Re: [DOCS] suggestion about SEO on www.postgresql.org/docs

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Antony <antony(at)cantoute(dot)com>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2014-10-07 16:46:52
Message-ID: CABRT9RA+ut5JvBqd6oSX9KJb0DEJcrTnvmzACYpZ6iiSfCfx+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-www

On Wed, Aug 27, 2014 at 6:00 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Are we using the rel="canonical" suggestion in our web docs now?

Apparently not. I looked into this and I'm not 100% certain we should
do it. But if we decide so, I'm willing to code up a patch.

https://tools.ietf.org/html/rfc6596 states:
==== 8< ====
The target (canonical) IRI MUST identify content that is either
duplicative or a superset of the content at the context (referring)
IRI. Authors who declare the canonical link relation ought to
anticipate that applications such as search engines can:

o Index content only from the target IRI (i.e., content from the
context IRIs will be likely disregarded as duplicative).

o Consolidate IRI properties, such as link popularity, to the target
IRI.

o Display the target IRI as the representative IRI.
==== 8< ====

We certainly want property 2, but property 1 suggests that older
versions of docs are dropped from search engines altogether. It's not
clear whether they are that strict in reality -- does anyone know?

This would not be a problem if we also retained notes about earlier
supported versions in the current version, which would make our latest
version a "superset" of earlier
ones.

But I believe we very rarely remove material from docs, so I believe
the upsides outweigh the cons.

----
Another question is whether we should make "interactive" point to
"static" -- again, actually the interactive one is the superset, since
static doesn't include user comments. But do we care about search
engines indexing comments anyway? They're not present in sitemap.xml
either and I've never landed on the interactive version when coming from Google.

My proposal:
1. Doc pages that are *older* than current, and exist in the current
version have canonical URL /docs/current/static/pagename.html
2. If it doesn't exist in current, we link to the last version that
includes this page, like /docs/8.4/static/install-win32.html
3. Newer versions (devel/beta) should perhaps point to itself and not
/current/? This would make new features googleable for testers. The
doc links use rel=nofollow when linking to them, so they're already
ranked lower by search engines.

It appears there are already lots of places that hardcode the
http://www.postgresql.org/ URL, so it makes sense to use absolute URLs
for canonical too?

Did I miss anything?

Regards,
Marti

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Alvaro Herrera 2014-10-07 17:14:31 Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Previous Message Laurence Parry 2014-09-20 21:24:51 Huge pages section needs to describe hugetlb_shm_group, memlock limit

Browse pgsql-www by date

  From Date Subject
Next Message Alvaro Herrera 2014-10-07 17:14:31 Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Previous Message Magnus Hagander 2014-10-07 14:24:48 Re: ML archives caching 404 results