Re: [DOCS] suggestion about SEO on www.postgresql.org/docs

Lists: pgsql-docspgsql-www
From: Antony <antony(at)cantoute(dot)com>
To: pgsql-docs(at)postgresql(dot)org
Subject: suggestion about SEO on www.postgresql.org/docs
Date: 2014-04-03 18:32:21
Message-ID: 04E4F5A6-6526-4DDC-A9E5-2991E3B2ED83@cantoute.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

May I suggest that on the documentation on the current version (now 9.3) is added a link rel=canonical

Ex:
http://www.postgresql.org/docs/9.3/static/

could have

<link rel="canonical" href="http://www.postgresql.org/docs/current/static/" />

I believe this could help google offering the current documentation as a first choice.
Right now it probably considers it as a duplicated content.

Probably not the right place to tell this, but could’t guess who to send this to.

Sorry for the noise

Antony


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Antony <antony(at)cantoute(dot)com>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2014-08-27 15:00:12
Message-ID: 20140827150012.GM14956@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

On Thu, Apr 3, 2014 at 08:32:21PM +0200, Antony wrote:
>
>
>
> May I suggest that on the documentation on the current version (now 9.3) is added a link rel=canonical
>
> Ex:
> http://www.postgresql.org/docs/9.3/static/
>
> could have
>
> <link rel="canonical" href="http://www.postgresql.org/docs/current/static/" />
>
> I believe this could help google offering the current documentation as a first choice.
> Right now it probably considers it as a duplicated content.
>
> Probably not the right place to tell this, but could’t guess who to send this to.

Are we using the rel="canonical" suggestion in our web docs now?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Antony <antony(at)cantoute(dot)com>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2014-10-07 16:46:52
Message-ID: CABRT9RA+ut5JvBqd6oSX9KJb0DEJcrTnvmzACYpZ6iiSfCfx+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

On Wed, Aug 27, 2014 at 6:00 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Are we using the rel="canonical" suggestion in our web docs now?

Apparently not. I looked into this and I'm not 100% certain we should
do it. But if we decide so, I'm willing to code up a patch.

https://tools.ietf.org/html/rfc6596 states:
==== 8< ====
The target (canonical) IRI MUST identify content that is either
duplicative or a superset of the content at the context (referring)
IRI. Authors who declare the canonical link relation ought to
anticipate that applications such as search engines can:

o Index content only from the target IRI (i.e., content from the
context IRIs will be likely disregarded as duplicative).

o Consolidate IRI properties, such as link popularity, to the target
IRI.

o Display the target IRI as the representative IRI.
==== 8< ====

We certainly want property 2, but property 1 suggests that older
versions of docs are dropped from search engines altogether. It's not
clear whether they are that strict in reality -- does anyone know?

This would not be a problem if we also retained notes about earlier
supported versions in the current version, which would make our latest
version a "superset" of earlier
ones.

But I believe we very rarely remove material from docs, so I believe
the upsides outweigh the cons.

----
Another question is whether we should make "interactive" point to
"static" -- again, actually the interactive one is the superset, since
static doesn't include user comments. But do we care about search
engines indexing comments anyway? They're not present in sitemap.xml
either and I've never landed on the interactive version when coming from Google.

My proposal:
1. Doc pages that are *older* than current, and exist in the current
version have canonical URL /docs/current/static/pagename.html
2. If it doesn't exist in current, we link to the last version that
includes this page, like /docs/8.4/static/install-win32.html
3. Newer versions (devel/beta) should perhaps point to itself and not
/current/? This would make new features googleable for testers. The
doc links use rel=nofollow when linking to them, so they're already
ranked lower by search engines.

It appears there are already lots of places that hardcode the
http://www.postgresql.org/ URL, so it makes sense to use absolute URLs
for canonical too?

Did I miss anything?

Regards,
Marti


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Antony <antony(at)cantoute(dot)com>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2014-10-07 17:14:31
Message-ID: 20141007171431.GO7043@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

Marti Raudsepp wrote:

> Another question is whether we should make "interactive" point to
> "static" -- again, actually the interactive one is the superset, since
> static doesn't include user comments. But do we care about search
> engines indexing comments anyway? They're not present in sitemap.xml
> either and I've never landed on the interactive version when coming from Google.

Please see this thread:
http://www.postgresql.org/message-id/CABUevEySZgGdTaKJz=DYoYYkPqhV2Pi4RAeY2vLTsAGV0me3Ug@mail.gmail.com

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Antony <antony(at)cantoute(dot)com>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2015-01-06 17:59:01
Message-ID: 20150106175901.GA17824@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

On Thu, Apr 3, 2014 at 08:32:21PM +0200, Antony wrote:
>
>
>
> May I suggest that on the documentation on the current version (now 9.3) is added a link rel=canonical
>
> Ex:
> http://www.postgresql.org/docs/9.3/static/
>
> could have
>
> <link rel="canonical" href="http://www.postgresql.org/docs/current/static/" />
>
> I believe this could help google offering the current documentation as a first choice.
> Right now it probably considers it as a duplicated content.
>
> Probably not the right place to tell this, but could’t guess who to send this to.

Have we made this update to use "canonical" for our web docs?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Antony <antony(at)cantoute(dot)com>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [DOCS] suggestion about SEO on www.postgresql.org/docs
Date: 2015-01-10 15:06:24
Message-ID: 54B13FF0.8000604@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-www

On 10/07/2014 06:46 PM, Marti Raudsepp wrote:
> On Wed, Aug 27, 2014 at 6:00 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> Are we using the rel="canonical" suggestion in our web docs now?
>
> Apparently not. I looked into this and I'm not 100% certain we should
> do it. But if we decide so, I'm willing to code up a patch.
>
> https://tools.ietf.org/html/rfc6596 states:
> ==== 8< ====
> The target (canonical) IRI MUST identify content that is either
> duplicative or a superset of the content at the context (referring)
> IRI. Authors who declare the canonical link relation ought to
> anticipate that applications such as search engines can:
>
> o Index content only from the target IRI (i.e., content from the
> context IRIs will be likely disregarded as duplicative).
>
> o Consolidate IRI properties, such as link popularity, to the target
> IRI.
>
> o Display the target IRI as the representative IRI.
> ==== 8< ====
>
> We certainly want property 2, but property 1 suggests that older
> versions of docs are dropped from search engines altogether. It's not
> clear whether they are that strict in reality -- does anyone know?
>
> This would not be a problem if we also retained notes about earlier
> supported versions in the current version, which would make our latest
> version a "superset" of earlier
> ones.
>
> But I believe we very rarely remove material from docs, so I believe
> the upsides outweigh the cons.

I'm not sure how search engines really behave here - dont we have any
SEO experts on the list who can shed some light on this?

>
> ----
> Another question is whether we should make "interactive" point to
> "static" -- again, actually the interactive one is the superset, since
> static doesn't include user comments. But do we care about search
> engines indexing comments anyway? They're not present in sitemap.xml
> either and I've never landed on the interactive version when coming from Google.
>
> My proposal:
> 1. Doc pages that are *older* than current, and exist in the current
> version have canonical URL /docs/current/static/pagename.html
> 2. If it doesn't exist in current, we link to the last version that
> includes this page, like /docs/8.4/static/install-win32.html
> 3. Newer versions (devel/beta) should perhaps point to itself and not
> /current/? This would make new features googleable for testers. The
> doc links use rel=nofollow when linking to them, so they're already
> ranked lower by search engines.
>
> It appears there are already lots of places that hardcode the
> http://www.postgresql.org/ URL, so it makes sense to use absolute URLs
> for canonical too?

I would actually strongly prefer to _NOT_ use even more absolute URLs on
the website for multiple reasons, one is that it will make moving the
website to https-only more difficult and the other one is that it makes
playing with your own copy of it (running under a different url) a pain.
I actually did a round of cleanups the other day (mostly on the
presskit) to remove some of the hardcoded urls.

Stefan