Mail archive indexes are broken, URLs too

Lists: pgsql-www
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-www(at)postgreSQL(dot)org
Subject: Mail archive indexes are broken, URLs too
Date: 2006-07-16 18:43:09
Message-ID: 20438.1153075389@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

When Marc fixed the message-boundary pattern and regenerated the
archives, many of the existing messages changed URLs because they
got assigned slightly different numbers. I notice that the archive
search engine hasn't yet tracked this change --- if you do a search
and click on a link to a message, you'll arrive at a message close
to the one you want but probably not quite it.

Regenerating the archive indexes is presumably not hard, but there's
a bigger problem: for awhile now many of us have been in the habit
of citing old discussions by archive URLs. All those links are now
broken too, and I can't think of any easy way to fix them. And then
there's Google etc.

I wonder if it'd be better to revert the regeneration of the archives,
and only apply the new message-boundary pattern to future messages.

regards, tom lane


From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-www(at)postgreSQL(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-07-16 19:43:43
Message-ID: 20060716164208.Q957@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Sun, 16 Jul 2006, Tom Lane wrote:

> When Marc fixed the message-boundary pattern and regenerated the
> archives, many of the existing messages changed URLs because they
> got assigned slightly different numbers. I notice that the archive
> search engine hasn't yet tracked this change --- if you do a search
> and click on a link to a message, you'll arrive at a message close
> to the one you want but probably not quite it.
>
> Regenerating the archive indexes is presumably not hard, but there's
> a bigger problem: for awhile now many of us have been in the habit
> of citing old discussions by archive URLs. All those links are now
> broken too, and I can't think of any easy way to fix them. And then
> there's Google etc.
>
> I wonder if it'd be better to revert the regeneration of the archives,
> and only apply the new message-boundary pattern to future messages.

Nope, for one simple reason ... if, for some reason, at some point in the
future, we have to regenerate everything anyway (ie. the last time we did
a major template change for the archives), all the #'ng is going to end up
reverting back to what it is now ... so we'd only be 'delaying the
inevitable' ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664


From: Jim Nasby <jnasby(at)pervasive(dot)com>
To: Marc G(dot) Fournier <scrappy(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)postgreSQL(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-07-19 18:12:06
Message-ID: FE70E4F8-2F5D-4110-AA6A-690A177552BC@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Jul 16, 2006, at 2:43 PM, Marc G. Fournier wrote:
> On Sun, 16 Jul 2006, Tom Lane wrote:
>> When Marc fixed the message-boundary pattern and regenerated the
>> archives, many of the existing messages changed URLs because they
>> got assigned slightly different numbers. I notice that the archive
>> search engine hasn't yet tracked this change --- if you do a search
>> and click on a link to a message, you'll arrive at a message close
>> to the one you want but probably not quite it.
>>
>> Regenerating the archive indexes is presumably not hard, but there's
>> a bigger problem: for awhile now many of us have been in the habit
>> of citing old discussions by archive URLs. All those links are now
>> broken too, and I can't think of any easy way to fix them. And then
>> there's Google etc.
>>
>> I wonder if it'd be better to revert the regeneration of the
>> archives,
>> and only apply the new message-boundary pattern to future messages.
>
> Nope, for one simple reason ... if, for some reason, at some point
> in the future, we have to regenerate everything anyway (ie. the
> last time we did a major template change for the archives), all the
> #'ng is going to end up reverting back to what it is now ... so
> we'd only be 'delaying the inevitable' ...

This is a problem for most mailing lists, but I think it's a critical
one for us since we depend very, very heavily on the archives.

Can we change the lists so that they will generate a UUID and add it
to message headers, and then allow the archive software to key off of
that?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Cc: pgsql-www(at)postgresql(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-07-29 15:39:00
Message-ID: 200607291539.k6TFd0B14602@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Tom Lane wrote:
> When Marc fixed the message-boundary pattern and regenerated the
> archives, many of the existing messages changed URLs because they
> got assigned slightly different numbers. I notice that the archive
> search engine hasn't yet tracked this change --- if you do a search
> and click on a link to a message, you'll arrive at a message close
> to the one you want but probably not quite it.
>
> Regenerating the archive indexes is presumably not hard, but there's
> a bigger problem: for awhile now many of us have been in the habit
> of citing old discussions by archive URLs. All those links are now
> broken too, and I can't think of any easy way to fix them. And then
> there's Google etc.
>
> I wonder if it'd be better to revert the regeneration of the archives,
> and only apply the new message-boundary pattern to future messages.

Agreed. There have been no changes since we discussed this.

The best proposal was to renumber the newly-found items to the end of
the numeric range for the pre-July 2006 archives, and to properly number
July 2006 and later archives. And this date range has to be enbedded in
the archive script so if it is ever run again, this behavior continues
to happen.

The longer we take to fix this, the more likely that people are creating
URL's that refer to the existing pre-July 2006 numbering which should
change. It needs to be fixed quickly.

And we can't just leave it alone because old archive emails have URLs
that point to now-incorrect numbers, and there is no good way to fix
that everywhere are emails are archived.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL www <pgsql-www(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-01 19:11:26
Message-ID: 200608011911.k71JBQl12986@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www


Is anyone working on this? Marc? If not, who can make these
modifications to the archive numbering?

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > When Marc fixed the message-boundary pattern and regenerated the
> > archives, many of the existing messages changed URLs because they
> > got assigned slightly different numbers. I notice that the archive
> > search engine hasn't yet tracked this change --- if you do a search
> > and click on a link to a message, you'll arrive at a message close
> > to the one you want but probably not quite it.
> >
> > Regenerating the archive indexes is presumably not hard, but there's
> > a bigger problem: for awhile now many of us have been in the habit
> > of citing old discussions by archive URLs. All those links are now
> > broken too, and I can't think of any easy way to fix them. And then
> > there's Google etc.
> >
> > I wonder if it'd be better to revert the regeneration of the archives,
> > and only apply the new message-boundary pattern to future messages.
>
> Agreed. There have been no changes since we discussed this.
>
> The best proposal was to renumber the newly-found items to the end of
> the numeric range for the pre-July 2006 archives, and to properly number
> July 2006 and later archives. And this date range has to be enbedded in
> the archive script so if it is ever run again, this behavior continues
> to happen.
>
> The longer we take to fix this, the more likely that people are creating
> URL's that refer to the existing pre-July 2006 numbering which should
> change. It needs to be fixed quickly.
>
> And we can't just leave it alone because old archive emails have URLs
> that point to now-incorrect numbers, and there is no good way to fix
> that everywhere are emails are archived.
>
> --
> Bruce Momjian bruce(at)momjian(dot)us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-01 20:06:02
Message-ID: 44CFB42A.5060706@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Bruce Momjian wrote:
> Is anyone working on this? Marc? If not, who can make these
> modifications to the archive numbering?

I believe Marc is the only one that can at last I heard on this, he
disagreed with rolling back the change.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-01 21:36:29
Message-ID: 200608012136.k71LaTe11804@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Joshua D. Drake wrote:
> Bruce Momjian wrote:
> > Is anyone working on this? Marc? If not, who can make these
> > modifications to the archive numbering?
>
> I believe Marc is the only one that can at last I heard on this, he
> disagreed with rolling back the change.

I have heard no reason he doesn't like the change, and unless he can
convince most of us, it is time to make the change.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-01 21:39:26
Message-ID: 44CFCA0E.8020002@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Bruce Momjian wrote:
> Joshua D. Drake wrote:
>> Bruce Momjian wrote:
>>> Is anyone working on this? Marc? If not, who can make these
>>> modifications to the archive numbering?
>> I believe Marc is the only one that can at last I heard on this, he
>> disagreed with rolling back the change.
>
> I have heard no reason he doesn't like the change, and unless he can
> convince most of us, it is time to make the change.
>

Marc wrote:

Nope, for one simple reason ... if, for some reason, at some point in
the future, we have to regenerate everything anyway (ie. the last time
we did a major template change for the archives), all the #'ng is going
to end up reverting back to what it is now ... so we'd only be 'delaying
the inevitable' ...

On July 17th.

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>, pgsql-www(at)postgresql(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-02 01:44:26
Message-ID: 200608020144.k721iQU17642@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www


> Marc wrote:
>
>
> Nope, for one simple reason ... if, for some reason, at some point in
> the future, we have to regenerate everything anyway (ie. the last time
> we did a major template change for the archives), all the #'ng is going
> to end up reverting back to what it is now ... so we'd only be 'delaying
> the inevitable' ...
>
> On July 17th.
>
> Joshua D. Drake

If you look below you will see my idea was to hack the script to always
use the method of putting newly found items numerically at the end for
pre-July 2006 dumps. That addresses Marc's concern.

Marc hasn't responded so I assume he is busy and will hack on this when
he gets back.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > When Marc fixed the message-boundary pattern and regenerated the
> > archives, many of the existing messages changed URLs because they
> > got assigned slightly different numbers. I notice that the archive
> > search engine hasn't yet tracked this change --- if you do a search
> > and click on a link to a message, you'll arrive at a message close
> > to the one you want but probably not quite it.
> >
> > Regenerating the archive indexes is presumably not hard, but there's
> > a bigger problem: for awhile now many of us have been in the habit
> > of citing old discussions by archive URLs. All those links are now
> > broken too, and I can't think of any easy way to fix them. And then
> > there's Google etc.
> >
> > I wonder if it'd be better to revert the regeneration of the archives,
> > and only apply the new message-boundary pattern to future messages.
>
> Agreed. There have been no changes since we discussed this.
>
> The best proposal was to renumber the newly-found items to the end of
> the numeric range for the pre-July 2006 archives, and to properly number
> July 2006 and later archives. And this date range has to be enbedded in
> the archive script so if it is ever run again, this behavior continues
> to happen.
>
> The longer we take to fix this, the more likely that people are creating
> URL's that refer to the existing pre-July 2006 numbering which should
> change. It needs to be fixed quickly.
>
> And we can't just leave it alone because old archive emails have URLs
> that point to now-incorrect numbers, and there is no good way to fix
> that everywhere are emails are archived.
>
> --
> Bruce Momjian bruce(at)momjian(dot)us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)postgresql(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-02 02:09:11
Message-ID: 20060801230814.H1188@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Tue, 1 Aug 2006, Bruce Momjian wrote:

>
>> Marc wrote:
>>
>>
>> Nope, for one simple reason ... if, for some reason, at some point in
>> the future, we have to regenerate everything anyway (ie. the last time
>> we did a major template change for the archives), all the #'ng is going
>> to end up reverting back to what it is now ... so we'd only be 'delaying
>> the inevitable' ...
>>
>> On July 17th.
>>
>> Joshua D. Drake
>
> If you look below you will see my idea was to hack the script to always
> use the method of putting newly found items numerically at the end for
> pre-July 2006 dumps. That addresses Marc's concern.
>
> Marc hasn't responded so I assume he is busy and will hack on this when
> he gets back.

Yup, been busy dealing with an Adaptec driver issue, will try and get
something hacked up over the coming weekend, sorry for the delay ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)postgresql(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-02 03:06:35
Message-ID: 200608020306.k7236ZE29636@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

Marc G. Fournier wrote:
> On Tue, 1 Aug 2006, Bruce Momjian wrote:
>
> >
> >> Marc wrote:
> >>
> >>
> >> Nope, for one simple reason ... if, for some reason, at some point in
> >> the future, we have to regenerate everything anyway (ie. the last time
> >> we did a major template change for the archives), all the #'ng is going
> >> to end up reverting back to what it is now ... so we'd only be 'delaying
> >> the inevitable' ...
> >>
> >> On July 17th.
> >>
> >> Joshua D. Drake
> >
> > If you look below you will see my idea was to hack the script to always
> > use the method of putting newly found items numerically at the end for
> > pre-July 2006 dumps. That addresses Marc's concern.
> >
> > Marc hasn't responded so I assume he is busy and will hack on this when
> > he gets back.
>
> Yup, been busy dealing with an Adaptec driver issue, will try and get
> something hacked up over the coming weekend, sorry for the delay ...

Thanks. When talking via IM I didn't get the sense whether you agreed
that this was a good idea or not, so I figured I should ask on the lists
so others know it is in process.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-www(at)postgresql(dot)org
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-02 03:25:24
Message-ID: 20060802002444.O1188@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www

On Tue, 1 Aug 2006, Bruce Momjian wrote:

> Marc G. Fournier wrote:
>> On Tue, 1 Aug 2006, Bruce Momjian wrote:
>>
>>>
>>>> Marc wrote:
>>>>
>>>>
>>>> Nope, for one simple reason ... if, for some reason, at some point in
>>>> the future, we have to regenerate everything anyway (ie. the last time
>>>> we did a major template change for the archives), all the #'ng is going
>>>> to end up reverting back to what it is now ... so we'd only be 'delaying
>>>> the inevitable' ...
>>>>
>>>> On July 17th.
>>>>
>>>> Joshua D. Drake
>>>
>>> If you look below you will see my idea was to hack the script to always
>>> use the method of putting newly found items numerically at the end for
>>> pre-July 2006 dumps. That addresses Marc's concern.
>>>
>>> Marc hasn't responded so I assume he is busy and will hack on this when
>>> he gets back.
>>
>> Yup, been busy dealing with an Adaptec driver issue, will try and get
>> something hacked up over the coming weekend, sorry for the delay ...
>
> Thanks. When talking via IM I didn't get the sense whether you agreed
> that this was a good idea or not, so I figured I should ask on the lists
> so others know it is in process.

Oh, I still don't think its a good idea, but understand why ...

Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664


From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-09 17:18:14
Message-ID: 20060809141747.C7267@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www


Just shutdown rsync while I rebuild the archives for the 'old/new' scheme,
where old is pre-July 2006 ...

will post once its been all rebuilt ...

On Tue, 1 Aug 2006, Bruce Momjian wrote:

>
> Is anyone working on this? Marc? If not, who can make these
> modifications to the archive numbering?
>
> ---------------------------------------------------------------------------
>
> Bruce Momjian wrote:
>> Tom Lane wrote:
>>> When Marc fixed the message-boundary pattern and regenerated the
>>> archives, many of the existing messages changed URLs because they
>>> got assigned slightly different numbers. I notice that the archive
>>> search engine hasn't yet tracked this change --- if you do a search
>>> and click on a link to a message, you'll arrive at a message close
>>> to the one you want but probably not quite it.
>>>
>>> Regenerating the archive indexes is presumably not hard, but there's
>>> a bigger problem: for awhile now many of us have been in the habit
>>> of citing old discussions by archive URLs. All those links are now
>>> broken too, and I can't think of any easy way to fix them. And then
>>> there's Google etc.
>>>
>>> I wonder if it'd be better to revert the regeneration of the archives,
>>> and only apply the new message-boundary pattern to future messages.
>>
>> Agreed. There have been no changes since we discussed this.
>>
>> The best proposal was to renumber the newly-found items to the end of
>> the numeric range for the pre-July 2006 archives, and to properly number
>> July 2006 and later archives. And this date range has to be enbedded in
>> the archive script so if it is ever run again, this behavior continues
>> to happen.
>>
>> The longer we take to fix this, the more likely that people are creating
>> URL's that refer to the existing pre-July 2006 numbering which should
>> change. It needs to be fixed quickly.
>>
>> And we can't just leave it alone because old archive emails have URLs
>> that point to now-incorrect numbers, and there is no good way to fix
>> that everywhere are emails are archived.
>>
>> --
>> Bruce Momjian bruce(at)momjian(dot)us
>> EnterpriseDB http://www.enterprisedb.com
>>
>> + If your life is a hard drive, Christ can be your backup. +
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 4: Have you searched our list archives?
>>
>> http://archives.postgresql.org
>
> --
> Bruce Momjian bruce(at)momjian(dot)us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664


From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-09 19:27:38
Message-ID: 20060809162654.V7267@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www


'k, rsync is back up ... for a short period, part of the archives will
disappear, but a large portion of it is re-generated, and figured may as
well let the 'feed server' start downloading now :)

On Wed, 9 Aug 2006, Marc G. Fournier wrote:

>
> Just shutdown rsync while I rebuild the archives for the 'old/new' scheme,
> where old is pre-July 2006 ...
>
> will post once its been all rebuilt ...
>
> On Tue, 1 Aug 2006, Bruce Momjian wrote:
>
>>
>> Is anyone working on this? Marc? If not, who can make these
>> modifications to the archive numbering?
>>
>> ---------------------------------------------------------------------------
>>
>> Bruce Momjian wrote:
>>> Tom Lane wrote:
>>>> When Marc fixed the message-boundary pattern and regenerated the
>>>> archives, many of the existing messages changed URLs because they
>>>> got assigned slightly different numbers. I notice that the archive
>>>> search engine hasn't yet tracked this change --- if you do a search
>>>> and click on a link to a message, you'll arrive at a message close
>>>> to the one you want but probably not quite it.
>>>>
>>>> Regenerating the archive indexes is presumably not hard, but there's
>>>> a bigger problem: for awhile now many of us have been in the habit
>>>> of citing old discussions by archive URLs. All those links are now
>>>> broken too, and I can't think of any easy way to fix them. And then
>>>> there's Google etc.
>>>>
>>>> I wonder if it'd be better to revert the regeneration of the archives,
>>>> and only apply the new message-boundary pattern to future messages.
>>>
>>> Agreed. There have been no changes since we discussed this.
>>>
>>> The best proposal was to renumber the newly-found items to the end of
>>> the numeric range for the pre-July 2006 archives, and to properly number
>>> July 2006 and later archives. And this date range has to be enbedded in
>>> the archive script so if it is ever run again, this behavior continues
>>> to happen.
>>>
>>> The longer we take to fix this, the more likely that people are creating
>>> URL's that refer to the existing pre-July 2006 numbering which should
>>> change. It needs to be fixed quickly.
>>>
>>> And we can't just leave it alone because old archive emails have URLs
>>> that point to now-incorrect numbers, and there is no good way to fix
>>> that everywhere are emails are archived.
>>>
>>> --
>>> Bruce Momjian bruce(at)momjian(dot)us
>>> EnterpriseDB http://www.enterprisedb.com
>>>
>>> + If your life is a hard drive, Christ can be your backup. +
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 4: Have you searched our list archives?
>>>
>>> http://archives.postgresql.org
>>
>> --
>> Bruce Momjian bruce(at)momjian(dot)us
>> EnterpriseDB http://www.enterprisedb.com
>>
>> + If your life is a hard drive, Christ can be your backup. +
>>
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
> Yahoo . yscrappy Skype: hub.org ICQ . 7615664
>

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
Yahoo . yscrappy Skype: hub.org ICQ . 7615664


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
Cc: PostgreSQL www <pgsql-www(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Mail archive indexes are broken, URLs too
Date: 2006-08-09 19:45:13
Message-ID: 200608091945.k79JjD314853@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-www


Nice, thanks.

---------------------------------------------------------------------------

Marc G. Fournier wrote:
>
> 'k, rsync is back up ... for a short period, part of the archives will
> disappear, but a large portion of it is re-generated, and figured may as
> well let the 'feed server' start downloading now :)
>
> On Wed, 9 Aug 2006, Marc G. Fournier wrote:
>
> >
> > Just shutdown rsync while I rebuild the archives for the 'old/new' scheme,
> > where old is pre-July 2006 ...
> >
> > will post once its been all rebuilt ...
> >
> > On Tue, 1 Aug 2006, Bruce Momjian wrote:
> >
> >>
> >> Is anyone working on this? Marc? If not, who can make these
> >> modifications to the archive numbering?
> >>
> >> ---------------------------------------------------------------------------
> >>
> >> Bruce Momjian wrote:
> >>> Tom Lane wrote:
> >>>> When Marc fixed the message-boundary pattern and regenerated the
> >>>> archives, many of the existing messages changed URLs because they
> >>>> got assigned slightly different numbers. I notice that the archive
> >>>> search engine hasn't yet tracked this change --- if you do a search
> >>>> and click on a link to a message, you'll arrive at a message close
> >>>> to the one you want but probably not quite it.
> >>>>
> >>>> Regenerating the archive indexes is presumably not hard, but there's
> >>>> a bigger problem: for awhile now many of us have been in the habit
> >>>> of citing old discussions by archive URLs. All those links are now
> >>>> broken too, and I can't think of any easy way to fix them. And then
> >>>> there's Google etc.
> >>>>
> >>>> I wonder if it'd be better to revert the regeneration of the archives,
> >>>> and only apply the new message-boundary pattern to future messages.
> >>>
> >>> Agreed. There have been no changes since we discussed this.
> >>>
> >>> The best proposal was to renumber the newly-found items to the end of
> >>> the numeric range for the pre-July 2006 archives, and to properly number
> >>> July 2006 and later archives. And this date range has to be enbedded in
> >>> the archive script so if it is ever run again, this behavior continues
> >>> to happen.
> >>>
> >>> The longer we take to fix this, the more likely that people are creating
> >>> URL's that refer to the existing pre-July 2006 numbering which should
> >>> change. It needs to be fixed quickly.
> >>>
> >>> And we can't just leave it alone because old archive emails have URLs
> >>> that point to now-incorrect numbers, and there is no good way to fix
> >>> that everywhere are emails are archived.
> >>>
> >>> --
> >>> Bruce Momjian bruce(at)momjian(dot)us
> >>> EnterpriseDB http://www.enterprisedb.com
> >>>
> >>> + If your life is a hard drive, Christ can be your backup. +
> >>>
> >>> ---------------------------(end of broadcast)---------------------------
> >>> TIP 4: Have you searched our list archives?
> >>>
> >>> http://archives.postgresql.org
> >>
> >> --
> >> Bruce Momjian bruce(at)momjian(dot)us
> >> EnterpriseDB http://www.enterprisedb.com
> >>
> >> + If your life is a hard drive, Christ can be your backup. +
> >>
> >
> > ----
> > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> > Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
> > Yahoo . yscrappy Skype: hub.org ICQ . 7615664
> >
>
> ----
> Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
> Email . scrappy(at)hub(dot)org MSN . scrappy(at)hub(dot)org
> Yahoo . yscrappy Skype: hub.org ICQ . 7615664

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +