Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"

Lists: pgsql-admin
From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To: pgsql-admin(at)postgresql(dot)org
Subject: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 13:18:20
Message-ID: 4BC8639C.4010202@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Hi all,

I'm sorry for the urgency of the question. (We have a customer whose DB
is "down" since 36 hours and business operations are compromised. Thank
you for your help.)

*Background:*
- PostgreSQL 8.1 on Windows Server
- The customer has disabled regular VACUUM jobs for backup to be taken,
a year or two ago.
- They didn't told us (as far as I can remember).
- Wednesday morning at 10:55:50: database is shut down to avoid
wraparound data loss in database *db*

*What has been done:*
- The message requested a VACUUM FULL so we stopped the postmaster and
started postgres.exe to launch a VACUUM FULL.
- During the night an employee of our client has stop (CTRL+C) and
restarted many many times the VACUUM FULL (trying to see the progress of
it).
- So yesterday morning, knowing that we gave instructions to let the job
go without interruptions, which they did.
- It worked for about 24 hours now, and we don't see the end of it. The
DB folder is now 38 GB (original DB was probably around 7GB of real data
- but these were the numbers two ago).

*What can we do?*
- 1.1 Can we stop the VACUUM FULL (CTRL+C in the shell) and start
postmaster again? So the company can continue working and then continue
the VACUUM FULL during the weekend?
- 1.2 Will the transactions to avoid warparound data loss be available
(in part at least) even if we stop the vacuum?
- 2. Could we stop VACUUM FULL and simply restart postmaster and
starting a normal VACUUM even if it's slow?
- 3. Is it possible to increase the transactions limit to something
bigger as a temporary solution so that the customer can continue its work?

I feel the pain of my client and understand that they need access to
their data, but I would not like to loose all the cleaning that has been
done. They have couple tables that are couple GB in size. And if
stopping the VACUUM FULL does not give access to available transactions
before the warparound shut down security, well it give nothing to stop
it: that would be worst.

I would need an expert's advice on the question.

Being very grateful for your help,

--
Alexandre Leclerc


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 13:59:03
Message-ID: 4BC86D27.4020209@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc wrote:
> - 2. Could we stop VACUUM FULL and simply restart postmaster and
> starting a normal VACUUM even if it's slow?

This is what you want to do. VACUUM FULL is the slow way--much, much
slower--and it is not needed to clean up from wraparound issues. Here's
two more opinions on that:
http://archives.postgresql.org/message-id/5224.1255142294@sss.pgh.pa.us

And here's some notes about why you should avoid VACUUM FULL:
http://wiki.postgresql.org/wiki/VACUUM_FULL

(Note that you don't have a good CLUSTER implementation in your 8.1
server, so using that to clean up tables isn't an option)

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 14:14:00
Message-ID: 4BC82A5802000025000309D3@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:

> - PostgreSQL 8.1 on Windows Server

That's not a supported environment.

http://www.postgresql.org/about/news.865

They should really be looking at upgrading.

> - The customer has disabled regular VACUUM jobs for backup to be
> taken, a year or two ago.

Ouch. Once you recover from the immediate emergency, you they need
to institute a sane vacuum/analyze policy, probably using some
combination of autovacuum and scheduled database vacuums.

> - Wednesday morning at 10:55:50: database is shut down to avoid
> wraparound data loss in database *db*

> - The message requested a VACUUM FULL

No. It didn't. It said "execute a full-database VACUUM".

> so we stopped the postmaster and started postgres.exe to launch a
> VACUUM FULL.

It was a mistake to use FULL, since that can run for days, and will
bloat indexes. It's almost never the right thing to do.

> - During the night an employee of our client has stop (CTRL+C)
> and restarted many many times the VACUUM FULL (trying to see the
> progress of it).

Each time that's done it will add bloat, making things worse.

> - So yesterday morning, knowing that we gave instructions to let
> the job go without interruptions, which they did.
> - It worked for about 24 hours now, and we don't see the end of
> it. The DB folder is now 38 GB (original DB was probably around
> 7GB of real data - but these were the numbers two ago).

Yeah, you're going to want to clean up all the bloat from these
mis-steps, but you have more immediate issues.

> - 2. Could we stop VACUUM FULL and simply restart postmaster and
> starting a normal VACUUM even if it's slow?

I would do that, but I'm not at all sure it would be safe for anyone
to try to use the database before the VACUUM completes. Once the
database VACUUM completes, they can use the database, but they're
likely to notice it's a bit slow. In the first available
maintenance window after that (e.g., a weekend), I would recommend
that they do a pg_dump of the database and restore it, followed by
VACUUM ANALYZE (again, not FULL). And then they should work out a
plan for an upgrade to a supported version.

> - 3. Is it possible to increase the transactions limit to
> something bigger as a temporary solution so that the customer can
> continue its work?

No.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 14:46:02
Message-ID: 9048.1271429162@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> writes:
> *Background:*
> - PostgreSQL 8.1 on Windows Server
> - The customer has disabled regular VACUUM jobs for backup to be taken,
> a year or two ago.
> - They didn't told us (as far as I can remember).
> - Wednesday morning at 10:55:50: database is shut down to avoid
> wraparound data loss in database *db*

> *What has been done:*
> - The message requested a VACUUM FULL so we stopped the postmaster and
> started postgres.exe to launch a VACUUM FULL.

You misread it. You do NOT need a VACUUM FULL here, you should just run
a plain VACUUM (across the whole database, ie not one per table). It
will take a lot less time.

After you get out of this, you really need to do something about
upgrading from 8.1, or else moving the server off Windows. 8.1 on
Windows is unsupported and is full of known problems for that platform.

regards, tom lane


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To:
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 18:02:11
Message-ID: 4BC8A623.8090201@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Hi all,

I might have a problem of a greater order, but I can't see how to get an
answer. (Indeed the message didn't say anything about VACUUM FULL... I
miss interpreted the message.)

The messages says to VACUUM the database postgres.

When I execute:
postgres -D "D:\my\path" postgres
VACUUM;

I'm always getting:
WARNING: db "template1" must be vacuumed within 999593 transactions
HINT: To avoid... execute a full-database VACUUM in "template1"
... (repeated many times until 999568)

Then I try the same in template1:
postgres -D "D:\my\path" postgres
VACUUM;
WARNING: db "postgres" must be vacuumed within 999*** transactions
HINT: To avoid... execute a full-database VACUUM in "postgres"

I tried VACUUM FREEZE / VACUUM FULL...

At some point I got:
ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed only
to AC/FB9224A8
CONTEXT: writing block 0 of relation 1664/0/1214
WARNING: could not writing block 0 of 1664/0/1214
DETAIL: Multiple failures --- write error may be permanent.

Now my customer tells me they had a power outage last Sunday. Their
might be a HD problem (?).

Is the best solution to dump all / recreate the DB from scratch? What
else can I do?

It looks like the vacuum command does not want to be execute. The other
VACUUM is still progressing on the main database in another postgres.exe
shell.

(If it's the only solution, is it possible to migration from one DB
directly to the other under windows... I don't know about the | command
under "windows cmd".)

Thank you for your help.

Le 2010-04-16 10:46, Tom Lane a écrit :
> Alexandre Leclerc<aleclerc(at)ipso(dot)ca> writes:
>
>> *Background:*
>> - PostgreSQL 8.1 on Windows Server
>> - The customer has disabled regular VACUUM jobs for backup to be taken,
>> a year or two ago.
>> - They didn't told us (as far as I can remember).
>> - Wednesday morning at 10:55:50: database is shut down to avoid
>> wraparound data loss in database *db*
>>
>
>> *What has been done:*
>> - The message requested a VACUUM FULL so we stopped the postmaster and
>> started postgres.exe to launch a VACUUM FULL.
>>
> You misread it. You do NOT need a VACUUM FULL here, you should just run
> a plain VACUUM (across the whole database, ie not one per table). It
> will take a lot less time.
>
> After you get out of this, you really need to do something about
> upgrading from 8.1, or else moving the server off Windows. 8.1 on
> Windows is unsupported and is full of known problems for that platform.
>
> regards, tom lane
>
>

--
Alexandre Leclerc


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 18:10:45
Message-ID: 4BC8A825.3030006@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Hi again,

I also want to mention that maybe I'm not doing it properly.

I started "postgres.exe" and it is inside that "session", "backend>"
prompt, that I did run the VACUUM command. Is it that way or should I
use psql to connect to anything "postgres.exe" would have "done" (like
listening to a port?).

(And by the way, what is the command to quit the backend? I'm doing
Ctrl+C but the DB does not like that...)

Best regards,
Alexandre Leclerc

Le 2010-04-16 10:46, Tom Lane a écrit :
> Alexandre Leclerc<aleclerc(at)ipso(dot)ca> writes:
>
>> *Background:*
>> - PostgreSQL 8.1 on Windows Server
>> - The customer has disabled regular VACUUM jobs for backup to be taken,
>> a year or two ago.
>> - They didn't told us (as far as I can remember).
>> - Wednesday morning at 10:55:50: database is shut down to avoid
>> wraparound data loss in database *db*
>>
>
>> *What has been done:*
>> - The message requested a VACUUM FULL so we stopped the postmaster and
>> started postgres.exe to launch a VACUUM FULL.
>>
> You misread it. You do NOT need a VACUUM FULL here, you should just run
> a plain VACUUM (across the whole database, ie not one per table). It
> will take a lot less time.
>
> After you get out of this, you really need to do something about
> upgrading from 8.1, or else moving the server off Windows. 8.1 on
> Windows is unsupported and is full of known problems for that platform.
>
> regards, tom lane
>
>

--
Alexandre Leclerc


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 18:18:06
Message-ID: 4BC8638E0200002500030A01@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:

> At some point I got:
> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed
> only to AC/FB9224A8
> CONTEXT: writing block 0 of relation 1664/0/1214
> WARNING: could not writing block 0 of 1664/0/1214
> DETAIL: Multiple failures --- write error may be permanent.

You're not running out of disk space where that writes, are you?

> It looks like the vacuum command does not want to be execute. The
> other VACUUM is still progressing on the main database in another
> postgres.exe shell.

Wait -- are you saying you're running two postgres instances against
the same data directory at the same time? (I sure hope not.)

> (If it's the only solution, is it possible to migration from one
> DB directly to the other under windows... I don't know about the |
> command under "windows cmd".)

Yeah, that should work as long as you have the disk space for both
copies.

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 18:32:54
Message-ID: 4BC867060200002500030A07@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:

> I also want to mention that maybe I'm not doing it properly.
>
> I started "postgres.exe" and it is inside that "session",
> "backend>" prompt, that I did run the VACUUM command. Is it that
> way

Yes, that's the single-user mode. Just don't run more than one with
the same -D value at the same time; more than one single-user
backend would not be good.

> (And by the way, what is the command to quit the backend? I'm
> doing Ctrl+C but the DB does not like that...)

In Linux it would be Ctrl+D but in Windows I think it's Ctrl+Z

-Kevin


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 18:47:42
Message-ID: 4BC8B0CE.2010401@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-16 14:18, Kevin Grittner a écrit :
> Alexandre Leclerc<aleclerc(at)ipso(dot)ca> wrote:
>
>
>> At some point I got:
>> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed
>> only to AC/FB9224A8
>> CONTEXT: writing block 0 of relation 1664/0/1214
>> WARNING: could not writing block 0 of 1664/0/1214
>> DETAIL: Multiple failures --- write error may be permanent.
>>
>
> You're not running out of disk space where that writes, are you?
>

There is 32 GB free.

>
>
>> It looks like the vacuum command does not want to be execute. The
>> other VACUUM is still progressing on the main database in another
>> postgres.exe shell.
>>
>
> Wait -- are you saying you're running two postgres instances against
> the same data directory at the same time? (I sure hope not.)
>

I did. :( Shame on me. I just realised while reading doc on postgres
that it is not made for that but only for a single instance at the time.
I hope I did not break anything.

>> (If it's the only solution, is it possible to migration from one
>> DB directly to the other under windows... I don't know about the |
>> command under "windows cmd".)
>>
>
> Yeah, that should work as long as you have the disk space for both
> copies.
>
> -Kevin
>

Thank you Kevin for you time, it's most appreciated.

Right now I'll leave the standard VACUUM complete it's job. Then I'll
vacuum the postgres and template1 DB if required.

Best regards,

--
Alexandre Leclerc


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:05:25
Message-ID: 1271444725.21263.80.camel@jd-desktop.unknown.charter.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On Fri, 2010-04-16 at 14:47 -0400, Alexandre Leclerc wrote:

> I did. :( Shame on me. I just realised while reading doc on postgres
> that it is not made for that but only for a single instance at the time.
> I hope I did not break anything.

How in the world did you pull that off? PostgreSQL checks for that.
Either way, if you actually managed to start two services against the
same data directory, I hope you have a backup, you can restore from.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <jd(at)commandprompt(dot)com>
Cc: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:19:37
Message-ID: 4BC871F90200002500030A0D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

"Joshua D. Drake" <jd(at)commandprompt(dot)com> wrote:

> if you actually managed to start two services against the
> same data directory, I hope you have a backup, you can restore
> from.

This is 8.1 under Windows, and he connected to a different database
with each backend. He got errors writing the WAL files, and it
apparently wouldn't let him start a second VACUUM on the other
database. I'm hoping that the initial VACUUM (of the big database)
can continue and the WAL problems will cycle out without corrupting
anything. Is that overly optimistic?

It did occur to me that maybe the manual (and maybe event the
error's hint) should say right up front that once the service stops
a file-based copy of the database should be made (if at all
possible) before proceeding to attempt recovery. Also, the
"full-database vacuum" terminology seems too likely to be
interpreted as VACUUM FULL for best results. Perhaps it's worth
changing that to just "database vacuum" or "vacuum of the entire
database"?

-Kevin


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:20:54
Message-ID: w2pdcc563d11004161220o6a4a4334sf040c37529061052@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On Fri, Apr 16, 2010 at 12:47 PM, Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:
> Le 2010-04-16 14:18, Kevin Grittner a écrit :
>>
>> Alexandre Leclerc<aleclerc(at)ipso(dot)ca>  wrote:
>>
>>
>>>
>>> At some point I got:
>>> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed
>>> only to AC/FB9224A8
>>> CONTEXT: writing block 0 of relation 1664/0/1214
>>> WARNING: could not writing block 0 of 1664/0/1214
>>> DETAIL: Multiple failures --- write error may be permanent.
>>>
>>
>> You're not running out of disk space where that writes, are you?
>>
>
> There is 32 GB free.
>
>>
>>
>>>
>>> It looks like the vacuum command does not want to be execute. The
>>> other VACUUM is still progressing on the main database in another
>>> postgres.exe shell.
>>>
>>
>> Wait -- are you saying you're running two postgres instances against
>> the same data directory at the same time?  (I sure hope not.)
>>
>
> I did. :( Shame on me. I just realised while reading doc on postgres that it
> is not made for that but only for a single instance at the time. I hope I
> did not break anything.

You've almost certainly corrupted the data store. This is why you
should always make a complete file system backup with the postmaster
stopped before you start doing rescue work.


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To:
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:29:38
Message-ID: 4BC8BAA2.1030507@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-16 15:20, Scott Marlowe a écrit :
> On Fri, Apr 16, 2010 at 12:47 PM, Alexandre Leclerc<aleclerc(at)ipso(dot)ca> wrote:
>
>> Le 2010-04-16 14:18, Kevin Grittner a écrit :
>>
>>> Alexandre Leclerc<aleclerc(at)ipso(dot)ca> wrote:
>>>
>>>
>>>
>>>> At some point I got:
>>>> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed
>>>> only to AC/FB9224A8
>>>> CONTEXT: writing block 0 of relation 1664/0/1214
>>>> WARNING: could not writing block 0 of 1664/0/1214
>>>> DETAIL: Multiple failures --- write error may be permanent.
>>>>
>>>>
>>> You're not running out of disk space where that writes, are you?
>>>
>>>
>> There is 32 GB free.
>>
>>
>>>
>>>
>>>> It looks like the vacuum command does not want to be execute. The
>>>> other VACUUM is still progressing on the main database in another
>>>> postgres.exe shell.
>>>>
>>>>
>>> Wait -- are you saying you're running two postgres instances against
>>> the same data directory at the same time? (I sure hope not.)
>>>
>>>
>> I did. :( Shame on me. I just realised while reading doc on postgres that it
>> is not made for that but only for a single instance at the time. I hope I
>> did not break anything.
>>
> You've almost certainly corrupted the data store. This is why you
> should always make a complete file system backup with the postmaster
> stopped before you start doing rescue work.
>
>

Hi Scott and Kevin,

Thank you for these comments. But I'm learning the hard way right now
since all this has already been done. I appreaciate the suggestions of
Kevin on the terminology (since I'm not a native English speaking guy)
and complementing the help. I can assure that it would have been clear
if worded like that, for me at least.

Hopefully, our customer is supposed to have a full file backup from the
evening. So they would have loose only 3 hours of work in that respect.
So the plan is that after the vacuum if the DB is not working well, we
start from the backup, 3 hours before the problem, and we run a VACUUM
using pgAdmin with the normal postmaster during the weekend.

Again, thank you very much for all your time and help. It really is
helping and supportive for me (even if I'm making mistakes).

Best regards,

--
Alexandre Leclerc


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: jd(at)commandprompt(dot)com, "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:44:23
Message-ID: 13971.1271447063@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> "Joshua D. Drake" <jd(at)commandprompt(dot)com> wrote:
>> if you actually managed to start two services against the
>> same data directory, I hope you have a backup, you can restore
>> from.

> This is 8.1 under Windows, and he connected to a different database
> with each backend. He got errors writing the WAL files, and it
> apparently wouldn't let him start a second VACUUM on the other
> database. I'm hoping that the initial VACUUM (of the big database)
> can continue and the WAL problems will cycle out without corrupting
> anything. Is that overly optimistic?

Maybe, but if he doesn't have a recent backup then that's probably the
best thing to try. I'm not actually sure how he would've started two
standalone backends though --- there *is* an interlock against that,
just as there is for two postmasters in the same data directory.
Maybe if he was bullheaded enough to remove the lock file manually :-(

> Also, the
> "full-database vacuum" terminology seems too likely to be
> interpreted as VACUUM FULL for best results. Perhaps it's worth
> changing that to just "database vacuum" or "vacuum of the entire
> database"?

We did change that ...
http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php

regards, tom lane


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: jd(at)commandprompt(dot)com, Alexandre Leclerc <aleclerc(at)ipso(dot)ca>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:44:34
Message-ID: 4BC8BE22.5030201@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Kevin Grittner wrote:
> Also, the "full-database vacuum" terminology seems too likely to be
> interpreted as VACUUM FULL for best results. Perhaps it's worth
> changing that to just "database vacuum" or "vacuum of the entire
> database"

http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php

It's "database-wide" now instead of "full-database".

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:44:41
Message-ID: 4BC877D90200002500030A13@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:

> our customer is supposed to have a full file backup from the
> evening.

That's very good news, but given that they've not been going "by the
book" in all respects, it pays to be cautious here. Did they make
the copy while the database service was shut down? If not, did they
follow all the rules for a PITR backup?

http://www.postgresql.org/docs/8.1/interactive/backup.html

In any event, take great care that the backup is not overwritten,
deleted, or *moved* to be used for recovery. You really want to
protect that and keep it until the database has been up and running
for a few weeks, at least.

-Kevin


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 19:56:03
Message-ID: 4BC8C0D3.70406@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-16 15:44, Tom Lane a écrit :
> "Kevin Grittner"<Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>
>> "Joshua D. Drake"<jd(at)commandprompt(dot)com> wrote:
>>
>>> if you actually managed to start two services against the
>>> same data directory, I hope you have a backup, you can restore
>>> from.
>>>
>
>
>> This is 8.1 under Windows, and he connected to a different database
>> with each backend. He got errors writing the WAL files, and it
>> apparently wouldn't let him start a second VACUUM on the other
>> database. I'm hoping that the initial VACUUM (of the big database)
>> can continue and the WAL problems will cycle out without corrupting
>> anything. Is that overly optimistic?
>>
> Maybe, but if he doesn't have a recent backup then that's probably the
> best thing to try. I'm not actually sure how he would've started two
> standalone backends though --- there *is* an interlock against that,
> just as there is for two postmasters in the same data directory.
> Maybe if he was bullheaded enough to remove the lock file manually :-(
>
>

The backup should work ok. The postmaster was closed every night for
file-backup.

The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped.

Is increasing the number enought to have it continue or other parameters
are required? (Or is there a way in 8.1 to increate the memory for
maintenance?) (Is there a quick hint to calculate the size required?)

Spec of the Server:
- Windows Server 2003 / 32 bits
- 3 GB ram

(Now I understand why an initial DB of 6 GB is now 38 GB: vacuuming has
been stopped and space wasted since!)

As a side question, is it possible to make a pg_dumpall on a DB that
could have been potentially damaged by the two postgres.exe executions
at the same time? (We did play arround with file read-only state in the
/base folder but not in this purpose: it was to make sure the DB was not
read only. Maybe the error message arrived after this manipulation, I
can't remember. But yes the two postgres program executed on the same
"base" folder, but not the same DB.)

Maybe our best solution is start over from the backup.

>> Also, the
>> "full-database vacuum" terminology seems too likely to be
>> interpreted as VACUUM FULL for best results. Perhaps it's worth
>> changing that to just "database vacuum" or "vacuum of the entire
>> database"?
>>
> We did change that ...
> http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php
>
>

That is great.

--
Alexandre Leclerc


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 20:02:27
Message-ID: 14240.1271448147@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> writes:
> I'm always getting:
> WARNING: db "template1" must be vacuumed within 999593 transactions
> HINT: To avoid... execute a full-database VACUUM in "template1"
> ... (repeated many times until 999568)

Yeah, I think you will get that bleat once per table processed, until
you've resolved the problem (which will likely take vacuuming each
database in the installation). Don't sweat it.

regards, tom lane


From: robin <robin(at)edesix(dot)com>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 20:02:34
Message-ID: d1b41cc2bdfea40fdfa28b562ce87a4a@edesix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

You could temporarily increase the fsm size in the postgres configuration
so as to be able to properly map all the free space. I think you're going
to do a dump/restore in due course in order to return the database to
something like it's normal size, at which point (if you're RAM constrained)
you might want to revert that parameter.

Cheers,
Robin

On Fri, 16 Apr 2010 15:56:03 -0400, Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
wrote:
> Le 2010-04-16 15:44, Tom Lane a écrit :
>> "Kevin Grittner"<Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>>
>>> "Joshua D. Drake"<jd(at)commandprompt(dot)com> wrote:
>>>
>>>> if you actually managed to start two services against the
>>>> same data directory, I hope you have a backup, you can restore
>>>> from.
>>>>
>>
>>
>>> This is 8.1 under Windows, and he connected to a different database
>>> with each backend. He got errors writing the WAL files, and it
>>> apparently wouldn't let him start a second VACUUM on the other
>>> database. I'm hoping that the initial VACUUM (of the big database)
>>> can continue and the WAL problems will cycle out without corrupting
>>> anything. Is that overly optimistic?
>>>
>> Maybe, but if he doesn't have a recent backup then that's probably the
>> best thing to try. I'm not actually sure how he would've started two
>> standalone backends though --- there *is* an interlock against that,
>> just as there is for two postmasters in the same data directory.
>> Maybe if he was bullheaded enough to remove the lock file manually :-(
>>
>>
>
> The backup should work ok. The postmaster was closed every night for
> file-backup.
>
> The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped.
>
> Is increasing the number enought to have it continue or other parameters

> are required? (Or is there a way in 8.1 to increate the memory for
> maintenance?) (Is there a quick hint to calculate the size required?)
>
> Spec of the Server:
> - Windows Server 2003 / 32 bits
> - 3 GB ram
>
> (Now I understand why an initial DB of 6 GB is now 38 GB: vacuuming has
> been stopped and space wasted since!)
>
> As a side question, is it possible to make a pg_dumpall on a DB that
> could have been potentially damaged by the two postgres.exe executions
> at the same time? (We did play arround with file read-only state in the
> /base folder but not in this purpose: it was to make sure the DB was not

> read only. Maybe the error message arrived after this manipulation, I
> can't remember. But yes the two postgres program executed on the same
> "base" folder, but not the same DB.)
>
> Maybe our best solution is start over from the backup.
>
>>> Also, the
>>> "full-database vacuum" terminology seems too likely to be
>>> interpreted as VACUUM FULL for best results. Perhaps it's worth
>>> changing that to just "database vacuum" or "vacuum of the entire
>>> database"?
>>>
>> We did change that ...
>> http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php
>>
>>
>
> That is great.
>
> --
> Alexandre Leclerc


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alexandre Leclerc" <aleclerc(at)ipso(dot)ca>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 20:08:03
Message-ID: 4BC87D530200002500030A18@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:

> The vacuum raised a "max_fsm_pages" of 142000 not enought and
> stopped.

That's probably just a warning that it wasn't able to track all the
dead space -- I would expect that. You're going to want to clean up
the bloat anyway. I would try a pg_dumpall at this point. If it
works, you might just be in good shape. I would try hard to keep
the old database when restoring it, and I recommend VACUUM ANALYZE
after restoring it. Then I would do some sanity checks to make sure
it looks like all the data is there and sound.

If it doesn't let you in with normal connections because other
databases are close to wrap-around, you'll have to vacuum those, but
that should be fast. However, before any other vacuums on that
cluster, I would try to copy the database off to some backup medium
(with the service stopped). If you have time, it's a very good idea
anyway.

I hope it all works out.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 20:14:31
Message-ID: 14487.1271448871@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc <aleclerc(at)ipso(dot)ca> writes:
> The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped.

That's just a warning that gets put out at the end of the run. Go on
with vacuuming your other databases. Right now is no time to be
worrying about FSM too small --- you need to get back to a running DB.

regards, tom lane


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To:
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 20:57:34
Message-ID: 4BC8CF3E.7000205@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-16 16:14, Tom Lane a écrit :
> Alexandre Leclerc<aleclerc(at)ipso(dot)ca> writes:
>
>> The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped.
>>
> That's just a warning that gets put out at the end of the run. Go on
> with vacuuming your other databases. Right now is no time to be
> worrying about FSM too small --- you need to get back to a running DB.
>
> regards, tom lane
>
>

Robin, Tom, Kevin,

Thank you guys. I wanted to rush and vacuum the other tables and try,
but I decided to make a copy. This is actually running. (Enough mistakes
in one day to not take the time to do it.)

After that we try to launch the DB and hopefully it will be working good
enough before the next maintenance. Else, a big week-end is coming.

As for restoring the old DB, if we must go there, we will keep a copy
for sure (the copy we are making right now).

Guys, thank you very much! Your help is invaluable.

Best regards,

--
Alexandre Leclerc


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-16 23:19:47
Message-ID: n2pdcc563d11004161619rbce3a6aamcbb030cea2e0d4a2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc <aleclerc(at)ipso(dot)ca> wrote:
> Thank you guys. I wanted to rush and vacuum the other tables and try, but I
> decided to make a copy. This is actually running. (Enough mistakes in one
> day to not take the time to do it.)
>
> After that we try to launch the DB and hopefully it will be working good
> enough before the next maintenance. Else, a big week-end is coming.

Don't forget to schedule an upgrade to at least 8.2 since 8.1 is
considered broken and unsupported / unsupportable on windows. 8.3
changes some casting behaviour so you might want to wait until you can
test / fix code to go there, but 8.2 is usually a painless upgrade
from 8.1


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To:
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-19 12:59:19
Message-ID: 4BCC53A7.7050308@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-16 19:19, Scott Marlowe a écrit :
> On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc<aleclerc(at)ipso(dot)ca> wrote:
>
>> Thank you guys. I wanted to rush and vacuum the other tables and try, but I
>> decided to make a copy. This is actually running. (Enough mistakes in one
>> day to not take the time to do it.)
>>
>> After that we try to launch the DB and hopefully it will be working good
>> enough before the next maintenance. Else, a big week-end is coming.
>>
> Don't forget to schedule an upgrade to at least 8.2 since 8.1 is
> considered broken and unsupported / unsupportable on windows. 8.3
> changes some casting behaviour so you might want to wait until you can
> test / fix code to go there, but 8.2 is usually a painless upgrade
> from 8.1
>

Thank you Scott for this note.

I wanted to give some feedback on the situation:

- The vacuum completed after about 6-7 hours (by 16:15 PM). We
immediately made a backup before proceeding further. Then we vacuumed
the postgres et template1 databases.

- We were then successful at restarting postmaster and working with the
database. Everything works as expected.

Among all the help received (thank you all) I want to specially thank
Kevin and Tom for their excellent help and time through this urgent
problem we had to fix. I want to underline their detailed and precise
posts that were of great value for acting and taking decisions to
recover the DB. Thank you guys!

Again, thank you to all.

Best regards,

--
Alexandre Leclerc


From: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-19 13:05:33
Message-ID: 4BCC551D.7030102@ipso.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Le 2010-04-19 08:59, Alexandre Leclerc a écrit :
> Le 2010-04-16 19:19, Scott Marlowe a écrit :
>> On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc<aleclerc(at)ipso(dot)ca>
>> wrote:
>>> Thank you guys. I wanted to rush and vacuum the other tables and
>>> try, but I
>>> decided to make a copy. This is actually running. (Enough mistakes
>>> in one
>>> day to not take the time to do it.)
>>>
>>> After that we try to launch the DB and hopefully it will be working
>>> good
>>> enough before the next maintenance. Else, a big week-end is coming.
>> Don't forget to schedule an upgrade to at least 8.2 since 8.1 is
>> considered broken and unsupported / unsupportable on windows. 8.3
>> changes some casting behaviour so you might want to wait until you can
>> test / fix code to go there, but 8.2 is usually a painless upgrade
>> from 8.1
>
> Thank you Scott for this note.
>
> I wanted to give some feedback on the situation:
>
> - The vacuum completed after about 6-7 hours (by 16:15 PM). We
> immediately made a backup before proceeding further. Then we vacuumed
> the postgres et template1 databases.
>
> - We were then successful at restarting postmaster and working with
> the database. Everything works as expected.
>
> Among all the help received (thank you all) I want to specially thank
> Kevin and Tom for their excellent help and time through this urgent
> problem we had to fix. I want to underline their detailed and precise
> posts that were of great value for acting and taking decisions to
> recover the DB. Thank you guys!
>
> Again, thank you to all.
>
> Best regards,
>

I knew I missed one -- yet not forgotten! **Thank you Greg!** You were
the first to post and it guided me for the immediate action I had to take.

Best regards,

--
Alexandre Leclerc


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alexandre Leclerc <aleclerc(at)ipso(dot)ca>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Date: 2010-04-20 14:28:39
Message-ID: 201004201428.o3KESdb12088@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Alexandre Leclerc wrote:
> Thank you Scott for this note.
>
> I wanted to give some feedback on the situation:
>
> - The vacuum completed after about 6-7 hours (by 16:15 PM). We
> immediately made a backup before proceeding further. Then we vacuumed
> the postgres et template1 databases.
>
> - We were then successful at restarting postmaster and working with the
> database. Everything works as expected.
>
> Among all the help received (thank you all) I want to specially thank
> Kevin and Tom for their excellent help and time through this urgent
> problem we had to fix. I want to underline their detailed and precise
> posts that were of great value for acting and taking decisions to
> recover the DB. Thank you guys!

As another followup item, it would be good for your organization to get
some Postgres training and a commercial support contract from someone
who is paid to help you in such a crisis, or help you long before it
becomes a crisis. These kind of problems make everyone look bad.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com