Lists: | pgsql-hackers |
---|
From: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Vacuum and oldest xmin (again) |
Date: | 2004-11-04 08:31:05 |
Message-ID: | cmcpc9$1r1t$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
we use 7.4 and suffer from table and index bloat. I tracked down the
issue to vacuum using GetOldestXmin() which returns the oldest xmin of
any database in cluster, not xmin of the current database. Yes, I've
read the thread about this issue called "Problems Vacuumi'ng" - April 2004
But our situation is not caused by misbehaved clients. Let me explain.
We're running several big OLTP databases on the server. Typical
transaction is insert/update/delete and takes less than second. Up to
this point everything is fine. But there are also tasks like backup or
initial data loading that take much more time than a second - sometimes
an hour or more. And during this time _all_ our databases on the
server are not properly vacuumed because there is a long running
transaction in some other database. This is really big problem for
us. We're trying to avoid long running transactions but backup and
initial data loading are essential tasks. The only solution I can see
now, is to have several database clusters on the server in order to have
completly separated databases...
Is there something I'm missing? What can I do for correct vacuuming when
there is a long running transaction in some other database?
My suggestion is to add some more logic to vacuum to get correct oldest
xmin - local to current database. I don't want to add extra overhead to
all transactions but only to vacuum. I have no knowledge of Pg internals
but something like adding txn number to pg_stat_activity and then
checking this view (if it is enabled)... Would this be possible?
Bottom line: GetOldestXmin() is also called from Slony and also leads to
unexpected behaviour (at least for me) - replication is waiting for
transactions in completly unrelated databases.
Thanks, Kuba
From: | Andrew Sullivan <ajs(at)crankycanuck(dot)ca> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 14:12:24 |
Message-ID: | 20041104141224.GC23219@phlogiston.dyndns.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Nov 04, 2004 at 09:31:05AM +0100, Kuba Ouhrabka wrote:
> initial data loading are essential tasks. The only solution I can see
> now, is to have several database clusters on the server in order to have
> completly separated databases...
We actually do that, for the reasons you say, plus because it gives
us a certain degree of separability (and because it allows us to tune
the caches more effectively for each type of system).
> My suggestion is to add some more logic to vacuum to get correct oldest
> xmin - local to current database.
I think the problem is that the xids are in fact global values. This
is, importantly, why you get messages about not having vacuumed in a
long time in case you have a database which is not in your regular
vacuum regimen. I have my doubts that the idea of the xids "local to
current database" is even a coherent idea in Postgres, but I may be
wrong (in which case someone is bound to correct me).
A
--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 15:00:23 |
Message-ID: | 25280.1099580423@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Kuba Ouhrabka <kuba(at)comgate(dot)cz> writes:
> My suggestion is to add some more logic to vacuum to get correct oldest
> xmin - local to current database.
If you read the code a little more closely, you'd see that it already does.
regards, tom lane
From: | Andrew Sullivan <ajs(at)crankycanuck(dot)ca> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 15:29:54 |
Message-ID: | 20041104152954.GA23471@phlogiston.dyndns.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Nov 04, 2004 at 10:00:23AM -0500, Tom Lane wrote:
>
> If you read the code a little more closely, you'd see that it already does.
Hmm, so obviously I was confused in my other message. But I've seen
the same sort of effect as the OP: transactions in another database
on the same back end seem to prevent some recovery by vacuum in the
local back end. Is this just an illusion? (I can probably chalk it
up to a later-completed transaction in the same back end, of course.
I don't know if the same is true for the OP.)
A
--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.
--Dennis Ritchie
From: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 15:43:22 |
Message-ID: | cmdimp$8mn$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
>>My suggestion is to add some more logic to vacuum to get correct oldest
>>xmin - local to current database.
>
>
> If you read the code a little more closely, you'd see that it already does.
>
> regards, tom lane
Could you pls tell what has changed since 7.4? I'm not able to find it...
Thanks Kuba
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Sullivan <ajs(at)crankycanuck(dot)ca> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 15:48:09 |
Message-ID: | 25747.1099583289@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andrew Sullivan <ajs(at)crankycanuck(dot)ca> writes:
> On Thu, Nov 04, 2004 at 10:00:23AM -0500, Tom Lane wrote:
>> If you read the code a little more closely, you'd see that it already does.
> Hmm, so obviously I was confused in my other message. But I've seen
> the same sort of effect as the OP: transactions in another database
> on the same back end seem to prevent some recovery by vacuum in the
> local back end. Is this just an illusion?
I think it's most likely that there were also old transactions in the
current database. Only the shared tables (pg_shadow, pg_database,
pg_group) are vacuumed using a cutoff that depends on non-local
transactions.
Looking at the back versions, it appears this logic was put in in 7.2;
is it possible you are remembering the behavior of older versions?
regards, tom lane
From: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 15:57:28 |
Message-ID: | cmdjh6$e4s$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
> I think it's most likely that there were also old transactions in the
> current database. Only the shared tables (pg_shadow, pg_database,
> pg_group) are vacuumed using a cutoff that depends on non-local
> transactions.
in my case, there are really no old transactions in current database.
> Looking at the back versions, it appears this logic was put in in 7.2;
> is it possible you are remembering the behavior of older versions?
And it's on 7.4...
The problem is fully described in thread I mentioned earlier, Tom's
excellent explanation can be found here:
Kuba
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 16:36:48 |
Message-ID: | 26222.1099586208@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Kuba Ouhrabka <kuba(at)comgate(dot)cz> writes:
> The problem is fully described in thread I mentioned earlier, Tom's
> excellent explanation can be found here:
> http://groups.google.com/groups?hl=cs&lr=&frame=right&th=5227028cb3449572&seekm=11390.1080964720%40sss.pgh.pa.us#link14
Oh, that thing. Well, my opinion has not changed since April --- I
don't think the problem justifies slowing down every transaction start.
regards, tom lane
From: | Kuba Ouhrabka <kuba(at)comgate(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 16:45:28 |
Message-ID: | cmdmbb$v40$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
>>The problem is fully described in thread I mentioned earlier, Tom's
>>excellent explanation can be found here:
>>http://groups.google.com/groups?hl=cs&lr=&frame=right&th=5227028cb3449572&seekm=11390.1080964720%40sss.pgh.pa.us#link14
>
>
> Oh, that thing. Well, my opinion has not changed since April --- I
> don't think the problem justifies slowing down every transaction start.
My question was if the slowdown of every transaction start is really
needed - If it is possible to find some other solution, for example add
more logic to vacuum to pick up correct txn id (or little bit more
"correct"). But unfortunately I have no idea how to do this because of
my limited knowledge of Pg internals, the diffucult part I guess are the
shared relations...
Kuba
From: | Andrew Sullivan <ajs(at)crankycanuck(dot)ca> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Vacuum and oldest xmin (again) |
Date: | 2004-11-04 16:53:16 |
Message-ID: | 20041104165316.GB23471@phlogiston.dyndns.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Nov 04, 2004 at 10:48:09AM -0500, Tom Lane wrote:
> Looking at the back versions, it appears this logic was put in in 7.2;
> is it possible you are remembering the behavior of older versions?
Quite likely, in fact. Thanks for clearing that up.
A
--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
The plural of anecdote is not data.
--Roger Brinner