Lists: | pgsql-performance |
---|
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | long running transactions |
Date: | 2006-10-10 16:10:27 |
Message-ID: | 20061010161027.GA18908@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
While doing a verbose vacuum, I'm constantly hitting things like:
DETAIL: 3606 dead row versions cannot be removed yet.
I believe this is a problem, because I still do have some empty tables
requireing up to 3-400 ms just to check if the table is empty (see
thread "slow queue-like empty table").
If pg_stat_activity.query_start actually is the start time of the
transaction, then we've gotten rid of all the real long-running
transactions. Then again, if pg_stat_activity.query_start actually was
the start time of the transaction, the attribute would have been called
pg_stat_activity.transaction_start, right?
Is there any way to find the longest running transaction?
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 16:23:40 |
Message-ID: | 25404.1160497420@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
> If pg_stat_activity.query_start actually is the start time of the
> transaction,
... but it isn't.
> Is there any way to find the longest running transaction?
Look in pg_locks to see the lowest-numbered transaction ID --- each
transaction will be holding exclusive lock on its own XID. You can
correlate that back to pg_stat_activity via the PID.
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 16:39:13 |
Message-ID: | 20061010163913.GA8097@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 12:23:40PM -0400]
> Look in pg_locks to see the lowest-numbered transaction ID --- each
> transaction will be holding exclusive lock on its own XID. You can
> correlate that back to pg_stat_activity via the PID.
Thanks a lot for the quick reply - I've already identified one
long-running transaction.
(I'm not allowed to order by xid, and not allowed to cast it to
anything, how come?)
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 16:41:42 |
Message-ID: | 20061010164142.GA5360@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tobias Brox - Tue at 06:39:13PM +0200]
> Thanks a lot for the quick reply - I've already identified one
> long-running transaction.
belonging to autovacuum ... how come?
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 16:42:52 |
Message-ID: | 25631.1160498572@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
> [Tobias Brox - Tue at 06:39:13PM +0200]
>> Thanks a lot for the quick reply - I've already identified one
>> long-running transaction.
> belonging to autovacuum ... how come?
Blocked on someone else's lock, maybe?
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 17:06:39 |
Message-ID: | 20061010170639.GB5360@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 12:42:52PM -0400]
> > belonging to autovacuum ... how come?
>
> Blocked on someone else's lock, maybe?
hardly, the autovacuum is the only one having such a low transaction id,
and also the only one hanging around when waiting a bit and rechecking
the pg_locks table.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 17:09:52 |
Message-ID: | 25953.1160500192@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
>> Blocked on someone else's lock, maybe?
> hardly, the autovacuum is the only one having such a low transaction id,
> and also the only one hanging around when waiting a bit and rechecking
> the pg_locks table.
Hmph. Is the autovac process actually doing anything (strace would be
revealing)? If not, can you attach to the autovac process with gdb and
get a stack trace to see where it's blocked?
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 17:13:04 |
Message-ID: | 20061010171304.GC5360@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 01:09:52PM -0400]
> Hmph. Is the autovac process actually doing anything (strace would be
> revealing)? If not, can you attach to the autovac process with gdb and
> get a stack trace to see where it's blocked?
Sorry ... I SIGINT'ed it, and now it's gone :-( I thought reloading the
config would restart autovacuum. Well, whatever, we still have the
nightly vacuum crontab.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 17:18:27 |
Message-ID: | 26123.1160500707@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
> [Tom Lane - Tue at 01:09:52PM -0400]
>> Hmph. Is the autovac process actually doing anything (strace would be
>> revealing)? If not, can you attach to the autovac process with gdb and
>> get a stack trace to see where it's blocked?
> Sorry ... I SIGINT'ed it, and now it's gone :-( I thought reloading the
> config would restart autovacuum.
It'll come back after the autovacuum naptime. If it gets stuck again,
please investigate.
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 17:49:55 |
Message-ID: | 20061010174955.GA284@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 01:18:27PM -0400]
> >> Hmph. Is the autovac process actually doing anything (strace would be
> >> revealing)?
It's definitively doing something; mostly reading, but also some few
writes, semops and opens.
> If not, can you attach to the autovac process with gdb and
> >> get a stack trace to see where it's blocked?
(gdb) bt
#0 0xb7c599f8 in select () from /lib/tls/libc.so.6
#1 0x08253c53 in pg_usleep ()
#2 0x0812ee93 in vacuum_delay_point ()
#3 0x0812f2a5 in lazy_vacuum_rel ()
#4 0x0812ef7b in lazy_vacuum_rel ()
#5 0x0812b4b6 in vac_update_relstats ()
#6 0x0812a995 in vacuum ()
#7 0x0818d2ca in autovac_stopped ()
#8 0x0818ceae in autovac_stopped ()
#9 0x0818c848 in autovac_stopped ()
#10 0x0818c4e2 in autovac_start ()
#11 0x08192c11 in PostmasterMain ()
#12 0x08191dcf in PostmasterMain ()
#13 0x081541b1 in main ()
> It'll come back after the autovacuum naptime. If it gets stuck again,
> please investigate.
It seems stuck, has had the same transid for a long while, and the
number of undeletable dead rows in our tables are increasing.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 18:04:55 |
Message-ID: | 528.1160503495@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
> (gdb) bt
> #0 0xb7c599f8 in select () from /lib/tls/libc.so.6
> #1 0x08253c53 in pg_usleep ()
> #2 0x0812ee93 in vacuum_delay_point ()
> #3 0x0812f2a5 in lazy_vacuum_rel ()
> #4 0x0812ef7b in lazy_vacuum_rel ()
> #5 0x0812b4b6 in vac_update_relstats ()
That doesn't look particularly blocked, and if you are seeing
reads/writes too, then it's doing something.
> It seems stuck, has had the same transid for a long while, and the
> number of undeletable dead rows in our tables are increasing.
Perhaps you have overly aggressive vacuum cost delay settings?
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 18:19:53 |
Message-ID: | 20061010181953.GA7584@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 02:04:55PM -0400]
> > It seems stuck, has had the same transid for a long while, and the
> > number of undeletable dead rows in our tables are increasing.
>
> Perhaps you have overly aggressive vacuum cost delay settings?
Perhaps, though I wouldn't expect it to sleep in the middle of a
transaction - and also, it really did seem to me that it's doing work
rather than only sleeping.
The transaction id for the vacuum process is the same now as when I
wrote the previous email, and the number of dead unremovable rows have
increased steadily.
The settings in effect are:
autovacuum_vacuum_cost_delay = 500
autovacuum_vacuum_cost_limit = 200
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tobias Brox <tobias(at)nordicbet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 18:26:53 |
Message-ID: | 772.1160504813@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tobias Brox <tobias(at)nordicbet(dot)com> writes:
>> Perhaps you have overly aggressive vacuum cost delay settings?
> autovacuum_vacuum_cost_delay = 500
> autovacuum_vacuum_cost_limit = 200
Well, that's going to cause it to sleep half a second after every dozen
or so page I/Os. I think you'd be well advised to reduce the delay.
regards, tom lane
From: | Tobias Brox <tobias(at)nordicbet(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tobias Brox <tobias(at)nordicbet(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: long running transactions |
Date: | 2006-10-10 18:43:54 |
Message-ID: | 20061010184354.GA24098@oppetid.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
[Tom Lane - Tue at 02:26:53PM -0400]
> > autovacuum_vacuum_cost_delay = 500
> > autovacuum_vacuum_cost_limit = 200
>
> Well, that's going to cause it to sleep half a second after every dozen
> or so page I/Os. I think you'd be well advised to reduce the delay.
Modified it to 20/250, and it definitively helped. Sorry for the
list verbosity; I should have been able to resolve this myself already
some 2-3 emails ago :-) I wanted a "soft" introduction of autovac in
production, and assumed that it was better to begin with too much sleep
than too little! Well, well.