Re: Lock partitions

Lists: pgsql-hackers
From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Lock partitions
Date: 2006-09-10 23:57:32
Message-ID: 200609102357.k8ANvWH21860@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I see this in the CVS commits for 8.2. Did we determine the proper
number of lock partitions? Should it be based on the number of buffers
or concurrent sessions allowed?

Divide the lock manager's shared state into 'partitions', so as to
reduce contention for the former single LockMgrLock. Per my recent
proposal. I set it up for 16 partitions, but on a pgbench test this
gives only a marginal further improvement over 4 partitions --- we need
to test more scenarios to choose the number of partitions.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Mark Wong <markw(at)osdl(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-11 00:30:30
Message-ID: 5772.1157934630@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> I see this in the CVS commits for 8.2. Did we determine the proper
> number of lock partitions? Should it be based on the number of buffers
> or concurrent sessions allowed?

No. NUM_LOCK_PARTITIONS needs to be a compile-time constant for a
number of reasons, and there is absolutely zero evidence to justify
making any effort (and spending any cycles) on a variable value.

It would be nice to see some results from the OSDL tests with, say, 4,
8, and 16 lock partitions before we forget about the point though.
Anybody know whether OSDL is in a position to run tests for us?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-11 15:20:35
Message-ID: 45057EC3.7000909@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> I see this in the CVS commits for 8.2. Did we determine the proper
>> number of lock partitions? Should it be based on the number of buffers
>> or concurrent sessions allowed?
>
> No. NUM_LOCK_PARTITIONS needs to be a compile-time constant for a
> number of reasons, and there is absolutely zero evidence to justify
> making any effort (and spending any cycles) on a variable value.
>
> It would be nice to see some results from the OSDL tests with, say, 4,
> 8, and 16 lock partitions before we forget about the point though.
> Anybody know whether OSDL is in a position to run tests for us?

Yeah, I can run some dbt2 tests in the lab. I'll get started on it.
We're still a little bit away from getting the automated testing for
PostgreSQL going again though.

Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-11 15:29:28
Message-ID: 3078.1157988568@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Tom Lane wrote:
>> It would be nice to see some results from the OSDL tests with, say, 4,
>> 8, and 16 lock partitions before we forget about the point though.
>> Anybody know whether OSDL is in a position to run tests for us?

> Yeah, I can run some dbt2 tests in the lab. I'll get started on it.
> We're still a little bit away from getting the automated testing for
> PostgreSQL going again though.

Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in
src/include/storage/lwlock.h. You need a full backend recompile
after changing it, but you shouldn't need to initdb, if that helps.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mark Wong <markw(at)osdl(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-12 08:36:42
Message-ID: 1158050202.2692.444.camel@holly
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2006-09-11 at 11:29 -0400, Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
> > Tom Lane wrote:
> >> It would be nice to see some results from the OSDL tests with, say, 4,
> >> 8, and 16 lock partitions before we forget about the point though.
> >> Anybody know whether OSDL is in a position to run tests for us?
>
> > Yeah, I can run some dbt2 tests in the lab. I'll get started on it.
> > We're still a little bit away from getting the automated testing for
> > PostgreSQL going again though.
>
> Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in
> src/include/storage/lwlock.h. You need a full backend recompile
> after changing it, but you shouldn't need to initdb, if that helps.

IIRC we did that already and the answer was 16...

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Mark Wong <markw(at)osdl(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-12 13:34:55
Message-ID: 14982.1158068095@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Mon, 2006-09-11 at 11:29 -0400, Tom Lane wrote:
>> Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in
>> src/include/storage/lwlock.h. You need a full backend recompile
>> after changing it, but you shouldn't need to initdb, if that helps.

> IIRC we did that already and the answer was 16...

No, no one has shown me any numbers from any "real" tests (anything
more than pgbench on a Dell PC ...).

regards, tom lane


From: "Strong, David" <david(dot)strong(at)unisys(dot)com>
To: "PostgreSQL-development" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-12 15:46:19
Message-ID: B6419AF36AC8524082E1BC17DA2506E80302B541@USMV-EXCH2.na.uis.unisys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

We can pass on what we've seen when running tests here with different
BufMapping and LockMgr partition sizes.

We use a TPC-C inspired benchmark. Currently it is configured to run 25
backend processes. The test runs for 16 minutes as this is the minimum
amount of time we can run and obtain useful information. This gives us
24,000 seconds (25 * 16 * 60) of processing time.

The following timings have been rounded to the nearest second and
represent the amount of time amongst all backend processes to acquire
and release locks. For example, a value of 2500 seconds would mean each
backend process (25) took ~100 seconds to acquire or release a lock.
Although, in reality, the time spent locking or releasing each partition
entry is not uniform and there are some definite hotspot entries. We can
pass on some of the lock output if anyone is interested.

When using 16 buffer and 16 lock partitions, we see that BufMapping
takes 809 seconds to acquire locks and 174 seconds to release locks. The
LockMgr takes 362 seconds to acquire locks and 26 seconds to release
locks.

When using 128 buffer and 128 lock partitions, we see that BufMapping
takes 277 seconds (532 seconds improvement) to acquire locks and 78
seconds (96 seconds improvement) to release locks. The LockMgr takes 235
seconds (127 seconds improvement) to acquire locks and 22 seconds (4
seconds improvement) to release locks.

Overall, 128 BufMapping partitions improves locking/releasing by 678
seconds, 128 LockMgr partitions improves locking/releasing by 131
seconds.

With the improvements in the various locking times, one might expect an
improvement in the overall benchmark result. However, a 16 partition run
produces a result of 198.74 TPS and a 128 partition run produces a
result of 203.24 TPS.

Part of the time saved from BufMapping and LockMgr partitions is
absorbed into the WALInsertLock lock. For a 16 partition run, the total
time to lock/release the WALInsertLock lock is 5845 seconds. For 128
partitions, the WALInsertLock lock takes 6172 seconds, an increase of
327 seconds. Perhaps we have our WAL configured incorrectly?

Other static locks are also affected, but not as much as the
WALInsertLock lock. For example, the ProcArrayLock lock increases from
337 seconds to 348 seconds. The SInvalLock lock increases from 317
seconds to 331 seconds.

Due to expansion of time in other locks, a 128 partition run only spends
403 seconds less in locking than a 16 partition run.

We can generate some OProfile statistics, but most of the time saved is
probably absorbed into functions such as HeapTupleSatisfiesSnapshot and
PinBuffer which seem to have a very high overhead.

David

-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Simon Riggs
Sent: Tuesday, September 12, 2006 1:37 AM
To: Tom Lane
Cc: Mark Wong; Bruce Momjian; PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

On Mon, 2006-09-11 at 11:29 -0400, Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
> > Tom Lane wrote:
> >> It would be nice to see some results from the OSDL tests with, say,
4,
> >> 8, and 16 lock partitions before we forget about the point though.
> >> Anybody know whether OSDL is in a position to run tests for us?
>
> > Yeah, I can run some dbt2 tests in the lab. I'll get started on it.

> > We're still a little bit away from getting the automated testing for

> > PostgreSQL going again though.
>
> Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in
> src/include/storage/lwlock.h. You need a full backend recompile
> after changing it, but you shouldn't need to initdb, if that helps.

IIRC we did that already and the answer was 16...

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Strong, David" <david(dot)strong(at)unisys(dot)com>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-12 16:40:14
Message-ID: 16913.1158079214@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> When using 16 buffer and 16 lock partitions, we see that BufMapping
> takes 809 seconds to acquire locks and 174 seconds to release locks. The
> LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> locks.

> When using 128 buffer and 128 lock partitions, we see that BufMapping
> takes 277 seconds (532 seconds improvement) to acquire locks and 78
> seconds (96 seconds improvement) to release locks. The LockMgr takes 235
> seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> seconds improvement) to release locks.

While I don't see any particular penalty to increasing
NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
significant penalty (increasing PGPROC size as well as the work needed
during LockReleaseAll, which is executed at every transaction end).
I think 128 lock partitions is probably verging on the ridiculous
... particularly if your benchmark only involves touching half a dozen
tables. I'd be more interested in comparisons between 4 and 16 lock
partitions. Also, please vary the two settings independently rather
than confusing the issue by changing them both at once.

> With the improvements in the various locking times, one might expect an
> improvement in the overall benchmark result. However, a 16 partition run
> produces a result of 198.74 TPS and a 128 partition run produces a
> result of 203.24 TPS.

> Part of the time saved from BufMapping and LockMgr partitions is
> absorbed into the WALInsertLock lock. For a 16 partition run, the total
> time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> partitions, the WALInsertLock lock takes 6172 seconds, an increase of
> 327 seconds. Perhaps we have our WAL configured incorrectly?

I fear this throws your entire measurement procedure into question. For
a fixed workload the number of acquisitions of WALInsertLock ought to be
fixed, so you shouldn't see any more contention for WALInsertLock if the
transaction rate didn't change materially.

regards, tom lane


From: "Strong, David" <david(dot)strong(at)unisys(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-12 17:03:19
Message-ID: B6419AF36AC8524082E1BC17DA2506E80302B5E7@USMV-EXCH2.na.uis.unisys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

Thanks for the feedback. We'll run a few tests with differing buffer and
lock partition sizes in the range you're interested in and we'll let you
know what we see.

Our workload is not fixed, however. Our benchmark does not follow the
strict TPC-C guideline of using think times etc. We throw as many
transactions at the database as we can. So, when any time is freed up,
we will fill it with another transaction. We simply want to stress as
much as we can. As one bottleneck is removed, the time saved obviously
flows to the next.

Postgres 8.2 moves some of the time that used to be consumed by single
BufMappingLock and LockMGRLock locks to the WALInsertLock lock. We have
run tests where we made XLogInsert a NOP, because we wanted to see where
the next bottleneck would be, and some of the time occupied by
WALInsertLock lock was absorbed by the SInvalLock lock. We have not
tried to remove the SInvalLock lock to see where time flows to next, but
we might.

David

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Tuesday, September 12, 2006 9:40 AM
To: Strong, David
Cc: PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

"Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> When using 16 buffer and 16 lock partitions, we see that BufMapping
> takes 809 seconds to acquire locks and 174 seconds to release locks.
The
> LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> locks.

> When using 128 buffer and 128 lock partitions, we see that BufMapping
> takes 277 seconds (532 seconds improvement) to acquire locks and 78
> seconds (96 seconds improvement) to release locks. The LockMgr takes
235
> seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> seconds improvement) to release locks.

While I don't see any particular penalty to increasing
NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
significant penalty (increasing PGPROC size as well as the work needed
during LockReleaseAll, which is executed at every transaction end).
I think 128 lock partitions is probably verging on the ridiculous
... particularly if your benchmark only involves touching half a dozen
tables. I'd be more interested in comparisons between 4 and 16 lock
partitions. Also, please vary the two settings independently rather
than confusing the issue by changing them both at once.

> With the improvements in the various locking times, one might expect
an
> improvement in the overall benchmark result. However, a 16 partition
run
> produces a result of 198.74 TPS and a 128 partition run produces a
> result of 203.24 TPS.

> Part of the time saved from BufMapping and LockMgr partitions is
> absorbed into the WALInsertLock lock. For a 16 partition run, the
total
> time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> partitions, the WALInsertLock lock takes 6172 seconds, an increase of
> 327 seconds. Perhaps we have our WAL configured incorrectly?

I fear this throws your entire measurement procedure into question. For
a fixed workload the number of acquisitions of WALInsertLock ought to be
fixed, so you shouldn't see any more contention for WALInsertLock if the
transaction rate didn't change materially.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Strong, David" <david(dot)strong(at)unisys(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 09:22:17
Message-ID: 1158139337.2692.597.camel@holly
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote:
> "Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> > When using 16 buffer and 16 lock partitions, we see that BufMapping
> > takes 809 seconds to acquire locks and 174 seconds to release locks. The
> > LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> > locks.
>
> > When using 128 buffer and 128 lock partitions, we see that BufMapping
> > takes 277 seconds (532 seconds improvement) to acquire locks and 78
> > seconds (96 seconds improvement) to release locks. The LockMgr takes 235
> > seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> > seconds improvement) to release locks.
>
> While I don't see any particular penalty to increasing
> NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
> significant penalty (increasing PGPROC size as well as the work needed
> during LockReleaseAll, which is executed at every transaction end).
> I think 128 lock partitions is probably verging on the ridiculous
> ... particularly if your benchmark only involves touching half a dozen
> tables. I'd be more interested in comparisons between 4 and 16 lock
> partitions. Also, please vary the two settings independently rather
> than confusing the issue by changing them both at once.

Good thinking David. Even if 128 is fairly high, it does seem worth
exploring higher values - I was just stuck in "fewer == better"
thoughts.

> > With the improvements in the various locking times, one might expect an
> > improvement in the overall benchmark result. However, a 16 partition run
> > produces a result of 198.74 TPS and a 128 partition run produces a
> > result of 203.24 TPS.
>
> > Part of the time saved from BufMapping and LockMgr partitions is
> > absorbed into the WALInsertLock lock. For a 16 partition run, the total
> > time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> > partitions, the WALInsertLock lock takes 6172 seconds, an increase of
> > 327 seconds. Perhaps we have our WAL configured incorrectly?
>
> I fear this throws your entire measurement procedure into question. For
> a fixed workload the number of acquisitions of WALInsertLock ought to be
> fixed, so you shouldn't see any more contention for WALInsertLock if the
> transaction rate didn't change materially.

David's results were to do with lock acquire/release time, not the
number of acquisitions, so that in itself doesn't make me doubt these
measurements. Perhaps we can ask whether there was a substantially
different number of lock acquisitions? As Tom says, that would be an
issue.

It seems reasonable that relieving the bottleneck on BufMapping and
LockMgr locks that we would then queue longer on the next bottleneck,
WALInsertLock. So again, those tests seem reasonable to me so far.

These seem to be the beginnings of accurate wait time analysis, so I'm
listening closely.

Are you using a lightweight timer?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Strong, David" <david(dot)strong(at)unisys(dot)com>
To: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 17:52:26
Message-ID: B6419AF36AC8524082E1BC17DA2506E80302BBD7@USMV-EXCH2.na.uis.unisys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon,

In the 16/16 (16 buffer partitions/16 lock partitions) test, the
WALInsertLock lock had 14643080 acquisition attempts and 12057678
successful acquisitions on the lock. That's 2585402 retries on the lock.
That is to say that PGSemaphoreLock was invoked 2585402 times.

In the 128/128 test, the WALInsertLock lock had 14991208 acquisition
attempts and 12324765 successful acquisitions. That's 2666443 retries.

The 128/128 test attempted 348128 more lock acquisitions than the 16/16
test and retried 81041 times more than the 16/16 test. We attribute the
rise in WALInsertLock lock accesses to the reduction in time on
acquiring the BufMapping and LockMgr partition locks. Does this seem
reasonable?

The overhead of any monitoring is of great concern to us. We've tried
both clock_gettime () and getttimeofday () calls. They both seem to have
the same overhead ~1 us/call (measured against the TSC of the CPU) and
both seem to be accurate. We realize this can be a delicate point and so
we would be happy to rerun any tests with a different timing mechanism.

David

-----Original Message-----
From: Simon Riggs [mailto:simon(at)2ndquadrant(dot)com]
Sent: Wednesday, September 13, 2006 2:22 AM
To: Tom Lane
Cc: Strong, David; PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote:
> "Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> > When using 16 buffer and 16 lock partitions, we see that BufMapping
> > takes 809 seconds to acquire locks and 174 seconds to release locks.
The
> > LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> > locks.
>
> > When using 128 buffer and 128 lock partitions, we see that
BufMapping
> > takes 277 seconds (532 seconds improvement) to acquire locks and 78
> > seconds (96 seconds improvement) to release locks. The LockMgr takes
235
> > seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> > seconds improvement) to release locks.
>
> While I don't see any particular penalty to increasing
> NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
> significant penalty (increasing PGPROC size as well as the work needed
> during LockReleaseAll, which is executed at every transaction end).
> I think 128 lock partitions is probably verging on the ridiculous
> ... particularly if your benchmark only involves touching half a dozen
> tables. I'd be more interested in comparisons between 4 and 16 lock
> partitions. Also, please vary the two settings independently rather
> than confusing the issue by changing them both at once.

Good thinking David. Even if 128 is fairly high, it does seem worth
exploring higher values - I was just stuck in "fewer == better"
thoughts.

> > With the improvements in the various locking times, one might expect
an
> > improvement in the overall benchmark result. However, a 16 partition
run
> > produces a result of 198.74 TPS and a 128 partition run produces a
> > result of 203.24 TPS.
>
> > Part of the time saved from BufMapping and LockMgr partitions is
> > absorbed into the WALInsertLock lock. For a 16 partition run, the
total
> > time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> > partitions, the WALInsertLock lock takes 6172 seconds, an increase
of
> > 327 seconds. Perhaps we have our WAL configured incorrectly?
>
> I fear this throws your entire measurement procedure into question.
For
> a fixed workload the number of acquisitions of WALInsertLock ought to
be
> fixed, so you shouldn't see any more contention for WALInsertLock if
the
> transaction rate didn't change materially.

David's results were to do with lock acquire/release time, not the
number of acquisitions, so that in itself doesn't make me doubt these
measurements. Perhaps we can ask whether there was a substantially
different number of lock acquisitions? As Tom says, that would be an
issue.

It seems reasonable that relieving the bottleneck on BufMapping and
LockMgr locks that we would then queue longer on the next bottleneck,
WALInsertLock. So again, those tests seem reasonable to me so far.

These seem to be the beginnings of accurate wait time analysis, so I'm
listening closely.

Are you using a lightweight timer?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Strong, David" <david(dot)strong(at)unisys(dot)com>
To: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 18:46:04
Message-ID: B6419AF36AC8524082E1BC17DA2506E803062C12@USMV-EXCH2.na.uis.unisys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

We have some results for you. We left the buffer partition locks at 128
as this did not seem to be a concern and we're still using 25 backend
processes. We ran tests for 4, 8 and 16 lock partitions.

For 4 lock partitions, it took 620 seconds to acquire locks and 32
seconds to release locks. The test produced 199.95 TPS.

For 8 lock partitions, it took 505 seconds to acquire locks and 31
seconds to release locks. The test produced 201.16 TPS.

For 16 lock partitions, it took 362 seconds to acquire locks and 22
seconds to release locks. The test produced 200.75 TPS.

And, just for grins, using 128 buffer and 128 lock partitions, took 235
seconds to acquire locks and 22 seconds to release locks. The test
produced 203.24 TPS.

Let me know if we can provide any additional information from these
tests and if there are any other tests that we can run.

David

-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Strong, David
Sent: Wednesday, September 13, 2006 10:52 AM
To: PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

Simon,

In the 16/16 (16 buffer partitions/16 lock partitions) test, the
WALInsertLock lock had 14643080 acquisition attempts and 12057678
successful acquisitions on the lock. That's 2585402 retries on the lock.
That is to say that PGSemaphoreLock was invoked 2585402 times.

In the 128/128 test, the WALInsertLock lock had 14991208 acquisition
attempts and 12324765 successful acquisitions. That's 2666443 retries.

The 128/128 test attempted 348128 more lock acquisitions than the 16/16
test and retried 81041 times more than the 16/16 test. We attribute the
rise in WALInsertLock lock accesses to the reduction in time on
acquiring the BufMapping and LockMgr partition locks. Does this seem
reasonable?

The overhead of any monitoring is of great concern to us. We've tried
both clock_gettime () and getttimeofday () calls. They both seem to have
the same overhead ~1 us/call (measured against the TSC of the CPU) and
both seem to be accurate. We realize this can be a delicate point and so
we would be happy to rerun any tests with a different timing mechanism.

David

-----Original Message-----
From: Simon Riggs [mailto:simon(at)2ndquadrant(dot)com]
Sent: Wednesday, September 13, 2006 2:22 AM
To: Tom Lane
Cc: Strong, David; PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

On Tue, 2006-09-12 at 12:40 -0400, Tom Lane wrote:
> "Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> > When using 16 buffer and 16 lock partitions, we see that BufMapping
> > takes 809 seconds to acquire locks and 174 seconds to release locks.
The
> > LockMgr takes 362 seconds to acquire locks and 26 seconds to release
> > locks.
>
> > When using 128 buffer and 128 lock partitions, we see that
BufMapping
> > takes 277 seconds (532 seconds improvement) to acquire locks and 78
> > seconds (96 seconds improvement) to release locks. The LockMgr takes
235
> > seconds (127 seconds improvement) to acquire locks and 22 seconds (4
> > seconds improvement) to release locks.
>
> While I don't see any particular penalty to increasing
> NUM_BUFFER_PARTITIONS, increasing NUM_LOCK_PARTITIONS carries a very
> significant penalty (increasing PGPROC size as well as the work needed
> during LockReleaseAll, which is executed at every transaction end).
> I think 128 lock partitions is probably verging on the ridiculous
> ... particularly if your benchmark only involves touching half a dozen
> tables. I'd be more interested in comparisons between 4 and 16 lock
> partitions. Also, please vary the two settings independently rather
> than confusing the issue by changing them both at once.

Good thinking David. Even if 128 is fairly high, it does seem worth
exploring higher values - I was just stuck in "fewer == better"
thoughts.

> > With the improvements in the various locking times, one might expect
an
> > improvement in the overall benchmark result. However, a 16 partition
run
> > produces a result of 198.74 TPS and a 128 partition run produces a
> > result of 203.24 TPS.
>
> > Part of the time saved from BufMapping and LockMgr partitions is
> > absorbed into the WALInsertLock lock. For a 16 partition run, the
total
> > time to lock/release the WALInsertLock lock is 5845 seconds. For 128
> > partitions, the WALInsertLock lock takes 6172 seconds, an increase
of
> > 327 seconds. Perhaps we have our WAL configured incorrectly?
>
> I fear this throws your entire measurement procedure into question.
For
> a fixed workload the number of acquisitions of WALInsertLock ought to
be
> fixed, so you shouldn't see any more contention for WALInsertLock if
the
> transaction rate didn't change materially.

David's results were to do with lock acquire/release time, not the
number of acquisitions, so that in itself doesn't make me doubt these
measurements. Perhaps we can ask whether there was a substantially
different number of lock acquisitions? As Tom says, that would be an
issue.

It seems reasonable that relieving the bottleneck on BufMapping and
LockMgr locks that we would then queue longer on the next bottleneck,
WALInsertLock. So again, those tests seem reasonable to me so far.

These seem to be the beginnings of accurate wait time analysis, so I'm
listening closely.

Are you using a lightweight timer?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Strong, David" <david(dot)strong(at)unisys(dot)com>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 20:35:53
Message-ID: 16328.1158179753@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> We have some results for you. We left the buffer partition locks at 128
> as this did not seem to be a concern and we're still using 25 backend
> processes. We ran tests for 4, 8 and 16 lock partitions.

> For 4 lock partitions, it took 620 seconds to acquire locks and 32
> seconds to release locks. The test produced 199.95 TPS.

> For 8 lock partitions, it took 505 seconds to acquire locks and 31
> seconds to release locks. The test produced 201.16 TPS.

> For 16 lock partitions, it took 362 seconds to acquire locks and 22
> seconds to release locks. The test produced 200.75 TPS.

> And, just for grins, using 128 buffer and 128 lock partitions, took 235
> seconds to acquire locks and 22 seconds to release locks. The test
> produced 203.24 TPS.

[ itch... ] I can't help thinking there's something wrong with this;
the wait-time measurements seem sane, but why is there essentially no
change in the TPS result?

The above numbers are only for the lock-partition LWLocks, right?
What are the totals --- that is, how much time is spent blocked
vs. processing overall?

regards, tom lane


From: Jim Nasby <jim(at)nasby(dot)net>
To: "Strong, David" <david(dot)strong(at)unisys(dot)com>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 20:37:41
Message-ID: 77177806-645E-4AAE-901E-8F6A64BC78D0@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sep 13, 2006, at 2:46 PM, Strong, David wrote:
> We have some results for you. We left the buffer partition locks at
> 128
> as this did not seem to be a concern and we're still using 25 backend
> processes. We ran tests for 4, 8 and 16 lock partitions.

Isn't having more lock partitions than buffer partitions pointless?
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: "Strong, David" <david(dot)strong(at)unisys(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-13 21:18:33
Message-ID: 16849.1158182313@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <jim(at)nasby(dot)net> writes:
> Isn't having more lock partitions than buffer partitions pointless?

AFAIK they're pretty orthogonal. It's true though that a typical
transaction doesn't hold all that many locks, which is why I don't
see a need for a large number of lock partitions.

regards, tom lane


From: "Strong, David" <david(dot)strong(at)unisys(dot)com>
To: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 15:44:20
Message-ID: B6419AF36AC8524082E1BC17DA2506E803062FD8@USMV-EXCH2.na.uis.unisys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

Taking the 4 lock vs 8 lock partitions, 4 LockMgr lock partitions spent
a total of 652 seconds in lock management (acquiring/releasing) and 8
LockMgr lock partitions spent a total of 536 in lock management. This is
an improvement of 116 seconds, but the TPS didn't improve by much - only
a 1.21 TPS improvement.

The improvement in the LockMgr processing is consumed by the next system
bottleneck downstream as more work is being let through. In this
particular case it's the WALInsertLock lock. The 4 LockMgr lock
partition test spent a total of 5868 seconds in WALInsertLock lock
management whereas the 8 LockMgr partition test spent 5945 seconds in
WALInsertLock lock management which is an increase of 77 seconds. But,
that's not the only static lock that increased in time, it's just the
most significant increase. The WALWriteLock lock increased by 12
seconds, ProcArrayLock increased by 8 seconds and SInvalLock increased
by 5 seconds. This takes the total time flowing to other locks to 102
seconds.

The locks are not the only part of the puzzle. As improvements are made
to various areas like the BufMapping and LockMgr lock partitions, other
parts of the system start to get exercised in ways that were not
possible in previous releases. We're still trying to get our arms around
all the functions that might become bottlenecks when other lock
contention is minimized.

And, improvements are being made. The locking changes from 8.0.x to
8.1.x made a significant difference in scalability. Again, the current
lock improvements in 8.2 have realized ~20% improvement over 8.1.x,
based on our testing.

We added monitoring code to the LWLockAcquire and LWLockRelease
functions. We track the total time taken to pass through LWLockAcquire
and LWLockRelease. So, if a particular backend process takes 1 second to
run through LWLockAcquire, we will track that as 1 second in lock
acquisition. Irrespective of whether my backend process was spinning or
in a semaphore wait, it's 1 second that was taken away from processing a
statement/request. We could also add timing for semaphore waits within
LWLockAcquire, if that would be a useful statistic.

Let me know if there are any other tests or metrics that would be
useful.

David

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Wednesday, September 13, 2006 1:36 PM
To: Strong, David
Cc: PostgreSQL-development
Subject: Re: [HACKERS] Lock partitions

"Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> We have some results for you. We left the buffer partition locks at
128
> as this did not seem to be a concern and we're still using 25 backend
> processes. We ran tests for 4, 8 and 16 lock partitions.

> For 4 lock partitions, it took 620 seconds to acquire locks and 32
> seconds to release locks. The test produced 199.95 TPS.

> For 8 lock partitions, it took 505 seconds to acquire locks and 31
> seconds to release locks. The test produced 201.16 TPS.

> For 16 lock partitions, it took 362 seconds to acquire locks and 22
> seconds to release locks. The test produced 200.75 TPS.

> And, just for grins, using 128 buffer and 128 lock partitions, took
235
> seconds to acquire locks and 22 seconds to release locks. The test
> produced 203.24 TPS.

[ itch... ] I can't help thinking there's something wrong with this;
the wait-time measurements seem sane, but why is there essentially no
change in the TPS result?

The above numbers are only for the lock-partition LWLocks, right?
What are the totals --- that is, how much time is spent blocked
vs. processing overall?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 17:16:11
Message-ID: 45098E5B.6070802@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Tom Lane wrote:
>>> It would be nice to see some results from the OSDL tests with, say, 4,
>>> 8, and 16 lock partitions before we forget about the point though.
>>> Anybody know whether OSDL is in a position to run tests for us?
>
>> Yeah, I can run some dbt2 tests in the lab. I'll get started on it.
>> We're still a little bit away from getting the automated testing for
>> PostgreSQL going again though.
>
> Great, thanks. The thing to twiddle is LOG2_NUM_LOCK_PARTITIONS in
> src/include/storage/lwlock.h. You need a full backend recompile
> after changing it, but you shouldn't need to initdb, if that helps.

Sorry for the delay but looks like there's some data coming in. It also
looks like my kit is starting to be a little dated. My stored libpq
calls are failing. I'm getting this message:

ERROR: record type has not been registered

From PQerrorMessage() on line 41 from this bit of code:

/* Create the query and execute it. */
sprintf(stmt, "SELECT * FROM order_status(%d, %d, %d, '%s')",
data->c_id, data->c_w_id, data->c_d_id, data->c_last);
res = PQexec(dbc->conn, stmt);
if (!res || (PQresultStatus(res) != PGRES_COMMAND_OK &&
PQresultStatus(res) != PGRES_TUPLES_OK)) {
LOG_ERROR_MESSAGE("%s", PQerrorMessage(dbc->conn));
PQclear(res);
return ERROR;
}

LOG_ERROR_MESSAGE() is just a macro for a function that does a printf().

Any suggestions?

Thanks,
Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 18:05:15
Message-ID: 1198.1158257115@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Sorry for the delay but looks like there's some data coming in. It also
> looks like my kit is starting to be a little dated. My stored libpq
> calls are failing. I'm getting this message:

> ERROR: record type has not been registered

This is a server-side failure --- could we see how order_status()
is defined? What PG version are you testing exactly?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 18:13:33
Message-ID: 45099BCD.1040402@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Sorry for the delay but looks like there's some data coming in. It also
>> looks like my kit is starting to be a little dated. My stored libpq
>> calls are failing. I'm getting this message:
>
>> ERROR: record type has not been registered
>
> This is a server-side failure --- could we see how order_status()
> is defined? What PG version are you testing exactly?

I took pgsqsl snapshot from cvs on Sept 11. Due to the length of the
file that order_status() is in and of order_status() itself, here's is a
url for the file in the svn repository. order_status() is defined
starting on line 710:

http://svn.sourceforge.net/viewvc/osdldbt/trunk/dbt2/storedproc/pgsql/c/funcs.c?view=markup

Thanks,
Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 18:34:51
Message-ID: 1452.1158258891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Tom Lane wrote:
>> This is a server-side failure --- could we see how order_status()
>> is defined? What PG version are you testing exactly?

> I took pgsqsl snapshot from cvs on Sept 11. Due to the length of the
> file that order_status() is in and of order_status() itself, here's is a
> url for the file in the svn repository. order_status() is defined
> starting on line 710:

> http://svn.sourceforge.net/viewvc/osdldbt/trunk/dbt2/storedproc/pgsql/c/funcs.c?view=markup

Hmph. I think we broke something --- the error implies that some
function tried to return a tuple that hadn't been properly "blessed",
but I can't see that order_status or any of the other functions in that
file are doing anything wrong. In any case, if it used to work for you,
we had better figure out exactly why it stopped working.

I know you've posted info before on how to set up and run the dbt code,
but could you refresh my memory? Is there a URL somewhere with the info?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 18:47:15
Message-ID: 4509A3B3.8030200@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Tom Lane wrote:
>>> This is a server-side failure --- could we see how order_status()
>>> is defined? What PG version are you testing exactly?
>
>> I took pgsqsl snapshot from cvs on Sept 11. Due to the length of the
>> file that order_status() is in and of order_status() itself, here's is a
>> url for the file in the svn repository. order_status() is defined
>> starting on line 710:
>
>> http://svn.sourceforge.net/viewvc/osdldbt/trunk/dbt2/storedproc/pgsql/c/funcs.c?view=markup
>
> Hmph. I think we broke something --- the error implies that some
> function tried to return a tuple that hadn't been properly "blessed",
> but I can't see that order_status or any of the other functions in that
> file are doing anything wrong. In any case, if it used to work for you,
> we had better figure out exactly why it stopped working.
>
> I know you've posted info before on how to set up and run the dbt code,
> but could you refresh my memory? Is there a URL somewhere with the info?

Here's a readme:
http://svn.sourceforge.net/viewvc/osdldbt/trunk/dbt2/README-POSTGRESQL?view=markup

But perhaps something much easier, using subversion:

mkdir /mnt/dbt2 # for pgdata
svn co https://svn.sourceforge.net/svnroot/osdldbt/trunk/dbt2 dbt2
cd dbt2
./configure --with-postgresql=<pgsql_dir>
make
cd scripts/pgsql/
./build_db -g -w 1
cd ..
./run_workload -w 1 -d 120 -c 20 -n

I think that should work. That will create a 1 warehouse database and
run a 120 second test with no-thinktimes. That should be sufficient to
run through each transaction more than several times.

Thanks,
Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:03:52
Message-ID: 1824.1158260632@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> But perhaps something much easier, using subversion:

> mkdir /mnt/dbt2 # for pgdata
> svn co https://svn.sourceforge.net/svnroot/osdldbt/trunk/dbt2 dbt2
> cd dbt2
> ./configure --with-postgresql=<pgsql_dir>

configure is not in the svn checkout. I guessed that I needed to do
aclocal/automake/autoconf, but automake fails:

$ automake
configure.ac:11: required file `config.h.in' not found
$

and I don't see anyplace to get config.h.in from.

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:05:13
Message-ID: 4509A7E9.5000207@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> But perhaps something much easier, using subversion:
>
>> mkdir /mnt/dbt2 # for pgdata
>> svn co https://svn.sourceforge.net/svnroot/osdldbt/trunk/dbt2 dbt2
>> cd dbt2
>> ./configure --with-postgresql=<pgsql_dir>
>
> configure is not in the svn checkout. I guessed that I needed to do
> aclocal/automake/autoconf, but automake fails:
>
> $ automake
> configure.ac:11: required file `config.h.in' not found
> $
>
> and I don't see anyplace to get config.h.in from.

Oops! 'autoreconf --install' is what I run to generate all that stuff.
Sorry about that.

Thanks,
Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:23:48
Message-ID: 2141.1158261828@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Oops! 'autoreconf --install' is what I run to generate all that stuff.

Ah, better. I see at least part of the problem:

CREATE OR REPLACE FUNCTION stock_level (INTEGER, INTEGER, INTEGER) RETURNS INTEGER AS '/home/tgl/dbt2/storedproc/pgsql/c/../../../storedproc/pgsql/c/funcs' LANGUAGE C STRICT;
psql:/home/tgl/dbt2/scripts/pgsql/../../storedproc/pgsql/c/stock_level.sql:7: ERROR: incompatible library "/home/tgl/dbt2/storedproc/pgsql/c/../../../storedproc/pgsql/c/funcs.so": missing magic block
HINT: Extension libraries are now required to use the PG_MODULE_MAGIC macro.

You need to add something like this to funcs.c:

#include <executor/spi.h> /* this should include most necessary APIs */
#include <executor/executor.h> /* for GetAttributeByName() */
#include <funcapi.h> /* for returning set of rows in order_status */
+
+ #ifdef PG_MODULE_MAGIC
+ PG_MODULE_MAGIC;
+ #endif

/*
#define DEBUG

With that change, I didn't see run_workload report any errors, but maybe
I don't know where to look. I'm not sure how this bug could have led to
a "type not registered" error ... the query should've just failed outright.

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:28:41
Message-ID: 4509AD69.3030704@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Oops! 'autoreconf --install' is what I run to generate all that stuff.
>
> Ah, better. I see at least part of the problem:
>
> CREATE OR REPLACE FUNCTION stock_level (INTEGER, INTEGER, INTEGER) RETURNS INTEGER AS '/home/tgl/dbt2/storedproc/pgsql/c/../../../storedproc/pgsql/c/funcs' LANGUAGE C STRICT;
> psql:/home/tgl/dbt2/scripts/pgsql/../../storedproc/pgsql/c/stock_level.sql:7: ERROR: incompatible library "/home/tgl/dbt2/storedproc/pgsql/c/../../../storedproc/pgsql/c/funcs.so": missing magic block
> HINT: Extension libraries are now required to use the PG_MODULE_MAGIC macro.
>
> You need to add something like this to funcs.c:
>
> #include <executor/spi.h> /* this should include most necessary APIs */
> #include <executor/executor.h> /* for GetAttributeByName() */
> #include <funcapi.h> /* for returning set of rows in order_status */
> +
> + #ifdef PG_MODULE_MAGIC
> + PG_MODULE_MAGIC;
> + #endif
>
> /*
> #define DEBUG
>
> With that change, I didn't see run_workload report any errors, but maybe
> I don't know where to look. I'm not sure how this bug could have led to
> a "type not registered" error ... the query should've just failed outright.

The error is captured in dbt2/scripts/output/*/client/error.log, where *
is the run directory.

Ok, I'll give it a shot on my system.

Thanks,
Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:32:48
Message-ID: 2345.1158262368@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Tom Lane wrote:
>> With that change, I didn't see run_workload report any errors, but maybe
>> I don't know where to look.

> The error is captured in dbt2/scripts/output/*/client/error.log, where *
> is the run directory.

Hm ... here's what I see in there:

Thu Sep 14 15:19:16 2006
tid:-1430387232 client.c:129
20 DB worker threads have started
Thu Sep 14 15:19:31 2006
tid:1087957312 libpq/dbc_new_order.c:111
ERROR: deadlock detected
DETAIL: Process 5334 waits for ShareLock on transaction 3505055; blocked by process 5363.
Process 5363 waits for ShareLock on transaction 3505049; blocked by process 5334.
CONTEXT: SQL statement "UPDATE stock
SET s_quantity = s_quantity - 10
WHERE s_i_id = 48368
AND s_w_id = 1"

Thu Sep 14 15:21:18 2006
tid:1089960256 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1089689920 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090636096 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090230592 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090365760 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090095424 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1089825088 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090500928 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090906432 client_interface.c:33
socket closed on _receive
Thu Sep 14 15:21:18 2006
tid:1090771264 client_interface.c:33
socket closed on _receive

Is the deadlock failure expected?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 19:45:29
Message-ID: 4509B159.9080905@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Tom Lane wrote:
>>> With that change, I didn't see run_workload report any errors, but maybe
>>> I don't know where to look.
>
>> The error is captured in dbt2/scripts/output/*/client/error.log, where *
>> is the run directory.
>
> Hm ... here's what I see in there:
>
> Thu Sep 14 15:19:16 2006
> tid:-1430387232 client.c:129
> 20 DB worker threads have started
> Thu Sep 14 15:19:31 2006
> tid:1087957312 libpq/dbc_new_order.c:111
> ERROR: deadlock detected
> DETAIL: Process 5334 waits for ShareLock on transaction 3505055; blocked by process 5363.
> Process 5363 waits for ShareLock on transaction 3505049; blocked by process 5334.
> CONTEXT: SQL statement "UPDATE stock
> SET s_quantity = s_quantity - 10
> WHERE s_i_id = 48368
> AND s_w_id = 1"
[snip]
>
> Is the deadlock failure expected?

Ooh, that's interesting. Deadlock failure is possible although I think
we would all prefer that it didn't happen. In the scheme of the
workload having failed transactions is ok. So with respect to having an
invalid test run it's something I wouldn't worry about too much if it's
infrequent.

Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-14 23:45:24
Message-ID: 16984.1158277524@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Curious, I'm still seeing the same behavior. Maybe I'll take another
> snapshot from CVS.

Hm, maybe I need to try a bit harder here. Does the "not registered"
error happen immediately/reliably for you, or do you need to run the
test awhile?

> As for the deadlock issue you mention I've been told
> I have some seeding/random number generation problems in the kit.
> Perhaps that is related to the deadlock at least.

The nature of the locks suggests that you have two transactions trying
to update the same two rows in opposite orders. The usually recommended
fix is to ensure you use a consistent processing order within a
transaction (eg, ascending primary keys --- but any well-defined row
ordering will work).

regards, tom lane


From: markw(at)osdl(dot)org
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Mark Wong" <markw(at)osdl(dot)org>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-15 00:32:24
Message-ID: 54967.63.227.209.236.1158280344.squirrel@www.osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Mark Wong <markw(at)osdl(dot)org> writes:
>> Curious, I'm still seeing the same behavior. Maybe I'll take another
>> snapshot from CVS.
>
> Hm, maybe I need to try a bit harder here. Does the "not registered"
> error happen immediately/reliably for you, or do you need to run the
> test awhile?

It appears to occur for every order status transaction. You can monitor
the transactions by watching dbt2/scripts/output/*/driver/mix.log. A 'o'
(lowercase) indicates a successful order status transaction while a 'O'
(uppercase) indiciates an unsuccessful transaction.

or, perhaps a simplier way is to start the database then:

cd dbt2/src
./transaction_test -d localhost -l 5432 -t o

That will connect directly to the database and execute an order status
transaction.

Thanks,
Mark


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-19 20:32:43
Message-ID: 451053EB.8030201@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Curious, I'm still seeing the same behavior. Maybe I'll take another
>> snapshot from CVS.
>
> Hm, maybe I need to try a bit harder here. Does the "not registered"
> error happen immediately/reliably for you, or do you need to run the
> test awhile?

I did a gross test and my kit appears broken between the 8.0 and 8.1
releases. I'll try to narrow down the exact date.

Mark


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-20 17:20:25
Message-ID: 45117859.4090507@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong wrote:
> Tom Lane wrote:
>> Mark Wong <markw(at)osdl(dot)org> writes:
>>> Curious, I'm still seeing the same behavior. Maybe I'll take another
>>> snapshot from CVS.
>>
>> Hm, maybe I need to try a bit harder here. Does the "not registered"
>> error happen immediately/reliably for you, or do you need to run the
>> test awhile?
>
> I did a gross test and my kit appears broken between the 8.0 and 8.1
> releases. I'll try to narrow down the exact date.

I've narrowed it down between cvs pulls from Dec 14, 2005 and Dec 15,
2005. Does the attached diff appear to be a plausible cause?

Thanks,
Mark

Attachment Content-Type Size
pgsql-2005-12-14to15.patch text/plain 8.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-20 17:37:36
Message-ID: 11624.1158773856@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
>> I did a gross test and my kit appears broken between the 8.0 and 8.1
>> releases. I'll try to narrow down the exact date.

> I've narrowed it down between cvs pulls from Dec 14, 2005 and Dec 15,
> 2005. Does the attached diff appear to be a plausible cause?

No, not particularly. You sure about those dates?

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lock partitions
Date: 2006-09-20 17:47:46
Message-ID: 45117EC2.3020207@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>>> I did a gross test and my kit appears broken between the 8.0 and 8.1
>>> releases. I'll try to narrow down the exact date.
>
>> I've narrowed it down between cvs pulls from Dec 14, 2005 and Dec 15,
>> 2005. Does the attached diff appear to be a plausible cause?
>
> No, not particularly. You sure about those dates?

Ugh, double checking tells me I'm wrong. I'll keep testing.

Thanks,
Mark


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-10-18 15:47:19
Message-ID: 45364C87.8000808@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> I see this in the CVS commits for 8.2. Did we determine the proper
>> number of lock partitions? Should it be based on the number of buffers
>> or concurrent sessions allowed?
>
> No. NUM_LOCK_PARTITIONS needs to be a compile-time constant for a
> number of reasons, and there is absolutely zero evidence to justify
> making any effort (and spending any cycles) on a variable value.
>
> It would be nice to see some results from the OSDL tests with, say, 4,
> 8, and 16 lock partitions before we forget about the point though.
> Anybody know whether OSDL is in a position to run tests for us?

I have a couple of bigger runs now against a CVS checkout on 2006-09-20
(please forgive my NUM_BUFFER_PARTITIONS note if you notice that on the
web pages):

Baseline (default NUM_LOCK_PARTITIONS=4):
notpm 6589
http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/184/

NUM_LOCK_PARTITIONS=8:
notpm 4471
http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/180/

NUM_LOCK_PARTITIONS=16:
Failed to run.

The number of transaction errors increased when I increased the
NUM_LOCK_PARTITIONS, which I think is the reason it failed to run when I
set it to 16. And the throughput went down significantly (32%). Should
I try against a more recent checkout?

Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-10-18 16:47:46
Message-ID: 27780.1161190066@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> The number of transaction errors increased when I increased the
> NUM_LOCK_PARTITIONS, which I think is the reason it failed to run when I
> set it to 16.

Hmm, what sort of errors are we talking about? I wonder if you've
exposed a bug. Changing NUM_LOCK_PARTITIONS really shouldn't have any
semantic effect.

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-10-18 16:54:08
Message-ID: 45365C30.9000008@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> The number of transaction errors increased when I increased the
>> NUM_LOCK_PARTITIONS, which I think is the reason it failed to run when I
>> set it to 16.
>
> Hmm, what sort of errors are we talking about? I wonder if you've
> exposed a bug. Changing NUM_LOCK_PARTITIONS really shouldn't have any
> semantic effect.

The libpq client (error log:
http://dbt.osdl.org/dbt/dbt2dev/results/dev4-015/180/client/error.log)
is saying things like this:

ERROR: too many LWLocks taken
CONTEXT: SQL statement "DELETE FROM new_order
WHERE no_o_id = 2101
AND no_w_id = 349
AND no_d_id = 1"

A grep through that file shows that all the unexpected errors appear to
be due to "too many LWLocks taken".

Mark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Wong <markw(at)osdl(dot)org>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-10-18 17:18:41
Message-ID: 28122.1161191921@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Wong <markw(at)osdl(dot)org> writes:
> Tom Lane wrote:
>> Hmm, what sort of errors are we talking about?

> ERROR: too many LWLocks taken

That really shouldn't happen ... are you sure you did a full recompile
after changing NUM_LOCK_PARTITIONS?

Actually ... wait a moment. The default value of NUM_LOCK_PARTITIONS
is already 16 (1 << LOG2_NUM_LOCK_PARTITIONS where the latter is 4).
Are you saying you set LOG2_NUM_LOCK_PARTITIONS to 16? That would be
way too many partitions. I was thinking of testing
LOG2_NUM_LOCK_PARTITIONS in the range of about 2 to 5.

regards, tom lane


From: Mark Wong <markw(at)osdl(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Lock partitions
Date: 2006-10-18 17:22:41
Message-ID: 453662E1.3080802@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Wong <markw(at)osdl(dot)org> writes:
>> Tom Lane wrote:
>>> Hmm, what sort of errors are we talking about?
>
>> ERROR: too many LWLocks taken
>
> That really shouldn't happen ... are you sure you did a full recompile
> after changing NUM_LOCK_PARTITIONS?
>
> Actually ... wait a moment. The default value of NUM_LOCK_PARTITIONS
> is already 16 (1 << LOG2_NUM_LOCK_PARTITIONS where the latter is 4).
> Are you saying you set LOG2_NUM_LOCK_PARTITIONS to 16? That would be
> way too many partitions. I was thinking of testing
> LOG2_NUM_LOCK_PARTITIONS in the range of about 2 to 5.

Oops, I can't read bit shifting. =p I'll do again.

Mark