Re: [PERFORM] pgbench to the MAXINT

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] pgbench to the MAXINT
Date: 2012-12-21 06:16:12
Message-ID: CABwTF4WdTZuooQVHSOSLrWWfyMQeBcGwLKzSBSsTuQxVY8hDTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Wed, Feb 16, 2011 at 8:15 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:

> Tom Lane wrote:
>
>> I think that might be a good idea --- it'd reduce the cross-platform
>> variability of the results quite a bit, I suspect. random() is not
>> to be trusted everywhere, but I think erand48 is pretty much the same
>> wherever it exists at all (and src/port/ provides it elsewhere).
>>
>>
>
> Given that pgbench will run with threads in some multi-worker
> configurations, after some more portability research I think odds are good
> we'd get nailed by http://sourceware.org/**bugzilla/show_bug.cgi?id=10320<http://sourceware.org/bugzilla/show_bug.cgi?id=10320>: "erand48 implementation not thread safe but POSIX says it should be".
> The AIX docs have a similar warning on them, so who knows how many
> versions of that library have the same issue.
>
> Maybe we could make sure the one in src/port/ is thread safe and make sure
> pgbench only uses it. This whole area continues to be messy enough that I
> think the patch needs to brew for another CF before it will all be sorted
> out properly. I'll mark it accordingly and can pick this back up later.
>

Hi Greg,

I spent some time rebasing this patch to current master. Attached is
the patch, based on master couple of commits old.

Your concern of using erand48() has been resolved since pgbench now
uses thread-safe and concurrent pg_erand48() from src/port/.

The patch is very much what you had posted, except for a couple of
differences due to bit-rot. (i) I didn't have to #define MAX_RANDOM_VALUE64
since its cousin MAX_RANDOM_VALUE is not used by code anymore, and (ii) I
used ternary operator in DDLs[] array to decide when to use bigint vs int
columns.

Please review.

As for tests, I am currently running 'pgbench -i -s 21474' using
unpatched pgbench, and am recording the time taken;Scale factor 21475 had
actually failed to do anything meaningful using unpatched pgbench. Next
I'll run with '-s 21475' on patched version to see if it does the right
thing, and in acceptable time compared to '-s 21474'.

What tests would you and others like to see, to get some confidence in
the patch? The machine that I have access to has 62 GB RAM, 16-core
64-hw-threads, and about 900 GB of disk space.

Linux <host> 3.2.6-3.fc16.ppc64 #1 SMP Fri Feb 17 21:41:20 UTC 2012 ppc64
ppc64 ppc64 GNU/Linux

Best regards,

PS: The primary source of patch is this branch:
https://github.com/gurjeet/postgres/tree/64bit_pgbench
--
Gurjeet Singh

http://gurjeet.singh.im/

Attachment Content-Type Size
pgbencg-64-v6.patch application/octet-stream 8.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-12-21 06:18:13 Re: ThisTimeLineID in checkpointer and bgwriter processes
Previous Message Jeff Janes 2012-12-21 03:30:35 Set visibility map bit after HOT prune

Browse pgsql-performance by date

  From Date Subject
Next Message Ghislain ROUVIGNAC 2012-12-21 10:48:58 Re: Slow queries after vacuum analyze
Previous Message Jeff Janes 2012-12-21 05:15:17 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?