Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault

Lists: pgsql-bugs
From: "Bernt Marius Johnsen" <bernt(dot)johnsen(at)sun(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-28 18:53:07
Message-ID: 200910281853.n9SIr7xw050216@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5145
Logged by: Bernt Marius Johnsen
Email address: bernt(dot)johnsen(at)sun(dot)com
PostgreSQL version: 8.3.8
Operating system: Linux 2.6.27-14 (Ubuntu Interpid)
Description: Complex query with lots of LEFT JOIN causes segfault
Details:

The below query generated by the Random Query Generator
(https://launchpad.net/randgen) causes a segfault. It was caused running

./gentest.pl --dsn=dbi:Pg:user=bernt --gendata --queries=100000 --threads=1
--grammar=/home/bernt/xx.yy

(I'll follow up with a mail with xx.yy as attachment)

Query:

SELECT * from B AS alias0 LEFT JOIN BB AS alias1 LEFT JOIN B
AS alias2 LEFT JOIN A AS alias3 LEFT JOIN AA AS alias4 LEFT JOIN B
AS alias5 ON alias4.int_key = alias5.int_key ON alias3.int_key =
alias4.int_key LEFT JOIN AA AS alias6 LEFT JOIN A AS alias7 ON
alias6.int_key = alias7.int_key LEFT JOIN BB AS alias8 ON alias7.int_key
= alias8.int_key ON alias3.int_key = alias8.int_key LEFT JOIN AA AS
alias9 ON alias6.int_key = alias9.int_key ON alias2.int_key =
alias8.int_key LEFT JOIN BB AS alias10 LEFT JOIN AA AS alias11 LEFT
JOIN B AS alias12 ON alias11.int_key = alias12.int_key ON alias10.int_key
= alias11.int_key ON alias9.int_key = alias10.int_key ON alias1.int_key =
alias8.int_key LEFT JOIN BB AS alias13 LEFT JOIN A AS alias14
LEFT JOIN AA AS alias15 LEFT JOIN A AS alias16 ON alias15.int_key =
alias16.int_key LEFT JOIN B AS alias17 ON alias15.int_key =
alias17.int_key ON alias14.int_key = alias16.int_key LEFT JOIN AA AS
alias18 ON alias14.int_key = alias18.int_key LEFT JOIN B AS alias19 ON
alias15.int_key = alias19.int_key LEFT JOIN AA AS alias20 ON
alias16.int_key = alias20.int_key ON alias13.int_key = alias19.int_key
LEFT JOIN A AS alias21 ON alias13.int_key = alias21.int_key ON
alias3.int_key = alias17.int_key LEFT JOIN B AS alias22 ON alias7.int_key
= alias22.int_key LEFT JOIN A AS alias23 ON alias20.int_key =
alias23.int_key LEFT JOIN A AS alias24 ON alias14.int_key =
alias24.int_key LEFT JOIN BB AS alias25 LEFT JOIN BB AS alias26 ON
alias25.int_key = alias26.int_key LEFT JOIN A AS alias27 LEFT JOIN
A AS alias28 ON alias27.int_key = alias28.int_key LEFT JOIN B AS alias29
LEFT JOIN BB AS alias30 LEFT JOIN B AS alias31 LEFT JOIN A AS
alias32 LEFT JOIN B AS alias33 ON alias32.int_key = alias33.int_key LEFT
JOIN A AS alias34 ON alias32.int_key = alias34.int_key ON alias31.int_key
= alias33.int_key ON alias30.int_key = alias33.int_key ON alias29.int_key
= alias34.int_key ON alias27.int_key = alias34.int_key LEFT JOIN AA AS
alias35 LEFT JOIN A AS alias36 ON alias35.int_key = alias36.int_key ON
alias34.int_key = alias36.int_key LEFT JOIN A AS alias37 ON
alias33.int_key = alias37.int_key ON alias25.int_key = alias32.int_key
LEFT JOIN A AS alias38 ON alias37.int_key = alias38.int_key ON
alias15.int_key = alias37.int_key ON alias0.int_key = alias9.int_key


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Bernt Marius Johnsen <bernt(dot)johnsen(at)sun(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-29 14:17:36
Message-ID: 4AE9A400.7030702@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bernt Marius Johnsen escreveu:
> The below query generated by the Random Query Generator
> (https://launchpad.net/randgen) causes a segfault. It was caused running
>
Could you get a core dump and post the gdb backtrace?

$ ulimit -c unlimited
$ pg_ctl start
$ psql -c "<my query goes here>" mydb
$ gdb /path/to/postgres $PGDATA/core
(gdb) bt
.
.
.
(gdb) quit

--
Euler Taveira de Oliveira
http://www.timbira.com/


From: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-29 15:09:57
Message-ID: 20091029150957.GA513@atum01.Norway.Sun.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

>>>>>>>>>>>> Euler Taveira de Oliveira wrote (2009-10-29 12:17:36):
> Bernt Marius Johnsen escreveu:
> > The below query generated by the Random Query Generator
> > (https://launchpad.net/randgen) causes a segfault. It was caused running
> >

Checkout the latest RQG from launchpad and run ./gentest.pl as shown
above (The lastest tarball misses a feature you need). xx.yy is
attached.

Run like this:
./gentest.pl --dsn=dbi:Pg:user=xxxx --gendata --queries=100000 --threads=1 --grammar=/path/to/xx.yy

> Could you get a core dump and post the gdb backtrace?
>
> $ ulimit -c unlimited
> $ pg_ctl start
> $ psql -c "<my query goes here>" mydb
> $ gdb /path/to/postgres $PGDATA/core
> (gdb) bt
> .
> .
> .
> (gdb) quit

We'll see next week If I can spare some time.

>
>
> --
> Euler Taveira de Oliveira
> http://www.timbira.com/

--
Bernt Marius Johnsen, Staff Engineer
Database Technology Group, Sun Microsystems, Trondheim, Norway

Attachment Content-Type Size
xx.yy text/plain 715 bytes

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-29 21:14:30
Message-ID: 4AEA05B6.9080203@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Bernt M. Johnsen wrote:
>>>>>>>>>>>>> Euler Taveira de Oliveira wrote (2009-10-29 12:17:36):
>> Bernt Marius Johnsen escreveu:
>>> The below query generated by the Random Query Generator
>>> (https://launchpad.net/randgen) causes a segfault. It was caused running
>>>
>
> Checkout the latest RQG from launchpad and run ./gentest.pl as shown
> above (The lastest tarball misses a feature you need). xx.yy is
> attached.
>
> Run like this:
> ./gentest.pl --dsn=dbi:Pg:user=xxxx --gendata --queries=100000 --threads=1 --grammar=/path/to/xx.yy
>
>
>
>> Could you get a core dump and post the gdb backtrace?
>>
>> $ ulimit -c unlimited
>> $ pg_ctl start
>> $ psql -c "<my query goes here>" mydb
>> $ gdb /path/to/postgres $PGDATA/core
>> (gdb) bt
>> .
>> .
>> .
>> (gdb) quit
>
> We'll see next week If I can spare some time.

I can easily reproduce the segfault on 8.4 and 8.5a2:

Program received signal SIGSEGV, Segmentation fault.
ExecHashJoinSaveTuple (tuple=0xb49c8870, hashvalue=3316173823,
fileptr=0x96185a8) at nodeHashjoin.c:775
775 BufFile *file = *fileptr;
(gdb) bt
#0 ExecHashJoinSaveTuple (tuple=0xb49c8870, hashvalue=3316173823,
fileptr=0x96185a8) at nodeHashjoin.c:775
#1 0x081cf21f in ExecHashJoin (node=0x88c6540) at nodeHashjoin.c:224
#2 0x081bd898 in ExecProcNode (node=0x88c6540) at execProcnode.c:427
#3 0x081bc445 in standard_ExecutorRun (queryDesc=0x875d22c,
direction=ForwardScanDirection, count=0) at execMain.c:1187
#4 0x0828215c in PortalRunSelect (portal=0x879197c, forward=1 '\001',
count=0, dest=0xb4f7bfb8) at pquery.c:953
#5 0x082834be in PortalRun (portal=0x879197c, count=2147483647,
isTopLevel=1 '\001', dest=0xb4f7bfb8, altdest=0xb4f7bfb8,
completionTag=0xbfe8ff9a "") at pquery.c:807
#6 0x0827f760 in exec_simple_query (
query_string=0x8751d3c " SELECT * from B AS alias0 LEFT JOIN
BB AS alias1 LEFT JOIN B AS alias2 LEFT JOIN A AS alias3
LEFT JOIN AA AS alias4 LEFT JOIN B AS alias5 ON alias4.int_key =
alias5.int_key O"...) at postgres.c:1000
#7 0x08280b8e in PostgresMain (argc=2, argv=0x86d7960,
username=0x86d7928 "mastermind") at postgres.c:3573
#8 0x082499be in ServerLoop () at postmaster.c:3366
#9 0x0824a9db in PostmasterMain (argc=3, argv=0x86d5a98) at
postmaster.c:1064
#10 0x081ea466 in main (argc=3, argv=0x86d5a98) at main.c:188

Stefan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-29 21:41:02
Message-ID: 25807.1256852462@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> I can easily reproduce the segfault on 8.4 and 8.5a2:

Doesn't crash here ... could we see the specific test data being used,
please?

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)sun(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 00:00:22
Message-ID: 407d949e0910291700s123fa568l1ae2d7506ea38f0e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Oct 29, 2009 at 2:14 PM, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> ExecHashJoinSaveTuple (tuple=0xb49c8870, hashvalue=3316173823,
> fileptr=0x96185a8) at nodeHashjoin.c:775
> 775             BufFile    *file = *fileptr;
>

That back trace doesn't make much sense. That function only has one
call site and it's called with fileptr set to an expression starting
with &. Ie, there's no way it can get null or an invalid pointer.

--
greg


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 06:21:17
Message-ID: 4AEA85DD.6010808@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> I can easily reproduce the segfault on 8.4 and 8.5a2:
>
> Doesn't crash here ... could we see the specific test data being used,
> please?

uploaded a dump of the dataset here:

http://www.kaltenbrunner.cc/files/rand_gen_data.sql

and the query that causes the segfault:

http://www.kaltenbrunner.cc/files/rand_gen_query.sql

Stefan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 15:34:03
Message-ID: 8701.1256916843@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> uploaded a dump of the dataset here:
> http://www.kaltenbrunner.cc/files/rand_gen_data.sql
> and the query that causes the segfault:
> http://www.kaltenbrunner.cc/files/rand_gen_query.sql

[ scratches head... ] Still no crash here, and I tried it on a couple
different types of hardware. What configure parameters are you using,
and what non-default postgresql.conf settings?

regards, tom lane


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 15:43:22
Message-ID: 4AEB099A.4080804@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> uploaded a dump of the dataset here:
>> http://www.kaltenbrunner.cc/files/rand_gen_data.sql
>> and the query that causes the segfault:
>> http://www.kaltenbrunner.cc/files/rand_gen_query.sql
>
> [ scratches head... ] Still no crash here, and I tried it on a couple
> different types of hardware. What configure parameters are you using,
> and what non-default postgresql.conf settings?

this is 8.5a2 configured with ./configure --enable-cassert
--enable-debug and just default settings(ie plain initdb with default
settings).
The OS is Debian Lenny/AMD64.

Stefan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 17:14:42
Message-ID: 11640.1256922882@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> this is 8.5a2 configured with ./configure --enable-cassert
> --enable-debug and just default settings(ie plain initdb with default
> settings).
> The OS is Debian Lenny/AMD64.

Huh. That should not be noticeably different from my F11/Xeon64
machine ... but I still can't make it crash. Somebody else is gonna
have to debug this one.

regards, tom lane


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 17:58:01
Message-ID: 4AEB2929.10704@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> this is 8.5a2 configured with ./configure --enable-cassert
>> --enable-debug and just default settings(ie plain initdb with default
>> settings).
>> The OS is Debian Lenny/AMD64.
>
> Huh. That should not be noticeably different from my F11/Xeon64
> machine ... but I still can't make it crash. Somebody else is gonna
> have to debug this one.

hmm sorry - that was actually Lenny/i386 however I fail to reproduce
this on some of my other test boxes as well though I don't have an idea
why it crashes on my laptop(and for the original reporter) but not
elsewhere :(

Stefan


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)Sun(dot)COM>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 18:22:53
Message-ID: 4AEB2EFD.5020402@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner wrote:
> Tom Lane wrote:
>> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>>> this is 8.5a2 configured with ./configure --enable-cassert
>>> --enable-debug and just default settings(ie plain initdb with
>>> default settings).
>>> The OS is Debian Lenny/AMD64.
>>
>> Huh. That should not be noticeably different from my F11/Xeon64
>> machine ... but I still can't make it crash. Somebody else is gonna
>> have to debug this one.
>
> hmm sorry - that was actually Lenny/i386 however I fail to reproduce
> this on some of my other test boxes as well though I don't have an idea
> why it crashes on my laptop(and for the original reporter) but not
> elsewhere :(

ok I now see why you (and I) failed to reproduce the problem - it only
causes clusters/databases to crash that were actually generated using
the upthread mentioned script. it does NOT fail using a dump generated
by a database that fails(!).
So the issue must be a bit more complex and somehow relate to some prior
stuff the script does.

Stefan


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)sun(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 18:26:26
Message-ID: 407d949e0910301126o32af6e9et60190ba6fc2de2d5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Oct 30, 2009 at 11:22 AM, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> ok I now see why you (and I) failed to reproduce the problem - it only
> causes clusters/databases to crash that were actually generated using the
> upthread mentioned script. it does NOT fail using a dump generated by a
> database that fails(!).
> So the issue must be a bit more complex and somehow relate to some prior
> stuff the script does.

Does it still crash if you compile with CFLAGS='-O0 -g' ? Could you
send a backtrace from that?

--
greg


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)sun(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 18:47:49
Message-ID: 4AEB34D5.2010202@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Greg Stark wrote:
> On Fri, Oct 30, 2009 at 11:22 AM, Stefan Kaltenbrunner
> <stefan(at)kaltenbrunner(dot)cc> wrote:
>> ok I now see why you (and I) failed to reproduce the problem - it only
>> causes clusters/databases to crash that were actually generated using the
>> upthread mentioned script. it does NOT fail using a dump generated by a
>> database that fails(!).
>> So the issue must be a bit more complex and somehow relate to some prior
>> stuff the script does.
>
> Does it still crash if you compile with CFLAGS='-O0 -g' ? Could you
> send a backtrace from that?

ok just assembled a new testcase from the querylog of the tool:

http://www.kaltenbrunner.cc/files/rand_gen_crash.sql

this crashes on i386 and results in something like:

ERROR: invalid memory alloc request size 8589934592

on an AMD64 host for me.

However the error seems to go away after an ANALYZE... so I wonder if
this is just another case of "if we missestimated the size of the
hashtable we are doomed"

Stefan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)sun(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 18:52:35
Message-ID: 14648.1256928755@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> ok just assembled a new testcase from the querylog of the tool:
> http://www.kaltenbrunner.cc/files/rand_gen_crash.sql

Ah, that works (or should I say fails). Will take a look.

> However the error seems to go away after an ANALYZE... so I wonder if
> this is just another case of "if we missestimated the size of the
> hashtable we are doomed"

Well, we shouldn't dump core in any case.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, "Bernt M(dot) Johnsen" <Bernt(dot)Johnsen(at)sun(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5145: Complex query with lots of LEFT JOIN causes segfault
Date: 2009-10-30 19:03:50
Message-ID: 14790.1256929430@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> However the error seems to go away after an ANALYZE... so I wonder if
> this is just another case of "if we missestimated the size of the
> hashtable we are doomed"

Actually, what seems to be happening on a 32-bit machine is that
ExecChooseHashTableSize sets nbatch = INT_MAX/2, and then when we
try to do

hashtable->outerBatchFile = (BufFile **)
palloc0(nbatch * sizeof(BufFile *));

the memory size calculation overflows to zero, so we get an empty
outerBatchFile array. So the fix is to make sure we limit the
number of batches to something sane, perhaps work_mem / sizeof(pointer).

regards, tom lane