Re: Strange pgsql crash on MacOSX

Lists: pgsql-hackers
From: Shane Ambler <pgsql(at)007Marketing(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Strange pgsql crash on MacOSX
Date: 2006-12-23 02:47:27
Message-ID: 458C98BF.8080101@007Marketing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have a dual G4 1.25Ghz with 2GB RAM running Mac OSX 10.4.8 and
PostgreSQL 8.2.0

This only happened to me today and with everything I have tried it
always happens now - had been running fine before.

The only thing I can think of that has changed in the last few days is I
have installed the last 2 security updates from Apple and the X11 update
(X11 1.1.3) that Apple released a while ago -

http://www.apple.com/support/downloads/securityupdate2006008ppc.html
http://www.apple.com/support/downloads/securityupdate20060071048clientppc.html

the first one I can't see having anything to do with postgres as it is I
believe only updating Java. The other one updates a few different areas
and may be the culprit.

I can't think of anything else I have changed just recently - certainly
not in the last couple of days.

To test and try and track down the cause I have restarted my machine
then started by unzipping the 8.2.0 released source and done the
following steps (this example is with clean data files and everything
default - the startup script has been there a while and using pg_ctl
instead makes no difference) make check passes all test -

./configure --prefix=/usr/local/pgsql
make check
sudo make install
cd /usr/local/pgsql
sudo mkdir data
sudo chown pgsql:pgsql data
sudo chmod 700 data
sudo -u pgsql /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
sudo /Library/StartupItems/PostgreSQL/PostgreSQL start

Then I get the following -
[devbox:~] shane% psql
Welcome to psql 8.2.0, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

postgres=# \q
psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
for freed object - object was probably modified after being freed, break
at szone_error to debug
psql(24931) malloc: *** set a breakpoint in szone_error to debug
Segmentation fault
[devbox:~] shane%

The serverlog gives me -
[devbox:local/pgsql/data] root# cat serverlog
LOG: database system was shut down at 2006-12-23 12:27:44 CST
LOG: checkpoint record is at 0/42BEB8
LOG: redo record is at 0/42BEB8; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 0/593; next OID: 10820
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system is ready

Apple's crashreporter gives me -

Date/Time: 2006-12-23 12:28:21.499 +1030
OS Version: 10.4.8 (Build 8L127)
Report Version: 4

Command: psql
Path: /usr/local/pgsql/bin/psql
Parent: tcsh [294]

Version: ??? (???)

PID: 24931
Thread: 0

Exception: EXC_BAD_ACCESS (0x0001)
Codes: KERN_INVALID_ADDRESS (0x0001) at 0x3430616b

Thread 0 Crashed:
0 libSystem.B.dylib 0x90006cd8 szone_free + 3148
1 libSystem.B.dylib 0x900152d0 fclose + 176
2 libedit.2.dylib 0x96b5c334 history_end + 1632
3 libedit.2.dylib 0x96b5c7bc history + 468
4 libedit.2.dylib 0x96b5ec58 write_history + 84
5 psql 0x00008350 saveHistory + 208
6 psql 0x00008428 finishInput + 120
7 libSystem.B.dylib 0x90014578 __cxa_finalize + 260
8 libSystem.B.dylib 0x90014444 exit + 36
9 psql 0x00001d00 _start + 764
10 psql 0x00001a00 start + 48

Thread 0 crashed with PPC Thread State 64:
srr0: 0x0000000090006cd8 srr1: 0x000000000000d030
vrsave: 0x0000000000000000
cr: 0x42002444 xer: 0x0000000020000001 lr:
0x0000000090006ca4 ctr: 0x00000000900143a0
r0: 0x0000000090006ca4 r1: 0x00000000bffff610 r2:
0x0000000042002442 r3: 0x000000000000000d
r4: 0x0000000000000000 r5: 0x000000000000000d r6:
0x0000000080808080 r7: 0x0000000000000003
r8: 0x0000000039333100 r9: 0x00000000bffff545 r10:
0x0000000000000000 r11: 0x0000000042002442
r12: 0x00000000900143a0 r13: 0x0000000000000000 r14:
0x0000000000000000 r15: 0x0000000000000000
r16: 0x0000000000000000 r17: 0x0000000000000052 r18:
0x0000000000000400 r19: 0x0000000000000054
r20: 0x00000000020000a4 r21: 0x000000000180a800 r22:
0x00000000a0001fac r23: 0x00000000020000a8
r24: 0x0000000000000002 r25: 0x0000000000000002 r26:
0x0000000000000001 r27: 0x0000000034306167
r28: 0x0000000001800000 r29: 0x000000000180a400 r30:
0x000000002e616767 r31: 0x00000000900060a0

Binary Images Description:
0x1000 - 0x36fff psql /usr/local/pgsql/bin/psql
0x3f000 - 0x54fff libpq.5.dylib /usr/local/pgsql/lib/libpq.5.dylib
0x8fe00000 - 0x8fe51fff dyld 45.3 /usr/lib/dyld
0x90000000 - 0x901bcfff libSystem.B.dylib /usr/lib/libSystem.B.dylib
0x90214000 - 0x90219fff libmathCommon.A.dylib
/usr/lib/system/libmathCommon.A.dylib
0x9110f000 - 0x9111dfff libz.1.dylib /usr/lib/libz.1.dylib
0x969c3000 - 0x969f1fff libncurses.5.4.dylib /usr/lib/libncurses.5.4.dylib
0x96b4d000 - 0x96b63fff libedit.2.dylib /usr/lib/libedit.2.dylib

Model: PowerMac3,6, BootROM 4.4.8f2, 2 processors, PowerPC G4 (3.2),
1.25 GHz, 2 GB
Graphics: NVIDIA GeForce4 MX, GeForce4 MX, AGP, 32 MB
Memory Module: DIMM0/J21, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM1/J22, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM2/J23, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM3/J20, 512 MB, DDR SDRAM, PC2600U-25330
AirPort: AirPort, 9.52
Network Service: Built-in Ethernet, Ethernet, en0
PCI Card: pci-bridge, pci, SLOT-3
PCI Card: firewire, ieee1394, 1x0
PCI Card: usb, usb, 1x1
PCI Card: usb, usb, 1x1
PCI Card: pci167e,225a, , 1x1
Parallel ATA Device: LITE-ON DVD SOHD-167T,
Parallel ATA Device: WDC WD1200JB-00FUA0, 111.79 GB
Parallel ATA Device: IBM-IC35L120AVVA07-0, 115.04 GB
USB Device: Apple Pro Keyboard, Mitsumi Electric, Up to 1.5 Mb/sec, 500 mA
USB Device: i350, Canon, Up to 12 Mb/sec, 500 mA
FireWire Device: unknown_device, unknown_value, Up to 400 Mb/sec

--

Shane Ambler
pgSQL(at)007Marketing(dot)com

Get Sheeky @ http://Sheeky.Biz


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shane Ambler <pgsql(at)007Marketing(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange pgsql crash on MacOSX
Date: 2006-12-23 04:20:19
Message-ID: 6585.1166847619@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shane Ambler <pgsql(at)007Marketing(dot)com> writes:
> postgres=# \q
> psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
> for freed object - object was probably modified after being freed, break
> at szone_error to debug
> psql(24931) malloc: *** set a breakpoint in szone_error to debug
> Segmentation fault

I think we've seen something like this before in connection with
readline/libedit follies. Does the crash go away if you invoke
psql with "-n" option? If so, exactly which version of readline or
libedit are you using?

FWIW, I do not see this on a fully up-to-date 10.4.8 G4 laptop.
I see

$ ls -l /usr/lib/libedit*
-rwxr-xr-x 1 root wheel 112404 Sep 29 20:59 /usr/lib/libedit.2.dylib
lrwxr-xr-x 1 root wheel 15 Apr 26 2006 /usr/lib/libedit.dylib -> libedit.2.dylib
$

so it seems that Apple did update libedit not too long ago ...

regards, tom lane


From: Shane Ambler <pgsql(at)007Marketing(dot)com>
To: Shane Ambler <pgsql(at)007Marketing(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange pgsql crash on MacOSX
Date: 2006-12-23 11:07:30
Message-ID: 458D0DF2.3030804@007Marketing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shane Ambler wrote:
> Tom Lane wrote:
>> Shane Ambler <pgsql(at)007Marketing(dot)com> writes:
>>> postgres=# \q
>>> psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
>>> for freed object - object was probably modified after being freed, break
>>> at szone_error to debug
>>> psql(24931) malloc: *** set a breakpoint in szone_error to debug
>>> Segmentation fault
>>
>> I think we've seen something like this before in connection with
>> readline/libedit follies. Does the crash go away if you invoke
>> psql with "-n" option? If so, exactly which version of readline or
>> libedit are you using?

>
> psql -n stops the error.
>

I just found out the problem.

psql_history - I had tried to copy from a text file earlier that was
utf8 and came up with some errors, I guess these got into the history
file and stuffed it up.

Renamed it so it created a new one and all is fine now.

--

Shane Ambler
pgSQL(at)007Marketing(dot)com

Get Sheeky @ http://Sheeky.Biz


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shane Ambler <pgsql(at)007Marketing(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Strange pgsql crash on MacOSX
Date: 2006-12-23 16:25:15
Message-ID: 12706.1166891115@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shane Ambler <pgsql(at)007Marketing(dot)com> writes:
> I just found out the problem.
> psql_history - I had tried to copy from a text file earlier that was
> utf8 and came up with some errors, I guess these got into the history
> file and stuffed it up.

Hm, so the question is: is it our bug or Apple's? If you kept the
busted history file, would you be willing to send me a copy?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shane Ambler <pgsql(at)007Marketing(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Strange pgsql crash on MacOSX
Date: 2006-12-28 18:23:36
Message-ID: 21538.1167330216@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shane Ambler <pgsql(at)007Marketing(dot)com> writes:
> Tom Lane wrote:
>> Hm, so the question is: is it our bug or Apple's? If you kept the
>> busted history file, would you be willing to send me a copy?

> The zip file attached has the psql_history file that crashes when
> quiting but doesn't appear to contain the steps I done when it first
> crashed.

So the answer is: it's Apple's bug, or at least not ours. libedit
contains a typo that causes it to potentially fail when saving strings
exceeding 256 bytes. Check out this code (around line 730 in history.c):

len = strlen(ev.str) * 4;
if (len >= max_size) {
char *nptr;
max_size = (len + 1023) & 1023;
nptr = h_realloc(ptr, max_size);

I think the intent of the max_size recalculation is to select the next
1K boundary larger than "len", but it actually produces a number *less*
than 1K. Probably "(len + 1023) & ~1023" was meant ... but even that
is wrong if len is exactly a multiple of 1024, because it will fail to
round up. So the buffer is realloc'd too small, and that results in
a potential memory clobber if the history entry is less than 1K, and a
guaranteed clobber if it's more.

The source code available from Apple shows that they got this code from
NetBSD originally

/* $NetBSD: history.c,v 1.25 2003/10/18 23:48:42 christos Exp $ */

so this may well be a pretty generic *BSD bug. Anyone clear on who to
report it to? I have no idea if libedit is an independent project...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shane Ambler <pgsql(at)007Marketing(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Strange pgsql crash on MacOSX
Date: 2006-12-28 23:36:18
Message-ID: 26341.1167348978@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> The source code available from Apple shows that they got this code from
> NetBSD originally
> /* $NetBSD: history.c,v 1.25 2003/10/18 23:48:42 christos Exp $ */
> so this may well be a pretty generic *BSD bug. Anyone clear on who to
> report it to? I have no idea if libedit is an independent project...

Some digging in the NetBSD CVS shows that they found both parts of this
bug more than two years ago:

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libedit/history.c.diff?r1=1.25&r2=1.27&f=h

so the short and sweet answer is that Apple is behind the times.

regards, tom lane