Re: COPY FROM command v8.1.4

Lists: pgsql-admin
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "theman" <bitsandbytes88(at)hotmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: COPY FROM command v8.1.4
Date: 2006-09-22 21:59:31
Message-ID: 29595.1158962371@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

"theman" <bitsandbytes88(at)hotmail(dot)com> writes:
> lseek(10, 0, SEEK_END) = 913072128
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> lseek(10, 0, SEEK_END) = 913080320
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> lseek(10, 0, SEEK_END) = 913088512
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> lseek(10, 0, SEEK_END) = 913088512
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> lseek(10, 0, SEEK_END) = 913096704
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192

Boy, that looks like a smoking gun to me. We're dealing with a kernel
bug.

What's the platform here, including the exact kernel version? What
filesystem are you storing the database on?

regards, tom lane


From: Ray Stell <stellr(at)cns(dot)vt(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: theman <bitsandbytes88(at)hotmail(dot)com>, pgsql-admin(at)postgresql(dot)org
Subject: Re: COPY FROM command v8.1.4
Date: 2006-09-23 17:20:59
Message-ID: 20060923172059.GB4831@cns.vt.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin


Curious how you get to an OS kernel bug from these writes? Aren't we
missing the associated db reads in this trc data? Maybe there been
other data provided off the list? Would love to see it.

Thanks.

On Fri, Sep 22, 2006 at 05:59:31PM -0400, Tom Lane wrote:
> "theman" <bitsandbytes88(at)hotmail(dot)com> writes:
> > lseek(10, 0, SEEK_END) = 913072128
> > write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> > lseek(10, 0, SEEK_END) = 913080320
> > write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> > lseek(10, 0, SEEK_END) = 913088512
> > write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> > lseek(10, 0, SEEK_END) = 913088512
> > write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
> > lseek(10, 0, SEEK_END) = 913096704
> > write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
>
> Boy, that looks like a smoking gun to me. We're dealing with a kernel
> bug.
>
> What's the platform here, including the exact kernel version? What
> filesystem are you storing the database on?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ray Stell <stellr(at)cns(dot)vt(dot)edu>
Cc: theman <bitsandbytes88(at)hotmail(dot)com>, pgsql-admin(at)postgresql(dot)org
Subject: Re: COPY FROM command v8.1.4
Date: 2006-09-23 17:41:00
Message-ID: 26301.1159033260@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Ray Stell <stellr(at)cns(dot)vt(dot)edu> writes:
> Curious how you get to an OS kernel bug from these writes?

Note the two successive lseek's returning the same result. With a
write() in between that should have extended the file, that is clearly
the wrong answer.

Subsequent discussion reveals that Dan is running

> Linux pike 2.6.5-7.244-smp #1 SMP Mon Dec 12 18:32:25 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux
> SLES 9 suse linux enterprise 9 service pack 3

which is probably overdue for an update ...

regards, tom lane


From: "theman" <bitsandbytes88(at)hotmail(dot)com>
To: <pgsql-admin(at)postgresql(dot)org>
Subject: Re: COPY FROM command v8.1.4
Date: 2006-09-26 14:46:04
Message-ID: BAY116-DAV15FEE1017E1AE200A7A2B9D1250@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin


Thanks a million Tom!

SLES support helped upgrade our SLES 9, sp3, linux kernel from 2.6.5-7.244
to 2.6.5-7.282. Since that we haven't had any blocks of rows beign
re-written or blanked out by the kernel. The new kernel is handling the
wirtes much better.

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Saturday, September 23, 2006 1:41 PM
To: Ray Stell
Cc: theman; pgsql-admin(at)postgresql(dot)org
Subject: Re: [ADMIN] COPY FROM command v8.1.4

Ray Stell <stellr(at)cns(dot)vt(dot)edu> writes:
> Curious how you get to an OS kernel bug from these writes?

Note the two successive lseek's returning the same result. With a
write() in between that should have extended the file, that is clearly the
wrong answer.

Subsequent discussion reveals that Dan is running

> Linux pike 2.6.5-7.244-smp #1 SMP Mon Dec 12 18:32:25 UTC 2005 x86_64
> x86_64 x86_64 GNU/Linux SLES 9 suse linux enterprise 9 service pack 3

which is probably overdue for an update ...

regards, tom lane