From: | Ray Stell <stellr(at)cns(dot)vt(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Mr(dot) Dan" <bitsandbytes88(at)hotmail(dot)com>, pgsql-admin(at)postgresql(dot)org |
Subject: | Re: COPY FROM command v8.1.4 |
Date: | 2006-09-13 22:54:38 |
Message-ID: | 20060913225438.GB2690@cns.vt.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
You said "local Reiser FS." Maybe repeat on one of the others, Ext3/JFS?
Tom asked about hardware issues, is there nothing in syslog
that relates to the timing of the event? I don't recall you
responding in public to this. Maybe I missed it.
Just musing...
On Wed, Sep 13, 2006 at 12:50:26PM -0400, Tom Lane wrote:
> "Mr. Dan" <bitsandbytes88(at)hotmail(dot)com> writes:
> >> How are you doing the copies, exactly? SQL COPY command, psql \copy,
> >> something else?
>
> > We've tried SQL COY and psql \copy and always get random results - 0,1, or 2
> > blocks of 25 rows missing.
>
> Hmph. If it happens with a SQL COPY command then psql seems to be off
> the hook, and that also eliminates some theories about dropped TCP
> packets and such.
>
> Would you check back in the source table for the COPY and see what the
> ctid values are for the missing rows? I'm wondering about a pattern
> like "the dropped rows of a group are all on the same disk page", ie,
> what's being missed is one whole page at a time.
>
> If that's what's happening, the only very plausible theory I can think
> of is that your disk drive is sometimes glitching and returning a page
> of all-zeroes instead of what it should return. Postgres will not
> complain about this in normal operation (because there are legitimate
> error-recovery scenarios where a zero page can be in a table); it'll
> just treat the page as empty. VACUUM will complain though, so the next
> step would be to set up a test table by copying your large table and
> then repeatedly run plain VACUUM on the test table. If you get sporadic
> warnings "relation foo page N is uninitialized --- fixing" then we have
> the smoking gun. Don't run this test directly on a valuable table, as
> each such message would mean you just lost another page of data :-(
>
> FWIW, I spent several hours yesterday evening copying 6GB tables around
> to see if I could reproduce any such problem, and I couldn't...
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
--
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Fuhr | 2006-09-14 06:26:33 | Re: real and effective user ids must match |
Previous Message | Ellen Cyran | 2006-09-13 18:07:23 | Re: After how many updates should a vacuum be performed? |