Re: Trouble with COPY IN

Lists: pgsql-hackerspgsql-jdbc
From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Trouble with COPY IN
Date: 2010-07-16 16:56:52
Message-ID: alpine.DEB.2.00.1007161753160.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc


I'm having some trouble with a COPY IN BINARY statement. I have the
following stack trace:

Caused by: java.sql.SQLException: Error writing to database, running
statement COPY InterMineObject (OBJECT, class, tableclass, longObjType,
fullTime, dateObjType, floatType, longType, booleanType, booleanObjType,
title, bigDecimalObjType, stringObjType, doubleObjType, intObjType, name,
id, doubleType, shortObjType, rangeStart, intermine_end, seniority,
floatObjType, intType, shortType, vatNumber, debt, salary, rangeEnd,
address, age, addressId, bankId, companyId, departmentId,
businessAddressId, managerId, personalAddressId, parentId,
departmentThatRejectedMeId, CEOId) FROM STDIN BINARY
at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)
at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:456)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.postgresql.util.PSQLException: Tried to end inactive copy
at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:803)
at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:61)
at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)
at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)
at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)

I have had a brief look at the JDBC driver code, and as far as I can see,
this shouldn't ever happen. All I did was call CopyManager.copyIn(). It
should have kept the correct locks to complete the operation.

For some reason, this seems to exhibit with some data but not with others.

Any clues would be very welcome.

Matthew

--
First law of computing: Anything can go wro
sig: Segmentation fault. core dumped.


From: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-16 17:32:01
Message-ID: AANLkTim0dNU58e_Zg3OGTlylkYVAA9rtyhnb-xfzyLTl@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Could it be that COPY is erroring out, and then the driver tries to
end it anyway, causing the error? Anything interesting in the
PostgreSQL logs? Is it consistently reproducible with a specific data
set?
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 230
Foster City, CA 94404
(650) 242-3500 Main
(650) 242-3501 F
www.truviso.com

On Fri, Jul 16, 2010 at 9:56 AM, Matthew Wakeling <matthew(at)flymine(dot)org> wrote:
>
> I'm having some trouble with a COPY IN BINARY statement. I have the
> following stack trace:
>
> Caused by: java.sql.SQLException: Error writing to database, running
> statement COPY InterMineObject (OBJECT, class, tableclass, longObjType,
> fullTime, dateObjType, floatType, longType, booleanType, booleanObjType,
> title, bigDecimalObjType, stringObjType, doubleObjType, intObjType, name,
> id, doubleType, shortObjType, rangeStart, intermine_end, seniority,
> floatObjType, intType, shortType, vatNumber, debt, salary, rangeEnd,
> address, age, addressId, bankId, companyId, departmentId, businessAddressId,
> managerId, personalAddressId, parentId, departmentThatRejectedMeId, CEOId)
> FROM STDIN BINARY
>  at
> org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:56)
>  at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:456)
>  at java.lang.Thread.run(Thread.java:619)
> Caused by: org.postgresql.util.PSQLException: Tried to end inactive copy
>  at
> org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:803)
>  at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:61)
>  at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)
>  at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)
>  at
> org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)
>
> I have had a brief look at the JDBC driver code, and as far as I can see,
> this shouldn't ever happen. All I did was call CopyManager.copyIn(). It
> should have kept the correct locks to complete the operation.
>
> For some reason, this seems to exhibit with some data but not with others.
>
> Any clues would be very welcome.
>
> Matthew
>
> --
> First law of computing:  Anything can go wro
> sig: Segmentation fault.  core dumped.
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc
>


From: Kris Jurka <books(at)ejurka(dot)com>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Matthew Wakeling <matthew(at)flymine(dot)org>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-16 17:41:13
Message-ID: alpine.BSO.2.00.1007161336490.32040@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 16 Jul 2010, Maciek Sakrejda wrote:

> Could it be that COPY is erroring out, and then the driver tries to
> end it anyway, causing the error?

That seems likely. What's probably necessary to reproduce this is a
failure at the end of the data stream. An early error will be received
and processed normally, but an error at the end will be buffered and the
driver will think everything succeeded and try to end the copy not knowing
that it has already failed. I think we'd need to call the blocking
version of processCopyResults prior to endCopy to avoid this misleading
error.

Kris Jurka


From: Kris Jurka <books(at)ejurka(dot)com>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Matthew Wakeling <matthew(at)flymine(dot)org>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-16 18:06:21
Message-ID: alpine.BSO.2.00.1007161401530.6634@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 16 Jul 2010, Kris Jurka wrote:

> That seems likely. What's probably necessary to reproduce this is a
> failure at the end of the data stream. An early error will be received
> and processed normally, but an error at the end will be buffered and the
> driver will think everything succeeded and try to end the copy not
> knowing that it has already failed. I think we'd need to call the
> blocking version of processCopyResults prior to endCopy to avoid this
> misleading error.
>

Actually with a little more thinking, what I said was completely bogus.

1) Blocking won't fix anything because in the success path, there's no
message to receive, so there's nothing to process.

2) The protocol docs say that after an error, CopyData,
CopyDone, and CopyFail messages are ignored, so they wouldn't be causing
this error.

So some more thought is needed here...

Kris Jurka


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-19 11:35:16
Message-ID: alpine.DEB.2.00.1007191234360.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 16 Jul 2010, Maciek Sakrejda wrote:
> Could it be that COPY is erroring out, and then the driver tries to
> end it anyway, causing the error? Anything interesting in the
> PostgreSQL logs?

There is nothing interesting in the Postgres logs at all to explain what
is going on. I have ERROR statements when I try to drop a table that isn't
there, but nothing when this COPY bugs out.

> Is it consistently reproducible with a specific data set?

>> For some reason, this seems to exhibit with some data but not with
>> others.

Yes, it seems to break with specific data. Particularly, it works with our
COPY tests, but it doesn't work with the higher-level tests.

So, the situation is that I have a database writing interface that I wrote
about six years ago, and we have been using a patched six-year-old
Postgres JDBC driver for ages. I have just updated to the newest JDBC
driver (which doesn't need to be patched for COPY), and I am now getting
this error. Could the binary format have changed slightly? (I haven't
changed my copy of Postgres when upgrading the JDBC driver - it is 8.4.4).

Thanks,

Matthew

--
There are only two kinds of programming languages: those people always
bitch about and those nobody uses. (Bjarne Stroustrup)


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Maciek Sakrejda <msakrejda(at)truviso(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-19 13:22:52
Message-ID: alpine.DEB.2.00.1007191421110.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 16 Jul 2010, Kris Jurka wrote:
> 2) The protocol docs say that after an error, CopyData, CopyDone, and
> CopyFail messages are ignored, so they wouldn't be causing this error.

The exception I receive isn't produced by the server. It is caused by the
client losing its own lock on the database connection. Surely the place
where the lock gets thrown away would be caused by a separate server
error, and should have been reported?

Matthew

--
So, given 'D' is undeclared too, with a default of zero, C++ is equal to D.
mnw21, commenting on the "Surely the value of C++ is zero, but C is now 1"
response to "No, C++ isn't equal to D. 'C' is undeclared [...] C++ should
really be called 1" response to "C++ -- shouldn't it be called D?"


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-19 14:11:06
Message-ID: alpine.DEB.2.00.1007191455550.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 16 Jul 2010, Maciek Sakrejda wrote:
> Could it be that COPY is erroring out, and then the driver tries to
> end it anyway, causing the error? Anything interesting in the
> PostgreSQL logs? Is it consistently reproducible with a specific data
> set?

It seems that the COPY command only fails with larger data sets. If I
split a binary lump into several parts which are otherwise identical, then
it works perfectly, where the complete binary lump doesn't work.

I have isolated a reasonably small example that fails. The table
definition is table1 (int col1, int col2). The following is an exception
that is thrown, containing the full hex dump of the binary COPY, for 583
rows inserted into that table. As far as I can see, there is nothing wrong
with the binary format. If I reduce the size to make it send smaller
chunks, then the error does not occur.

java.sql.SQLException: Error writing to database, running statement COPY
table1 (col1, col2) FROM STDIN BINARY, size = 10515, data =
5047434F50590AFF 0D0A000000000000 0000000002000000 04000001A2000000
0400000247000200 000004000001A300 0000040000024600 0200000004000001
A400000004000002 4500020000000400 0001A50000000400 0002440002000000
04000001A6000000 0400000243000200 000004000001A700 0000040000024200
0200000004000001 A800000004000002 4100020000000400 0001A90000000400
0002400002000000 04000001AA000000 040000023F000200 000004000001AB00
0000040000023E00 0200000004000001 AC00000004000002 3D00020000000400
0001AD0000000400 00023C0002000000 04000001AE000000 040000023B000200
000004000001AF00 0000040000023A00 0200000004000001 B000000004000002
3900020000000400 0001B10000000400 0002380002000000 04000001B2000000
0400000237000200 000004000001B300 0000040000023600 0200000004000001
B400000004000002 3500020000000400 0001B50000000400 0002340002000000
04000001B6000000 0400000233000200 000004000001B700 0000040000023200
0200000004000001 B800000004000002 3100020000000400 0001B90000000400
0002300002000000 04000001BA000000 040000022F000200 000004000001BB00
0000040000022E00 0200000004000001 BC00000004000002 2D00020000000400
0001BD0000000400 00022C0002000000 04000001BE000000 040000022B000200
000004000001BF00 0000040000022A00 0200000004000001 C000000004000002
2900020000000400 0001C10000000400 0002280002000000 04000001C2000000
0400000227000200 000004000001C300 0000040000022600 0200000004000001
C400000004000002 2500020000000400 0001C50000000400 0002240002000000
04000001C6000000 0400000223000200 000004000001C700 0000040000022200
0200000004000001 C800000004000002 2100020000000400 0001C90000000400
0002200002000000 04000001CA000000 040000021F000200 000004000001CB00
0000040000021E00 0200000004000001 CC00000004000002 1D00020000000400
0001CD0000000400 00021C0002000000 04000001CE000000 040000021B000200
000004000001CF00 0000040000021A00 0200000004000001 D000000004000002
1900020000000400 0001D10000000400 0002180002000000 04000001D2000000
0400000217000200 000004000001D300 0000040000021600 0200000004000001
D400000004000002 1500020000000400 0001D50000000400 0002140002000000
04000001D6000000 0400000213000200 000004000001D700 0000040000021200
0200000004000001 D800000004000002 1100020000000400 0001D90000000400
0002100002000000 04000001DA000000 040000020F000200 000004000001DB00
0000040000020E00 0200000004000001 DC00000004000002 0D00020000000400
0001DD0000000400 00020C0002000000 04000001DE000000 040000020B000200
000004000001DF00 0000040000020A00 0200000004000001 E000000004000002
0900020000000400 0001E10000000400 0002080002000000 04000001E2000000
0400000207000200 000004000001E300 0000040000020600 0200000004000001
E400000004000002 0500020000000400 0001E50000000400 0002040002000000
04000001E6000000 0400000203000200 000004000001E700 0000040000020200
0200000004000001 E800000004000002 0100020000000400 0001E90000000400
0002000002000000 04000001EA000000 04000001FF000200 000004000001EB00
000004000001FE00 0200000004000001 EC00000004000001 FD00020000000400
0001ED0000000400 0001FC0002000000 04000001EE000000 04000001FB000200
000004000001EF00 000004000001FA00 0200000004000001 F000000004000001
F900020000000400 0001F10000000400 0001F80002000000 04000001F2000000
04000001F7000200 000004000001F300 000004000001F600 0200000004000001
F400000004000001 F500020000000400 0001F50000000400 0001F40002000000
04000001F6000000 04000001F3000200 000004000001F700 000004000001F200
0200000004000001 F800000004000001 F100020000000400 0001F90000000400
0001F00002000000 04000001FA000000 04000001EF000200 000004000001FB00
000004000001EE00 0200000004000001 FC00000004000001 ED00020000000400
0001FD0000000400 0001EC0002000000 04000001FE000000 04000001EB000200
000004000001FF00 000004000001EA00 0200000004000002 0000000004000001
E900020000000400 0002010000000400 0001E80002000000 0400000202000000
04000001E7000200 0000040000020300 000004000001E600 0200000004000002
0400000004000001 E500020000000400 0002050000000400 0001E40002000000
0400000206000000 04000001E3000200 0000040000020700 000004000001E200
0200000004000002 0800000004000001 E100020000000400 0002090000000400
0001E00002000000 040000020A000000 04000001DF000200 0000040000020B00
000004000001DE00 0200000004000002 0C00000004000001 DD00020000000400
00020D0000000400 0001DC0002000000 040000020E000000 04000001DB000200
0000040000020F00 000004000001DA00 0200000004000002 1000000004000001
D900020000000400 0002110000000400 0001D80002000000 0400000212000000
04000001D7000200 0000040000021300 000004000001D600 0200000004000002
1400000004000001 D500020000000400 0002150000000400 0001D40002000000
0400000216000000 04000001D3000200 0000040000021700 000004000001D200
0200000004000002 1800000004000001 D100020000000400 0002190000000400
0001D00002000000 040000021A000000 04000001CF000200 0000040000021B00
000004000001CE00 0200000004000002 1C00000004000001 CD00020000000400
00021D0000000400 0001CC0002000000 040000021E000000 04000001CB000200
0000040000021F00 000004000001CA00 0200000004000002 2000000004000001
C900020000000400 0002210000000400 0001C80002000000 0400000222000000
04000001C7000200 0000040000022300 000004000001C600 0200000004000002
2400000004000001 C500020000000400 0002250000000400 0001C40002000000
0400000226000000 04000001C3000200 0000040000022700 000004000001C200
0200000004000002 2800000004000001 C100020000000400 0002290000000400
0001C00002000000 040000022A000000 04000001BF000200 0000040000022B00
000004000001BE00 0200000004000002 2C00000004000001 BD00020000000400
00022D0000000400 0001BC0002000000 040000022E000000 04000001BB000200
0000040000022F00 000004000001BA00 0200000004000002 3000000004000001
B900020000000400 0002310000000400 0001B80002000000 0400000232000000
04000001B7000200 0000040000023300 000004000001B600 0200000004000002
3400000004000001 B500020000000400 0002350000000400 0001B40002000000
0400000236000000 04000001B3000200 0000040000023700 000004000001B200
0200000004000002 3800000004000001 B100020000000400 0002390000000400
0001B00002000000 040000023A000000 04000001AF000200 0000040000023B00
000004000001AE00 0200000004000002 3C00000004000001 AD00020000000400
00023D0000000400 0001AC0002000000 040000023E000000 04000001AB000200
0000040000023F00 000004000001AA00 0200000004000002 4000000004000001
A900020000000400 0002410000000400 0001A80002000000 0400000242000000
04000001A7000200 0000040000024300 000004000001A600 0200000004000002
4400000004000001 A500020000000400 0002450000000400 0001A40002000000
0400000246000000 04000001A3000200 0000040000024700 000004000001A200
0200000004000002 4800000004000001 A100020000000400 0002490000000400
0001A00002000000 040000024A000000 040000019F000200 0000040000024B00
0000040000019E00 0200000004000002 4C00000004000001 9D00020000000400
00024D0000000400 00019C0002000000 040000024E000000 040000019B000200
0000040000024F00 0000040000019A00 0200000004000002 5000000004000001
9900020000000400 0002510000000400 0001980002000000 0400000252000000
0400000197000200 0000040000025300 0000040000019600 0200000004000002
5400000004000001 9500020000000400 0002550000000400 0001940002000000
0400000256000000 0400000193000200 0000040000025700 0000040000019200
0200000004000002 5800000004000001 9100020000000400 0002590000000400
0001900002000000 040000025A000000 040000018F000200 0000040000025B00
0000040000018E00 0200000004000002 5C00000004000001 8D00020000000400
00025D0000000400 00018C0002000000 040000025E000000 040000018B000200
0000040000025F00 0000040000018A00 0200000004000002 6000000004000001
8900020000000400 0002610000000400 0001880002000000 0400000262000000
0400000187000200 0000040000026300 0000040000018600 0200000004000002
6400000004000001 8500020000000400 0002650000000400 0001840002000000
0400000266000000 0400000183000200 0000040000026700 0000040000018200
0200000004000002 6800000004000001 8100020000000400 0002690000000400
0001800002000000 040000026A000000 040000017F000200 0000040000026B00
0000040000017E00 0200000004000002 6C00000004000001 7D00020000000400
00026D0000000400 00017C0002000000 040000026E000000 040000017B000200
0000040000026F00 0000040000017A00 0200000004000002 7000000004000001
7900020000000400 0002710000000400 0001780002000000 0400000272000000
0400000177000200 0000040000027300 0000040000017600 0200000004000002
7400000004000001 7500020000000400 0002750000000400 0001740002000000
0400000276000000 0400000173000200 0000040000027700 0000040000017200
0200000004000002 7800000004000001 7100020000000400 0002790000000400
0001700002000000 040000027A000000 040000016F000200 0000040000027B00
0000040000016E00 0200000004000002 7C00000004000001 6D00020000000400
00027D0000000400 00016C0002000000 040000027E000000 040000016B000200
0000040000027F00 0000040000016A00 0200000004000002 8000000004000001
6900020000000400 0002810000000400 0001680002000000 0400000282000000
0400000167000200 0000040000028300 0000040000016600 0200000004000002
8400000004000001 6500020000000400 0002850000000400 0001640002000000
0400000286000000 0400000163000200 0000040000028700 0000040000016200
0200000004000002 8800000004000001 6100020000000400 0002890000000400
0001600002000000 040000028A000000 040000015F000200 0000040000028B00
0000040000015E00 0200000004000002 8C00000004000001 5D00020000000400
00028D0000000400 00015C0002000000 040000028E000000 040000015B000200
0000040000028F00 0000040000015A00 0200000004000002 9000000004000001
5900020000000400 0002910000000400 0001580002000000 0400000292000000
0400000157000200 0000040000029300 0000040000015600 0200000004000002
9400000004000001 5500020000000400 0002950000000400 0001540002000000
0400000296000000 0400000153000200 0000040000029700 0000040000015200
0200000004000002 9800000004000001 5100020000000400 0002990000000400
0001500002000000 040000029A000000 040000014F000200 0000040000029B00
0000040000014E00 0200000004000002 9C00000004000001 4D00020000000400
00029D0000000400 00014C0002000000 040000029E000000 040000014B000200
0000040000029F00 0000040000014A00 0200000004000002 A000000004000001
4900020000000400 0002A10000000400 0001480002000000 04000002A2000000
0400000147000200 000004000002A300 0000040000014600 0200000004000002
A400000004000001 4500020000000400 0002A50000000400 0001440002000000
04000002A6000000 0400000143000200 000004000002A700 0000040000014200
0200000004000002 A800000004000001 4100020000000400 0002A90000000400
0001400002000000 04000002AA000000 040000013F000200 000004000002AB00
0000040000013E00 0200000004000002 AC00000004000001 3D00020000000400
0002AD0000000400 00013C0002000000 04000002AE000000 040000013B000200
000004000002AF00 0000040000013A00 0200000004000002 B000000004000001
3900020000000400 0002B10000000400 0001380002000000 04000002B2000000
0400000137000200 000004000002B300 0000040000013600 0200000004000002
B400000004000001 3500020000000400 0002B50000000400 0001340002000000
04000002B6000000 0400000133000200 000004000002B700 0000040000013200
0200000004000002 B800000004000001 3100020000000400 0002B90000000400
0001300002000000 04000002BA000000 040000012F000200 000004000002BB00
0000040000012E00 0200000004000002 BC00000004000001 2D00020000000400
0002BD0000000400 00012C0002000000 04000002BE000000 040000012B000200
000004000002BF00 0000040000012A00 0200000004000002 C000000004000001
2900020000000400 0002C10000000400 0001280002000000 04000002C2000000
0400000127000200 000004000002C300 0000040000012600 0200000004000002
C400000004000001 2500020000000400 0002C50000000400 0001240002000000
04000002C6000000 0400000123000200 000004000002C700 0000040000012200
0200000004000002 C800000004000001 2100020000000400 0002C90000000400
0001200002000000 04000002CA000000 040000011F000200 000004000002CB00
0000040000011E00 0200000004000002 CC00000004000001 1D00020000000400
0002CD0000000400 00011C0002000000 04000002CE000000 040000011B000200
000004000002CF00 0000040000011A00 0200000004000002 D000000004000001
1900020000000400 0002D10000000400 0001180002000000 04000002D2000000
0400000117000200 000004000002D300 0000040000011600 0200000004000002
D400000004000001 1500020000000400 0002D50000000400 0001140002000000
04000002D6000000 0400000113000200 000004000002D700 0000040000011200
0200000004000002 D800000004000001 1100020000000400 0002D90000000400
0001100002000000 04000002DA000000 040000010F000200 000004000002DB00
0000040000010E00 0200000004000002 DC00000004000001 0D00020000000400
0002DD0000000400 00010C0002000000 04000002DE000000 040000010B000200
000004000002DF00 0000040000010A00 0200000004000002 E000000004000001
0900020000000400 0002E10000000400 0001080002000000 04000002E2000000
0400000107000200 000004000002E300 0000040000010600 0200000004000002
E400000004000001 0500020000000400 0002E50000000400 0001040002000000
04000002E6000000 0400000103000200 000004000002E700 0000040000010200
0200000004000002 E800000004000001 0100020000000400 0002E90000000400
0001000002000000 04000002EA000000 04000000FF000200 000004000002EB00
000004000000FE00 0200000004000002 EC00000004000000 FD00020000000400
0002ED0000000400 0000FC0002000000 04000002EE000000 04000000FB000200
000004000002EF00 000004000000FA00 0200000004000002 F000000004000000
F900020000000400 0002F10000000400 0000F80002000000 04000002F2000000
04000000F7000200 000004000002F300 000004000000F600 0200000004000002
F400000004000000 F500020000000400 0002F50000000400 0000F40002000000
04000002F6000000 04000000F3000200 000004000002F700 000004000000F200
0200000004000002 F800000004000000 F100020000000400 0002F90000000400
0000F00002000000 04000002FA000000 04000000EF000200 000004000002FB00
000004000000EE00 0200000004000002 FC00000004000000 ED00020000000400
0002FD0000000400 0000EC0002000000 04000002FE000000 04000000EB000200
000004000002FF00 000004000000EA00 0200000004000003 0000000004000000
E900020000000400 0003010000000400 0000E80002000000 0400000302000000
04000000E7000200 0000040000030300 000004000000E600 0200000004000003
0400000004000000 E500020000000400 0003050000000400 0000E40002000000
0400000306000000 04000000E3000200 0000040000030700 000004000000E200
0200000004000003 0800000004000000 E100020000000400 0003090000000400
0000E00002000000 040000030A000000 04000000DF000200 0000040000030B00
000004000000DE00 0200000004000003 0C00000004000000 DD00020000000400
00030D0000000400 0000DC0002000000 040000030E000000 04000000DB000200
0000040000030F00 000004000000DA00 0200000004000003 1000000004000000
D900020000000400 0003110000000400 0000D80002000000 0400000312000000
04000000D7000200 0000040000031300 000004000000D600 0200000004000003
1400000004000000 D500020000000400 0003150000000400 0000D40002000000
0400000316000000 04000000D3000200 0000040000031700 000004000000D200
0200000004000003 1800000004000000 D100020000000400 0003190000000400
0000D00002000000 040000031A000000 04000000CF000200 0000040000031B00
000004000000CE00 0200000004000003 1C00000004000000 CD00020000000400
00031D0000000400 0000CC0002000000 040000031E000000 04000000CB000200
0000040000031F00 000004000000CA00 0200000004000003 2000000004000000
C900020000000400 0003210000000400 0000C80002000000 0400000322000000
04000000C7000200 0000040000032300 000004000000C600 0200000004000003
2400000004000000 C500020000000400 0003250000000400 0000C40002000000
0400000326000000 04000000C3000200 0000040000032700 000004000000C200
0200000004000003 2800000004000000 C100020000000400 0003290000000400
0000C00002000000 040000032A000000 04000000BF000200 0000040000032B00
000004000000BE00 0200000004000003 2C00000004000000 BD00020000000400
00032D0000000400 0000BC0002000000 040000032E000000 04000000BB000200
0000040000032F00 000004000000BA00 0200000004000003 3000000004000000
B900020000000400 0003310000000400 0000B80002000000 0400000332000000
04000000B7000200 0000040000033300 000004000000B600 0200000004000003
3400000004000000 B500020000000400 0003350000000400 0000B40002000000
0400000336000000 04000000B3000200 0000040000033700 000004000000B200
0200000004000003 3800000004000000 B100020000000400 0003390000000400
0000B00002000000 040000033A000000 04000000AF000200 0000040000033B00
000004000000AE00 0200000004000003 3C00000004000000 AD00020000000400
00033D0000000400 0000AC0002000000 040000033E000000 04000000AB000200
0000040000033F00 000004000000AA00 0200000004000003 4000000004000000
A900020000000400 0003410000000400 0000A80002000000 0400000342000000
04000000A7000200 0000040000034300 000004000000A600 0200000004000003
4400000004000000 A500020000000400 0003450000000400 0000A40002000000
0400000346000000 04000000A3000200 0000040000034700 000004000000A200
0200000004000003 4800000004000000 A100020000000400 0003490000000400
0000A00002000000 040000034A000000 040000009F000200 0000040000034B00
0000040000009E00 0200000004000003 4C00000004000000 9D00020000000400
00034D0000000400 00009C0002000000 040000034E000000 040000009B000200
0000040000034F00 0000040000009A00 0200000004000003 5000000004000000
9900020000000400 0003510000000400 0000980002000000 0400000352000000
0400000097000200 0000040000035300 0000040000009600 0200000004000003
5400000004000000 9500020000000400 0003550000000400 0000940002000000
0400000356000000 0400000093000200 0000040000035700 0000040000009200
0200000004000003 5800000004000000 9100020000000400 0003590000000400
0000900002000000 040000035A000000 040000008F000200 0000040000035B00
0000040000008E00 0200000004000003 5C00000004000000 8D00020000000400
00035D0000000400 00008C0002000000 040000035E000000 040000008B000200
0000040000035F00 0000040000008A00 0200000004000003 6000000004000000
8900020000000400 0003610000000400 0000880002000000 0400000362000000
0400000087000200 0000040000036300 0000040000008600 0200000004000003
6400000004000000 8500020000000400 0003650000000400 0000840002000000
0400000366000000 0400000083000200 0000040000036700 0000040000008200
0200000004000003 6800000004000000 8100020000000400 0003690000000400
0000800002000000 040000036A000000 040000007F000200 0000040000036B00
0000040000007E00 0200000004000003 6C00000004000000 7D00020000000400
00036D0000000400 00007C0002000000 040000036E000000 040000007B000200
0000040000036F00 0000040000007A00 0200000004000003 7000000004000000
7900020000000400 0003710000000400 0000780002000000 0400000372000000
0400000077000200 0000040000037300 0000040000007600 0200000004000003
7400000004000000 7500020000000400 0003750000000400 0000740002000000
0400000376000000 0400000073000200 0000040000037700 0000040000007200
0200000004000003 7800000004000000 7100020000000400 0003790000000400
0000700002000000 040000037A000000 040000006F000200 0000040000037B00
0000040000006E00 0200000004000003 7C00000004000000 6D00020000000400
00037D0000000400 00006C0002000000 040000037E000000 040000006B000200
0000040000037F00 0000040000006A00 0200000004000003 8000000004000000
6900020000000400 0003810000000400 0000680002000000 0400000382000000
0400000067000200 0000040000038300 0000040000006600 0200000004000003
8400000004000000 6500020000000400 0003850000000400 0000640002000000
0400000386000000 0400000063000200 0000040000038700 0000040000006200
0200000004000003 8800000004000000 6100020000000400 0003890000000400
0000600002000000 040000038A000000 040000005F000200 0000040000038B00
0000040000005E00 0200000004000003 8C00000004000000 5D00020000000400
00038D0000000400 00005C0002000000 040000038E000000 040000005B000200
0000040000038F00 0000040000005A00 0200000004000003 9000000004000000
5900020000000400 0003910000000400 0000580002000000 0400000392000000
0400000057000200 0000040000039300 0000040000005600 0200000004000003
9400000004000000 5500020000000400 0003950000000400 0000540002000000
0400000396000000 0400000053000200 0000040000039700 0000040000005200
0200000004000003 9800000004000000 5100020000000400 0003990000000400
0000500002000000 040000039A000000 040000004F000200 0000040000039B00
0000040000004E00 0200000004000003 9C00000004000000 4D00020000000400
00039D0000000400 00004C0002000000 040000039E000000 040000004B000200
0000040000039F00 0000040000004A00 0200000004000003 A000000004000000
4900020000000400 0003A10000000400 0000480002000000 04000003A2000000
0400000047000200 000004000003A300 0000040000004600 0200000004000003
A400000004000000 4500020000000400 0003A50000000400 0000440002000000
04000003A6000000 0400000043000200 000004000003A700 0000040000004200
0200000004000003 A800000004000000 4100020000000400 0003A90000000400
0000400002000000 04000003AA000000 040000003F000200 000004000003AB00
0000040000003E00 0200000004000003 AC00000004000000 3D00020000000400
0003AD0000000400 00003C0002000000 04000003AE000000 040000003B000200
000004000003AF00 0000040000003A00 0200000004000003 B000000004000000
3900020000000400 0003B10000000400 0000380002000000 04000003B2000000
0400000037000200 000004000003B300 0000040000003600 0200000004000003
B400000004000000 3500020000000400 0003B50000000400 0000340002000000
04000003B6000000 0400000033000200 000004000003B700 0000040000003200
0200000004000003 B800000004000000 3100020000000400 0003B90000000400
0000300002000000 04000003BA000000 040000002F000200 000004000003BB00
0000040000002E00 0200000004000003 BC00000004000000 2D00020000000400
0003BD0000000400 00002C0002000000 04000003BE000000 040000002B000200
000004000003BF00 0000040000002A00 0200000004000003 C000000004000000
2900020000000400 0003C10000000400 0000280002000000 04000003C2000000
0400000027000200 000004000003C300 0000040000002600 0200000004000003
C400000004000000 2500020000000400 0003C50000000400 0000240002000000
04000003C6000000 0400000023000200 000004000003C700 0000040000002200
0200000004000003 C800000004000000 2100020000000400 0003C90000000400
0000200002000000 04000003CA000000 040000001F000200 000004000003CB00
0000040000001E00 0200000004000003 CC00000004000000 1D00020000000400
0003CD0000000400 00001C0002000000 04000003CE000000 040000001B000200
000004000003CF00 0000040000001A00 0200000004000003 D000000004000000
1900020000000400 0003D10000000400 0000180002000000 04000003D2000000
0400000017000200 000004000003D300 0000040000001600 0200000004000003
D400000004000000 1500020000000400 0003D50000000400 0000140002000000
04000003D6000000 0400000013000200 000004000003D700 0000040000001200
0200000004000003 D800000004000000 1100020000000400 0003D90000000400
0000100002000000 04000003DA000000 040000000F000200 000004000003DB00
0000040000000E00 0200000004000003 DC00000004000000 0D00020000000400
0003DD0000000400 00000C0002000000 04000003DE000000 040000000B000200
000004000003DF00 0000040000000A00 0200000004000003 E000000004000000
0900020000000400 0003E10000000400 0000080002000000 04000003E2000000
0400000007000200 000004000003E300 0000040000000600 0200000004000003
E400000004000000 0500020000000400 0003E50000000400 0000040002000000
04000003E6000000 0400000003000200 000004000003E700 0000040000000200
0200000004000003 E800000004000000 01FFFF
at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:66)
at org.intermine.sql.writebatch.Batch$BatchFlusher.run(Batch.java:458)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.postgresql.util.PSQLException: Tried to end inactive copy
at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:803)
at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:61)
at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:181)
at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:161)
at org.intermine.sql.writebatch.FlushJobPostgresCopyImpl.flush(FlushJobPostgresCopyImpl.java:51)

--
"We did a risk management review. We concluded that there was no risk
of any management." -- Hugo Mills <hugo(at)carfax(dot)nildram(dot)co(dot)uk>


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-19 14:16:12
Message-ID: alpine.DEB.2.00.1007191514510.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Mon, 19 Jul 2010, Matthew Wakeling wrote:
> I have isolated a reasonably small example that fails.

Just ran the example again, and it passed. Looks like an intermittent
problem, not a deterministic one. However, it does seem that the size of
the COPY is a significant factor.

Matthew

--
Here we go - the Fairy Godmother redundancy proof.
-- Computer Science Lecturer


From: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To:
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-19 15:11:00
Message-ID: AANLkTimCYa3fvWu2-NwcUc_7GibTpFB1RjNC09IVk8IS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

I've fixed other COPY-related bugs by running the latest code from
revision control rather than the latest released driver. My problem
doesn't appear to be the same as your problem, but it is probably
worth a test, since it is pretty trivial to build the driver from
source.

On Mon, Jul 19, 2010 at 7:16 AM, Matthew Wakeling <matthew(at)flymine(dot)org> wrote:
> On Mon, 19 Jul 2010, Matthew Wakeling wrote:
>>
>> I have isolated a reasonably small example that fails.
>
> Just ran the example again, and it passed. Looks like an intermittent
> problem, not a deterministic one. However, it does seem that the size of the
> COPY is a significant factor.
>
> Matthew
>
> --
> Here we go - the Fairy Godmother redundancy proof.
>                                       -- Computer Science Lecturer
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc
>


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 10:15:31
Message-ID: alpine.DEB.2.00.1007201114080.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Mon, 19 Jul 2010, Samuel Gendler wrote:
> I've fixed other COPY-related bugs by running the latest code from
> revision control rather than the latest released driver. My problem
> doesn't appear to be the same as your problem, but it is probably
> worth a test, since it is pretty trivial to build the driver from
> source.

That does indeed fix the problem. I didn't realise the published jar was
so old!

I have a few "changed behaviour" problems to fix up before I can give it a
completely clean bill of health. I'll get back to the list if there are
any further problems.

Matthew

--
Note: some countries impose serious penalties for a conspiracy to overthrow
the political system. THIS DOES NOT FIX THE VULNERABILITY.
-- http://seclists.org/vulnwatch/2003/q2/0002.html


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 10:29:30
Message-ID: alpine.DEB.2.00.1007201128560.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:
> That does indeed fix the problem. I didn't realise the published jar was so
> old!

Okay, no it doesn't fix the problem after all. Identical stack trace to
before. Any ideas?

Matthew

--
for a in past present future; do
for b in clients employers associates relatives neighbours pets; do
echo "The opinions here in no way reflect the opinions of my $a $b."
done; done


From: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 11:06:02
Message-ID: AANLkTinVNt56ZuxRf0KNgBR-kJPN1NkTgDkHY2DXIPQr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

I'd be willing to take a more in-depth look if you can minimize this
to a small, at-least-semi-reproducible test case. Ideally, something
that can be included in
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/test/jdbc2/CopyTest.java
(since I noticed the driver doesn't have any WITH BINARY tests).

If that's more trouble to you than it's worth, I'd try turning on
driver-level logging and see if that gets you anything, or look at the
traffic on the wire with something like wireshark to see if you're
seeing the network traffic you expect (from the sound of it, I don't
expect the problem to be there, but it's always nice to eliminate
things). A protocol reference is available (
http://developer.postgresql.org/pgdocs/postgres/protocol-flow.html#PROTOCOL-COPY
), but you're basically looking for the pattern CopyIn CopyData*
(CopyDone|CopyFail). If for some reason you see other messages
interspersed from the client or a repeated Copy{Done,Fail} from the
client, that could be a bad sign (although again, your server logs
suggest this isn't the case).
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 230
Foster City, CA 94404
(650) 242-3500 Main
(650) 242-3501 F
www.truviso.com

On Tue, Jul 20, 2010 at 3:29 AM, Matthew Wakeling <matthew(at)flymine(dot)org> wrote:
> On Tue, 20 Jul 2010, Matthew Wakeling wrote:
>>
>> That does indeed fix the problem. I didn't realise the published jar was
>> so old!
>
> Okay, no it doesn't fix the problem after all. Identical stack trace to
> before. Any ideas?
>
> Matthew
>
> --
> for a in past present future; do
>  for b in clients employers associates relatives neighbours pets; do
>  echo "The opinions here in no way reflect the opinions of my $a $b."
> done; done
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc
>


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 11:43:53
Message-ID: alpine.DEB.2.00.1007201229110.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Maciek Sakrejda wrote:
> I'd be willing to take a more in-depth look if you can minimize this
> to a small, at-least-semi-reproducible test case. Ideally, something
> that can be included in
> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/test/jdbc2/CopyTest.java
> (since I noticed the driver doesn't have any WITH BINARY tests).

Okay, I'm looking at this now. Just tried running the unit tests (without
any modifications), and I get a single failure in date handling:

runtest:
[junit] Testsuite: org.postgresql.test.jdbc2.Jdbc2TestSuite
[junit] Tests run: 288, Failures: 0, Errors: 1, Time elapsed: 19.517 sec
[junit]
[junit] ------------- Standard Error -----------------
[junit] DatabaseEncodingTest: Skipping UNICODE database tests as test database encoding is SQL_ASCII
[junit] ------------- ---------------- ---------------
[junit] Testcase: testSetDate(org.postgresql.test.jdbc2.DateTest): Caused an ERROR
[junit] null
[junit] java.lang.IllegalArgumentException
[junit] at java.sql.Date.valueOf(Date.java:138)
[junit] at org.postgresql.test.jdbc2.DateTest.testSetDate(DateTest.java:120)
[junit]
[junit]
[junit] Test org.postgresql.test.jdbc2.Jdbc2TestSuite FAILED

Will try and produce a test for binary copy in now and let you know how it
goes.

Matthew

--
Most books now say our sun is a star. But it still knows how to change
back into a sun in the daytime.


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 13:25:29
Message-ID: alpine.DEB.2.00.1007201422270.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:
> On Tue, 20 Jul 2010, Maciek Sakrejda wrote:
>> I'd be willing to take a more in-depth look if you can minimize this
>> to a small, at-least-semi-reproducible test case. Ideally, something
>> that can be included in
>> http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/test/jdbc2/CopyTest.java
>> (since I noticed the driver doesn't have any WITH BINARY tests).
>
> Okay, I'm looking at this now.

Just written a test that writes 100000 rows into the test table using
binary copy, and it works perfectly. Not sure what's going on here, as it
still fails in my application.

One difference is that the test uses a straight connection, and my
application uses a ds.PGPooledConnection. Could that be affecting it at
all?

Matthew

--
It's one of those irregular verbs - "I have an independent mind," "You are
an eccentric," "He is round the twist."
-- Bernard Woolly, Yes Prime Minister


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Matthew Wakeling" <matthew(at)flymine(dot)org>, "Maciek Sakrejda" <msakrejda(at)truviso(dot)com>
Cc: "Samuel Gendler" <sgendler(at)ideasculptor(dot)com>, <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 13:48:05
Message-ID: 4C4562C50200002500033951@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Matthew Wakeling <matthew(at)flymine(dot)org> wrote:

> Just written a test that writes 100000 rows into the test table
> using binary copy, and it works perfectly. Not sure what's going
> on here, as it still fails in my application.

Random thought: is SSL involved in one and not the other?

-Kevin


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 13:56:49
Message-ID: alpine.DEB.2.00.1007201452360.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Kevin Grittner wrote:
> Matthew Wakeling <matthew(at)flymine(dot)org> wrote:
>
>> Just written a test that writes 100000 rows into the test table
>> using binary copy, and it works perfectly. Not sure what's going
>> on here, as it still fails in my application.
>
> Random thought: is SSL involved in one and not the other?

No.

After a little more investigation, it appears that
core.v3.QueryExecutorImpl.unlock() is being called before
core.v3.QueryExecutorImpl.endCopy(), in
core.v3.QueryExecutorImpl.processCopyResults() apparently because it
receives a "Z", called by core.v3.QueryExecutorImpl.writeToCopy().
Investigating further.

Matthew

--
I quite understand I'm doing algebra on the blackboard and the usual response
is to throw objects... If you're going to freak out... wait until party time
and invite me along -- Computer Science Lecturer


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 14:12:25
Message-ID: alpine.DEB.2.00.1007201507500.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:
> After a little more investigation, it appears that
> core.v3.QueryExecutorImpl.unlock() is being called before
> core.v3.QueryExecutorImpl.endCopy(), in
> core.v3.QueryExecutorImpl.processCopyResults() apparently because it receives
> a "Z", called by core.v3.QueryExecutorImpl.writeToCopy(). Investigating
> further.

Okay, have investigated a little further. It seems that the server may be
responding to the client a little too quickly with the "all done" message.
Here is a sequence of events in core.v3.QueryExecutorImpl:

We call startCopy(). It calls processCopyResults().
In processCopyResults, we receive character G, and we lock.

We call writeToCopy(). It calls processCopyResults().
We receive character C.
We receive character Z, and unlock.

We call endCopy(), which tries to unlock and fails.

So, should the unlocking happen elsewhere, or should endCopy not require a
lock?

Matthew

--
Jadzia: Don't forget the 34th rule of acquisition: Peace is good for business.
Quark: That's the 35th.
Jadzia: Oh yes, that's right. What's the 34th again?
Quark: War is good for business. It's easy to get them mixed up.


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 14:55:18
Message-ID: alpine.DEB.2.00.1007201548270.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:
> So, should the unlocking happen elsewhere, or should endCopy not require a
> lock?

Or a third option. We should delay the processing of the "Z" (Ready for
query) message until after the endCopy has been called.

I have still not been able to create a test that provokes this bug. I do
not understand under what conditions the server will do this, but
sometimes it will respond with a "C" and a "Z" before the endCopy message
has actually been sent to it. Maybe it is keying off the "-1" (no further
rows) part of the binary copy. In any case, I am including a patch here
which seems to fix the driver for my application. When the driver receives
a "C" message (command finished), and block is set to false, then we can
assume that a little later on we will call processCopyResults again very
soon with block set to true. Instead of setting block to true, finish
processing of messages in order to delay seeing the "Z" message until
after the endCopy method has been called.

The patch also contains a simple test for binary copy in, although it does
not exercise this bug.

Index: org/postgresql/core/v3/QueryExecutorImpl.java
===================================================================
RCS file:
/cvsroot/jdbc/pgjdbc/org/postgresql/core/v3/QueryExecutorImpl.java,v
retrieving revision 1.48
diff -u -r1.48 QueryExecutorImpl.java
--- org/postgresql/core/v3/QueryExecutorImpl.java 21 Mar 2010 07:13:37 -0000 1.48
+++ org/postgresql/core/v3/QueryExecutorImpl.java 20 Jul 2010 14:48:03 -0000
@@ -931,7 +931,9 @@
error = se;
}

- block = true;
+ if (!block) {
+ endReceiving = true;
+ }
break;

case 'E': // ErrorMessage (expected response to CopyFail)
Index: org/postgresql/test/jdbc2/CopyTest.java
===================================================================
RCS file: /cvsroot/jdbc/pgjdbc/org/postgresql/test/jdbc2/CopyTest.java,v
retrieving revision 1.4
diff -u -r1.4 CopyTest.java
--- org/postgresql/test/jdbc2/CopyTest.java 8 Apr 2010 16:18:05 -0000 1.4
+++ org/postgresql/test/jdbc2/CopyTest.java 20 Jul 2010 14:48:03 -0000
@@ -15,6 +15,7 @@
import java.sql.Statement;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
+import java.io.DataOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.PrintStream;
@@ -272,4 +273,35 @@
assertEquals(0, getCount());
}

+ public void testBinaryCopyIn() throws Exception {
+ con.setAutoCommit(true);
+ String sql = "COPY copytest FROM STDIN BINARY";
+ ByteArrayOutputStream data = new ByteArrayOutputStream();
+ DataOutputStream dos = new DataOutputStream(data);
+ dos.writeBytes("PGCOPY\n");
+ dos.writeByte(255);
+ dos.writeBytes("\r\n");
+ dos.writeByte(0); // Signature done
+ dos.writeInt(0); // Flags - we aren't supplying OIDS
+ dos.writeInt(0); // Length of header extension
+ for (int row = 0; row < 99700; row++) {
+ dos.writeShort(3); // Three rows
+ dos.writeInt(5); // String length
+ dos.writeByte('H');
+ dos.writeByte('e');
+ dos.writeByte('l');
+ dos.writeByte('l');
+ dos.writeByte('o');
+ dos.writeInt(4); // int length
+ dos.writeInt(row); // Second argument
+ dos.writeInt(-1); // Null third argument
+ }
+ dos.writeShort(-1); // End of rows
+ dos.flush();
+ dos.close();
+ byte[] byteArray = data.toByteArray();
+ //System.err.println("Data length: " + byteArray.length);
+ long count = copyAPI.copyIn(sql, new ByteArrayInputStream(byteArray));
+ assertEquals(99700, count);
+ }
}

Hope this helps,

Matthew

--
There once was a limerick .sig
that really was not very big
It was going quite fine
Till it reached the fourth line


From: Kris Jurka <books(at)ejurka(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-20 18:58:09
Message-ID: alpine.BSO.2.00.1007201455220.5084@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:

> Okay, have investigated a little further. It seems that the server may be
> responding to the client a little too quickly with the "all done" message.
> Here is a sequence of events in core.v3.QueryExecutorImpl:
>
> We call startCopy(). It calls processCopyResults().
> In processCopyResults, we receive character G, and we lock.
>
> We call writeToCopy(). It calls processCopyResults().
> We receive character C.
> We receive character Z, and unlock.
>
> We call endCopy(), which tries to unlock and fails.
>
>
> So, should the unlocking happen elsewhere, or should endCopy not require a
> lock?

To me it seems like a protocol violation to receive CommandComplete(C) and
ReadyForQuery(Z) before we've issued CopyDone(c) and this should be fixed
server side.

Kris Jurka


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 16:38:31
Message-ID: alpine.DEB.2.00.1007221737410.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Kris Jurka wrote:
> To me it seems like a protocol violation to receive CommandComplete(C) and
> ReadyForQuery(Z) before we've issued CopyDone(c) and this should be fixed
> server side.

Is anyone following this up?

Matthew

--
If you're thinking "Oh no, this lecturer thinks Turing Machines are a feasible
method of computation, where's the door?", then you are in luck. There are
some there, there, and by the side there. Oxygen masks will not drop from the
ceiling... -- Computer Science Lecturer


From: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 17:03:44
Message-ID: AANLkTinrgKujGBdYWYVFjL3M_xeOKHxVcO8SBH7Rjqdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Can you confirm Kris's theory? You should be able to do that by
starting a binary copy with the CopyManager.copyIn(String), writing
all your binary data to it, flushing, and then sleeping for a while
before calling endCopy? If Kris is right, you should get the error
after the flush before the endCopy. If that's the case, I can take a
look at doing a patch this weekend. If I'm reading the protocol
contract right, it ultimately should be fixed in the backend, though.
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 230
Foster City, CA 94404
(650) 242-3500 Main
(650) 242-3501 F
www.truviso.com

On Thu, Jul 22, 2010 at 9:38 AM, Matthew Wakeling <matthew(at)flymine(dot)org> wrote:
> On Tue, 20 Jul 2010, Kris Jurka wrote:
>>
>> To me it seems like a protocol violation to receive CommandComplete(C) and
>> ReadyForQuery(Z) before we've issued CopyDone(c) and this should be fixed
>> server side.
>
> Is anyone following this up?
>
> Matthew
>
> --
> If you're thinking "Oh no, this lecturer thinks Turing Machines are a
> feasible
> method of computation, where's the door?", then you are in luck. There are
> some there, there, and by the side there. Oxygen masks will not drop from
> the
> ceiling...                              -- Computer Science Lecturer
>


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 17:10:13
Message-ID: alpine.DEB.2.00.1007221807090.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, 22 Jul 2010, Maciek Sakrejda wrote:
> Can you confirm Kris's theory? You should be able to do that by
> starting a binary copy with the CopyManager.copyIn(String), writing
> all your binary data to it, flushing, and then sleeping for a while
> before calling endCopy? If Kris is right, you should get the error
> after the flush before the endCopy. If that's the case, I can take a
> look at doing a patch this weekend. If I'm reading the protocol
> contract right, it ultimately should be fixed in the backend, though.

I can try and reproduce that. However, I will point out that the patch I
sent to the list which delays the reading of the Z does fix the problem.

Matthew

--
Ozzy: Life is full of disappointments.
Millie: No it isn't - I can always fit more in.


From: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 17:21:05
Message-ID: AANLkTilGsobIl2yV3o8hEreUq3ffjTNnqqzOVTTvN83N@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

> I can try and reproduce that. However, I will point out that the patch I
> sent to the list which delays the reading of the Z does fix the problem.

Sorry, I missed that. I'm not crazy about the logic (if I'm following
it correctly), but it seems relatively sound (and has the benefit of
being minimally invasive).
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 230
Foster City, CA 94404
(650) 242-3500 Main
(650) 242-3501 F
www.truviso.com


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Maciek Sakrejda <msakrejda(at)truviso(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 17:30:02
Message-ID: alpine.DEB.2.00.1007221823210.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, 22 Jul 2010, Maciek Sakrejda wrote:
>> I can try and reproduce that. However, I will point out that the patch I
>> sent to the list which delays the reading of the Z does fix the problem.
>
> Sorry, I missed that. I'm not crazy about the logic (if I'm following
> it correctly), but it seems relatively sound (and has the benefit of
> being minimally invasive).

No, the logic sucks, as it is full of possible failure modes. What if you
for some reason call processCopyResults() again, after deciding to delay
after seeing the C, but before calling endCopy()? It'd see the Z and fail.

Anyway, I have just tried to get it to fail using the method you
suggested, and failed. For some reason, it seems that the server only
responds early under some circumstances that I can't work out. The test
case I have here doesn't do it. It does seem to be data dependent.

For reference, I edited CopyManager, and put a ten second sleep just
before endCopy(). I edited QueryExecutorImpl and put a flush after each
pgStream.send(data), and I put a processCopyResults() right at the
beginning of endCopy. This should provoke an error if the server does the
wrong thing.

I'll try using this in my application, and see what happens. Will get back
to you.

Matthew

--
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it. - Kernighan


From: Kris Jurka <books(at)ejurka(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 19:49:39
Message-ID: alpine.BSO.2.00.1007221548280.1768@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, 22 Jul 2010, Matthew Wakeling wrote:

> On Tue, 20 Jul 2010, Kris Jurka wrote:
>> To me it seems like a protocol violation to receive CommandComplete(C) and
>> ReadyForQuery(Z) before we've issued CopyDone(c) and this should be fixed
>> server side.
>
> Is anyone following this up?

Yes, I am able to reproduce this at will now. I will look at what's going
on server side after some lunch.

Kris Jurka


From: Kris Jurka <books(at)ejurka(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-22 21:34:08
Message-ID: alpine.BSO.2.00.1007221723230.16703@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc


Per discussion and investigation on the -jdbc list, the server appears to
violate the frontend/backend protocol when binary copy data is sent to the
server. Upon receiving the binary copy end of data marker (a -1 field
count), the server immediately responds with CommandComplete and
ReadyForQuery without waiting for the frontend to issue CopyDone or
CopyFail. This confuses the JDBC driver as it doesn't think the command
sequence should have finished yet.

Attached is a patch to make the server continue to consume protocol data
until instructed to stop by the client in the same way as copying text
data to the server currently works.

http://www.postgresql.org/docs/8.4/static/protocol-flow.html#PROTOCOL-COPY
http://www.postgresql.org/docs/8.4/static/sql-copy.html

Kris Jurka

Attachment Content-Type Size
binary-copy-end.patch text/plain 1.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 00:55:13
Message-ID: AANLkTincY8oaXCvKL4bmDb3VEOj6oSXY3SHX9BYF=maA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, Jul 22, 2010 at 5:34 PM, Kris Jurka <books(at)ejurka(dot)com> wrote:
> Per discussion and investigation on the -jdbc list, the server appears to
> violate the frontend/backend protocol when binary copy data is sent to the
> server.  Upon receiving the binary copy end of data marker (a -1 field
> count), the server immediately responds with CommandComplete and
> ReadyForQuery without waiting for the frontend to issue CopyDone or
> CopyFail.  This confuses the JDBC driver as it doesn't think the command
> sequence should have finished yet.
>
> Attached is a patch to make the server continue to consume protocol data
> until instructed to stop by the client in the same way as copying text data
> to the server currently works.
>
> http://www.postgresql.org/docs/8.4/static/protocol-flow.html#PROTOCOL-COPY
> http://www.postgresql.org/docs/8.4/static/sql-copy.html
>
> Kris Jurka

I guess the obvious question is whether we shouldn't instead change
the docs to match the behavior. I suspect there's almost no chance
we'd consider back-patching a change of this type, since it is a clear
behavior change. And even if we did, there would still be people
running servers with the old behavior with which JDBC and other
drivers would have to cope. Having two different behaviors might be
worse than the status quo.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-07-23 09:33:27
Message-ID: alpine.DEB.2.00.1007231009270.8895@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, 22 Jul 2010, Kris Jurka wrote:
> Yes, I am able to reproduce this at will now. I will look at what's going on
> server side after some lunch.

Well, at least someone can reproduce it. I have completely failed so far,
even going so far as to perform exactly the same operation in the test as
I do in my application. It fails in the application, but passes in the
test.

Matthew

--
For those of you who are into writing programs that are as obscure and
complicated as possible, there are opportunities for... real fun here
-- Computer Science Lecturer


From: Kris Jurka <books(at)ejurka(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 13:23:55
Message-ID: alpine.BSO.2.00.1007222112070.29831@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Thu, 22 Jul 2010, Robert Haas wrote:

> On Thu, Jul 22, 2010 at 5:34 PM, Kris Jurka <books(at)ejurka(dot)com> wrote:
>>
>> Attached is a patch to make the server continue to consume protocol data
>> until instructed to stop by the client in the same way as copying text data
>> to the server currently works.
>>
>
> I guess the obvious question is whether we shouldn't instead change
> the docs to match the behavior. I suspect there's almost no chance
> we'd consider back-patching a change of this type, since it is a clear
> behavior change. And even if we did, there would still be people
> running servers with the old behavior with which JDBC and other
> drivers would have to cope. Having two different behaviors might be
> worse than the status quo.
>

It is a clear behavior change, but that's what bug fixes are. I would
advocate back-patching this because I doubt many people would be affected
by this change and I think it would be awkward trying to document how
things work differently in binary mode when sending a file end marker than
in text mode or without a file end marker. If this was fixed server side
and backpatched, I would not modify the JDBC driver to work with older
server versions.

The copy documentation is clear that you must call PQputCopyEnd or
equivalent to end the copy sequence, so this would only affect people who
are not doing that and using binary copy mode. I doubt many people are
using binary copy at all because of the additional difficulty in
generating binary format data and the potential for portability problems.

Kris Jurka


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 13:32:08
Message-ID: 4C4999D8.705@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Kris Jurka wrote:
>>
>> I guess the obvious question is whether we shouldn't instead change
>> the docs to match the behavior. I suspect there's almost no chance
>> we'd consider back-patching a change of this type, since it is a clear
>> behavior change. And even if we did, there would still be people
>> running servers with the old behavior with which JDBC and other
>> drivers would have to cope. Having two different behaviors might be
>> worse than the status quo.
>>
>
> It is a clear behavior change, but that's what bug fixes are.

That was my first reaction. I don't think we're in the business if
redefining bugs out of existence.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 13:40:35
Message-ID: 9143.1279892435@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> writes:
> Attached is a patch to make the server continue to consume protocol data
> until instructed to stop by the client in the same way as copying text
> data to the server currently works.

I believe this is a misunderstanding of the protocol spec. The spec is
(intended to say that) we'll continue to accept data after reporting an
error, not that we will silently swallow an incorrect data stream and
not complain about it. Which is what this patch will do.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Kris Jurka <books(at)ejurka(dot)com>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 13:40:37
Message-ID: AANLkTimPUT97D0ErZ_XvTRbVsYjNC0LPQAgQNB0fGaY5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, Jul 23, 2010 at 9:32 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Kris Jurka wrote:
>>>
>>> I guess the obvious question is whether we shouldn't instead change
>>> the docs to match the behavior. I suspect there's almost no chance
>>> we'd consider back-patching a change of this type, since it is a clear
>>> behavior change. And even if we did, there would still be people
>>> running servers with the old behavior with which JDBC and other
>>> drivers would have to cope. Having two different behaviors might be
>>> worse than the status quo.
>>>
>>
>> It is a clear behavior change, but that's what bug fixes are.
>
> That was my first reaction. I don't think we're in the business if
> redefining bugs out of existence.

I certainly understand that reaction - I just worry that there might
be people depending on the current behavior. We really don't want to
get a reputation for breaking things in minor releases. But this is
not an area of the code I'm very familiar with, and I'm not in a good
position to judge the likelihood of breakage, so I'll defer to those
who are...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Kris Jurka <books(at)ejurka(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 13:50:30
Message-ID: 4C499E26.4060307@ejurka.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On 7/23/2010 6:40 AM, Tom Lane wrote:
> Kris Jurka<books(at)ejurka(dot)com> writes:
>> Attached is a patch to make the server continue to consume protocol data
>> until instructed to stop by the client in the same way as copying text
>> data to the server currently works.
>
> I believe this is a misunderstanding of the protocol spec. The spec is
> (intended to say that) we'll continue to accept data after reporting an
> error, not that we will silently swallow an incorrect data stream and
> not complain about it. Which is what this patch will do.
>

All this does is make binary mode match text mode. Are you planning on
changing text mode to match binary mode instead? Currently text mode
parsing ends at the data end marker (\.) and throws away any further
data which may or may not be ill formatted. For example there's no
complaint about copying the following data file into a single column
integer table even though there is bogus data after the file end marker

3
4
\.
asdf
aff
5
qq

Kris Jurka


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 14:11:36
Message-ID: 9653.1279894296@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> writes:
> On 7/23/2010 6:40 AM, Tom Lane wrote:
>> I believe this is a misunderstanding of the protocol spec. The spec is
>> (intended to say that) we'll continue to accept data after reporting an
>> error, not that we will silently swallow an incorrect data stream and
>> not complain about it. Which is what this patch will do.

> All this does is make binary mode match text mode.

The fact that text mode eats data after \. is a backwards-compatibility
kluge to match the behavior of pre-7.4 COPY. It could very legitimately
be argued to be a bug in itself. I don't think that we should make
binary mode match it. The main concrete reason why not is that binary
mode has almost no redundancy. It would be really easy for the code
change you suggest to result in data being silently discarded with no
hint of what went wrong.

After some reflection, I think the real issue here is that the JDBC
driver is depending on a behavior not stated in the protocol, which
is the relative timing of FE-to-BE and BE-to-FE messages. Once you've
sent the EOF marker, the only correct follow-on for a spec-compliant
frontend is a CopyEnd message. So the backend is just sending its
response a bit sooner. There's nothing in the protocol spec forbidding
that.

I would be willing to accept a patch that avoided sending CopyEnd
immediately after receiving EOF, so long as it still threw an error
for extra data; but this is not that patch. The larger issue though
is whether it wouldn't be better to change the driver behavior instead.
I can't help thinking that the JDBC driver must be being overly cute
if this breaks it ... and you're never going to get everybody to
upgrade their server version, even if we were willing to back-patch
the change.

regards, tom lane


From: Kris Jurka <books(at)ejurka(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-23 15:35:53
Message-ID: alpine.BSO.2.00.1007231130130.23628@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 23 Jul 2010, Tom Lane wrote:

> Kris Jurka <books(at)ejurka(dot)com> writes:
>> On 7/23/2010 6:40 AM, Tom Lane wrote:
>>> I believe this is a misunderstanding of the protocol spec. The spec is
>>> (intended to say that) we'll continue to accept data after reporting an
>>> error, not that we will silently swallow an incorrect data stream and
>>> not complain about it. Which is what this patch will do.
>
>> All this does is make binary mode match text mode.
>
> The fact that text mode eats data after \. is a backwards-compatibility
> kluge to match the behavior of pre-7.4 COPY. It could very legitimately
> be argued to be a bug in itself. I don't think that we should make
> binary mode match it. The main concrete reason why not is that binary
> mode has almost no redundancy. It would be really easy for the code
> change you suggest to result in data being silently discarded with no
> hint of what went wrong.

Binary copy mode already does this (eat data silently after -1 field
count). The patch I sent just made it follow the fe/be protocol while it
does so.

jurka=# create table copytest (a int);
CREATE TABLE
jurka=# insert into copytest values (1);
INSERT 0 1
jurka=# \copy copytest to copydata with binary
jurka=# \! echo garbage >> copydata
jurka=# \copy copytest from copydata with binary
jurka=# select * from copytest;
a
---
1
1
(2 rows)

> After some reflection, I think the real issue here is that the JDBC
> driver is depending on a behavior not stated in the protocol, which
> is the relative timing of FE-to-BE and BE-to-FE messages. Once you've
> sent the EOF marker, the only correct follow-on for a spec-compliant
> frontend is a CopyEnd message. So the backend is just sending its
> response a bit sooner. There's nothing in the protocol spec forbidding
> that.

What about CopyFail? The protocol docs say nothing about the message
contents only about the messages themselves.

Kris Jurka


From: James William Pye <lists(at)jwp(dot)name>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kris Jurka <books(at)ejurka(dot)com>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-25 02:03:20
Message-ID: 7E743A09-D012-4BBB-9B87-24CA1DF51015@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Jul 23, 2010, at 7:11 AM, Tom Lane wrote:
> I can't help thinking that the JDBC driver must be being overly cute
> if this breaks it ...

I was wondering the same thing when I first saw Kris' message. However, iff I understand what JDBC is trying to achieve, I don't think I would call it "overly".

@Kris

Is this a problem because JDBC is trying to detect failures as early as possible during a COPY IN?
Or, is it just JDBC's normal MO to always be reading?

Well, I've wanted to do the former (early error detection) with py-postgresql's COPY support, and I imagine getting a read event marking completion prior to emitting done/fail could be a snag.

cheers, jwp


From: Kris Jurka <books(at)ejurka(dot)com>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-25 15:01:48
Message-ID: alpine.BSO.2.00.1007251041530.16933@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Sat, 24 Jul 2010, James William Pye wrote:

> On Jul 23, 2010, at 7:11 AM, Tom Lane wrote:
>> I can't help thinking that the JDBC driver must be being overly cute
>> if this breaks it ...
>
> I was wondering the same thing when I first saw Kris' message. However,
> iff I understand what JDBC is trying to achieve, I don't think I would
> call it "overly".
>
> Is this a problem because JDBC is trying to detect failures as early as
> possible during a COPY IN? Or, is it just JDBC's normal MO to always be
> reading?

The JDBC driver reads server messages for multiple reasons. One of them
is indeed to do early failure detection. Another is to pickup
NoticeResponse messages to avoid a network buffer deadlock. If someone
puts a trigger on the table you're copying data into that does RAISE
NOTICE 'received row X' for each row, to avoid a full network buffer
deadlock, the client must regularly read from the backend. So as we are
reading along, supposing that we're still mid-copy, we get a command
complete message. So this is possible to work around driver side by
peeking into the network stream and delaying processing of the end of copy
until the driver agrees that the copy is done, but I still maintain that
this is a server bug. It is not OK for the server to assume that the
client is done and move on, the client must tell the server what it wants
done.

Kris Jurka


From: James William Pye <lists(at)jwp(dot)name>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-28 16:31:32
Message-ID: 41218459-EDA4-4005-B791-2392991150AC@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Jul 25, 2010, at 8:01 AM, Kris Jurka wrote:
> The JDBC driver reads server messages for multiple reasons.

> One of them is indeed to do early failure detection.

That's high quality. =)

> Another is to pickup NoticeResponse messages to avoid a network buffer deadlock.

That's a good catch. I don't think psql/restore would often run into this as when COPY IN is in play, it's normally restoring a database. However, with JDBC, I imagine COPY would more often be used to do bulk loading into live tables that may very well cause a NOTICE. [Well, I reference psql/libpq because I don't recall it recognizing failure during COPY IN in the past, so I assume it's not receiving any data in that state.]

hrm, I suppose a lazy way around that problem would be to suspend all client messages(client_min_messages) during COPY IN. Tho, I guess one would still have to contend with NotificationResponse, and ParameterStatus..

> So this is possible to work around driver side by peeking into the network stream and delaying processing of the end of copy until the driver agrees that the copy is done, but

I don't think you would have to peek in. If the interface were to always hold onto the last message or last n-bytes submitted to be sent, it would be able to send the possible CopyData(EOF) and CopyDone once the COPY operation (at the interface level) is closed/shutdown/terminated. Granted, this is dependent on CopyData(EOF) not being in the middle of regular CopyData, but I gather that that would end in an ErrorResponse anyways.

> I still maintain that this is a server bug. It is not OK for the server to assume that the client is done and move on, the client must tell the server what it wants done.

I'm a bit torn here. While it would seem to be either a bug in the spec or a bug in the server, I'm inclined to call it a wart in the server's implementation of the spec.

I don't see the fix as being dangerous, but I imagine an implementor would want to have the workaround in place regardless. I certainly would.

I'd be in favor of seeing this fixed in 9.x, and the documentation updated to warn implementors about the wart in the older versions.. That is, I don't see any reason why we can't get rid of this unsightly thing considering the workarounds would still work with a wart-free server.

cheers, jwp


From: Kris Jurka <books(at)ejurka(dot)com>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-28 16:53:49
Message-ID: alpine.BSO.2.00.1007281249290.9018@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Wed, 28 Jul 2010, James William Pye wrote:

>
> hrm, I suppose a lazy way around that problem would be to suspend all
> client messages(client_min_messages) during COPY IN. Tho, I guess one
> would still have to contend with NotificationResponse, and
> ParameterStatus..

Technically you won't get NotificationResponse until transaction end, so
you don't need to worry about that mid copy.

> I don't think you would have to peek in. If the interface were to always
> hold onto the last message or last n-bytes submitted to be sent, it
> would be able to send the possible CopyData(EOF) and CopyDone once the
> COPY operation (at the interface level) is closed/shutdown/terminated.
> Granted, this is dependent on CopyData(EOF) not being in the middle of
> regular CopyData, but I gather that that would end in an ErrorResponse
> anyways.

One of the key points of confusion is that CopyData(EOF) does not result
in an error. It results in ignoring any futher data. The problem I have
is that for text mode it waits for CopyDone, but in binary mode it ends
the copy sequence immediately. Additionally the interface exposed by the
JDBC driver lets the user write arbitrary CopyData bytes to the server, so
without parsing all of that we don't know whether they've issued
CopyData(EOF) or not.

Kris Jurka


From: James William Pye <lists(at)jwp(dot)name>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-28 20:19:26
Message-ID: BDD3690F-8290-4FC4-B6DE-1ED6E5558D83@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Jul 28, 2010, at 9:53 AM, Kris Jurka wrote:
> Technically you won't get NotificationResponse until transaction end, so you don't need to worry about that mid copy.

Ah, thanks for noting that. It would appear my original reading of the async section didn't get far enough beyond "Frontends must be prepared to deal with these messages at any time, even when not engaged in a query.". I see the note below clarifying NotificationResponse.

> One of the key points of confusion is that CopyData(EOF) does not result in an error.
> It results in ignoring any futher data.
> The problem I have is that for text mode it waits for CopyDone, but in binary mode it ends the copy sequence immediately.

That is bothersome. :\

> Additionally the interface exposed by the JDBC driver lets the user write arbitrary CopyData bytes to the server, so without parsing all of that we don't know whether they've issued CopyData(EOF) or not.

Okay, so you can't know with absolute certainty without parsing the data, but the usual case would be handled by holding onto the last-N bytes or so. Enough to fit the EOF and perhaps a little more for paranoia's sake.

That's not to say that I'm missing the problem. When (not "if", "when") the user feeds data past a CopyData(EOF), it's going to get interesting.

[Thinking about the logic necessary to handle such a case and avoid network buffer deadlock...]
I would think the least invasive way to handle it would be to set the CommandComplete and ReadyForQuery messages aside when they are received if CopyDone hasn't been sent, continue the COPY operation as usual until it is shutdown, send CopyDone and, finally, "reinstate" CommandComplete and RFQ as if they were just received.. I don't think that really accommodates for CopyFail as the status in RFQ will need to be "adjusted" to match what was actually done..? Well, I'm not sure you would need to worry about NoticeResponse after a premature CommandComplete as INSERTs are no longer happening. ugh.

+1 for a fix.

Not directly regarding your patch, but while the discussion is in the general area.
I think it would be wise to throw an error when non-empty CopyData messages are received after CopyData(EOF). Chances are that the user is making a mistake and should be notified of it.

cheers, jwp


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-29 09:15:14
Message-ID: alpine.DEB.2.00.1007290952210.2654@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

(Yes, I know I'm not on the hackers list. Most interested parties should
get this directly anyway.)

>> Additionally the interface exposed by the JDBC driver lets the user
>> write arbitrary CopyData bytes to the server, so without parsing all of
>> that we don't know whether they've issued CopyData(EOF) or not.
>
> Okay, so you can't know with absolute certainty without parsing the
> data, but the usual case would be handled by holding onto the last-N
> bytes or so. Enough to fit the EOF and perhaps a little more for
> paranoia's sake.
>
> That's not to say that I'm missing the problem. When (not "if", "when")
> the user feeds data past a CopyData(EOF), it's going to get interesting.

This is the reason why the patch to the JDBC driver that I sent in is very
fragile. In the case where a user provides a binary copy with lots of data
after the EOF, the processCopyData method *will* get called after the
CommandComplete and ReadyForQuery messages have been received, even if we
try to delay processing of the ReadyForQuery message.

> [Thinking about the logic necessary to handle such a case and avoid
> network buffer deadlock...] I would think the least invasive way to
> handle it would be to set the CommandComplete and ReadyForQuery messages
> aside when they are received if CopyDone hasn't been sent, continue the
> COPY operation as usual until it is shutdown, send CopyDone and,
> finally, "reinstate" CommandComplete and RFQ as if they were just
> received..

Basically, yes. We need to introduce a little more state into the JDBC
driver. Currently, the driver is in one of two states:

1. In the middle of a copy.
2. Not in a copy.

These states are recorded in the lock system. We need to introduce a new
state, where the copy is still locked, but we know that the
CommandComplete and ReadyForQuery messages have been received. We can no
longer unlock the copy in processCopyData - we need to do that in endCopy
instead, after calling processCopyData to ensure that we wait for a valid
CommandComplete and ReadyForQuery message first.

Matthew

--
Terrorists evolve but security is intelligently designed? -- Jake von Slatt


From: Kris Jurka <books(at)ejurka(dot)com>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-08-06 23:31:45
Message-ID: alpine.BSO.2.00.1008061927300.17860@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Wed, 28 Jul 2010, James William Pye wrote:

> Not directly regarding your patch, but while the discussion is in the
> general area. I think it would be wise to throw an error when non-empty
> CopyData messages are received after CopyData(EOF). Chances are that the
> user is making a mistake and should be notified of it.
>

As this is also the direction that Tom Lane indicated we should go, here
is a patch which errors out after receiving any more copy data past the
EOF marker. This also fixes the protocol problem I previously brought up
because the act of checking to see if there is any more data does ensure
that if there isn't any more data in the current buffer, that we wait for
the client to provide CopyDone/Fail.

Kris Jurka

Attachment Content-Type Size
binary-copy-end-v2.patch text/plain 1.2 KB

From: James William Pye <lists(at)jwp(dot)name>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-08-07 00:45:15
Message-ID: 0448ABED-5F22-4EE5-95F5-A7917B25EFB6@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Aug 6, 2010, at 4:31 PM, Kris Jurka wrote:
> <binary-copy-end-v2.patch>

I think there's a snag in the patch:

postgres=# COPY data FROM '/Users/jwp/DATA.bcopy' WITH BINARY;
ERROR: row field count is -1, expected 1
CONTEXT: COPY data, line 4

Probably a quick/small fix away, I imagine.

But, I was able to trigger the new ERROR with py-postgresql:

>>> import postgresql as pg
>>> db=pg.open('localhost/postgres')
>>> q=db.prepare('copy data FROM STDIN WITH BINARY')
>>> from itertools import chain
>>> import sys
>>> db.pq.tracer = sys.stderr.write
>>> q.load_rows(chain(open('/Users/jwp/DATA.bcopy', 'rb'), (b'EXTRA',)))
↑ B(25): b'B\x00\x00\x00\x18\x00py:0x1268b30\x00\x00\x00\x00\x00\x00\x00'
↑ E(10): b'E\x00\x00\x00\t\x00\x00\x00\x00\x01'
↑ S(5): b'S\x00\x00\x00\x04'
↓ b'2'(0): b''
↓ b'G'(5): b'\x01\x00\x01\x00\x01'
↑__(7): b'PGCOPY\n'
↑__(3): b'\xff\r\n'
↑__(41): b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x00\x01\x00\x01\x00\x00\x00\x04\x00\x00\x00\x02\x00\x01\x00\x00\x00\x04\x00\x00\x00\x03\xff\xff'
↑__(5): b'EXTRA'
↑ c(5): b'c\x00\x00\x00\x04'
↑ S(5): b'S\x00\x00\x00\x04'
↓ b'E'(95): b'SERROR\x00C22P04\x00Mreceived copy data after EOF marker\x00WCOPY data, line 4\x00Fcopy.c\x00L2081\x00RCopyFrom\x00\x00'
↓ b'Z'(1): b'I'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
<snip>
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/postgresql/driver/pq3.py", line 462, in raise_server_error
raise server_error
postgresql.exceptions.BadCopyError: received copy data after EOF marker
CODE: 22P04
LOCATION: File 'copy.c', line 2081, in CopyFrom from SERVER
CONTEXT: COPY data, line 4
STATEMENT: [prepared]
sql_parameter_types: []
statement_id: py:0x1268b30
string: copy data FROM STDIN WITH BINARY
CONNECTION: [idle]
client_address: ::1/128
client_port: 63922
version:
PostgreSQL 9.1devel on x86_64-apple-darwin10.4.0, compiled by GCC i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664), 64-bit
CONNECTOR: [Host] pq://jwp:***(at)localhost:5432/postgres
category: None
DRIVER: postgresql.driver.pq3.Driver


From: Kris Jurka <books(at)ejurka(dot)com>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-08-07 07:09:59
Message-ID: alpine.BSO.2.00.1008070305470.12532@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Fri, 6 Aug 2010, James William Pye wrote:

> On Aug 6, 2010, at 4:31 PM, Kris Jurka wrote:
>> <binary-copy-end-v2.patch>
>
> I think there's a snag in the patch:
>
> postgres=# COPY data FROM '/Users/jwp/DATA.bcopy' WITH BINARY;
> ERROR: row field count is -1, expected 1
> CONTEXT: COPY data, line 4
>
> Probably a quick/small fix away, I imagine.

Hmm, not quite sure why that is. That seems to imply that it's not using
V3 protocol, but I thought binary copy could only be used with the V3
protocol. In any case, I think this new patch is more bulletproof.

Kris Jurka

Attachment Content-Type Size
binary-copy-end-v3.patch text/plain 1.2 KB

From: Kris Jurka <books(at)ejurka(dot)com>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-08-09 18:49:04
Message-ID: alpine.BSO.2.00.1008091447440.25873@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Sat, 7 Aug 2010, Kris Jurka wrote:

> On Fri, 6 Aug 2010, James William Pye wrote:
>
>> I think there's a snag in the patch:
>>
>> postgres=# COPY data FROM '/Users/jwp/DATA.bcopy' WITH BINARY;
>> ERROR: row field count is -1, expected 1
>> CONTEXT: COPY data, line 4
>>
>> Probably a quick/small fix away, I imagine.
>
> Hmm, not quite sure why that is. That seems to imply that it's not using V3
> protocol, but I thought binary copy could only be used with the V3 protocol.
> In any case, I think this new patch is more bulletproof.
>

Oh, duh. It's a server side copy not going through the client at all.
Here's a hopefully final patch.

Kris Jurka

Attachment Content-Type Size
binary-copy-end-v4.patch text/plain 1.2 KB

From: James William Pye <lists(at)jwp(dot)name>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-08-28 18:37:38
Message-ID: 060ADB72-80CA-4593-A333-DF2FD6BB0C2E@jwp.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Aug 9, 2010, at 11:49 AM, Kris Jurka wrote:
> Oh, duh. It's a server side copy not going through the client at all. Here's a hopefully final patch.

Trying it out... Works for me.

I understand the resistance to the patch, but it would be
quite nice to see this wart in the rear view. =\


From: Kris Jurka <books(at)ejurka(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Trouble with COPY IN
Date: 2010-08-31 18:13:47
Message-ID: alpine.BSO.2.00.1008311410310.1495@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

On Tue, 20 Jul 2010, Matthew Wakeling wrote:

> On Tue, 20 Jul 2010, Matthew Wakeling wrote:
>> So, should the unlocking happen elsewhere, or should endCopy not require a
>> lock?
>
> Or a third option. We should delay the processing of the "Z" (Ready for
> query) message until after the endCopy has been called.
>
> I have still not been able to create a test that provokes this bug. I do not
> understand under what conditions the server will do this, but sometimes it
> will respond with a "C" and a "Z" before the endCopy message has actually
> been sent to it. Maybe it is keying off the "-1" (no further rows) part of
> the binary copy. In any case, I am including a patch here which seems to fix
> the driver for my application. When the driver receives a "C" message
> (command finished), and block is set to false, then we can assume that a
> little later on we will call processCopyResults again very soon with block
> set to true. Instead of setting block to true, finish processing of messages
> in order to delay seeing the "Z" message until after the endCopy method has
> been called.
>
> The patch also contains a simple test for binary copy in, although it does
> not exercise this bug.

I've applied a patch to CVS to implement a variant of this that is more
selective of what it ignores. It peeks into the protocol stream and
ignores all CommandComplete messages while we aren't blocking (which
means that the driver thinks it isn't done yet).

Kris Jurka


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: James William Pye <lists(at)jwp(dot)name>, pgsql-hackers(at)postgresql(dot)org, Matthew Wakeling <matthew(at)flymine(dot)org>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [JDBC] Trouble with COPY IN
Date: 2010-09-18 20:15:45
Message-ID: 18984.1284840945@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> writes:
>> On Fri, 6 Aug 2010, James William Pye wrote:
>>> I think there's a snag in the patch:

> Oh, duh. It's a server side copy not going through the client at all.
> Here's a hopefully final patch.

Applied with a correction: this would've totally broken binary copy in
old-style protocol, because there is no other EOF marker except the -1
in that case.

BTW, it strikes me that we could reduce the backwards-compatibility
impact of this patch if we made it ignore, rather than throw error for,
any extra data after the EOF marker. I left it as-is since ISTM the
more error checking you can have in a binary data format, the better.
But a case could be made for doing the other thing, especially if
somebody wanted to argue for back-patching this.

regards, tom lane