Quick Links

Re: Binary Cursors, and the COPY command

Lists:	pgsql-hackerspgsql-jdbc

From:	pgsql(at)mohawksoft(dot)com
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Binary Cursors, and the COPY command
Date:	2004-07-26 17:41:47
Message-ID:	10519.64.119.142.34.1090863707.squirrel@mail.mohawksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

OK, I wrote a utility for 7.3 that takes the output of a select command in
a Binary cursor and creates a binary "COPY" file.

The premise of the utility is to take the results of two or more selects
from external databases and create a single unified table.

Here are the issues:

In 7.3, COPY BINARY was machine specific and so was the output of a binary
cursor. Everything just worked fine.

In 7.4, COPY BINARY uses "network byte order," i.e. native data types are
altered to big endian if nessisary. The documentation for binary cursors
does not specify whether or not the "binary" data is native or "network
byte order."

I have a few issues with "COPY BINARY" using "network byte order," first,
it is pointless. The problem it intends to solve, i.e. transfering across
different machine types is already answered using the tried and true ascii
method.

Second, it actually makes the COPY functionality less usable. You can not
create the data outside of the database because all the data type
definitions and manipulation functions are inside the database. (Unless
you only use simple data types, of course.)

Third, if a binary cursor does not encode the binary data as "network byte
order" a binary copy can ONLY communicate between two postgreSQL databases
because the information required to go from native ordering to network
ordering is only in the database.

Lastly, the vast majority of machines in use today are intel. Meaning that
they are small endian. Except in a very rare circumstance, two machines
that would normally be able to communicate in native byte order, will
ALWAYS have to convert data.

The only use case network byte order fixes is a BINARY COPY between
different machine types, but in doing that, it forces anyone trying to add
value to postgresql or create a utility that uses COPY to reimplement all
the data type handlers outside of the database, even if they never need to
interpret or inspect the values, because they have to do this to put them
in network byte order.

I would say that the history of the word "BINARY" would tend more to
indicate incompatible machine specific data.

I would submit that the 7.4 format of data, i.e. one data size int32
instead of an int16 followed by the optional int32 is cleaner, but I would
remove the "network byte order" and put the byte order int32 back in the
header for 7.5

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql(at)mohawksoft(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-26 18:11:38
Message-ID:	6334.1090865498@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

pgsql(at)mohawksoft(dot)com writes:
> Lastly, the vast majority of machines in use today are intel. Meaning that
> they are small endian. Except in a very rare circumstance, two machines
> that would normally be able to communicate in native byte order, will
> ALWAYS have to convert data.

Quite honestly, that is exactly one of the reasons for using network
order. If we standardized on little-endian then 90% of programmers
would ignore the need to do anything about byte ordering issues, and
their code would be unportable to big-endian machines.

Putting a byte order flag into the header doesn't improve matters;
it just forces everybody to deal with *both* orders, which is not
simpler. (Except for those who would like not to be bothered with
portability, which is a position I have zero sympathy for.)

If you don't want to deal with this, don't use binary data. It's as
simple as that.

BTW, for your stated use-case of combining binary output from separate
databases, I should think you'd be happy about having a not-so-machine-
dependent data format. It would still work if the databases were
running on servers of different endianness.

regards, tom lane

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	pgsql(at)mohawksoft(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-26 21:17:45
Message-ID:	410574F9.9020103@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

pgsql(at)mohawksoft(dot)com wrote:

> The only use case network byte order fixes is a BINARY COPY between
> different machine types, but in doing that, it forces anyone trying to add
> value to postgresql or create a utility that uses COPY to reimplement all
> the data type handlers outside of the database, even if they never need to
> interpret or inspect the values, because they have to do this to put them
> in network byte order.

This is not true if you happen to be using Java on the client side,
which has no idea (unless you grot around in the guts of the JVM) what
the native byte order is. This actually means that Java clients have the
opposite problem -- it's a lot of work to try to use the 7.3-style
binary formats.

> I would say that the history of the word "BINARY" would tend more to
> indicate incompatible machine specific data.

"Binary" implies "not plaintext" to me..

What about binary parameters in Bind or binary resultsets from Execute?
They follow the same format as binary COPY values. Are you suggesting
those should be changed too?

-O

From:	pgsql(at)mohawksoft(dot)com
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql(at)mohawksoft(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 02:06:28
Message-ID:	17868.24.91.171.78.1090893988.squirrel@mail.mohawksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

> pgsql(at)mohawksoft(dot)com writes:
>> Lastly, the vast majority of machines in use today are intel. Meaning
>> that
>> they are small endian. Except in a very rare circumstance, two machines
>> that would normally be able to communicate in native byte order, will
>> ALWAYS have to convert data.
>
> Quite honestly, that is exactly one of the reasons for using network
> order. If we standardized on little-endian then 90% of programmers
> would ignore the need to do anything about byte ordering issues, and
> their code would be unportable to big-endian machines.

So what you are saying is that you should inconvenience 90% of your users
to make sure they do something "right?"

When you *really* think about it, by forcing a big endian byte order over
machine dependent byte order, you are inconveniencing 90% of the users,
but only helping the very small amount of people who run in mixed
environments where the server is intel and the client is big endian.

>
> Putting a byte order flag into the header doesn't improve matters;
> it just forces everybody to deal with *both* orders, which is not
> simpler. (Except for those who would like not to be bothered with
> portability, which is a position I have zero sympathy for.)

While I understand your lack of sympathy, I think you are too removed from
the trenches. There is a viable argument in many projects for machine
dependent constructs. Sometimes there are real-world performance and
delivery goals with absolutely stated objectives that the project is not
heterogenus.

>
> If you don't want to deal with this, don't use binary data. It's as
> simple as that.

That's not a good answer either. Come on, what the hell good is having a
"BINARY" if the data is not usable? Using ascii isn't good either because
you need a text parser for every possible interpretation of the various
simple types.

>
> BTW, for your stated use-case of combining binary output from separate
> databases, I should think you'd be happy about having a not-so-machine-
> dependent data format. It would still work if the databases were
> running on servers of different endianness.

Few deployments will *ever* really have different endian-ness amongst
their servers. 99% of all deployments will be the same or compatible
hardware, and probably intel at that.

OK, so you like the idea of binary being "big endian," what about the idea
of adding a keyword that is exclusive of "binary" called "native" or
something like that?

DECLARE fubar NATIVE CURSOR ...

COPY fubar {FROM | TO} {'filename' | STD..} WITH [NATIVE | BINARY] ...

From:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 06:39:49
Message-ID:	ce4t7l$pas$2@sea.gmane.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Oliver Jowett wrote:
> This is not true if you happen to be using Java on the client side,
> which has no idea (unless you grot around in the guts of the JVM) what
> the native byte order is.
>
The method java.nio.ByteOrder.nativeOrder() will tell you what the
native byte order is.

> This actually means that Java clients have the
> opposite problem -- it's a lot of work to try to use the 7.3-style
> binary formats.
>
The commonly used java.io.DataInput will always use network order but
it's easy enough to read/write little endian using the java.nio and
java.nio.channel packages.

Regards,

Thomas Hallgren

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 07:07:47
Message-ID:	4105FF43.40508@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Thomas Hallgren wrote:
> Oliver Jowett wrote:
> > This is not true if you happen to be using Java on the client side,
> > which has no idea (unless you grot around in the guts of the JVM) what
> > the native byte order is.
> >
> The method java.nio.ByteOrder.nativeOrder() will tell you what the
> native byte order is.

NIO is not present before JDK 1.4. The JDBC driver, at least, needs to
support earlier JVMs.

> > This actually means that Java clients have the
> > opposite problem -- it's a lot of work to try to use the 7.3-style
> > binary formats.
> >
> The commonly used java.io.DataInput will always use network order but
> it's easy enough to read/write little endian using the java.nio and
> java.nio.channel packages.

That's no use if you don't know the endianness of the data you're
receiving (which is what happens under <= 7.3 -- the data followed the
server's byte ordering)

The problem with using native byte orderings is not the byte ordering
itself, but that the order is unpredictable -- at best, you have to
implement code to handle both orders, and at worst you have to just take
a guess and hope you were right..

-O

From:	"Thomas Hallgren" <thhal(at)mailblocks(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 09:39:54
Message-ID:	ce587j$ga1$1@sea.gmane.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

"Oliver Jowett" <oliver(at)opencloud(dot)com> wrote in message
news:4105FF43(dot)40508(at)opencloud(dot)com(dot)(dot)(dot)
> NIO is not present before JDK 1.4. The JDBC driver, at least, needs to
> support earlier JVMs.
>
Clients only capable of network order (such as a Java 1.3 based JDBC driver)
must of course be supported still. No argument there. My objection was to
your general statement that "Java has no idea what the native byte order
is".

Another more philosophical question (more suitable on the jdbc list) is when
the Java 1.3 support should be limited (or perhaps discontinued altogether)
so that further development can exploit everything that 1.4 provides. After
all, it's been around for more than 2 years now. AFAIK, the early bugs
forcing you to cling on to the 1.3 have been fixed a long time ago. Doesn't
the current 3.0 driver make use of features from the Java 1.4 version of
java.sql already?

> The problem with using native byte orderings is not the byte ordering
> itself, but that the order is unpredictable -- at best, you have to
> implement code to handle both orders, and at worst you have to just take
> a guess and hope you were right..
>
Sure, but those problems are present regardless of implementation language.

Regards,

Thomas Hallgren

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
Cc:	pgsql-jdbc(at)postgresql(dot)org
Subject:	Re: [HACKERS] Binary Cursors, and the COPY command
Date:	2004-07-27 10:04:41
Message-ID:	410628B9.2080800@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

(moved to -jdbc)

Thomas Hallgren wrote:

> Another more philosophical question (more suitable on the jdbc list) is when
> the Java 1.3 support should be limited (or perhaps discontinued altogether)
> so that further development can exploit everything that 1.4 provides. After
> all, it's been around for more than 2 years now. AFAIK, the early bugs
> forcing you to cling on to the 1.3 have been fixed a long time ago. Doesn't
> the current 3.0 driver make use of features from the Java 1.4 version of
> java.sql already?

JDBC3 support implies JDK1.4. But the driver has the JDBC3 support
separated out specifically so it's still possible to build under earlier
JDKs.

We only recently dropped JDK1.1 support (!). I think there might be some
resistance to dropping JDK1.2 / JDK1.3 support. Personally I target
JDK1.4 so it doesn't affect me anyway, but..

What features of 1.4 were you thinking of exploiting?

-O

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 10:10:57
Message-ID:	41062A31.7000800@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Thomas Hallgren wrote:
> "Oliver Jowett" <oliver(at)opencloud(dot)com> wrote in message
> news:4105FF43(dot)40508(at)opencloud(dot)com(dot)(dot)(dot)
>
>>NIO is not present before JDK 1.4. The JDBC driver, at least, needs to
>>support earlier JVMs.
>>
>
> Clients only capable of network order (such as a Java 1.3 based JDBC driver)
> must of course be supported still. No argument there. My objection was to
> your general statement that "Java has no idea what the native byte order
> is".

I suppose so. I'd point out that the NIO byteorder info is really just
an optimization hint -- the rest of the NIO API is byteorder agnostic,
regardless of what the native byteorder is.

>>The problem with using native byte orderings is not the byte ordering
>>itself, but that the order is unpredictable -- at best, you have to
>>implement code to handle both orders, and at worst you have to just take
>>a guess and hope you were right..
>
> Sure, but those problems are present regardless of implementation language.

That was my real argument..

-O

From:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 10:41:49
Message-ID:	4106316D.2000605@mailblocks.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Oliver Jowett wrote:
> I suppose so. I'd point out that the NIO byteorder info is really just
> an optimization hint -- the rest of the NIO API is byteorder agnostic,
> regardless of what the native byteorder is.
>
The rest of the NIO API is not agnostic. A java.nio.ByteBuffer is big
endian by default and can be changed using the order method. This is the
mechanism you'd use to implement I/O that can handle either endian type.

Regards,

Thomas Hallgren

From:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
To:	Oliver Jowett <oliver(at)opencloud(dot)com>
Cc:	pgsql-jdbc(at)postgresql(dot)org, pgsql(at)mohawksoft(dot)com
Subject:	Re: [HACKERS] Binary Cursors, and the COPY command
Date:	2004-07-27 11:28:55
Message-ID:	thhal-0Q9bhAYd/UNsDDyjznLWQVb9lU3V0FY@mailblocks.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Oliver Jowett wrote:

> What features of 1.4 were you thinking of exploiting?

Well, the java.nio obviously :-)

java.nio provides a java.nio.CharBuffer. A java.lang.StringBuffer is
synchronized. The CharBuffer is not. Since the JDBC driver uses strings
in a lot of places some code could be rewritten to increase performance.

And I think that Mark (pgsql(at)mohawksoft(dot)com) has a point. 90% of all
installations would get a performance boost if native byte order was
used. IMHO, it's all about an initial protocoll negotiation (which of
course must be done in network byte order).

Regards,

Thomas Hallgren

From:	"Jeroen T(dot) Vermeulen" <jtv(at)xs4all(dot)nl>
To:	pgsql(at)mohawksoft(dot)com
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Binary Cursors, and the COPY command
Date:	2004-07-27 11:30:04
Message-ID:	20040727113003.GA51388@xs4all.nl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

On Mon, Jul 26, 2004 at 10:06:28PM -0400, pgsql(at)mohawksoft(dot)com wrote:

> So what you are saying is that you should inconvenience 90% of your users
> to make sure they do something "right?"

I would say that was pretty solid reasoning. Exposing 10% of users to a
high data corruption risk just to get slightly better CPU performance on
I/O bound operations (think Amdahl's Law!) does not sound all that sensible
to me. Remember what happened to NT on the Alpha? Device driver writers
failed to use the portability macros for byte or 16-bit memory-mapped
hardware register accesses, exactly because it also worked without on x86.

This was one of the main reasons why they had to add 8-bit and 16-bit
loads and stores to the architecture! Similarly, lots of programs harbor
annoying, hard-to-find endianness bugs because little-endian byte order
happens to hide some typical pointer bugs.

> When you *really* think about it, by forcing a big endian byte order over
> machine dependent byte order, you are inconveniencing 90% of the users,
> but only helping the very small amount of people who run in mixed
> environments where the server is intel and the client is big endian.

I don't see this holding in situations where the user, the programmer, and
the systems manager are different (groups of) people. Users may want to
connect to different servers. Systems managers may want to replace servers.
In today's world even a non-programming user might recompile your
application on a big-endian machine. Applications may be running under
emulation, and people will prefer to run the database server natively.

All those scenarios may break the client-side application. The chance that
the mixed-endian scenario had never been tested would be close to those 90%.

> Few deployments will *ever* really have different endian-ness amongst
> their servers. 99% of all deployments will be the same or compatible
> hardware, and probably intel at that.

I'd like to add 2 points here:

1. Server architectures and client architectures are two very different
things. There are lots of SPARC, PA-RISC, MIPS, POWER, zSeries etc.
servers out there.

2. Even if both platforms are AMD or AMD-clones (we'll have to get used to
saying that as 64-bit becomes more popular), the number of situations where
this is _guaranteed_ as a part of the project definition will be much lower.

Remember, the programmer should generally support both the single-endian
and the big-endian scenario anyway. We might as well make sure that the
hardest is also the most widely tested.

Jeroen

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
Cc:	pgsql-jdbc(at)postgresql(dot)org, pgsql(at)mohawksoft(dot)com
Subject:	Re: [HACKERS] Binary Cursors, and the COPY command
Date:	2004-07-27 11:43:58
Message-ID:	41063FFE.8030400@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Thomas Hallgren wrote:

> Well, the java.nio obviously :-)
>
> java.nio provides a java.nio.CharBuffer. A java.lang.StringBuffer is
> synchronized. The CharBuffer is not. Since the JDBC driver uses strings
> in a lot of places some code could be rewritten to increase performance.

I haven't seen the string manipulation to be much of a problem with the
current driver in the profiling I've done (it'll be quite application
specific though). And I'd have thought the stringbuffer monitors would
be essentially uncontended and cheap to enter. What are the hotspots you
see?

> And I think that Mark (pgsql(at)mohawksoft(dot)com) has a point. 90% of all
> installations would get a performance boost if native byte order was
> used.

Well, not mine :) (x86 clients, sparc server, and the driver doesn't yet
use binary format in places where byte order matters anyway)

Also I am fairly suspicious about claims that native byte order will
make things go measurably faster. Do you have any profiling or
benchmarks to back that up? The low-level manipulation of protocol data
barely shows up on the profiles I've done.

-O

From:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
To:	Oliver Jowett <oliver(at)opencloud(dot)com>
Cc:	pgsql-jdbc(at)postgresql(dot)org
Subject:	Re: [HACKERS] Binary Cursors, and the COPY command
Date:	2004-07-27 12:58:36
Message-ID:	thhal-0SOvhAV2AUNsrD1b21bnUVdXao4af1P@mailblocks.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Oliver Jowett wrote:

> I haven't seen the string manipulation to be much of a problem with
> the current driver in the profiling I've done (it'll be quite
> application specific though). And I'd have thought the stringbuffer
> monitors would be essentially uncontended and cheap to enter. What are
> the hotspots you see?
>
See below...

> Also I am fairly suspicious about claims that native byte order will
> make things go measurably faster. Do you have any profiling or
> benchmarks to back that up? The low-level manipulation of protocol
> data barely shows up on the profiles I've done.

I haven't made any benchmarks and I admit that as long as the client and
server runs in separate processes, the gain will be relatively small.
Context switching and socket management are the real time consumers.

I'm the author of PL/Java. It uses its own JDBC driver on top of SPI.
The main reason for thas is that I don't want the overhead of streaming
data and flipping byte order when everything is readily available in
memory. When "client" and "server" resides in the same process the
overhead is measurable. By using java.nio in your JDBC, I beleive it
would be possible to not just use native byte ordering, but perhaps also
to create a nice abstraction allowing direct access to structures in
memory rather than streaming data, thus obliviate the need for my own
driver. And PL/Java will never run on Java 1.3 or older :-)

Do you have any opinion on that?

Regards,

Thomas Hallgren

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Thomas Hallgren <thhal(at)mailblocks(dot)com>
Cc:	pgsql-jdbc(at)postgresql(dot)org
Subject:	Re: [HACKERS] Binary Cursors, and the COPY command
Date:	2004-07-27 22:19:03
Message-ID:	4106D4D7.6070902@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-jdbc

Thomas Hallgren wrote:

> I'm the author of PL/Java. It uses its own JDBC driver on top of SPI.
> The main reason for thas is that I don't want the overhead of streaming
> data and flipping byte order when everything is readily available in
> memory. When "client" and "server" resides in the same process the
> overhead is measurable. By using java.nio in your JDBC, I beleive it
> would be possible to not just use native byte ordering, but perhaps also
> to create a nice abstraction allowing direct access to structures in
> memory rather than streaming data, thus obliviate the need for my own
> driver. And PL/Java will never run on Java 1.3 or older :-)
>
> Do you have any opinion on that?

This is a *lot* of work. There are bigger issues to deal with than the
use of NIO -- lots of the higher-level JDBC code makes assumptions about
the details of the protocol it is speaking. I've made some inroads on
that front with the V3 protocol rewrite I did recently but there will
still be a lot more work in that area.

That said, I can see that it might work with "SPI within the same
process" as a new underlying "protocol". That could use NIO or whatever
you want, and conditionally build that code only when 1.4 + the SPI
interface is available. The query/parameter abstraction that's currently
in place is intended to give the protocol flexibility about how it
represents the data (although it is pretty basic at the moment) so I
don't think it'd be necessary to have NIO in the protocol-independent
parts of the driver.

Anyway, good luck :)

-O