Re: [HACKERS] pg_dump disaster

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alfred Perlstein <bright(at)wintelcom(dot)net>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, prlw1(at)cam(dot)ac(dot)uk, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] pg_dump disaster
Date: 2000-01-21 15:44:02
Message-ID: 4769.948469442@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alfred Perlstein <bright(at)wintelcom(dot)net> writes:
>>>> The answer appears to be that Perlstein's "nonblocking mode" patches
>>>> have broken psql copy, and doubtless a lot of other applications as
>>>> well, because pqPutBytes no longer feels any particular compulsion
>>>> to actually send the data it's been handed. (Moreover, if it does
>>>> do only a partial send, there is no way to discover how much it sent;
>>>> while its callers might be blamed for not having checked for an error
>>>> return, they'd have no way to recover anyhow.)

> pqPutBytes _never_ felt any compulsion to flush the buffer to the backend,
> or at least not since I started using it.

Sorry, I was insufficiently careful about my wording. It's true that
pqPutBytes doesn't worry about actually flushing the data out to the
backend. (It shouldn't, either, since it is typically called with small
fragments of a message rather than complete messages.) It did, however,
take care to *accept* all the data it was given and ensure that the data
was queued in the output buffer. As the code now stands, it's
impossible to tell whether all the passed data was queued or not, or how
much of it was queued. This is a fundamental design error, because the
caller has no way to discover what to do after a failure return (nor
even a way to tell if it was a hard failure or just I-won't-block).
Moreover, no existing caller of PQputline thinks it should have to worry
about looping around the call, so even if you put in a usable return
convention, existing apps would still be broken.

Similarly, PQendcopy is now willing to return without having gotten
the library out of the COPY state, but the caller can't easily tell
what to do about it --- nor do existing callers believe that they
should have to do anything about it.

> The implications of this is trully annoying, exporting the socket to
> the backend to the client application causes all sorts of problems because
> the person select()'ing on the socket sees that it's 'clear' but yet
> all thier data has not been sent...

Yeah, the original set of exported routines was designed without any
thought of handling a nonblock mode. But you aren't going to be able
to fix them this way. There will need to be a new set of entry points
that add a concept of "operation not complete" to their API, and apps
that want to avoid blocking will need to call those instead. Compare
what's been done for connecting (PQconnectPoll) and COPY TO STDOUT
(PQgetlineAsync).

It's possible that things were broken before you got to them --- there
have been several sets of not-very-carefully-reviewed patches to libpq
during the current development cycle, so someone else may have created
the seeds of the problem. However, we weren't seeing failures in psql
before this week...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Don Baccus 2000-01-21 16:10:44 Re: Re. [HACKERS] Some notes on optimizer cost estimates
Previous Message Tatsuo Ishii 2000-01-21 15:25:43 Re: [HACKERS] Well...