Re: Speed dblink using alternate libpq tuple storage

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-06 02:13:45
Message-ID: 20120306.111345.25933910.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I'm sorry for the abesnce.

> But it's broken in V3 protocol - getAnotherTuple() will be called
> only if the packet is fully read. If the packet contents do not
> agree with packet header, it's protocol error. Only valid EOF
> return in V3 getAnotherTuple() is when row processor asks
> for early exit.

Original code of getAnotherTuple returns EOF when the bytes to
be read is not fully loaded. I understand that this was
inappropriately (redundant checks?) written at least for the
pqGetInt() for the field length in getAnotherTuple. But I don't
understand how to secure the rows (or table data) fully loaded at
the point of getAnotherTuple called...

Nevertheles the first pgGetInt() can return EOF when the
previsous row is fully loaded but the next row is not loaded so
the EOF-rerurn seems necessary even if the each row will passed
after fully loaded.

> * Convert old EOFs to protocol errors in V3 getAnotherTuple()

Ok. I will do that.

> * V2 getAnotherTuple() can leak PGresult when handling custom
> error from row processor.

mmm. I will confirm it.

> * remove pqIsnonblocking(conn) check when row processor returned 2.
> I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
> on sync connection.

mmm. EOF from getAnotherTuple makes PQgetResult try furthur
reading until asyncStatus != PGASYNC_BUSY as far as I saw. And It
seemed to do so when I tried to remove 'return 2'. I think that
it is needed at least one additional state for asyncStatus to
work EOF as desied here.

> * It seems the return codes from callback should be remapped,
> (0, 1, 2) is unusual pattern. Better would be:
>
> -1 - error
> 0 - stop parsing / early exit ("I'm not done yet")
> 1 - OK ("I'm done with the row")

I almost agree with it. I will consider the suggestion related to
pqAddRow together.

> * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
> Main problem is that it needs to be synced with error handling
> in rest of libpq, which is unlike the rest of row processor patch,
> which consists only of local changes. All solutions here
> are either ugly hacks or too complex to be part of this patch.

Ok, I will take your advice.

> Also considering that we have working exceptions and PQgetRow,
> I don't see much need for custom error messages. If really needed,
> it should be introduced as separate patch, as the area of code it
> affects is completely different.

I agree with it.

> Currently the custom error messaging seems to be the blocker for
> this patch, because of raised complexity when implementing it and
> when reviewing it. Considering how unimportant the provided
> functionality is, compared to rest of the patch, I think we should
> simply drop it.

Ok.

> My suggestion - check in getAnotherTuple whether resultStatus is
> already error and do nothing then. This allows internal pqAddRow
> to set regular "out of memory" error. Otherwise give generic
> "row processor error".

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-03-06 02:17:47 Re: pg_upgrade --logfile option documentation
Previous Message Bruce Momjian 2012-03-06 00:37:02 Dropping PL language retains support functions