Re: Speed dblink using alternate libpq tuple storage

Lists: pgsql-hackers
From: horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-30 11:42:20
Message-ID: 20120130.204220.24171357.horiguchi.kyotaro@horiguti.oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm sorry.

> Thank you for comments, this is revised version of the patch.

The malloc error handling in dblink.c of the patch is broken. It
is leaking memory and breaking control.

i'll re-send the properly fixed patch for dblink.c later.

# This severe back pain should have made me stupid :-p

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-31 02:59:31
Message-ID: 20120131.115931.266804142.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This is fixed version of dblink.c for row processor.

> i'll re-send the properly fixed patch for dblink.c later.

- malloc error in initStoreInfo throws ERRCODE_OUT_OF_MEMORY. (new error)

- storeHandler() now returns FALSE on malloc failure. Garbage
cleanup is done in dblink_fetch() or dblink_record_internal().
The behavior that this dblink displays this error as 'unkown
error/could not execute query' on the user session is as it did
before.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
dblink_use_rowproc_20120131.patch text/x-patch 11.8 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-31 07:19:22
Message-ID: CACMqXC+PJHqxhPTDdOYSHqZBenm3j7otRi5P5_f30W05o9cxig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 31, 2012 at 4:59 AM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
> This is fixed version of dblink.c for row processor.
>
>> i'll re-send the properly fixed patch for dblink.c later.
>
> - malloc error in initStoreInfo throws ERRCODE_OUT_OF_MEMORY. (new error)
>
> - storeHandler() now returns FALSE on malloc failure. Garbage
>  cleanup is done in dblink_fetch() or dblink_record_internal().
>  The behavior that this dblink displays this error as 'unkown
>  error/could not execute query' on the user session is as it did
>  before.

Another thing: if realloc() fails, the old pointer stays valid.
So it needs to be assigned to temp variable first, before
overwriting old pointer.

And seems malloc() is preferable to palloc() to avoid
exceptions inside row processor. Although latter
one could be made to work, it might be unnecessary
complexity. (store current pqresult into remoteConn?)

--
marko


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-01 08:32:16
Message-ID: 20120201.173216.21791705.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> Another thing: if realloc() fails, the old pointer stays valid.
> So it needs to be assigned to temp variable first, before
> overwriting old pointer.

mmm. I've misunderstood of the realloc.. I'll fix there in the
next patch.

> And seems malloc() is preferable to palloc() to avoid
> exceptions inside row processor. Although latter
> one could be made to work, it might be unnecessary
> complexity. (store current pqresult into remoteConn?)

Hmm.. palloc may throw ERRCODE_OUT_OF_MEMORY so I must catch it
and return NULL. That seems there is no difference to using
malloc after all.. However, the inhibition of throwing exceptions
in RowProcessor is not based on any certain fact, so palloc here
may make sense if we can do that.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-01 09:52:27
Message-ID: CACMqXCL_+Wzpsb_0niZP4koB9T3-UpYLRJ0yOD6c3a-YfuKqpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 1, 2012 at 10:32 AM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>> Another thing: if realloc() fails, the old pointer stays valid.
>> So it needs to be assigned to temp variable first, before
>> overwriting old pointer.
>
>  mmm. I've misunderstood of the realloc.. I'll fix there in the
> next patch.

Please wait a moment, I started doing small cleanups,
and now have some larger ones too. I'll send it soon.

OTOH, if you have already done something, you can send it,
I have various states in GIT so it should not be hard
to merge things.

>> And seems malloc() is preferable to palloc() to avoid
>> exceptions inside row processor.  Although latter
>> one could be made to work, it might be unnecessary
>> complexity.  (store current pqresult into remoteConn?)
>
> Hmm.. palloc may throw ERRCODE_OUT_OF_MEMORY so I must catch it
> and return NULL. That seems there is no difference to using
> malloc after all.. However, the inhibition of throwing exceptions
> in RowProcessor is not based on any certain fact, so palloc here
> may make sense if we can do that.

No, I was thinking about storing the result in connection
struct and then letting the exception pass, as the
PGresult can be cleaned later. Thus we could get rid
of TRY/CATCH in per-row handler. (At that point
the PGresult is already under PGconn, so maybe
it's enough to clean that one later?)

But for now, as the TRY is already there, it should be
also simple to move palloc usage inside it.

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-01 14:19:42
Message-ID: 20120201141942.GA17303@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 01, 2012 at 11:52:27AM +0200, Marko Kreen wrote:
> Please wait a moment, I started doing small cleanups,
> and now have some larger ones too. I'll send it soon.

I started doing small comment cleanups, but then the changes
spread out a bit...

Kyotaro-san, please review. If you don't find anything to fix,
then it's ready for commiter, I think.

Changes:

== dblink ==

- Increase area in bigger steps

- Fix realloc leak on failure

== libpq ==

- Replace dynamic stack array with malloced rowBuf on PGconn.
It is only increased, never decreased.

- Made PGresult.rowProcessorErrMsg allocated with pqResultStrdup.
Seems more proper. Although it would be even better to
use ->errmsg for it, but it's lifetime and usage
are unclear to me.

- Removed the rowProcessor, rowProcessorParam from PGresult.
They are unnecessary there.

- Move conditional msg outside from libpq_gettext()
- Removed the unnecessary memcpy() from pqAddRow

- Moved PQregisterRowProcessor to fe-exec.c, made pqAddRow static.

- Restored pqAddTuple export.

- Renamed few symbols:
PQregisterRowProcessor -> PQsetRowProcessor
RowProcessor -> PQrowProcessor
Mes -> Msg (more common abbreviation)

- Updated some comments

- Updated sgml doc

== Benchmark ==

I did try to benchmark whether the patch affects stock libpq usage.
But, uh, could not draw any conclusions. It *seems* that patch is
bit faster with few columns, bit slower with many, but the difference
seems smaller than test noise. OTOH, the test noise is pretty big,
so maybe I don't have stable-enough setup to properly benchmark.

As the test-app does not actually touch resultset, it seems probable
that application that actually does something with resultset,
will no see the difference.

It would be nice if anybody who has stable pgbench setup could
run few tests to see whether the patch moves things either way.

Just in case, I attached the minimal test files.

--
marko

Attachment Content-Type Size
libpq_rowproc_2012-02-01.diff text/x-diff 21.9 KB
libpq_rowproc_doc_2012-02-01.diff text/x-diff 6.0 KB
dblink_rowproc_2012-02-01.diff text/x-diff 13.4 KB
test_prog.c text/x-csrc 1.0 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-02 07:51:37
Message-ID: 20120202.165137.155927529.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, This is new version of dblink.c

- Memory is properly freed when realloc returns NULL in storeHandler().

- The bug that free() in finishStoreInfo() will be fed with
garbage pointer when malloc for sinfo->valbuflen fails is
fixed.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
dblink_use_rowproc_20120202.patch text/x-patch 11.9 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-02 14:30:57
Message-ID: 20120202143057.GA12434@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
> Hello, This is new version of dblink.c
>
> - Memory is properly freed when realloc returns NULL in storeHandler().
>
> - The bug that free() in finishStoreInfo() will be fed with
> garbage pointer when malloc for sinfo->valbuflen fails is
> fixed.

Thanks, merged. I also did some minor coding style cleanups.

Tagging it Ready For Committer. I don't see any notable
issues with the patch anymore.

There is some potential for experimenting with more aggressive
optimizations on dblink side, but I'd like to get a nod from
a committer for libpq changes first.

--
marko

Attachment Content-Type Size
libpq_rowproc_2012-02-02-2.diff text/x-diff 22.0 KB
libpq_rowproc_doc_2012-02-02-2.diff text/x-diff 6.0 KB
dblink_rowproc_2012-02-02-2.diff text/x-diff 13.7 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-03 04:01:11
Message-ID: 20120203.130111.225335626.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> Thanks, merged. I also did some minor coding style cleanups.

Thank you for editing many comments and some code I'd left
unchanged from my carelessness, and lack of understanding your
comments. I'll be more careful about that...

> There is some potential for experimenting with more aggressive
> optimizations on dblink side, but I'd like to get a nod from
> a committer for libpq changes first.

I'm looking forward to the aggressiveness of that:-)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-07 10:25:14
Message-ID: 4F30FC0A.6020900@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2012/02/02 23:30), Marko Kreen wrote:
> On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
>> Hello, This is new version of dblink.c
>>
>> - Memory is properly freed when realloc returns NULL in storeHandler().
>>
>> - The bug that free() in finishStoreInfo() will be fed with
>> garbage pointer when malloc for sinfo->valbuflen fails is
>> fixed.
>
> Thanks, merged. I also did some minor coding style cleanups.
>
> Tagging it Ready For Committer. I don't see any notable
> issues with the patch anymore.
>
> There is some potential for experimenting with more aggressive
> optimizations on dblink side, but I'd like to get a nod from
> a committer for libpq changes first.

I tried to use this feature in pgsql_fdw patch, and found some small issues.

- Typos
- mes -> msg
- funcion -> function
- overritten -> overwritten
- costom -> custom
- What is the right (or recommended) way to prevent from throwing
exceptoin in row-processor callback function? When author should use
PG_TRY block to catch exception in the callback function?
- It would be better to describe how to determine whether a column
result is NULL should be described clearly. Perhaps result value is
NULL when PGrowValue.len is less than 0, right?

Regards,
--
Shigeru Hanada


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-07 14:44:09
Message-ID: 20120207144409.GA26763@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
> (2012/02/02 23:30), Marko Kreen wrote:
> > On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
> >> Hello, This is new version of dblink.c
> >>
> >> - Memory is properly freed when realloc returns NULL in storeHandler().
> >>
> >> - The bug that free() in finishStoreInfo() will be fed with
> >> garbage pointer when malloc for sinfo->valbuflen fails is
> >> fixed.
> >
> > Thanks, merged. I also did some minor coding style cleanups.
> >
> > Tagging it Ready For Committer. I don't see any notable
> > issues with the patch anymore.
> >
> > There is some potential for experimenting with more aggressive
> > optimizations on dblink side, but I'd like to get a nod from
> > a committer for libpq changes first.
>
> I tried to use this feature in pgsql_fdw patch, and found some small issues.
>
> - Typos
> - mes -> msg
> - funcion -> function
> - overritten -> overwritten
> - costom -> custom

Fixed.

> - What is the right (or recommended) way to prevent from throwing
> exceptoin in row-processor callback function? When author should use
> PG_TRY block to catch exception in the callback function?

When it calls backend functions that can throw exceptions?
As all handlers running in backend will want to convert data
to Datums, that means "always wrap handler code in PG_TRY"?

Although it seems we could allow exceptions, at least when we are
speaking of Postgres backend, as the connection and result are
internally consistent state when the handler is called, and the
partial PGresult is stored under PGconn->result. Even the
connection is in consistent state, as the row packet is
fully processed.

So we have 3 variants, all work, but which one do we want to support?

1) Exceptions are not allowed.
2) Exceptions are allowed, but when it happens, user must call
PQfinish() next, to avoid processing incoming data from
potentially invalid state.
3) Exceptions are allowed, and further row processing can continue
with PQisBusy() / PQgetResult()
3.1) The problematic row is retried. (Current behaviour)
3.2) The problematic row is skipped.

No clue if that is ok for handler written in C++, I have no idea
whether you can throw C++ exception when part of the stack
contains raw C calls.

> - It would be better to describe how to determine whether a column
> result is NULL should be described clearly. Perhaps result value is
> NULL when PGrowValue.len is less than 0, right?

Eh, seems it's documented everywhere except in sgml doc. Fixed.
[ Is it better to document that it is "-1" or "< 0"? ]

Also I removed one remaining dynamic stack array in dblink.c

Current state of patch attached.

--
marko

Attachment Content-Type Size
libpq_rowproc_2012-02-07.diff text/x-diff 22.0 KB
libpq_rowproc_doc_2012-02-07.diff text/x-diff 6.3 KB
dblink_rowproc_2012-02-07.diff text/x-diff 14.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-07 15:58:10
Message-ID: CA+TgmoZF8HUO__0O9gomw66bPDeTNCbXYVAuT385iMx9pgmv5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 7, 2012 at 9:44 AM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
>> - What is the right (or recommended) way to prevent from throwing
>> exceptoin in row-processor callback function?  When author should use
>> PG_TRY block to catch exception in the callback function?
>
> When it calls backend functions that can throw exceptions?
> As all handlers running in backend will want to convert data
> to Datums, that means "always wrap handler code in PG_TRY"?

I would hate to have such a rule. PG_TRY isn't free, and it's prone
to subtle bugs, like failing to mark enough stuff in the same function
"volatile".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-08 05:44:13
Message-ID: 4F320BAD.1010205@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2012/02/07 23:44), Marko Kreen wrote:
> On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
>> - What is the right (or recommended) way to prevent from throwing
>> exceptoin in row-processor callback function? When author should use
>> PG_TRY block to catch exception in the callback function?
>
> When it calls backend functions that can throw exceptions?
> As all handlers running in backend will want to convert data
> to Datums, that means "always wrap handler code in PG_TRY"?

ISTM that telling a) what happens to PGresult and PGconn when row
processor ends without return, and b) how to recover them would be
necessary. We can't assume much about caller because libpq is a client
library. IMHO, it's most important to tell authors of row processor
clearly what should be done on error case.

In such case, must we call PQfinish, or is calling PQgetResult until it
returns NULL enough to reuse the connection? IIUC calling
pqClearAsyncResult seems sufficient, but it's not exported for client...

Regards,
--
Shigeru Hanada


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-08 11:51:16
Message-ID: 20120208115116.GA32486@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 08, 2012 at 02:44:13PM +0900, Shigeru Hanada wrote:
> (2012/02/07 23:44), Marko Kreen wrote:
> > On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
> >> - What is the right (or recommended) way to prevent from throwing
> >> exceptoin in row-processor callback function? When author should use
> >> PG_TRY block to catch exception in the callback function?
> >
> > When it calls backend functions that can throw exceptions?
> > As all handlers running in backend will want to convert data
> > to Datums, that means "always wrap handler code in PG_TRY"?
>
> ISTM that telling a) what happens to PGresult and PGconn when row
> processor ends without return, and b) how to recover them would be
> necessary. We can't assume much about caller because libpq is a client
> library. IMHO, it's most important to tell authors of row processor
> clearly what should be done on error case.

Yes.

> In such case, must we call PQfinish, or is calling PQgetResult until it
> returns NULL enough to reuse the connection? IIUC calling
> pqClearAsyncResult seems sufficient, but it's not exported for client...

Simply dropping ->result is not useful as there are still rows
coming in. Now you cannot process them anymore.

The rule of "after exception it's valid to close the connection with PQfinish()
or continue processing with PQgetResult()/PQisBusy()/PQconsumeInput()" seems
quite simple and clear. Perhaps only clarification whats valid on sync
connection and whats valid on async connection would be useful.

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-13 23:39:06
Message-ID: 20120213233905.GA6225@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 07, 2012 at 04:44:09PM +0200, Marko Kreen wrote:
> Although it seems we could allow exceptions, at least when we are
> speaking of Postgres backend, as the connection and result are
> internally consistent state when the handler is called, and the
> partial PGresult is stored under PGconn->result. Even the
> connection is in consistent state, as the row packet is
> fully processed.
>
> So we have 3 variants, all work, but which one do we want to support?
>
> 1) Exceptions are not allowed.
> 2) Exceptions are allowed, but when it happens, user must call
> PQfinish() next, to avoid processing incoming data from
> potentially invalid state.
> 3) Exceptions are allowed, and further row processing can continue
> with PQisBusy() / PQgetResult()
> 3.1) The problematic row is retried. (Current behaviour)
> 3.2) The problematic row is skipped.

I converted the patch to support 3.2), that is - skip row on exception.
That required some cleanups of getAnotherTuple() API, plus I made some
other cleanups. Details below.

But during this I also started to think what happens if the user does
*not* throw exceptions. Eg. Python does exceptions via regular return
values, which means complications when passing them upwards.

The main case I'm thinking of is actually resultset iterator in high-level
language. Current callback-only style API requires that rows are
stuffed into temporary buffer until the network blocks and user code
gets control again. This feels clumsy for a performance-oriented API.
Another case is user code that wants to do complex processing. Doing
lot of stuff under callback seems dubious. And what if some part of
code calls PQfinish() during some error recovery?

I tried imaging some sort of getFoo() style API for fetching in-flight
row data, but I always ended up with "rewrite libpq" step, so I feel
it's not productive to go there.

Instead I added simple feature: rowProcessor can return '2',
in which case getAnotherTuple() does early exit without setting
any error state. In user side it appears as PQisBusy() returned
with TRUE result. All pointers stay valid, so callback can just
stuff them into some temp area. ATM there is not indication though
whether the exit was due to callback or other reasons, so user
must detect it based on whether new temp pointers appeares,
which means those must be cleaned before calling PQisBusy() again.
This actually feels like feature, those must not stay around
after single call.

It's included in main patch, but I also attached it as separate patch
so that it can be examined separately and reverted if not acceptable.

But as it's really simple, I recommend including it.

It's usage might now be obvious though, should we include
example code in doc?

New feature:

* Row processor can return 2, then PQisBusy() does early exit.
Supportde only when connection is in non-blocking mode.

Cleanups:

* Row data is tagged as processed when rowProcessor is called,
so exceptions will skip the row. This simplifies non-exceptional
handling as well.

* Converted 'return EOF' in V3 getAnotherTuple() to set error instead.
AFAICS those EOFs are remnants from V2 getAnotherTuple()
not something that is coded deliberately. Note that when
v3 getAnotherTuple() is called the row packet is fully in buffer.
The EOF on broken packet to signify 'give me more data' does not
result in any useful behaviour, instead you can simply get
into infinite loop.

Fix bugs in my previous changes:

* Split the OOM error handling from custom error message handling,
previously the error message in PGresult was lost due to OOM logic
early free of PGresult.

* Drop unused goto label from experimental debug code.

--
marko

Attachment Content-Type Size
libpq_rowproc_2012-02-14.diff text/x-diff 25.3 KB
libpq_rowproc_doc_2012-02-14.diff text/x-diff 7.4 KB
dblink_rowproc_2012-02-14.diff text/x-diff 14.0 KB
early.exit.diff text/x-diff 1.9 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-16 08:49:34
Message-ID: 20120216.174934.80061891.horiguchi.kyotaro@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I added the function PQskipRemainingResult() and use it in
dblink. This reduces the number of executing try-catch block from
the number of rows to one per query in dblink.

- fe-exec.c : new function PQskipRemainingResult.
- dblink.c : using PQskipRemainingResult in dblink_record_internal().
- libpq.sgml: documentation for PQskipRemainingResult and related
change in RowProcessor.

> Instead I added simple feature: rowProcessor can return '2',
> in which case getAnotherTuple() does early exit without setting
> any error state. In user side it appears as PQisBusy() returned
> with TRUE result. All pointers stay valid, so callback can just
> stuff them into some temp area.
...
> It's included in main patch, but I also attached it as separate patch
> so that it can be examined separately and reverted if not acceptable.

This patch is based on the patch above and composed in the same
manner - main three patches include all modifications and the '2'
patch separately.

This patch is not rebased to the HEAD because the HEAD yields
error about the symbol LEAKPROOF...

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
libpq_rowproc_20120216.patch text/x-patch 75.4 KB
libpq_rowproc_doc_20120216.patch text/x-patch 8.3 KB
dblink_use_rowproc_20120216.patch text/x-patch 12.5 KB
early_exit_20120216.diff text/x-patch 1.9 KB

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-16 15:17:25
Message-ID: 20120216151725.GA9208@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 16, 2012 at 05:49:34PM +0900, Kyotaro HORIGUCHI wrote:
> I added the function PQskipRemainingResult() and use it in
> dblink. This reduces the number of executing try-catch block from
> the number of rows to one per query in dblink.

This implementation is wrong - you must not simply call PQgetResult()
when connection is in async mode. And the resulting PGresult must
be freed.

Please just make PGsetRowProcessorErrMsg() callable outside from
callback. That also makes clear to code that sees final PGresult
what happened. As a bonus, this will simply make the function
more robust and less special.

Although it's easy to create some PQsetRowSkipFlag() function
that will silently skip remaining rows, I think such logic
is best left to user to handle. It creates unnecessary special
case when handling final PGresult, so in the big picture
it creates confusion.

> This patch is based on the patch above and composed in the same
> manner - main three patches include all modifications and the '2'
> patch separately.

I think there is not need to carry the early-exit patch separately.
It is visible in archives and nobody screamed about it yet,
so I guess it's acceptable. Also it's so simple, there is no
point in spending time rebasing it.

> diff --git a/src/interfaces/libpq/#fe-protocol3.c# b/src/interfaces/libpq/#fe-protocol3.c#

There is some backup file in your git repo.

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: shigeru(dot)hanada(at)gmail(dot)com, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-18 21:00:03
Message-ID: CACMqXCLvpkjb9+c6sqJXitMHvrRCo+yu4q4bQ--0d7L=vw62Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Demos/tests of the new API:

https://github.com/markokr/libpq-rowproc-demos

Comments resulting from that:

- PQsetRowProcessorErrMsg() should take const char*

- callback API should be (void *, PGresult *, PQrowValue*)
or void* at the end, but not in the middle

I have not looked yet what needs to be done to get
ErrMsg callable outside of callback, if it requires PGconn,
then we should add PGconn also to callback args.

> On Thu, Feb 16, 2012 at 05:49:34PM +0900, Kyotaro HORIGUCHI wrote:
>>  I added the function PQskipRemainingResult() and use it in
>> dblink. This reduces the number of executing try-catch block from
>> the number of rows to one per query in dblink.

I still think we don't need extra skipping function.

Yes, the callback function needs have a flag to know that
rows need to be skip, but for such low-level API it does
not seem to be that hard requirement.

If this really needs to be made easier then getRowProcessor
might be better approach, to allow easy implementation
of generic skipping func for user.

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-24 15:46:16
Message-ID: 20120224154616.GA16985@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 14, 2012 at 01:39:06AM +0200, Marko Kreen wrote:
> I tried imaging some sort of getFoo() style API for fetching in-flight
> row data, but I always ended up with "rewrite libpq" step, so I feel
> it's not productive to go there.
>
> Instead I added simple feature: rowProcessor can return '2',
> in which case getAnotherTuple() does early exit without setting
> any error state. In user side it appears as PQisBusy() returned
> with TRUE result. All pointers stay valid, so callback can just
> stuff them into some temp area. ATM there is not indication though
> whether the exit was due to callback or other reasons, so user
> must detect it based on whether new temp pointers appeares,
> which means those must be cleaned before calling PQisBusy() again.
> This actually feels like feature, those must not stay around
> after single call.

To see how iterating a resultset would look like I implemented PQgetRow()
function using the currently available public API:

/*
* Wait and return next row in resultset.
*
* returns:
* 1 - row data available, the pointers are owned by PGconn
* 0 - result done, use PQgetResult() to get final result
* -1 - some problem, check connection error
*/
int PQgetRow(PGconn *db, PGresult **hdr_p, PGrowValue **row_p);

code at:

https://github.com/markokr/libpq-rowproc-demos/blob/master/getrow.c

usage:

/* send query */
if (!PQsendQuery(db, q))
die(db, "PQsendQuery");

/* fetch rows one-by-one */
while (1) {
rc = PQgetRow(db, &hdr, &row);
if (rc > 0)
proc_row(hdr, row);
else if (rc == 0)
break;
else
die(db, "streamResult");
}
/* final PGresult, either PGRES_TUPLES_OK or error */
final = PQgetResult(db);

It does not look like it can replace the public callback API,
because it does not work with fully-async connections well.
But it *does* continue the line of synchronous APIs:

- PQexec(): last result only
- PQsendQuery() + PQgetResult(): each result separately
- PQsendQuery() + PQgetRow() + PQgetResult(): each row separately

Also the generic implementation is slightly messy, because
it cannot assume anything about surrounding usage patterns,
while same code living in some user framework can. But
for simple users who just want to synchronously iterate
over resultset, it might be good enough API?

It does have a inconsistency problem - the row data does
not live in PGresult but in custom container. Proper
API pattern would be to have PQgetRow() that gives
functional PGresult, but that is not interesting for
high-performace users. Solutions:

- rename to PQrecvRow()
- rename to PQrecvRow() and additionally provide PQgetRow()
- Don't bother, let users implement it themselves via callback API.

Comments?

--
marko


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-26 22:19:22
Message-ID: 20120226221922.GA6981@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 24, 2012 at 05:46:16PM +0200, Marko Kreen wrote:
> - rename to PQrecvRow() and additionally provide PQgetRow()

I tried it and it seems to work as API - there is valid behaviour
for both sync and async connections.

Sync connection - PQgetRow() waits for data from network:

if (!PQsendQuery(db, q))
die(db, "PQsendQuery");
while (1) {
r = PQgetRow(db);
if (!r)
break;
handle(r);
PQclear(r);
}
r = PQgetResult(db);

Async connection - PQgetRow() does PQisBusy() loop internally,
but does not read from network:

on read event:
PQconsumeInput(db);
while (1) {
r = PQgetRow(db);
if (!r)
break;
handle(r);
PQclear(r);
}
if (!PQisBusy(db))
r = PQgetResult(db);
else
waitForMoredata();

As it seems to simplify life for quite many potential users,
it seems worth including in libpq properly.

Attached patch is on top of v20120223 of row-processor
patch. Only change in general code is allowing
early exit for syncronous connection, as we now have
valid use-case for it.

--
marko

Attachment Content-Type Size
pqgetrow.diff text/x-diff 8.9 KB