Re: Speed dblink using alternate libpq tuple storage

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-02-07 14:44:09
Message-ID: 20120207144409.GA26763@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
> (2012/02/02 23:30), Marko Kreen wrote:
> > On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
> >> Hello, This is new version of dblink.c
> >>
> >> - Memory is properly freed when realloc returns NULL in storeHandler().
> >>
> >> - The bug that free() in finishStoreInfo() will be fed with
> >> garbage pointer when malloc for sinfo->valbuflen fails is
> >> fixed.
> >
> > Thanks, merged. I also did some minor coding style cleanups.
> >
> > Tagging it Ready For Committer. I don't see any notable
> > issues with the patch anymore.
> >
> > There is some potential for experimenting with more aggressive
> > optimizations on dblink side, but I'd like to get a nod from
> > a committer for libpq changes first.
>
> I tried to use this feature in pgsql_fdw patch, and found some small issues.
>
> - Typos
> - mes -> msg
> - funcion -> function
> - overritten -> overwritten
> - costom -> custom

Fixed.

> - What is the right (or recommended) way to prevent from throwing
> exceptoin in row-processor callback function? When author should use
> PG_TRY block to catch exception in the callback function?

When it calls backend functions that can throw exceptions?
As all handlers running in backend will want to convert data
to Datums, that means "always wrap handler code in PG_TRY"?

Although it seems we could allow exceptions, at least when we are
speaking of Postgres backend, as the connection and result are
internally consistent state when the handler is called, and the
partial PGresult is stored under PGconn->result. Even the
connection is in consistent state, as the row packet is
fully processed.

So we have 3 variants, all work, but which one do we want to support?

1) Exceptions are not allowed.
2) Exceptions are allowed, but when it happens, user must call
PQfinish() next, to avoid processing incoming data from
potentially invalid state.
3) Exceptions are allowed, and further row processing can continue
with PQisBusy() / PQgetResult()
3.1) The problematic row is retried. (Current behaviour)
3.2) The problematic row is skipped.

No clue if that is ok for handler written in C++, I have no idea
whether you can throw C++ exception when part of the stack
contains raw C calls.

> - It would be better to describe how to determine whether a column
> result is NULL should be described clearly. Perhaps result value is
> NULL when PGrowValue.len is less than 0, right?

Eh, seems it's documented everywhere except in sgml doc. Fixed.
[ Is it better to document that it is "-1" or "< 0"? ]

Also I removed one remaining dynamic stack array in dblink.c

Current state of patch attached.

--
marko

Attachment Content-Type Size
libpq_rowproc_2012-02-07.diff text/x-diff 22.0 KB
libpq_rowproc_doc_2012-02-07.diff text/x-diff 6.3 KB
dblink_rowproc_2012-02-07.diff text/x-diff 14.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-02-07 14:55:14 Re: incorrect handling of the timeout in pg_receivexlog
Previous Message David Fetter 2012-02-07 14:18:23 Re: semi-PoC: kNN-gist for cubes