Re: patch: SQL/MED(FDW) DDL

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, SAKAMOTO Masahiko <sakamoto(dot)masahiko(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: SQL/MED(FDW) DDL
Date: 2010-10-05 18:43:18
Message-ID: AANLkTikSDLeoyynDTobVvKocDv25gQaC0Wjk_DmUme8a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 12:49 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Oct 5, 2010 at 11:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> IMO this is a *clear* case of premature optimization being the root of
>>> all evil.  We should get it working first and then see what needs to be
>>> optimized by measuring, rather than guessing in a vacuum.
>
>> I have no problem with punting the issue of remote statistics to some
>> time in the future.  But I don't think we should have a half-baked
>> implementation of remote statistics.  We should either do it right
>> (doing such testing as is necessary to establish what that means) or
>> not do it at all.  Frankly, if we could get from where we are today to
>> a workable implementation of this technology for CSV files in time for
>> 9.1, I think that would be an impressive accomplishment.  Making it
>> work for more complicated cases is almost certainly material for 9.2,
>> 9.3, 9.4, and maybe further out than that.
>
> I quite agree that it's going to take multiple release cycles to have
> a really impressive version of SQL/MED.  What I'm saying is that caching
> remote statistics is several versions out in my view of the world, and
> designing support for it shouldn't be happening now.

Fair enough.

> Maybe we ought to take a step back and discuss what the development plan
> ought to be, before we spend more time on details like this.

Good idea.

> My idea of
> a good development process would involve working in parallel on at least
> two FDW adapters, so that we don't find we've skewed the API design to
> meet the needs of just one adapter.  Probably a remote-PG adapter and a
> local-CSV-file adapter would be a good combination.  I don't have any
> clear idea of how soon we might expect to see how much functionality,
> though.  Thoughts?

I'm somewhat afraid that a remote-PG adapter will turn into a can of
worms. If we give up on remote statistics, does that mean we're also
giving up on index use on the remote side? I fear that we'll end up
crafting partial solutions that will only end up getting thrown away,
but after a lot of work has been invested in them. I wonder if we
should focus on first efforts on really simple cases like CSV files
(as you mentioned) and perhaps something like memcached, which has
different properties than a CSV file, but extremely simple ones. I
think it's inevitable that the API is going to get more complicated
from release to release and probably not in backward-compatible ways;
I think it's too early to be worried about that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-05 18:46:02 Re: leaky views, yet again
Previous Message Simon Riggs 2010-10-05 18:33:31 Re: Sync Rep at Oct 5