Re: Join push-down support for foreign tables

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Join push-down support for foreign tables
Date: 2014-09-04 16:01:20
Message-ID: CAOeZVieZFg_XT3qnSfPKTQoAPPUXpaANBjy-XiqbZ32abz34PQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 4, 2014 at 9:26 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> On Thu, Sep 4, 2014 at 08:41:43PM +0530, Atri Sharma wrote:
> >
> >
> > On Thursday, September 4, 2014, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
> > > The main problem I see here is that accurate costing may require a
> > > round-trip to the remote server. If there is only one path that is
> > > probably OK; the cost of asking the question will usually be more
> than
> > > paid for by hearing that the pushed-down join clobbers the other
> > > possible methods of executing the query. But if there are many
> paths,
> > > for example because there are multiple sets of useful pathkeys, it
> > > might start to get a bit expensive.
> > >
> > > Probably both the initial cost and final cost calculations should
> be
> > > delegated to the FDW, but maybe within postgres_fdw, the initial
> cost
> > > should do only the work that can be done without contacting the
> remote
> > > server; then, let the final cost step do that if appropriate. But
> I'm
> > > not entirely sure what is best here.
> >
> > I am thinking eventually we will need to cache the foreign server
> > statistics on the local server.
> >
> >
> >
> >
> > Wouldn't that lead to issues where the statistics get outdated and we
> have to
> > anyways query the foreign server before planning any joins? Or are you
> thinking
> > of dropping the foreign table statistics once the foreign join is
> complete?
>
> I am thinking we would eventually have to cache the statistics, then get
> some kind of invalidation message from the foreign server. I am also
> thinking that cache would have to be global across all backends, I guess
> similar to our invalidation cache.
>
>
>
That could lead to some bloat in storing statistics since we may have a lot
of tables for a lot of foreign servers. Also, will we have VACUUM look at
ANALYZING the foreign tables?

Also, how will we decide that the statistics are invalid? Will we have the
FDW query the foreign server and do some sort of comparison between the
statistics the foreign server has and the statistics we locally have? I am
trying to understand how the idea of invalidation message from foreign
server will work.

Regards,

Atri

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Tiikkaja 2014-09-04 16:02:05 Re: PL/pgSQL 2
Previous Message Bruce Momjian 2014-09-04 15:56:38 Re: Join push-down support for foreign tables