Re: TABLESPACE and directory for Foreign tables?

Lists: pgsql-hackers
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 17:26:24
Message-ID: 5367C9C0.6030401@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

I'm working with the cstore_fdw project, which has an interesting
property for an FDW: the FDW itself creates the files which make up the
database. This raises a couple of questions:

1) Do we want to establish a standard directory for FDWs which create
files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
leave it up to each FDW to decide?

2) Do we want to support the TABLESPACE directive for FDWs?

While cstore is the first FDW to create its own files, it won't
necessarily be the last; I could imagine CSV_FDW doing so as well, or a
future SQLite_FDW which does the same. So I think the above questions
are worth answering in general. And we're planning to implement
automated file management for cstore_fdw fairly soon, so we want to make
it consistent with whatever we're doing in Postgres 9.5.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 17:52:22
Message-ID: CAFcNs+q6zu-j4JFP_7y38qDi+P+evxq==LQhSfR_JyvNA=hn3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 5, 2014 at 2:26 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
> All,
>
> I'm working with the cstore_fdw project, which has an interesting
> property for an FDW: the FDW itself creates the files which make up the
> database. This raises a couple of questions:
>
> 1) Do we want to establish a standard directory for FDWs which create
> files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
> leave it up to each FDW to decide?
>

-1. Each FDW must decide.

> 2) Do we want to support the TABLESPACE directive for FDWs?
>
> While cstore is the first FDW to create its own files, it won't
> necessarily be the last; I could imagine CSV_FDW doing so as well, or a
> future SQLite_FDW which does the same. So I think the above questions
> are worth answering in general. And we're planning to implement
> automated file management for cstore_fdw fairly soon, so we want to make
> it consistent with whatever we're doing in Postgres 9.5.
>

-1. We cannot guarantee the consistency of files using an "ALTER FOREIGN
TABLE ... SET TABLESPACE ..." as a normal "ALTER TABLE ..." does.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 17:53:46
Message-ID: 11302.1399312426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> I'm working with the cstore_fdw project, which has an interesting
> property for an FDW: the FDW itself creates the files which make up the
> database. This raises a couple of questions:

> 1) Do we want to establish a standard directory for FDWs which create
> files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
> leave it up to each FDW to decide?

I think we ought to vigorously discourage FDWs from storing any files
inside $PGDATA. This cannot lead to anything except grief. Just for
starters, what will operations such as pg_basebackup do with them?

A larger and more philosophical point is that such a direction of
development could hardly be called a "foreign" data wrapper. People
would expect Postgres to take full responsibility for such files,
including data integrity considerations such as fsync-at-checkpoints
and WAL support. Even if we wanted the FDW abstractions to allow
for that, we're very far away from it. And frankly I'd maintain
that FDW is the wrong abstraction.

> 2) Do we want to support the TABLESPACE directive for FDWs?

A fortiori, no.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 18:17:18
Message-ID: 5367D5AE.1040309@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/05/2014 10:53 AM, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> I'm working with the cstore_fdw project, which has an interesting
>> property for an FDW: the FDW itself creates the files which make up the
>> database. This raises a couple of questions:
>
>> 1) Do we want to establish a standard directory for FDWs which create
>> files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
>> leave it up to each FDW to decide?
>
> I think we ought to vigorously discourage FDWs from storing any files
> inside $PGDATA. This cannot lead to anything except grief. Just for
> starters, what will operations such as pg_basebackup do with them?

That was one advantage to putting them in PGDATA; you get a copy of the
files with pg_basebackup. Of course, they don't replicate after that,
but they potentially could, in the future, with Logical Streaming
Replication.

> A larger and more philosophical point is that such a direction of
> development could hardly be called a "foreign" data wrapper. People
> would expect Postgres to take full responsibility for such files,
> including data integrity considerations such as fsync-at-checkpoints
> and WAL support. Even if we wanted the FDW abstractions to allow
> for that, we're very far away from it. And frankly I'd maintain
> that FDW is the wrong abstraction.

Certainly pluggable storage would be a better abstraction; but we don't
have that yet. In the meantime, we have one FDW which creates files
*right now*, and we might have more in the future, so I'm trying to
establish some guidelines as to how such FDWs should behave. Regardless
of whether or not you think FDWs should be managing files, it's better
for users if all FDWs which manage files manage them in the same way.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 18:20:00
Message-ID: CABUevEzfV4rbmU2gP6R9OAhzgTZ9Fu4tgSK-WW=BNPiFATM9wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 5, 2014 at 8:17 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> On 05/05/2014 10:53 AM, Tom Lane wrote:
> > Josh Berkus <josh(at)agliodbs(dot)com> writes:
> >> I'm working with the cstore_fdw project, which has an interesting
> >> property for an FDW: the FDW itself creates the files which make up the
> >> database. This raises a couple of questions:
> >
> >> 1) Do we want to establish a standard directory for FDWs which create
> >> files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
> >> leave it up to each FDW to decide?
> >
> > I think we ought to vigorously discourage FDWs from storing any files
> > inside $PGDATA. This cannot lead to anything except grief. Just for
> > starters, what will operations such as pg_basebackup do with them?
>
> That was one advantage to putting them in PGDATA; you get a copy of the
> files with pg_basebackup. Of course, they don't replicate after that,
> but they potentially could, in the future, with Logical Streaming
> Replication.
>

Presumably they'd also be inconsistent? And as such not really useful
unless you actually shut the database down before you back it up (e.g.
don't use pg_basebackup)?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 18:22:07
Message-ID: 20140505182207.GF17909@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-05-05 11:17:18 -0700, Josh Berkus wrote:
> On 05/05/2014 10:53 AM, Tom Lane wrote:
> > Josh Berkus <josh(at)agliodbs(dot)com> writes:
> >> I'm working with the cstore_fdw project, which has an interesting
> >> property for an FDW: the FDW itself creates the files which make up the
> >> database. This raises a couple of questions:
> >
> >> 1) Do we want to establish a standard directory for FDWs which create
> >> files, such as $PGDATA/base/{database-oid}/fdw/ ? Or would we want to
> >> leave it up to each FDW to decide?
> >
> > I think we ought to vigorously discourage FDWs from storing any files
> > inside $PGDATA. This cannot lead to anything except grief. Just for
> > starters, what will operations such as pg_basebackup do with them?
>
> That was one advantage to putting them in PGDATA; you get a copy of the
> files with pg_basebackup.

A corrupted copy. There's no WAL replay to correct skew due to write
activity while copying.

> Of course, they don't replicate after that,
> but they potentially could, in the future, with Logical Streaming
> Replication.

Nope. They're not in the WAL, so they won't be streamed out.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 18:53:50
Message-ID: 19854.1399316030@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> On 05/05/2014 10:53 AM, Tom Lane wrote:
>> A larger and more philosophical point is that such a direction of
>> development could hardly be called a "foreign" data wrapper. People
>> would expect Postgres to take full responsibility for such files,
>> including data integrity considerations such as fsync-at-checkpoints
>> and WAL support. Even if we wanted the FDW abstractions to allow
>> for that, we're very far away from it. And frankly I'd maintain
>> that FDW is the wrong abstraction.

> Certainly pluggable storage would be a better abstraction; but we don't
> have that yet. In the meantime, we have one FDW which creates files
> *right now*, and we might have more in the future, so I'm trying to
> establish some guidelines as to how such FDWs should behave.

The guideline is simple: don't do that. We should absolutely not
encourage this until/unless we have infrastructure to support it.
Just because one FDW author thought this would be a cool thing to do
does not make it a cool thing to do, and definitely not a cool thing
to encourage others to emulate.

> Regardless
> of whether or not you think FDWs should be managing files, it's better
> for users if all FDWs which manage files manage them in the same way.

Sure. They should all keep them outside $PGDATA, making it not-our-
problem. When and if we're prepared to consider it our problem, we
will be sure to advise people.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-05 20:36:49
Message-ID: 5367F661.9000102@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/05/2014 11:53 AM, Tom Lane wrote:
> Sure. They should all keep them outside $PGDATA, making it not-our-
> problem. When and if we're prepared to consider it our problem, we
> will be sure to advise people.

OK.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-06 15:09:54
Message-ID: CA+Tgmob8ZSr3+CNOyZqBhk22qqXEm-SFh3QMGDQHV9Mrj3WJvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 5, 2014 at 1:53 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A larger and more philosophical point is that such a direction of
> development could hardly be called a "foreign" data wrapper. People
> would expect Postgres to take full responsibility for such files,
> including data integrity considerations such as fsync-at-checkpoints
> and WAL support. Even if we wanted the FDW abstractions to allow
> for that, we're very far away from it. And frankly I'd maintain
> that FDW is the wrong abstraction.

The right abstraction, as Josh points out, would probably be pluggable
storage. Are you (or is anyone) planning to pursue that further?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TABLESPACE and directory for Foreign tables?
Date: 2014-05-06 16:48:36
Message-ID: 20415.1399394916@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, May 5, 2014 at 1:53 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A larger and more philosophical point is that such a direction of
>> development could hardly be called a "foreign" data wrapper. People
>> would expect Postgres to take full responsibility for such files,
>> including data integrity considerations such as fsync-at-checkpoints
>> and WAL support. Even if we wanted the FDW abstractions to allow
>> for that, we're very far away from it. And frankly I'd maintain
>> that FDW is the wrong abstraction.

> The right abstraction, as Josh points out, would probably be pluggable
> storage. Are you (or is anyone) planning to pursue that further?

Well, as you've noticed, I made no progress on that since last PGCon.
It's still something I'm thinking about, but it's a hard problem.

regards, tom lane