Re: pg_basebackup fails with long tablespace paths

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_basebackup fails with long tablespace paths
Date: 2014-10-20 18:59:31
Message-ID: 16477.1413831571@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
for pg_basebackup fail when run in a sufficiently deeply-nested directory
tree. The cause appears to be that we rely on standard "tar" format
to represent the symlink for a tablespace, and POSIX tar format has a
hard-wired restriction of 99 bytes in a symlink's expansion.

What do we want to do about this? I think a minimum expectation would be
for pg_basebackup to notice and complain when it's trying to create an
unworkably long symlink entry, but it would be far better if we found a
way to cope instead.

One thing we could possibly do without reinventing "tar" is to avoid using
absolute path names if a PGDATA-relative one would do.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-10-20 20:51:43
Message-ID: 544575DF.6060004@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/20/14 2:59 PM, Tom Lane wrote:
> What do we want to do about this? I think a minimum expectation would be
> for pg_basebackup to notice and complain when it's trying to create an
> unworkably long symlink entry, but it would be far better if we found a
> way to cope instead.

Isn't it the backend that should error out before sending truncated
files names?

src/port/tar.c:

/* Name 100 */
sprintf(&h[0], "%.99s", filename);

And then do we need to prevent the creation of tablespaces that can't be
backed up?

> One thing we could possibly do without reinventing "tar" is to avoid >
using
> absolute path names if a PGDATA-relative one would do.

Maybe we could hack up the tar format to store the symlink target as the
file body, like cpio does. Of course then we'd lose the property of
this actually being tar.


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-10-21 03:47:44
Message-ID: CAA4eK1+s_yKpzJkW1sHA9RXdSbZGHGssGXNP_F2t9z_DhgNZ6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 21, 2014 at 12:29 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
> for pg_basebackup fail when run in a sufficiently deeply-nested directory
> tree. The cause appears to be that we rely on standard "tar" format
> to represent the symlink for a tablespace, and POSIX tar format has a
> hard-wired restriction of 99 bytes in a symlink's expansion.
>
> What do we want to do about this? I think a minimum expectation would be
> for pg_basebackup to notice and complain when it's trying to create an
> unworkably long symlink entry, but it would be far better if we found a
> way to cope instead.

One way to cope with such a situation could be that during backup we create
a backup symlink file which contains listing of symlinks and then archive
recovery recreates it. Basically this is the solution (patch), I have
proposed
for Windows [1].

[1] - https://commitfest.postgresql.org/action/patch_view?id=1512

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-10-29 00:29:39
Message-ID: 545034F3.3010407@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/20/14 2:59 PM, Tom Lane wrote:
> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
> for pg_basebackup fail when run in a sufficiently deeply-nested directory
> tree.

As for the test, we can do something like the attached to mark the test
as "TODO".

Attachment Content-Type Size
basebackup-tests.patch text/x-diff 738 bytes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-10-29 14:48:00
Message-ID: CA+TgmoabjR0hN7K12gWF=MpvDOFDhwuA6LWc4=PxAVDHQNkHbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 28, 2014 at 8:29 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 10/20/14 2:59 PM, Tom Lane wrote:
>> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
>> for pg_basebackup fail when run in a sufficiently deeply-nested directory
>> tree.
>
> As for the test, we can do something like the attached to mark the test
> as "TODO".

What does this actually do? It doesn't appear that it's just
disabling the test.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-10-31 03:14:04
Message-ID: 5452FE7C.2050209@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/29/14 10:48 AM, Robert Haas wrote:
> On Tue, Oct 28, 2014 at 8:29 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On 10/20/14 2:59 PM, Tom Lane wrote:
>>> My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
>>> for pg_basebackup fail when run in a sufficiently deeply-nested directory
>>> tree.
>>
>> As for the test, we can do something like the attached to mark the test
>> as "TODO".
>
> What does this actually do? It doesn't appear that it's just
> disabling the test.

It still runs the tests, but doesn't count the results in whether the
suite passes.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-11-04 20:52:12
Message-ID: 54593C7C.5080503@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/20/14 4:51 PM, Peter Eisentraut wrote:
> On 10/20/14 2:59 PM, Tom Lane wrote:
>> What do we want to do about this? I think a minimum expectation would be
>> for pg_basebackup to notice and complain when it's trying to create an
>> unworkably long symlink entry, but it would be far better if we found a
>> way to cope instead.
>
> Isn't it the backend that should error out before sending truncated
> files names?
>
> src/port/tar.c:
>
> /* Name 100 */
> sprintf(&h[0], "%.99s", filename);

Here are patches to address that. First, it reports errors when
attempting to create a tar header that would truncate file or symlink
names. Second, it works around the problem in the tests by creating a
symlink from the short-name tempdir that we had set up for the
Unix-socket directory case.

The first patch can be backpatched to 9.3. The tar code before that is
different and would need manual adjustments.

If someone has a too-long tablespace path, I think they can work around
that after this patch by creating a shorter symlink and updating the
pg_tblspc symlinks to point there.

Attachment Content-Type Size
0001-Error-when-creating-names-too-long-for-tar-format.patch application/x-patch 3.8 KB
0002-pg_basebackup-Adjust-tests-for-long-file-name-issues.patch application/x-patch 3.6 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-11-08 02:03:07
Message-ID: 545D79DB.4060507@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/4/14 3:52 PM, Peter Eisentraut wrote:
> Here are patches to address that. First, it reports errors when
> attempting to create a tar header that would truncate file or symlink
> names. Second, it works around the problem in the tests by creating a
> symlink from the short-name tempdir that we had set up for the
> Unix-socket directory case.

I ended up splitting this up differently. I applied to part of the
second patch that works around the length issue in tablespaces. So the
tests now pass in 9.4 and up even in working directories with long
names. This clears up the regression in 9.4.

The remaining, not applied patch is attached. It errors when the file
name is too long and adds tests for that. This could be applied to 9.5
and backpatched, if we so choose. It might become obsolete if
https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
If that patch doesn't get accepted, I might add my patch to a future
commit fest.

Attachment Content-Type Size
0001-Error-when-creating-names-too-long-for-tar-format.patch application/x-patch 5.4 KB

From: Oskari Saarenmaa <os(at)ohmu(dot)fi>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-22 22:40:30
Message-ID: 54989DDE.8010705@ohmu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

08.11.2014, 04:03, Peter Eisentraut kirjoitti:
> On 11/4/14 3:52 PM, Peter Eisentraut wrote:
>> > Here are patches to address that. First, it reports errors when
>> > attempting to create a tar header that would truncate file or symlink
>> > names. Second, it works around the problem in the tests by creating a
>> > symlink from the short-name tempdir that we had set up for the
>> > Unix-socket directory case.
> I ended up splitting this up differently. I applied to part of the
> second patch that works around the length issue in tablespaces. So the
> tests now pass in 9.4 and up even in working directories with long
> names. This clears up the regression in 9.4.
>
> The remaining, not applied patch is attached. It errors when the file
> name is too long and adds tests for that. This could be applied to 9.5
> and backpatched, if we so choose. It might become obsolete if
> https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
> If that patch doesn't get accepted, I might add my patch to a future
> commit fest.

I think we should just use the UStar tar format
(http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
allow long file names; all actively used tar implementations should be
able to handle them. I'll try to write a patch for that soonish.

Until UStar format is used we should raise an error if a filename is
being truncated by tar instead of creating invalid archives. Also note
that Posix tar format allows 100 byte file names as the name doesn't
have to be zero terminated, but we may want to stick to 99 bytes in old
type tar anyway as using 100 byte filenames has shown bugs in other tar
implementations, for example
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=689582 - and
truncating at 100 bytes instead of 99 doesn't help us too much anyway.

/ Oskari


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Oskari Saarenmaa <os(at)ohmu(dot)fi>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-23 03:00:19
Message-ID: CAA4eK1+z_aBFncVwapzjoAVe3SFUektqTjU3uL6TF+GZqpwAhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 23, 2014 at 4:10 AM, Oskari Saarenmaa <os(at)ohmu(dot)fi> wrote:
>
> 08.11.2014, 04:03, Peter Eisentraut kirjoitti:
> > On 11/4/14 3:52 PM, Peter Eisentraut wrote:
> >> > Here are patches to address that. First, it reports errors when
> >> > attempting to create a tar header that would truncate file or symlink
> >> > names. Second, it works around the problem in the tests by creating
a
> >> > symlink from the short-name tempdir that we had set up for the
> >> > Unix-socket directory case.
> > I ended up splitting this up differently. I applied to part of the
> > second patch that works around the length issue in tablespaces. So the
> > tests now pass in 9.4 and up even in working directories with long
> > names. This clears up the regression in 9.4.
> >
> > The remaining, not applied patch is attached. It errors when the file
> > name is too long and adds tests for that. This could be applied to 9.5
> > and backpatched, if we so choose. It might become obsolete if
> > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
> > If that patch doesn't get accepted, I might add my patch to a future
> > commit fest.
>
> I think we should just use the UStar tar format
> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
> allow long file names; all actively used tar implementations should be
> able to handle them. I'll try to write a patch for that soonish.
>

I think even using UStar format won't make it work for Windows where
the standard utilities are not able to understand the symlinks in tar.
There is already a patch [1] in this CF which will handle both cases, so I
am
not sure if it is very good idea to go with a new tar format to handle this
issue.

[1] : https://commitfest.postgresql.org/action/patch_view?id=1512

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Oskari Saarenmaa <os(at)ohmu(dot)fi>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-23 08:33:44
Message-ID: 549928E8.5010602@ohmu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

23.12.2014, 05:00, Amit Kapila kirjoitti:
> On Tue, Dec 23, 2014 at 4:10 AM, Oskari Saarenmaa wrote:
>> 08.11.2014, 04:03, Peter Eisentraut kirjoitti:
>> > It errors when the file
>> > name is too long and adds tests for that. This could be applied to 9.5
>> > and backpatched, if we so choose. It might become obsolete if
>> > https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
>> > If that patch doesn't get accepted, I might add my patch to a future
>> > commit fest.
>>
>> I think we should just use the UStar tar format
>> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
>> allow long file names; all actively used tar implementations should be
>> able to handle them. I'll try to write a patch for that soonish.
>>
>
> I think even using UStar format won't make it work for Windows where
> the standard utilities are not able to understand the symlinks in tar.
> There is already a patch [1] in this CF which will handle both cases, so
> I am
> not sure if it is very good idea to go with a new tar format to handle this
> issue.
>
> [1] : https://commitfest.postgresql.org/action/patch_view?id=1512

That patch makes sense for 9.5, but I don't think it's going to be
backpatched to previous releases? I think we should also apply Peter's
patch to master and backbranches to avoid creating invalid tar files
anywhere. And optionally implement and backpatch long filename support
in tar even if 9.5 no longer creates tar files with long names.

/ Oskari


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-24 13:10:46
Message-ID: 549ABB56.4010505@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/22/14 5:40 PM, Oskari Saarenmaa wrote:
> I think we should just use the UStar tar format
> (http://en.wikipedia.org/wiki/Tar_%28computing%29#UStar_format) and
> allow long file names; all actively used tar implementations should be
> able to handle them. I'll try to write a patch for that soonish.

UStar doesn't handle long link targets, only long file names (and then
only up to 255 characters, which doesn't seem satisfactory).

AFAICT, to allow long link targets, the available solutions are either
pax extended headers or GNU-specific long-link extra headers.

When I create a symlink with a long target and call tar on it, GNU tar
by default creates the GNU long-link header and BSD tar by default
creates a pax header. But they are both able to extract either one.

As a demo for how this might look, attached is a wildly incomplete patch
to produce GNU long-link headers.

Attachment Content-Type Size
tar-gnu-longlink.patch application/x-patch 4.1 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-24 13:12:16
Message-ID: 549ABBB0.5040908@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/22/14 10:00 PM, Amit Kapila wrote:
> There is already a patch [1] in this CF which will handle both cases, so
> I am
> not sure if it is very good idea to go with a new tar format to handle this
> issue.

I think it would still make sense to have proper symlinks in the
basebackup if possible, for clarity.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2014-12-28 01:02:52
Message-ID: CA+Tgmoa0L+bFdDg2+2kN8Z3OrZ_ZvpNmYA4w=TW2z6W8W73kxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 12/22/14 10:00 PM, Amit Kapila wrote:
>> There is already a patch [1] in this CF which will handle both cases, so
>> I am
>> not sure if it is very good idea to go with a new tar format to handle this
>> issue.
>
> I think it would still make sense to have proper symlinks in the
> basebackup if possible, for clarity.

I guess I would have assumed it would be more clear to omit the
symlinks if we're expecting the server to put them in. Otherwise, the
server has to remove the existing symlinks and create new ones, which
introduces various possibilities for failure and confusion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-06 21:33:12
Message-ID: 54AC5498.30304@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/27/14 8:02 PM, Robert Haas wrote:
> On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On 12/22/14 10:00 PM, Amit Kapila wrote:
>>> There is already a patch [1] in this CF which will handle both cases, so
>>> I am
>>> not sure if it is very good idea to go with a new tar format to handle this
>>> issue.
>>
>> I think it would still make sense to have proper symlinks in the
>> basebackup if possible, for clarity.
>
> I guess I would have assumed it would be more clear to omit the
> symlinks if we're expecting the server to put them in. Otherwise, the
> server has to remove the existing symlinks and create new ones, which
> introduces various possibilities for failure and confusion.

Currently, when you unpack a tarred basebackup with tablespaces, the
symlinks will tell you whether you have unpacked the tablespace tars at
the right place. Otherwise, how do you know? Secondly, you also have
the option of putting the tablespaces somewhere else by changing the
symlinks. Under the new scheme, the existing symlinks would be
overwritten (or not?). If that is actually correct, then the proposed
fix doesn't really replicate the required functionality on Windows.

One way to address this would be to do away with the symlinks altogether
and have pg_tblspc/12345 be a text file that contains the tablespace
location. Kind of symlinks implemented in user space.


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-07 03:36:08
Message-ID: CAA4eK1+AV-r7CUxY1GHm_jdjViQLfyYer5qA=dgMW4JUg0GcrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 7, 2015 at 3:03 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
> On 12/27/14 8:02 PM, Robert Haas wrote:
> > On Wed, Dec 24, 2014 at 8:12 AM, Peter Eisentraut <peter_e(at)gmx(dot)net>
wrote:
> >> On 12/22/14 10:00 PM, Amit Kapila wrote:
> >>> There is already a patch [1] in this CF which will handle both cases,
so
> >>> I am
> >>> not sure if it is very good idea to go with a new tar format to
handle this
> >>> issue.
> >>
> >> I think it would still make sense to have proper symlinks in the
> >> basebackup if possible, for clarity.
> >
> > I guess I would have assumed it would be more clear to omit the
> > symlinks if we're expecting the server to put them in. Otherwise, the
> > server has to remove the existing symlinks and create new ones, which
> > introduces various possibilities for failure and confusion.
>
> Currently, when you unpack a tarred basebackup with tablespaces, the
> symlinks will tell you whether you have unpacked the tablespace tars at
> the right place. Otherwise, how do you know?

via some kind of tablespace map file which will tell us the exact
location where symlink need to be pointed and the same will be used
to create a symlink. So after you unpack a tarred basebackup with
tablespaces, there will be no symlinks; when you start the server
(archive recovery) using base backup, it will create the appropriate
symlinks.

> Secondly, you also have
> the option of putting the tablespaces somewhere else by changing the
> symlinks. Under the new scheme, the existing symlinks would be
> overwritten (or not?). If that is actually correct, then the proposed
> fix doesn't really replicate the required functionality on Windows.
>
> One way to address this would be to do away with the symlinks altogether
> and have pg_tblspc/12345 be a text file that contains the tablespace
> location. Kind of symlinks implemented in user space.
>

I think this is somewhat similar to what existing patch [1] does with
the different that there is just one file for all the tablespace locations
rather than individual file in each tablespace directory.

[1] : https://commitfest.postgresql.org/action/patch_view?id=1512
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-07 20:19:58
Message-ID: CA+Tgmob5b7-hFdJKHEHvhHqY5DchC0TGRDR+9EOLN2cmte+nkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 6, 2015 at 4:33 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Currently, when you unpack a tarred basebackup with tablespaces, the
> symlinks will tell you whether you have unpacked the tablespace tars at
> the right place. Otherwise, how do you know? Secondly, you also have
> the option of putting the tablespaces somewhere else by changing the
> symlinks.

That's a good argument for making the tablespace-map file
human-readable and human-editable, but I don't think it's an argument
for duplicating its contents inaccurately in the filesystem.

> One way to address this would be to do away with the symlinks altogether
> and have pg_tblspc/12345 be a text file that contains the tablespace
> location. Kind of symlinks implemented in user space.

Well, that's just spreading the tablespace-map file out into several
files, and maybe keeping it around after we've restored from backup.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-13 21:41:32
Message-ID: 54B5910C.3080103@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/7/15 3:19 PM, Robert Haas wrote:
> On Tue, Jan 6, 2015 at 4:33 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> Currently, when you unpack a tarred basebackup with tablespaces, the
>> symlinks will tell you whether you have unpacked the tablespace tars at
>> the right place. Otherwise, how do you know? Secondly, you also have
>> the option of putting the tablespaces somewhere else by changing the
>> symlinks.
>
> That's a good argument for making the tablespace-map file
> human-readable and human-editable, but I don't think it's an argument
> for duplicating its contents inaccurately in the filesystem.
>
>> One way to address this would be to do away with the symlinks altogether
>> and have pg_tblspc/12345 be a text file that contains the tablespace
>> location. Kind of symlinks implemented in user space.
>
> Well, that's just spreading the tablespace-map file out into several
> files, and maybe keeping it around after we've restored from backup.

I think the key point I'm approaching is that the information should
only ever be in one place, all the time. This is not dissimilar from
why we took the tablespace location out of the system catalogs. Users
might have all kinds of workflows for how they back up, restore, and
move their tablespaces. This works pretty well right now, because the
authoritative configuration information is always in plain view. The
proposal is essentially that we add another location for this
information, because the existing location is incompatible with some
operating system tools. And, when considered by a user, that second
location might or might not collide with or overwrite the first location
at some mysterious times.

So I think the preferable fix is not to add a second location, but to
make the first location compatible with said operating system tools,
possibly in the way I propose above.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-14 19:45:38
Message-ID: CA+TgmobM2tWoMWAbi57sp+7vfTU95V8+f79spA6Rs25a5aWRAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 13, 2015 at 4:41 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> I think the key point I'm approaching is that the information should
> only ever be in one place, all the time. This is not dissimilar from
> why we took the tablespace location out of the system catalogs. Users
> might have all kinds of workflows for how they back up, restore, and
> move their tablespaces. This works pretty well right now, because the
> authoritative configuration information is always in plain view. The
> proposal is essentially that we add another location for this
> information, because the existing location is incompatible with some
> operating system tools. And, when considered by a user, that second
> location might or might not collide with or overwrite the first location
> at some mysterious times.
>
> So I think the preferable fix is not to add a second location, but to
> make the first location compatible with said operating system tools,
> possibly in the way I propose above.

I see. I'm a little concerned that following symlinks may be cheaper
than whatever system we would come up with for caching the
tablespace-name-to-file-name mappings. But that concern might be
unfounded, and apart from it I have no reason to oppose your proposal,
if you want to do the work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-01-23 08:26:39
Message-ID: 20150123082639.GB16172@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2014-12-24 08:10:46 -0500, peter_e(at)gmx(dot)net wrote:
>
> As a demo for how this might look, attached is a wildly incomplete
> patch to produce GNU long-link headers.

Hi Peter.

In what way exactly is this patch wildly incomplete? (I ask because it's
been added to the current CF).

-- Abhijit


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-02-02 13:58:03
Message-ID: CA+Tgmoa0vD4H29P9RFADQ+sMZ5GQ67NZ__3k=dmzCscLArdC9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 7, 2014 at 9:03 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 11/4/14 3:52 PM, Peter Eisentraut wrote:
>> Here are patches to address that. First, it reports errors when
>> attempting to create a tar header that would truncate file or symlink
>> names. Second, it works around the problem in the tests by creating a
>> symlink from the short-name tempdir that we had set up for the
>> Unix-socket directory case.
>
> I ended up splitting this up differently. I applied to part of the
> second patch that works around the length issue in tablespaces. So the
> tests now pass in 9.4 and up even in working directories with long
> names. This clears up the regression in 9.4.
>
> The remaining, not applied patch is attached. It errors when the file
> name is too long and adds tests for that. This could be applied to 9.5
> and backpatched, if we so choose. It might become obsolete if
> https://commitfest.postgresql.org/action/patch_view?id=1512 is accepted.
> If that patch doesn't get accepted, I might add my patch to a future
> commit fest.

I think we should commit this, where by "this" I mean your patch to
error-check the length of filenames and symlinks instead of truncating
them. I don't know what will become of Amit's patch, but I think this
is a good idea anyway. We should perhaps even consider back-patching
it, because silently eating people's data is generally not cool. It's
possible that there are people out there who know that their filenames
and links are being truncated and don't care, and those people would
be unhappy to see this back-patched. However, it's also possible that
there are people who don't know that this is happening and do care,
and those people would be happy about a back-patch. I don't know
which group is larger. At the least, I think we should apply it to
master; because whatever we end up doing about Amit's patch, adding
error checks for conditions where we're chewing up somebody's
filenames and spitting out what's left over has got to be a good
thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-02-24 18:46:30
Message-ID: 54ECC706.1080409@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/2/15 8:58 AM, Robert Haas wrote:
> I think we should commit this, where by "this" I mean your patch to
> error-check the length of filenames and symlinks instead of truncating
> them.

done


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
Cc: Oskari Saarenmaa <os(at)ohmu(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_basebackup fails with long tablespace paths
Date: 2015-02-24 18:56:50
Message-ID: 54ECC972.3080605@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/23/15 3:26 AM, Abhijit Menon-Sen wrote:
> At 2014-12-24 08:10:46 -0500, peter_e(at)gmx(dot)net wrote:
>>
>> As a demo for how this might look, attached is a wildly incomplete
>> patch to produce GNU long-link headers.
>
> Hi Peter.
>
> In what way exactly is this patch wildly incomplete? (I ask because it's
> been added to the current CF).

This patch is not in the commit fest. It's just the most recently
posted patch-like attachment in this thread, which confuses the new CP app.