Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation

Lists: pgsql-hackers
From: "MauMau" <maumau307(at)gmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2013-10-31 12:40:07
Message-ID: 0CAA393E7F0B4FF88848B6C96B89468E@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I've found and fixed a bug that causes recovery (crash recovery, PITR) to
fail. Please find attached the patch against HEAD.

[Bug]
To reproduce the problem, do the following on Windows:

1. pg_ctl start
2. CREATE TABLESPACE tbs LOCATION 'some_tblspc_path';
3. pg_ctl stop -mi
4. pg_ctl start

The database server fails to start, leaving the below messages:

LOG: database system was interrupted; last known up at 2013-10-31 20:24:07
JST
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 0/1788938
FATAL: could not remove symbolic link "pg_tblspc/16385": Permission denied
CONTEXT: xlog redo create tablespace: 16385 "d:/tbs"
LOG: startup process (PID 2724) exited with exit code 1
LOG: aborting startup due to startup process failure

[Cause]
In redo, create_tablespace_directories() tries to remove the symbolic link
for the tablespace using unlink(). However, unlink() on Windows fails with
errno=13 (Permission denied). This is because junction points are
directories on Windows.

[Fix]
Follow destroy_tablespace_directories() and use rmdir() to remove the
junction point.

I've tested the patch. Could you review it and commit? I wish it to be
backported to all major releases.

Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink.patch application/octet-stream 1.6 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2013-10-31 13:20:15
Message-ID: 5272590F.5020504@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 10/31/2013 08:40 AM, MauMau wrote:
> Hello,
>
> I've found and fixed a bug that causes recovery (crash recovery, PITR)
> to fail. Please find attached the patch against HEAD.
>
>
> [Bug]
> To reproduce the problem, do the following on Windows:
>
> 1. pg_ctl start
> 2. CREATE TABLESPACE tbs LOCATION 'some_tblspc_path';
> 3. pg_ctl stop -mi
> 4. pg_ctl start
>
> The database server fails to start, leaving the below messages:
>
> LOG: database system was interrupted; last known up at 2013-10-31
> 20:24:07 JST
> LOG: database system was not properly shut down; automatic recovery in
> progress
> LOG: redo starts at 0/1788938
> FATAL: could not remove symbolic link "pg_tblspc/16385": Permission
> denied
> CONTEXT: xlog redo create tablespace: 16385 "d:/tbs"
> LOG: startup process (PID 2724) exited with exit code 1
> LOG: aborting startup due to startup process failure
>
>
> [Cause]
> In redo, create_tablespace_directories() tries to remove the symbolic
> link for the tablespace using unlink(). However, unlink() on Windows
> fails with errno=13 (Permission denied). This is because junction
> points are directories on Windows.
>
>
> [Fix]
> Follow destroy_tablespace_directories() and use rmdir() to remove the
> junction point.
>
>
> I've tested the patch. Could you review it and commit? I wish it to be
> backported to all major releases.
>
>
> Regards
> MauMau
>

Why are you making this a runtime check instead of a compile time check?

cheers

andrew


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2013-10-31 14:09:34
Message-ID: F5E196DBB5614BAE84BA0B3CDBEACA09@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
> Why are you making this a runtime check instead of a compile time check?

I thought the same situation could happen as in
destroy_tablespace_directories(), but it doesn't seem to apply on second
thought. Revised patch attached, which is very simple based on compile time
check.

Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink_v2.patch application/octet-stream 746 bytes

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "MauMau" <maumau307(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2013-10-31 15:03:32
Message-ID: CF99407282C0451EA38C9C53B70F1064@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "MauMau" <maumau307(at)gmail(dot)com>
> I thought the same situation could happen as in
> destroy_tablespace_directories(), but it doesn't seem to apply on second
> thought. Revised patch attached, which is very simple based on compile
> time
> check.

Sorry, I was careless to leave an old comment. Please use this version.

Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink_v3.patch application/octet-stream 731 bytes

From: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-01-15 09:56:46
Message-ID: CAEB4t-MCXzq-Emp3xHfe7aGebShDaKhOo4MsLahh0B2Keva99w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I did worked on testing the patch on Windows, test scenario mentioned above
thread is reproducible and the provided patch resolve the issue. In case of
junction or directory unlink function
(deprecated<http://msdn.microsoft.com/en-us/library/ms235350.aspx>)
returns -1 with errno “EACCES” (i.e. Permission denied). As you have
followed destroy_tablespace_directories() function, Is there any specific
reason not to use same logic to detect type of the file/link i.e.
“(lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))”, It also seems have
more appropriate error message i.e.

src/backend/commands/tablespace.c

> ….
> 745 /*
> 746 * Try to remove the symlink. We must however deal with
> the possibility
> 747 * that it's a directory instead of a symlink --- this
> could happen during
> 748 * WAL replay (see TablespaceCreateDbspace), and it is
> also the case on
> 749 * Windows where junction points lstat() as directories.
> 750 *
> 751 * Note: in the redo case, we'll return true if this final
> step fails;
> 752 * there's no point in retrying it. Also, ENOENT should
> provoke no more
> 753 * than a warning.
> 754 */
> 755 remove_symlink:
> 756 linkloc = pstrdup(linkloc_with_version_dir);
> 757 get_parent_directory(linkloc);
> 758 if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
> 759 {
> 760 if (rmdir(linkloc) < 0)
> 761 ereport(redo ? LOG : ERROR,
> 762 (errcode_for_file_access(),
> 763 errmsg("could not remove
> directory \"%s\": %m",
> 764 linkloc)));
> 765 }
> 766 else
> 767 {
> 768 if (unlink(linkloc) < 0)
> 769 ereport(redo ? LOG : (errno == ENOENT ?
> WARNING : ERROR),
> 770 (errcode_for_file_access(),
> 771 errmsg("could not remove
> symbolic link \"%s\": %m",
> 772 linkloc)));
> 773 }
> ….

Other than this patch looks good to me.

Regards,
Muhammad Asif Naeem

On Thu, Oct 31, 2013 at 8:03 PM, MauMau <maumau307(at)gmail(dot)com> wrote:

> From: "MauMau" <maumau307(at)gmail(dot)com>
>
> I thought the same situation could happen as in
>> destroy_tablespace_directories(), but it doesn't seem to apply on second
>> thought. Revised patch attached, which is very simple based on compile
>> time
>> check.
>>
>
> Sorry, I was careless to leave an old comment. Please use this version.
>
> Regards
> MauMau
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-01-15 14:42:47
Message-ID: B8A9EC1E51FD45A281A3B61F5744B4A1@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>
> As you have
followed destroy_tablespace_directories() function, Is there any specific
reason not to use same logic to detect type of the file/link i.e.
“(lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))”, It also seems have
more appropriate error message i.e.

Thanks for reviewing and testing the patch. Yes, at first I did what you
mentioned, but modified the patch according to some advice in the mail
thread. During redo, create_tablespace_directories() needs to handle the
case where the $PGDATA/pg_tblspc/xxx is not a symlink but a directory even
on UNIX/Linux. Please see TablespaceCreateDbspace is().
destroy_tablespace_directories() doesn't have to handle such situation.

Regards
MauMau


From: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-01-16 05:09:05
Message-ID: CAEB4t-OcdKpd=x-3N4vw4obG_LtPVB9MUNNbwtTeBXPUCOMOkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi MauMau,

Ah. Sorry, I missed that part. As NTFS junctions and symbolic links are
different (although they behave similarly), there seems only a minor
inconvenience related to misleading error message i.e.

+ #ifdef WIN32
> + if (rmdir(linkloc) < 0 && errno != ENOENT)
> + #else
> if (unlink(linkloc) < 0 && errno != ENOENT)
> + #endif
> ereport(ERROR,
> (errcode_for_file_access(),
> errmsg("could not remove symbolic link \"%s\": %m",

What is your opinion about it, Is it not worth changing ? . Thanks.

On Wed, Jan 15, 2014 at 7:42 PM, MauMau <maumau307(at)gmail(dot)com> wrote:

> From: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>
>
> As you have
>>
> followed destroy_tablespace_directories() function, Is there any specific
> reason not to use same logic to detect type of the file/link i.e.
> “(lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))”, It also seems have
> more appropriate error message i.e.
>
> Thanks for reviewing and testing the patch. Yes, at first I did what you
> mentioned, but modified the patch according to some advice in the mail
> thread. During redo, create_tablespace_directories() needs to handle the
> case where the $PGDATA/pg_tblspc/xxx is not a symlink but a directory even
> on UNIX/Linux. Please see TablespaceCreateDbspace is(). destroy_tablespace_directories()
> doesn't have to handle such situation.
>
> Regards
> MauMau
>
>


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-02-04 10:47:49
Message-ID: 61C1035D921348B59F199CBEFFBEE67A@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm sorry, I'm replying to an older mail, because I lost your latest mail by
mistake.

> Ah. Sorry, I missed that part. As NTFS junctions and symbolic links are
> different (although they behave similarly), there seems only a minor
> inconvenience related to misleading error message i.e.

You are right. Fixed.

Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink_v4.patch application/octet-stream 913 bytes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-16 12:31:53
Message-ID: CAA4eK1+xH8tYoFK3=T3UiTf=5s66a1cC1st9jEScLYc0t1UnsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 8:12 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> From: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>
>
>> As you have
>
> followed destroy_tablespace_directories() function, Is there any specific
> reason not to use same logic to detect type of the file/link i.e.
> "(lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))", It also seems have
> more appropriate error message i.e.
>
> Thanks for reviewing and testing the patch. Yes, at first I did what you
> mentioned, but modified the patch according to some advice in the mail
> thread. During redo, create_tablespace_directories() needs to handle the
> case where the $PGDATA/pg_tblspc/xxx is not a symlink but a directory even
> on UNIX/Linux. Please see TablespaceCreateDbspace is().
> destroy_tablespace_directories() doesn't have to handle such situation.

If create_tablespace_directories() needs to handle with directory both on
Windows/Linux, then shouldn't it be a runtime check as in your first
version rather than compile time check?
Also isn't that the reason why destroy_tablespace_directories() have similar
check?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-21 06:54:03
Message-ID: E98B469BA1D74D66A69103C6392335F0@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
> On Wed, Jan 15, 2014 at 8:12 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
>> Thanks for reviewing and testing the patch. Yes, at first I did what you
>> mentioned, but modified the patch according to some advice in the mail
>> thread. During redo, create_tablespace_directories() needs to handle the
>> case where the $PGDATA/pg_tblspc/xxx is not a symlink but a directory
>> even
>> on UNIX/Linux. Please see TablespaceCreateDbspace is().
>> destroy_tablespace_directories() doesn't have to handle such situation.
>
> If create_tablespace_directories() needs to handle with directory both on
> Windows/Linux, then shouldn't it be a runtime check as in your first
> version rather than compile time check?
> Also isn't that the reason why destroy_tablespace_directories() have
> similar
> check?

I see..., and you are correct. The first version of my patch should be the
right fix. It seems that my head went somewhere when I submitted the second
revision.

What should I do? Should I re-submit the first revision as the latest fifth
revision and link the email from the CommitFest newest entry?

Regards
MauMau


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-21 07:26:42
Message-ID: CAA4eK1Jdw-ve9j8f1RS61CZZZRKp1PBQ2M3mKjj_qLumtukKRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2014 at 12:24 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
>> If create_tablespace_directories() needs to handle with directory both on
>> Windows/Linux, then shouldn't it be a runtime check as in your first
>> version rather than compile time check?
>> Also isn't that the reason why destroy_tablespace_directories() have
>> similar
>> check?
>
>
> I see..., and you are correct. The first version of my patch should be the
> right fix. It seems that my head went somewhere when I submitted the second
> revision.
>
> What should I do? Should I re-submit the first revision as the latest fifth
> revision and link the email from the CommitFest newest entry?

The comments in your first version needs to be improved, as there
you just mentioned a Windows specific comment:
+ /* On Windows, lstat()

I think you can change comments (make it somewhat similar to
destroy_tablespace_directories) and then submit it as a new version.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-21 08:58:19
Message-ID: A9C7FB95B38340CB8FFC3BF5419C5D90@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
> The comments in your first version needs to be improved, as there
> you just mentioned a Windows specific comment:
> + /* On Windows, lstat()
>
> I think you can change comments (make it somewhat similar to
> destroy_tablespace_directories) and then submit it as a new version.

OK, done. Please find the attached patch. I also rebased the patch to
HEAD.

I'll update the CommitFest entry soon.
Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink_v5.patch application/octet-stream 1.8 KB

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-24 06:33:09
Message-ID: CAA4eK1LbcrODt8vb1=w37Os52aH9tfmHT5Humc9j5hyJKRG5Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 21, 2014 at 12:24 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
>> On Wed, Jan 15, 2014 at 8:12 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
>>>
>>> Thanks for reviewing and testing the patch. Yes, at first I did what you
>>> mentioned, but modified the patch according to some advice in the mail
>>> thread. During redo, create_tablespace_directories() needs to handle the
>>> case where the $PGDATA/pg_tblspc/xxx is not a symlink but a directory
>>> even
>>> on UNIX/Linux. Please see TablespaceCreateDbspace is().
>>> destroy_tablespace_directories() doesn't have to handle such situation.
>>
>>
>> If create_tablespace_directories() needs to handle with directory both on
>> Windows/Linux, then shouldn't it be a runtime check as in your first
>> version rather than compile time check?
>> Also isn't that the reason why destroy_tablespace_directories() have
>> similar
>> check?
>
>
> I see..., and you are correct.

I have thought about the above point again and it seems to me that
the above claim (create_tablespace_directories() needs to handle a
path which is not a symlink but directory) might not be true.
For Example, I could easily think of case where it is required for
destroy_tablespace_directories().

1. Assume a tablespace tbs already exists.
2. Create table t1(c1 int) tablespace tbs;
3. drop table t1;
4. Drop tablespace tbs;
5. Do immediate shutdown (pg_ctl stop -mi);
6. During recovery it will create a table in directory (in function
TablespaceCreateDbspace) which needs to be removed by
destroy_tablespace_directories().

I am neither aware of, nor could think of such a case for
create_tablespace_directories(). Do you have any such case in mind
which I might be missing?

By saying above, I don't mean that your current patch has any
problem; even if there is no such scenario, I think your code is
right as stat/isdir check seems to be okay to identify junction
points and it avoids ifdef WIN32 check (which I personally think
is bit annoying and we should try to avoid such code unless it
is must or provides any significant advantage).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-24 14:19:29
Message-ID: 48E2E45C0A7446D1AF4757B66B2D2D8B@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
> 1. Assume a tablespace tbs already exists.
> 2. Create table t1(c1 int) tablespace tbs;
> 3. drop table t1;
> 4. Drop tablespace tbs;
> 5. Do immediate shutdown (pg_ctl stop -mi);
> 6. During recovery it will create a table in directory (in function
> TablespaceCreateDbspace) which needs to be removed by
> destroy_tablespace_directories().
>
> I am neither aware of, nor could think of such a case for
> create_tablespace_directories(). Do you have any such case in mind
> which I might be missing?

A bit contrived example is:

1. After the directory is created by TablespaceCreateDbspace(), recovery is
stopped (e.g. due to power outage). The directory remains.
2. Restart the server, redoing CREATE TABLESPACE during recovery, which
executes create_tablespace_directories().

> By saying above, I don't mean that your current patch has any
> problem; even if there is no such scenario, I think your code is
> right as stat/isdir check seems to be okay to identify junction
> points and it avoids ifdef WIN32 check (which I personally think
> is bit annoying and we should try to avoid such code unless it
> is must or provides any significant advantage).

I think so, too.

Regards
MauMau


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-25 03:32:17
Message-ID: CAA4eK1J-+uVXtezaGRK9GOJP1dG4Vcyb-VxwRRrW=7YF5vTrkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 24, 2014 at 7:49 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> A bit contrived example is:
>
> 1. After the directory is created by TablespaceCreateDbspace(), recovery is
> stopped (e.g. due to power outage). The directory remains.
> 2. Restart the server, redoing CREATE TABLESPACE during recovery, which
> executes create_tablespace_directories().

I don't understand how after step-1, step-2 can occur in recovery for same
tablespace path.

Function TablespaceCreateDbspace() would have called for CREATE TABLE.
Now Step-2 can only occur if there is a Drop Tablespace command in-between
step-1 and step-2, else CREATE TABLESPACE can't be successful during
command execution so will not get recorded in WAL. Basically Create Table
cannot happen on a particular directory without having some
CREATE TABLESPACE before it, so in the above example taken by you,
there must be some Create TableSpace before step-1 or it's on default
tablespace location which means you cannot perform step-2 for same
tablespace path as step-1 without having DROP TABLESPACE in-between
step-1 and step-2.

If you think that above scenario is not possible, then you just need to
modify comment:
"! * Remove old symlink in recovery...."

One more minor point about patch:
+ struct stat st;

if (InRecovery)
{
struct stat st;

Defining stat struct two times in same function in different ways doesn't
seem to be good, we can do the same way for new usage as is already
done in code or may be declare it once.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-25 12:09:13
Message-ID: 994F3C8171ED42D7898F0795DCD6ACAA@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

From: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
> If you think that above scenario is not possible, then you just need to
> modify comment:
> "! * Remove old symlink in recovery...."

Hm, my scenario seems impossible. I reverted the comment.

> One more minor point about patch:
> + struct stat st;
>
> if (InRecovery)
> {
> struct stat st;
>
> Defining stat struct two times in same function in different ways doesn't
> seem to be good, we can do the same way for new usage as is already
> done in code or may be declare it once.

OK, I removed the second (existing) definition of st.

Regards
MauMau

Attachment Content-Type Size
remove_tblspc_symlink_v6.patch application/octet-stream 1.9 KB

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-03-26 03:31:44
Message-ID: CAA4eK1LvXr2BPQG3C68AQLP5SB6o68mOXDJzbqcc96N6LxjTeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 25, 2014 at 5:39 PM, MauMau <maumau307(at)gmail(dot)com> wrote:
> OK, I removed the second (existing) definition of st.

This patch version looks fine, I have verified the issue, regression tests
passed.

I had attached the latest version of patch in CF app and marked it as
"Ready For Committer".

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Asif Naeem <anaeem(dot)it(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-04-04 19:04:03
Message-ID: 20140404190403.GA27702@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-25 21:09:13 +0900, MauMau wrote:
> ! /*
> ! * Remove old symlink in recovery, in case it points to the wrong place.
> ! * On Windows, lstat() reports junction points as directories.
> ! */
> if (InRecovery)
> {
> ! if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))
> ! {
> ! if (rmdir(linkloc) < 0)
> ! ereport(ERROR,
> ! (errcode_for_file_access(),
> ! errmsg("could not remove directory \"%s\": %m",
> ! linkloc)));
> ! }
> ! else
> ! {
> ! if (unlink(linkloc) < 0 && errno != ENOENT)
> ! ereport(ERROR,
> ! (errcode_for_file_access(),
> ! errmsg("could not remove symbolic link \"%s\": %m",
> ! linkloc)));
> ! }
> }

if (..)
...
else
{
if (...)
...
}

is pretty odd code layout.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "MauMau" <maumau307(at)gmail(dot)com>
Cc: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Asif Naeem" <anaeem(dot)it(at)gmail(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation
Date: 2014-04-05 03:11:47
Message-ID: 20066.1396667507@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"MauMau" <maumau307(at)gmail(dot)com> writes:
> [ remove_tblspc_symlink_v6.patch ]

Committed, thanks.

regards, tom lane