Re: danger of stats_temp_directory = /dev/shm

Lists: pgsql-hackers
From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: danger of stats_temp_directory = /dev/shm
Date: 2013-04-24 23:12:05
Message-ID: CAMkU=1z9+7RsDODnT4=cDFBRBp8wYQbd_qsLcMtKEf-oFwuOdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

With the stats file split patch 187492b6c2e8cafc5 introduced in 9.3dev, now
after a crash the postmaster will try to delete all files in the directory
stats_temp_directory. When that is just a subdirectory of PGDATA, this is
fine. But it seems rather hostile when it is set to a shared directory,
like the popular /dev/shm.

Previously, it would only try to delete the one file /dev/shm/pgstat.stat,
so the danger was much smaller.

If /dev/shm is used, this only comes into play when postgres has crashed
but the OS has not (If the OS has crashed, /dev/shm it will be empty anyway
when it comes back) so perhaps this is not such a large exposure.

The docs don't discuss the issue of what happens if stats_temp_directory is
set to a non-private (to PGDATA) directory. The docs also do not
explicitly recommend using /dev/shm, but there are plenty of examples of
that usage that come up on google (and no examples of using a private
subdirectory of /dev/shm rather than /dev/shm itself).

Does this need to be fixed, or at least documented?

Cheers,

Jeff


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-04-25 01:42:45
Message-ID: 20130425014245.GR2169@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Janes escribió:
> With the stats file split patch 187492b6c2e8cafc5 introduced in 9.3dev, now
> after a crash the postmaster will try to delete all files in the directory
> stats_temp_directory. When that is just a subdirectory of PGDATA, this is
> fine. But it seems rather hostile when it is set to a shared directory,
> like the popular /dev/shm.

> Does this need to be fixed, or at least documented?

I think we need it fixed so that it only deletes the files matching a
well-known pattern. In fact, there's an XXX comment about this in the
code:

/* XXX should we try to ignore files other than the ones we write? */

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-04-25 04:09:50
Message-ID: 12653.1366862990@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Jeff Janes escribi:
>> With the stats file split patch 187492b6c2e8cafc5 introduced in 9.3dev, now
>> after a crash the postmaster will try to delete all files in the directory
>> stats_temp_directory. When that is just a subdirectory of PGDATA, this is
>> fine. But it seems rather hostile when it is set to a shared directory,
>> like the popular /dev/shm.

>> Does this need to be fixed, or at least documented?

> I think we need it fixed so that it only deletes the files matching a
> well-known pattern.

I think we need it fixed to reject any stats_temp_directory that is not
postgres-owned with restrictive permissions. The problem here is not
with what it deletes, it's with the insanely insecure configuration.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-04-25 15:24:53
Message-ID: 51794AC5.9080709@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4/25/13 12:09 AM, Tom Lane wrote:
> I think we need it fixed to reject any stats_temp_directory that is not
> postgres-owned with restrictive permissions. The problem here is not
> with what it deletes, it's with the insanely insecure configuration.

Yeah, the requirements should be similar to what initdb requires for
PGDATA and pg_xlog.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-04-25 16:14:13
Message-ID: 51795655.9060508@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 04/25/2013 11:24 AM, Peter Eisentraut wrote:
> On 4/25/13 12:09 AM, Tom Lane wrote:
>> I think we need it fixed to reject any stats_temp_directory that is not
>> postgres-owned with restrictive permissions. The problem here is not
>> with what it deletes, it's with the insanely insecure configuration.
> Yeah, the requirements should be similar to what initdb requires for
> PGDATA and pg_xlog.
>
>

Right.

I do think that best practice suggests using a dedicated ram drive
rather than /dev/shm. Here's an fstab entry I have used at one client's
site:

tmpfs /var/lib/pgsql/stats_tmp tmpfs
size=5G,uid=postgres,gid=postgres 0 0

I guess if we put in the sort of restrictions being suggested above I'd
add a mode argument to the mount options.

(This drive might seem large, but total RAM on this machine is 512Gb.)

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-04-26 15:40:14
Message-ID: CA+TgmoYnNi6UCt_73swjQQDe-ktkhP9oQTz4f+2b7NKxgzdNXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 25, 2013 at 12:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Jeff Janes escribió:
>>> With the stats file split patch 187492b6c2e8cafc5 introduced in 9.3dev, now
>>> after a crash the postmaster will try to delete all files in the directory
>>> stats_temp_directory. When that is just a subdirectory of PGDATA, this is
>>> fine. But it seems rather hostile when it is set to a shared directory,
>>> like the popular /dev/shm.
>
>>> Does this need to be fixed, or at least documented?
>
>> I think we need it fixed so that it only deletes the files matching a
>> well-known pattern.
>
> I think we need it fixed to reject any stats_temp_directory that is not
> postgres-owned with restrictive permissions. The problem here is not
> with what it deletes, it's with the insanely insecure configuration.

Only deleting files matching the relevant pattern might not be a bad
idea either, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-13 16:57:36
Message-ID: CAMkU=1y3cg0FqsHx9jt=O+zUhgEbQ-41Dq9=-m9LNbaqTuwmiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 25, 2013 at 8:24 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 4/25/13 12:09 AM, Tom Lane wrote:
>> I think we need it fixed to reject any stats_temp_directory that is not
>> postgres-owned with restrictive permissions. The problem here is not
>> with what it deletes, it's with the insanely insecure configuration.
>
> Yeah, the requirements should be similar to what initdb requires for
> PGDATA and pg_xlog.

Is this a blocker for 9.3?

If it is a concern of not what is deleted but rather that someone can
inject a poisoned stats file into the directory, does it need to be
back-patched all the way, as that could be done before the split
patch?

Cheers,

Jeff


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-13 17:27:19
Message-ID: 520A6C77.4070908@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/13/2013 09:57 AM, Jeff Janes wrote:
> Is this a blocker for 9.3?

Why would it be? This issue doesn't originate with 9.3.

> If it is a concern of not what is deleted but rather that someone can
> inject a poisoned stats file into the directory, does it need to be
> back-patched all the way, as that could be done before the split
> patch?

I'd say it's a backpatch. We'll need to warn the heck out of users, though.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-14 04:00:57
Message-ID: CAMkU=1w_WoXXKTUhEGm-4bAyCj-py=gnB9wspLeOUHZyAqv-LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, August 13, 2013, Josh Berkus wrote:

> On 08/13/2013 09:57 AM, Jeff Janes wrote:
> > Is this a blocker for 9.3?
>
> Why would it be? This issue doesn't originate with 9.3.
>

Before 9.3, it would delete one specific file from a potentially shared
directory. In 9.3 it deletes the entire contents of a potentially shared
directory. That is a massive expansion in the surface area for
unintentional deletion. If we will disallow using shared directories
before the time 9.3 is released, that would fix it one way, but I don't
know if that is the plan or not.

Cheers,

Jeff


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-14 22:05:14
Message-ID: 520BFF1A.6070708@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff,

> Before 9.3, it would delete one specific file from a potentially shared
> directory. In 9.3 it deletes the entire contents of a potentially shared
> directory. That is a massive expansion in the surface area for
> unintentional deletion. If we will disallow using shared directories
> before the time 9.3 is released, that would fix it one way, but I don't
> know if that is the plan or not.

I can't see doing that. I can see adding the requirement for 9.3, and
then documenting it though.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-15 01:44:32
Message-ID: 19282.1376531072@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Before 9.3, it would delete one specific file from a potentially shared
>> directory. In 9.3 it deletes the entire contents of a potentially shared
>> directory. That is a massive expansion in the surface area for
>> unintentional deletion. If we will disallow using shared directories
>> before the time 9.3 is released, that would fix it one way, but I don't
>> know if that is the plan or not.

> I can't see doing that. I can see adding the requirement for 9.3, and
> then documenting it though.

I think we should change 9.3 to be restrictive about ownership/permissions
on the stats_temp_directory (ie, require owner = postgres user,
permissions = 0700, same as for the $PGDATA directory). I agree that
back-patching such a change to the older branches is probably not a good
plan. I can't quite parse what you say above, so I'm not sure if you're
fully agreeing with that position or not.

In addition to that, it might be a good idea to do what the comment in the
code suggests, namely do more than zero checking on each file name to try
to make sure it looks like a stats temp file name that we'd generate
before we delete it. The ownership/permissions test wouldn't be enough
to prevent you from pointing at, say, ~postgres and thereby losing some
files you'd rather not.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-15 01:57:46
Message-ID: 20130815015746.GE6351@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> I think we should change 9.3 to be restrictive about ownership/permissions
> on the stats_temp_directory (ie, require owner = postgres user,
> permissions = 0700, same as for the $PGDATA directory). I agree that
> back-patching such a change to the older branches is probably not a good
> plan. I can't quite parse what you say above, so I'm not sure if you're
> fully agreeing with that position or not.
>
> In addition to that, it might be a good idea to do what the comment in the
> code suggests, namely do more than zero checking on each file name to try
> to make sure it looks like a stats temp file name that we'd generate
> before we delete it. The ownership/permissions test wouldn't be enough
> to prevent you from pointing at, say, ~postgres and thereby losing some
> files you'd rather not.

I will look into this.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-15 06:20:58
Message-ID: CABUevEzHdwBgapH8kT8h4anu7O2snYw6Megd_wo1v-JuhT9Apg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Aug 15, 2013 3:44 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> >> Before 9.3, it would delete one specific file from a potentially shared
> >> directory. In 9.3 it deletes the entire contents of a potentially
shared
> >> directory. That is a massive expansion in the surface area for
> >> unintentional deletion. If we will disallow using shared directories
> >> before the time 9.3 is released, that would fix it one way, but I don't
> >> know if that is the plan or not.
>
> > I can't see doing that. I can see adding the requirement for 9.3, and
> > then documenting it though.
>
> I think we should change 9.3 to be restrictive about ownership/permissions
> on the stats_temp_directory (ie, require owner = postgres user,
> permissions = 0700, same as for the $PGDATA directory). I agree that
> back-patching such a change to the older branches is probably not a good

+1

>
> In addition to that, it might be a good idea to do what the comment in the
> code suggests, namely do more than zero checking on each file name to try
> to make sure it looks like a stats temp file name that we'd generate
> before we delete it. The ownership/permissions test wouldn't be enough
> to prevent you from pointing at, say, ~postgres and thereby losing some
> files you'd rather not.

+1 on that as well. It shouldn't be that hard to do.

/Magnus


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 17:50:38
Message-ID: 20130819175038.GE9264@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> I think we should change 9.3 to be restrictive about ownership/permissions
> on the stats_temp_directory (ie, require owner = postgres user,
> permissions = 0700, same as for the $PGDATA directory).

Not an easy thing to do, this. It should be done as a GUC check hook,
ISTM, but this doesn't work because the first time those are run we
haven't yet changed to the data directory, and so any relative path
(which the default value is) will cause the check to fail (I *assume*
setting an absolute path would work, but I haven't tried). We could
skip the check on the first run, and verify the directory separately in
PostmasterMain() after changing CWD, but I don't see any way to detect
that we're in the initial run of GUC processing. Any thoughts? Maybe
the idea of using a GUC check hook is flawed, but I don't think so
because we also need to verify a directory when the setting changes on
SIGHUP.

The implementation I chose for the actual check was to separate the
permission checks from checkDataDir() into src/port/pgcheckdir.c that
returns different error codes for each case; see first attachment.
This part seems pretty reasonable, except that the code should be in
src/common rather than src/port, but I believe the entire pgcheckdir.c
file should be moved.

> In addition to that, it might be a good idea to do what the comment in the
> code suggests, namely do more than zero checking on each file name to try
> to make sure it looks like a stats temp file name that we'd generate
> before we delete it. The ownership/permissions test wouldn't be enough
> to prevent you from pointing at, say, ~postgres and thereby losing some
> files you'd rather not.

This seems pretty simple to do; see second attachment. (It would delete
files named, "db_1234.tmpfoobar", that is, valid names with suffixes,
but I can't see that being a problem). (I haven't really tested this
part at all.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
check-dir-perms.patch text/x-diff 6.4 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 18:04:38
Message-ID: 20130819180437.GF9264@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:

> > In addition to that, it might be a good idea to do what the comment in the
> > code suggests, namely do more than zero checking on each file name to try
> > to make sure it looks like a stats temp file name that we'd generate
> > before we delete it. The ownership/permissions test wouldn't be enough
> > to prevent you from pointing at, say, ~postgres and thereby losing some
> > files you'd rather not.
>
> This seems pretty simple to do; see second attachment. (It would delete
> files named, "db_1234.tmpfoobar", that is, valid names with suffixes,
> but I can't see that being a problem). (I haven't really tested this
> part at all.)

Here's the second attachment.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
skip-unknown-files.patch text/x-diff 1.7 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 18:13:50
Message-ID: 23344.1376936030@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Alvaro Herrera wrote:
>>> In addition to that, it might be a good idea to do what the comment in the
>>> code suggests, namely do more than zero checking on each file name to try
>>> to make sure it looks like a stats temp file name that we'd generate
>>> before we delete it. The ownership/permissions test wouldn't be enough
>>> to prevent you from pointing at, say, ~postgres and thereby losing some
>>> files you'd rather not.

>> This seems pretty simple to do; see second attachment. (It would delete
>> files named, "db_1234.tmpfoobar", that is, valid names with suffixes,
>> but I can't see that being a problem). (I haven't really tested this
>> part at all.)

> Here's the second attachment.

This looks good except that it can't tell "db_123.statfoo" isn't a match.
The scan limit/buffer size needs to be greater than the longest string
you care about, not only equal to. I think strcmp not strncmp would be
better coding, too. Please fix that and commit -- I think this part
is pretty noncontroversial.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 18:28:28
Message-ID: 23792.1376936908@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> The implementation I chose for the actual check was to separate the
> permission checks from checkDataDir() into src/port/pgcheckdir.c that
> returns different error codes for each case; see first attachment.
> This part seems pretty reasonable, except that the code should be in
> src/common rather than src/port, but I believe the entire pgcheckdir.c
> file should be moved.

s/CKDIR_TOOACCESIBLE/CKDIR_TOOACCESSIBLE/, and maybe use underscores
to separate the words in those names? Otherwise no objection. But
there's not much point in this unless we can figure out where to call
it from for the stat_directory case.

One possibility is to do the initial check somewhere shortly after
ChangeToDataDir(), and have the GUC check hook only attempt to make a
check in SIGHUP context. Unfortunately we aren't passing the context to
check hooks, only GucSource which isn't adequate for this. Not sure if we
want to go so far as to change the check-hook API at this point. We could
probably think of some other, klugy way to tell if it's initial startup.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 18:39:50
Message-ID: 20130819183950.GA26775@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-08-19 14:28:28 -0400, Tom Lane wrote:
> One possibility is to do the initial check somewhere shortly after
> ChangeToDataDir(), and have the GUC check hook only attempt to make a
> check in SIGHUP context. Unfortunately we aren't passing the context to
> check hooks, only GucSource which isn't adequate for this. Not sure if we
> want to go so far as to change the check-hook API at this point. We could
> probably think of some other, klugy way to tell if it's initial startup.

Is it even actually safe to have stats_temp_directory PGC_SIGHUP after
the per DB splitup? I haven't audited the code for it, but it seems
somewhat likely that we would end up with some files in the old and some
in the new directory?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 18:56:05
Message-ID: 24568.1376938565@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-08-19 14:28:28 -0400, Tom Lane wrote:
>> One possibility is to do the initial check somewhere shortly after
>> ChangeToDataDir(), and have the GUC check hook only attempt to make a
>> check in SIGHUP context. Unfortunately we aren't passing the context to
>> check hooks, only GucSource which isn't adequate for this. Not sure if we
>> want to go so far as to change the check-hook API at this point. We could
>> probably think of some other, klugy way to tell if it's initial startup.

> Is it even actually safe to have stats_temp_directory PGC_SIGHUP after
> the per DB splitup? I haven't audited the code for it, but it seems
> somewhat likely that we would end up with some files in the old and some
> in the new directory?

That's a good point. For the moment we could just change it to
PGC_POSTMASTER and eliminate this problem. Reducing it back to SIGHUP
would be a future feature, which is evidently less than trivial.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:06:48
Message-ID: 20130819190648.GB26775@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-08-19 13:50:38 -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
>
> > I think we should change 9.3 to be restrictive about ownership/permissions
> > on the stats_temp_directory (ie, require owner = postgres user,
> > permissions = 0700, same as for the $PGDATA directory).
>
> Not an easy thing to do, this. It should be done as a GUC check hook,
> ISTM, but this doesn't work because the first time those are run we
> haven't yet changed to the data directory, and so any relative path
> (which the default value is) will cause the check to fail (I *assume*
> setting an absolute path would work, but I haven't tried). We could
> skip the check on the first run, and verify the directory separately in
> PostmasterMain() after changing CWD, but I don't see any way to detect
> that we're in the initial run of GUC processing. Any thoughts? Maybe
> the idea of using a GUC check hook is flawed, but I don't think so
> because we also need to verify a directory when the setting changes on
> SIGHUP.

Hm. Is a check like that actually sufficient? The idea of setting
stats_temp_directory to /dev/shm/postgres or similar in all of several
clusters on one machine doesn't seem to be that far fetched.

The only idea I have to prevent that is writing some minimal pg_control
like file into the temp stats directory iff it's empty. Then, when
reusing a stats temp directory, refuse to work unless it has the same
ids.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:25:38
Message-ID: 25478.1376940338@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Hm. Is a check like that actually sufficient? The idea of setting
> stats_temp_directory to /dev/shm/postgres or similar in all of several
> clusters on one machine doesn't seem to be that far fetched.

Hm. We could fairly easily have the lockfile management code also
write a postmaster.pid file into the stats_temp_directory, thus claiming
it in the same way as we do the $PGDATA dir itself. Not sure it's worth
the trouble though, since we've not heard any field reports of this sort
of thing.

I note BTW that similar complaints could be lodged against the
log_directory setting. We've not worried about that one too much.

Maybe we're overreacting to this issue for stats_temp_directory,
and tightening up the deletion code is a sufficient fix.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:36:26
Message-ID: 20130819193626.GE26775@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-08-19 15:25:38 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > Hm. Is a check like that actually sufficient? The idea of setting
> > stats_temp_directory to /dev/shm/postgres or similar in all of several
> > clusters on one machine doesn't seem to be that far fetched.
>
> Hm. We could fairly easily have the lockfile management code also
> write a postmaster.pid file into the stats_temp_directory, thus claiming
> it in the same way as we do the $PGDATA dir itself. Not sure it's worth
> the trouble though, since we've not heard any field reports of this sort
> of thing.

The reason I think it's more likely is that pg_stats_directory, to be
useful, really needs something like a ramdisk/tmpfs. Which is annoying
to create in a persistent fashion...
Very likely doing so would cause hard to diagnose planner issues using
completely absurd statistics. Not sure how often it would get properly
diagnosed.

But:

> Maybe we're overreacting to this issue for stats_temp_directory,
> and tightening up the deletion code is a sufficient fix.

You very well might be right. Just wanted to raise the issue.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:47:05
Message-ID: 52127639.20301@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

> I note BTW that similar complaints could be lodged against the
> log_directory setting. We've not worried about that one too much.

Actually, it does happen that when you change log_directory on a reload,
stuff takes an uneven amount of time to "cut over"; that is, there's a
few seconds while you're writing to both logs at once. Materially,
though, this isn't a serious operational issue (the logs are known to be
asynchronous), so beyond confusing newbies, it's not something we'd want
to fix.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:51:16
Message-ID: 27252.1376941876@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Tom,
>> I note BTW that similar complaints could be lodged against the
>> log_directory setting. We've not worried about that one too much.

> Actually, it does happen that when you change log_directory on a reload,
> stuff takes an uneven amount of time to "cut over"; that is, there's a
> few seconds while you're writing to both logs at once. Materially,
> though, this isn't a serious operational issue (the logs are known to be
> asynchronous), so beyond confusing newbies, it's not something we'd want
> to fix.

Yeah, the stats temp directory will act pretty much the same way: there
will be an interval where backends might not get the most up-to-date data,
but it's not clear that it's worth the trouble to get it to be more nearly
synchronous.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 19:54:21
Message-ID: 20130819195421.GG26775@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-08-19 12:47:05 -0700, Josh Berkus wrote:
> Tom,
>
> > I note BTW that similar complaints could be lodged against the
> > log_directory setting. We've not worried about that one too much.
>
> Actually, it does happen that when you change log_directory on a reload,
> stuff takes an uneven amount of time to "cut over"; that is, there's a
> few seconds while you're writing to both logs at once. Materially,
> though, this isn't a serious operational issue (the logs are known to be
> asynchronous), so beyond confusing newbies, it's not something we'd want
> to fix.

I think Tom was talking about the issue I noted on upthread which is
that two different clusters could write to the same
temp_stats_directory.
I've seen the logfile equivalent happen, but it was pretty easy to
diagnose...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 20:00:35
Message-ID: 20130819200034.GH26775@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-08-19 15:51:16 -0400, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> > Tom,
> >> I note BTW that similar complaints could be lodged against the
> >> log_directory setting. We've not worried about that one too much.
>
> > Actually, it does happen that when you change log_directory on a reload,
> > stuff takes an uneven amount of time to "cut over"; that is, there's a
> > few seconds while you're writing to both logs at once. Materially,
> > though, this isn't a serious operational issue (the logs are known to be
> > asynchronous), so beyond confusing newbies, it's not something we'd want
> > to fix.
>
> Yeah, the stats temp directory will act pretty much the same way: there
> will be an interval where backends might not get the most up-to-date data,
> but it's not clear that it's worth the trouble to get it to be more nearly
> synchronous.

Won't it possibly cause stats being entirely lost?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 20:12:37
Message-ID: 28615.1376943157@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-08-19 15:51:16 -0400, Tom Lane wrote:
>> Yeah, the stats temp directory will act pretty much the same way: there
>> will be an interval where backends might not get the most up-to-date data,
>> but it's not clear that it's worth the trouble to get it to be more nearly
>> synchronous.

> Won't it possibly cause stats being entirely lost?

How would that happen? The directory is write-only as far as the stats
collector is concerned, no?

Admittedly it might take a long time for it to write out the data again
for some database that wasn't getting any updates. Possibly it'd be worth
teaching the SIGHUP code segment in the stats collector to check for a
change in the value of stats_temp_directory and schedule write-out for all
databases if that happens.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 21:59:47
Message-ID: 20130819215947.GH9264@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:

> > Here's the second attachment.
>
> This looks good except that it can't tell "db_123.statfoo" isn't a match.
> The scan limit/buffer size needs to be greater than the longest string
> you care about, not only equal to. I think strcmp not strncmp would be
> better coding, too. Please fix that and commit -- I think this part
> is pretty noncontroversial.

Pushed with those fixes, thanks. I noticed we also needed to match
files global.stat and global.tmp. Also I added another conversion to
the sscanf pattern, to ignore files named "db_1234.tmp.foo" and such
(i.e. stuff after whitespace).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-19 23:41:13
Message-ID: 2548.1376955673@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Pushed with those fixes, thanks. I noticed we also needed to match
> files global.stat and global.tmp. Also I added another conversion to
> the sscanf pattern, to ignore files named "db_1234.tmp.foo" and such
> (i.e. stuff after whitespace).

After reading the sscanf man page, it seemed like this still wasn't
covering all the bases about where sscanf will silently skip whitespace,
so I hacked it a bit more.

regards, tom lane


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-20 03:02:14
Message-ID: CAMkU=1zGOsp=a_tSjKgSqAjOWDe-9JKzUf5MurTySyU+QB39aA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, August 19, 2013, Alvaro Herrera wrote:

> Tom Lane wrote:
> > Alvaro Herrera <alvherre(at)2ndquadrant(dot)com <javascript:;>> writes:
>
> > > Here's the second attachment.
> >
> > This looks good except that it can't tell "db_123.statfoo" isn't a match.
> > The scan limit/buffer size needs to be greater than the longest string
> > you care about, not only equal to. I think strcmp not strncmp would be
> > better coding, too. Please fix that and commit -- I think this part
> > is pretty noncontroversial.
>
> Pushed with those fixes, thanks. I noticed we also needed to match
> files global.stat and global.tmp. Also I added another conversion to
> the sscanf pattern, to ignore files named "db_1234.tmp.foo" and such
> (i.e. stuff after whitespace).
>

Thanks. I did not attack this with malice, but I did throw some casual
stupidity at it (setting the temp directory to the same as the absolute
path of the data directory) and it managed to go through crash recovery
successfully, while HEAD^ completely destroyed itself.

One oddity I observed is that the first time I tested it (by doing kill -9
on the checkpointer) it failed to come back up automatically, claiming the
start up process had been signalled with 9 during recovery. When I
manually restarted, it ran through recovery again and started successfully.
However I could not repeat this and can't see how this patch could
possibly cause a regression of this nature, so I'm going to chalk it up to
some bizarre race condition, or operator error.

Cheers,

Jeff


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-20 04:45:44
Message-ID: 20130820044544.GA6564@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Pushed with those fixes, thanks. I noticed we also needed to match
> > files global.stat and global.tmp. Also I added another conversion to
> > the sscanf pattern, to ignore files named "db_1234.tmp.foo" and such
> > (i.e. stuff after whitespace).
>
> After reading the sscanf man page, it seemed like this still wasn't
> covering all the bases about where sscanf will silently skip whitespace,
> so I hacked it a bit more.

Funnily enough, my own reading of that manpage got me a flawed
understanding of %n -- I interpreted it as returning the number of items
matched, not the number of chars read .. and I was precisely looking for
a way to determine the latter. Thanks.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: danger of stats_temp_directory = /dev/shm
Date: 2013-08-20 23:10:06
Message-ID: 20130820231006.GM6564@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2013-08-19 14:28:28 -0400, Tom Lane wrote:
> >> One possibility is to do the initial check somewhere shortly after
> >> ChangeToDataDir(), and have the GUC check hook only attempt to make a
> >> check in SIGHUP context. Unfortunately we aren't passing the context to
> >> check hooks, only GucSource which isn't adequate for this. Not sure if we
> >> want to go so far as to change the check-hook API at this point. We could
> >> probably think of some other, klugy way to tell if it's initial startup.
>
> > Is it even actually safe to have stats_temp_directory PGC_SIGHUP after
> > the per DB splitup? I haven't audited the code for it, but it seems
> > somewhat likely that we would end up with some files in the old and some
> > in the new directory?
>
> That's a good point. For the moment we could just change it to
> PGC_POSTMASTER and eliminate this problem. Reducing it back to SIGHUP
> would be a future feature, which is evidently less than trivial.

Here's a patchset for this. The first patch is the same as upthread,
with the enum members renamed; the second is the actual pgstats change.

(I considered the idea that checkDirectoryPermissions returned a bitmask
of the failed checks, instead of just returning a code for the first
check that fails. This might be useful if some caller wants to ignore
certain problems. But at the moment I didn't see many other places
where this could be used, so it's probably pointless ATM.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
stats-temp-dir-1.patch text/x-diff 6.4 KB
stats-temp-dir-2.patch text/x-diff 4.9 KB