Re: function side effects

Lists: pgsql-hackers
From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: function side effects
Date: 2010-02-23 03:51:49
Message-ID: 20100223.125149.28784519.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I'm wondering if we could detect a funcion has a side effect,
i.e. does a write to database. This is neccessary for pgpool to decide
if a qeury should to be sent to all of databases or not. If a query
includes functions which do writes to database, it should send the
query to all of databases, otherwise the contents of databases go into
inconsistent state.

Currently we have three properties of functions: IMMUTABLE, STABLE and
VOLATILE. According to docs IMMUTABLE or STABLE functions do not write
to database. VOLATILE functions *may* do writes to database. Maybe I
could regard VOLATILE functions always do write, but priblem is,
VOLATILE qfunctions such as random() and timeofday() apparently do not
write and sending those queries that include such functions is
overkill.

Can we VOLATILE property divide into two categories, say, VOLATILE
without write, and VOLATILE with write?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [SPAM]function side effects
Date: 2010-02-23 04:15:15
Message-ID: 20100223131515.A177.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Tatsuo Ishii" <ishii(at)postgresql(dot)org> wrote:

> VOLATILE functions such as random() and timeofday() apparently do not
> write and sending those queries that include such functions is
> overkill.

> Can we VOLATILE property divide into two categories, say, VOLATILE
> without write, and VOLATILE with write?

I think it's possible. We might borrow words and semantics from
unctional programming languages for functions with side effects.
How do they handle the issue?

BTW, random() *writes* the random seed, though no one will mind it.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 04:49:42
Message-ID: 9226.1266900582@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> I'm wondering if we could detect a funcion has a side effect,
> i.e. does a write to database.

> Currently we have three properties of functions: IMMUTABLE, STABLE and
> VOLATILE. According to docs IMMUTABLE or STABLE functions do not write
> to database.

Those classifications are meant as planner directives; they are NOT
meant to be bulletproof. Hanging database integrity guarantees on
whether a "non volatile" function changes anything is entirely unsafe.
To give just one illustration of the problems, a nonvolatile function
is allowed to call a volatile one.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 14:44:48
Message-ID: 20100223144448.GB3672@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tatsuo Ishii wrote:
> Hi,
>
> I'm wondering if we could detect a funcion has a side effect,
> i.e. does a write to database. This is neccessary for pgpool to decide
> if a qeury should to be sent to all of databases or not. If a query
> includes functions which do writes to database, it should send the
> query to all of databases, otherwise the contents of databases go into
> inconsistent state.

I was talking about this to someone in Cuba and one conclusion we
reached was that this was a fairly difficult task -- consider that
someone may choose to define an innocent-looking operator using a
volatile function. If you only examine things that look like functions
in the query you will miss those. The only way to figure out whether a
query has a write effect is to ask the server about the whole query.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: alvherre(at)commandprompt(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 15:05:49
Message-ID: 20100224.000549.82836224.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I'm wondering if we could detect a funcion has a side effect,
> > i.e. does a write to database. This is neccessary for pgpool to decide
> > if a qeury should to be sent to all of databases or not. If a query
> > includes functions which do writes to database, it should send the
> > query to all of databases, otherwise the contents of databases go into
> > inconsistent state.
>
> I was talking about this to someone in Cuba and one conclusion we
> reached was that this was a fairly difficult task -- consider that
> someone may choose to define an innocent-looking operator using a
> volatile function. If you only examine things that look like functions
> in the query you will miss those. The only way to figure out whether a
> query has a write effect is to ask the server about the whole query.

In general you are right. However in most database application
systems, it is possible that all functions are properly designed and
implemented (at least they want so). In this world, more or less
PostgreSQL functions are just a part of their applications. If they
trust their client side applications, why they cannot trust PostgreSQL
custom functions as well?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: function side effects
Date: 2010-02-23 15:40:34
Message-ID: 4B83A292020000250002F503@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Those classifications are meant as planner directives; they are
> NOT meant to be bulletproof. Hanging database integrity
> guarantees on whether a "non volatile" function changes anything
> is entirely unsafe. To give just one illustration of the
> problems, a nonvolatile function is allowed to call a volatile
> one.

Could it work to store a flag in each process to indicate when it is
executing a non-volatile function, and throw an error on any attempt
to call a volatile function or modify the database?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 16:08:43
Message-ID: 22537.1266941323@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Those classifications are meant as planner directives; they are
>> NOT meant to be bulletproof. Hanging database integrity
>> guarantees on whether a "non volatile" function changes anything
>> is entirely unsafe. To give just one illustration of the
>> problems, a nonvolatile function is allowed to call a volatile
>> one.

> Could it work to store a flag in each process to indicate when it is
> executing a non-volatile function, and throw an error on any attempt
> to call a volatile function or modify the database?

It's *not an error* for a nonvolatile function to call a volatile one.
At least it's never been in the past, and I'm sure you'd break some
applications if you made it so in the future.

regards, tom lane


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: ishii(at)postgresql(dot)org
Cc: alvherre(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 16:29:40
Message-ID: 20100224.012940.70463133.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I was talking about this to someone in Cuba and one conclusion we
> > reached was that this was a fairly difficult task -- consider that
> > someone may choose to define an innocent-looking operator using a
> > volatile function. If you only examine things that look like functions
> > in the query you will miss those. The only way to figure out whether a
> > query has a write effect is to ask the server about the whole query.
>
> In general you are right. However in most database application
> systems, it is possible that all functions are properly designed and
> implemented (at least they want so). In this world, more or less
> PostgreSQL functions are just a part of their applications. If they
> trust their client side applications, why they cannot trust PostgreSQL
> custom functions as well?

Still there could be "none honest functions" such as calling volatile
functions from non volatile function in the PostgreSQL system(I have
not made any investigation. But it's possible). Or vendor provided
functions (for example embedded in closed source packages) might fall
into this category. Probably it's enough for pgpool to have a "black
list" of such that function. Maintaining such a list is a boring task
but I cannot think of any good way at this point.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 16:39:09
Message-ID: 3073cc9b1002230839p17023cdbs3baacf36f1194dd4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 11:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> It's *not an error* for a nonvolatile function to call a volatile one.

it should be considered an error i think, someone think there is a use
cas for calling volatile functions
inside stable ones but i can see what that reason could be...

> At least it's never been in the past, and I'm sure you'd break some
> applications if you made it so in the future.
>

i'm sure of that too, but in this case seems reasonable to do so

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: function side effects
Date: 2010-02-23 16:52:36
Message-ID: 4B83B374020000250002F513@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:

>> throw an error on any attempt to call a volatile function or
>> modify the database?

> It's *not an error* for a nonvolatile function to call a volatile
> one.

Right, we all know it currently doesn't throw an error, but I can't
think of anywhere I'd like to have someone do that in a database for
which I have any responsibility. Does anyone have a sane use case
for a non-volatile function to call a volatile one or to update the
database?

-Kevin


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 18:18:32
Message-ID: 407d949e1002231018r191ad206ye01561f98caf6fc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 4:52 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Right, we all know it currently doesn't throw an error, but I can't
> think of anywhere I'd like to have someone do that in a database for
> which I have any responsibility.  Does anyone have a sane use case
> for a non-volatile function to call a volatile one or to update the
> database?

So consider for example a function which explicitly sets the timezone
and then uses timestamp without timezone functions (which are volatile
only because the GUC variable might change between calls).

Or somebody who uses the tsearch functions because they're planning to
not change their dictionaries.

Or builds a hash function by calling random after setting the seed to
a specific value -- this is actually a fairly popular strategy for
building good hash functions.

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <gsstark(at)mit(dot)edu>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: function side effects
Date: 2010-02-23 18:39:04
Message-ID: 4B83CC68020000250002F537@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> wrote:

> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

>> Does anyone have a sane use case for a non-volatile function to
>> call a volatile one or to update the database?
>
> So consider for example a function which explicitly sets the
> timezone and then uses timestamp without timezone functions (which
> are volatile only because the GUC variable might change between
> calls).

OK, I can see where that would be sane, but it seems more fragile
than using timestamp with time zone. But, OK, something sane and
functional could break on that.

> Or somebody who uses the tsearch functions because they're
> planning to not change their dictionaries.

I didn't realize tsearch functions were volatile. Should they
really be so?

> Or builds a hash function by calling random after setting the seed
> to a specific value -- this is actually a fairly popular strategy
> for building good hash functions.

I'd never seen that. I'm not sure I understand where that comes in
useful, but if you've seen it enough to call it "fairly popular" I
guess I have to accept it.

Thanks for the examples. They did make me consider a real-life type
of process which isn't currently implemented as a PostgreSQL
function, but conceivably could be -- randomizing a pool of jurors
to facilitate jury selection. My eyes are opened. :-)

-Kevin


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: function side effects
Date: 2010-02-23 18:50:23
Message-ID: 407d949e1002231050y3c7402d7t3b4aebb638058975@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 6:39 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Or somebody who uses the tsearch functions because they're
>> planning to not change their dictionaries.
>
> I didn't realize tsearch functions were volatile.  Should they
> really be so?

Uhm, my mistake. They're stable. Ok, for that one I'll substitute a
function which uses pg_read_file knowing that the file in question
won't be changed. Perhaps it's a per-machine key or something like
that.

>> Or builds a hash function by calling random after setting the seed
>> to a specific value -- this is actually a fairly popular strategy
>> for building good hash functions.
>
> I'd never seen that.  I'm not sure I understand where that comes in
> useful, but if you've seen it enough to call it "fairly popular" I
> guess I have to accept it.

http://en.wikipedia.org/wiki/Universal_hashing

They have the useful property that it's hard for an attacker to
contrive data which has poor collision behaviour.

> Thanks for the examples.  They did make me consider a real-life type
> of process which isn't currently implemented as a PostgreSQL
> function, but conceivably could be -- randomizing a pool of jurors
> to facilitate jury selection.  My eyes are opened.  :-)

I'm not actually sure I follow what you're picturing.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 19:02:58
Message-ID: 424.1266951778@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Tue, Feb 23, 2010 at 6:39 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> I didn't realize tsearch functions were volatile. Should they
>> really be so?

> Uhm, my mistake. They're stable.

IMMUTABLE/STABLE/VOLATILE is not really about side effects, it is about
how long the function value can be expected to hold still for.

There are quite a lot of cases of functions that are marked
conservatively as stable (or even volatile) but could be considered
immutable in particular queries, because the application developer is
prepared to assume that values such as GUCs won't change in his usage.
The traditional way to deal with that is to wrap them in an immutable
wrapper function. There's actually code in the planner to make that
work --- we have to suppress inlining to avoid exposing the not-immutable
guts, else the planner will not do what's wanted.

There may be some value in inventing a "has no side effects" marker, but
that should not be confused with IMMUTABLE/STABLE.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 20:01:15
Message-ID: 603c8f071002231201n1213d648t11ce1cda7d61eb3e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 2:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> There may be some value in inventing a "has no side effects" marker, but
> that should not be confused with IMMUTABLE/STABLE.

Yeah, that's what I was thinking, too....

...Robert


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <gsstark(at)mit(dot)edu>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: function side effects
Date: 2010-02-23 20:01:47
Message-ID: 4B83DFCB020000250002F554@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

>> Thanks for the examples. They did make me consider a real-life
>> type of process which isn't currently implemented as a PostgreSQL
>> function, but conceivably could be -- randomizing a pool of
>> jurors to facilitate jury selection. My eyes are opened. :-)
>
> I'm not actually sure I follow what you're picturing.

Well, to facilitate people's rights to a jury of their peers, we
obtain lists of people in each county based on having a drivers
license or state ID, being registered to vote, etc., then (after
eliminating duplicates and those who have served on juries in recent
years) we randomly select a subset, who get questionnaires, from
which (at a later date) we randomly pick people to summon for jury a
juror panel, from which (on each day they appear) we randomly select
people for particular juries.

Any flaw in the randomness of selection could constitute grounds for
an appeal of the outcome of a case, so we have to be careful about
process. (Randomness being defined as the properties that nobody
with an interest in the case can control or predict who will be
selected from one group into the next, and there is no bias on
anything related to demographics, like age or last name [which could
correlate with ethnicity]). Sounds like fun, eh?

-Kevin


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 21:54:12
Message-ID: 3073cc9b1002231354u368f6dfdge43aec711fefc98b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 2:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> There may be some value in inventing a "has no side effects" marker, but
> that should not be confused with IMMUTABLE/STABLE.
>

a READONLY function?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 22:21:03
Message-ID: 1266963663.3752.4633.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2010-02-22 at 23:49 -0500, Tom Lane wrote:
> Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> > I'm wondering if we could detect a funcion has a side effect,
> > i.e. does a write to database.
>
> > Currently we have three properties of functions: IMMUTABLE, STABLE and
> > VOLATILE. According to docs IMMUTABLE or STABLE functions do not write
> > to database.
>
> Those classifications are meant as planner directives; they are NOT
> meant to be bulletproof.

You make them sound like "hints". (I thought we frowned on those?)

That isn't true, they don't just change the optimal plan in the way the
enable_* parameters do. Immutable functions are reduced in ways that
would give the wrong answer if the function is actually volatile.
Referring to function properties as "planner directives" hides their
critical importance to the output of a query that calls such functions.

> Hanging database integrity guarantees on
> whether a "non volatile" function changes anything is entirely unsafe.
> To give just one illustration of the problems, a nonvolatile function
> is allowed to call a volatile one.

So wrongly marking a function as something other than volatile *is* a
data integrity issue. Why is that OK? ISTM that this should work the way
Tatsuo wants it to work. Immutability should be passed down through the
call stack to ensure we can't get this wrong.

If people have been advising clients to set things immutable when they
are not that seems fairly questionable. We shouldn't avoid fixing an
integrity loophole just simply to preserve a planner backdoor,
especially since other backdoors are specifically avoided.

--
Simon Riggs www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 22:35:41
Message-ID: 1266964541.3752.4675.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-02-23 at 12:51 +0900, Tatsuo Ishii wrote:

> I'm wondering if we could detect a funcion has a side effect,
> i.e. does a write to database. This is neccessary for pgpool to decide
> if a qeury should to be sent to all of databases or not. If a query
> includes functions which do writes to database, it should send the
> query to all of databases, otherwise the contents of databases go into
> inconsistent state.
>
> Currently we have three properties of functions: IMMUTABLE, STABLE and
> VOLATILE. According to docs IMMUTABLE or STABLE functions do not write
> to database. VOLATILE functions *may* do writes to database. Maybe I
> could regard VOLATILE functions always do write, but priblem is,
> VOLATILE qfunctions such as random() and timeofday() apparently do not
> write and sending those queries that include such functions is
> overkill.
>
> Can we VOLATILE property divide into two categories, say, VOLATILE
> without write, and VOLATILE with write?

pgpool parses the query before deciding how to route it, yes?

Why not mark random() and timeofday() as stable in the pgpool catalog,
yet leave them as volatile on the database servers? It will "just work"
then.

--
Simon Riggs www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-23 23:04:55
Message-ID: 18382.1266966295@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> So wrongly marking a function as something other than volatile *is* a
> data integrity issue. Why is that OK? ISTM that this should work the way
> Tatsuo wants it to work.

Please read the rest of the thread.

regards, tom lane


From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: simon(at)2ndQuadrant(dot)com
Cc: ishii(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-24 00:56:29
Message-ID: 20100224.095629.00003185.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I'm wondering if we could detect a funcion has a side effect,
> > i.e. does a write to database. This is neccessary for pgpool to decide
> > if a qeury should to be sent to all of databases or not. If a query
> > includes functions which do writes to database, it should send the
> > query to all of databases, otherwise the contents of databases go into
> > inconsistent state.
> >
> > Currently we have three properties of functions: IMMUTABLE, STABLE and
> > VOLATILE. According to docs IMMUTABLE or STABLE functions do not write
> > to database. VOLATILE functions *may* do writes to database. Maybe I
> > could regard VOLATILE functions always do write, but priblem is,
> > VOLATILE qfunctions such as random() and timeofday() apparently do not
> > write and sending those queries that include such functions is
> > overkill.
> >
> > Can we VOLATILE property divide into two categories, say, VOLATILE
> > without write, and VOLATILE with write?
>
> pgpool parses the query before deciding how to route it, yes?

Right.

> Why not mark random() and timeofday() as stable in the pgpool catalog,
> yet leave them as volatile on the database servers? It will "just work"
> then.

Please note that random() and timeofday() are just examples. What I'm
thinking about was, a function which directly or indirectly cause
write to database (thus lead writing to log).

Consider a function that calls those has-side-effect functions. We
need a property which is inherited to child function to parent
function.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: gsstark(at)mit(dot)edu, Kevin(dot)Grittner(at)wicourts(dot)gov, ishii(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-24 01:00:33
Message-ID: 20100224.100033.01018847.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> IMMUTABLE/STABLE/VOLATILE is not really about side effects, it is about
> how long the function value can be expected to hold still for.
>
> There are quite a lot of cases of functions that are marked
> conservatively as stable (or even volatile) but could be considered
> immutable in particular queries, because the application developer is
> prepared to assume that values such as GUCs won't change in his usage.
> The traditional way to deal with that is to wrap them in an immutable
> wrapper function. There's actually code in the planner to make that
> work --- we have to suppress inlining to avoid exposing the not-immutable
> guts, else the planner will not do what's wanted.

"IMMUTABLE indicates that the function cannot modify the database..."

"STABLE indicates that the function cannot modify the database..."

Apparently IMMUTABLE/STABLE should not write to database according to
docs. Are you saying that in the real world these are ignored? If so,
this is an important database intergrity issue as Simon pointed out.

> It's *not an error* for a nonvolatile function to call a volatile one.
> At least it's never been in the past, and I'm sure you'd break some
> applications if you made it so in the future.

If this is true, at least the docs should be corrected IMO.

> There may be some value in inventing a "has no side effects" marker, but
> that should not be confused with IMMUTABLE/STABLE.

It seems it is neccessary to invent new marker for not only pgpool but
HOT/SR (and may be Slony). They need to know if a query including
functions do writes or not *before* sending to backend. Otherwise they
got error because they sent a write query to standby.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: gsstark(at)mit(dot)edu, Kevin(dot)Grittner(at)wicourts(dot)gov, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-24 03:18:42
Message-ID: 23236.1266981522@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> Apparently IMMUTABLE/STABLE should not write to database according to
> docs. Are you saying that in the real world these are ignored? If so,
> this is an important database intergrity issue as Simon pointed out.

One more time: these markings are not about whether the function writes
to the database. They are about whether its result value can be
presumed to be unchanging in various circumstances. Trying to redefine
them for another purpose is going to lead to nothing but trouble.

And no, there is not an "integrity issue" here. If the planner thinks
something is stable or immutable, it might evaluate it fewer times than
the user would wish, but that doesn't render the database inconsistent.
It just means the user doesn't get the behavior he wanted. That's no
different from any other erroneously-written query.

> It seems it is neccessary to invent new marker for not only pgpool but
> HOT/SR (and may be Slony). They need to know if a query including
> functions do writes or not *before* sending to backend. Otherwise they
> got error because they sent a write query to standby.

Well, that's something we can consider adding in 9.1, but it's far too
late for 9.0. Personally I find that goal rather suspect anyway.
I think the chances of determining this reliably in pgpool are
negligible, even if functions were marked like that. You would need to
duplicate *all* of the backend's parsing and all of its state (eg schema
search path) in order to discover anything.

regards, tom lane


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, gsstark(at)mit(dot)edu, Kevin(dot)Grittner(at)wicourts(dot)gov, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-02-24 03:46:13
Message-ID: 3073cc9b1002231946l7d2f2861r3760a4412043c04@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 23, 2010 at 10:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Personally I find that goal rather suspect anyway.
> I think the chances of determining this reliably in pgpool are
> negligible, even if functions were marked like that.  You would need to
> duplicate *all* of the backend's parsing and all of its state (eg schema
> search path) in order to discover anything.
>

i agree with that, as Alvaro suggested maybe a way to ask the server
about the whole query is the way to go

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-01 19:56:30
Message-ID: 1267473390.7837.9.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2010-02-23 at 16:54 -0500, Jaime Casanova wrote:
> On Tue, Feb 23, 2010 at 2:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > There may be some value in inventing a "has no side effects" marker, but
> > that should not be confused with IMMUTABLE/STABLE.
> >
>
> a READONLY function?

SQL standard:

<SQL-data access indication> ::=
NO SQL
| CONTAINS SQL
| READS SQL DATA
| MODIFIES SQL DATA

Notice also that this is separate from

<deterministic characteristic> ::=
DETERMINISTIC
| NOT DETERMINISTIC

which is the SQL standard's variant of volatility.

So someone has already had the idea that these two should exist
separately.


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-01 20:30:20
Message-ID: 3073cc9b1003011230m13ba722ao2df305ba65b67dce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 1, 2010 at 2:56 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On tis, 2010-02-23 at 16:54 -0500, Jaime Casanova wrote:
>> On Tue, Feb 23, 2010 at 2:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >
>> > There may be some value in inventing a "has no side effects" marker, but
>> > that should not be confused with IMMUTABLE/STABLE.
>> >
>>
>> a READONLY function?
>
> SQL standard:
>
> <SQL-data access indication> ::=
> NO SQL
> | CONTAINS SQL
> | READS SQL DATA
> | MODIFIES SQL DATA
>

good!

> Notice also that this is separate from
>
> <deterministic characteristic> ::=
> DETERMINISTIC
> | NOT DETERMINISTIC
>

so IMMUTABLE = DETERMINISTIC NO SQL,
STABLE = DETERMINISTIC READS SQL DATA
VOLATILE = NOT DETERMINISTIC MODIFIES SQL DATA

> which is the SQL standard's variant of volatility.
>
> So someone has already had the idea that these two should exist
> separately.
>

seems something we should implement

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-01 21:29:56
Message-ID: 24513.1267478996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> SQL standard:

> <SQL-data access indication> ::=
> NO SQL
> | CONTAINS SQL
> | READS SQL DATA
> | MODIFIES SQL DATA

Huh. I understand three of those, but what is the use of CONTAINS SQL?
Seems like that would have to be the same as the last one, or maybe
the next-to-last one if you're prepared to assume it's read-only SQL.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-01 21:40:40
Message-ID: 4B8C3458.80701@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jaime Casanova írta:
> On Mon, Mar 1, 2010 at 2:56 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
>> On tis, 2010-02-23 at 16:54 -0500, Jaime Casanova wrote:
>>
>>> On Tue, Feb 23, 2010 at 2:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>
>>>> There may be some value in inventing a "has no side effects" marker, but
>>>> that should not be confused with IMMUTABLE/STABLE.
>>>>
>>>>
>>> a READONLY function?
>>>
>> SQL standard:
>>
>> <SQL-data access indication> ::=
>> NO SQL
>> | CONTAINS SQL
>> | READS SQL DATA
>> | MODIFIES SQL DATA
>>
>>
>
> good!
>
>
>> Notice also that this is separate from
>>
>> <deterministic characteristic> ::=
>> DETERMINISTIC
>> | NOT DETERMINISTIC
>>
>>
>
> so IMMUTABLE = DETERMINISTIC NO SQL,
> STABLE = DETERMINISTIC READS SQL DATA
> VOLATILE = NOT DETERMINISTIC MODIFIES SQL DATA
>
>
>> which is the SQL standard's variant of volatility.
>>
>> So someone has already had the idea that these two should exist
>> separately.
>>
>>
>
> seems something we should implement
>

At least the combinations to recognize the current
IMMUTABLE/STABLE/VOLATILE features.
By definition, READS SQL DATA and MODIFIES SQL DATA
cannot be DETERMINISTIC. But I can imagine some C and
PL/Perl functions that are NOT DETERMINISTIC NO SQL.

And what does "CONTAINS SQL" mean? Is it distinct from
the other two READS/MODIFIES SQL DATA markers?
"SELECT CURRENT_TIMESTAMP" may be an example
but it doesn't seem to be significantly different from
$$SELECT $1 || $2;$$ LANGUAGE SQL or the same
written in PL/Perl or C.

Best regards,
Zoltán Böszörményi

--
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-01 22:36:08
Message-ID: 3073cc9b1003011436v73a94af2ve7903d38560c02be@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 1, 2010 at 4:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> SQL standard:
>
>> <SQL-data access indication> ::=
>> NO SQL
>> | CONTAINS SQL
>> | READS SQL DATA
>> | MODIFIES SQL DATA
>
> Huh.  I understand three of those, but what is the use of CONTAINS SQL?
> Seems like that would have to be the same as the last one
>

i guess the safer asumption is: treat it as MODIFIES SQL DATA

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>, "Jaime Casanova" <jcasanov(at)systemguards(dot)com(dot)ec>
Subject: Re: function side effects
Date: 2010-03-01 23:09:56
Message-ID: 4B8BF4E4020000250002F769@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> SQL standard:
>
>> <SQL-data access indication> ::=
>> NO SQL
>> | CONTAINS SQL
>> | READS SQL DATA
>> | MODIFIES SQL DATA
>
> Huh. I understand three of those, but what is the use of CONTAINS
> SQL? Seems like that would have to be the same as the last one,
> or maybe the next-to-last one if you're prepared to assume it's
> read-only SQL.

On a quick search of the spec, the best I was able to tell was that
you are required to use "CONTAINS SQL" if the language is SQL.
Perhaps it figures that the database engine can determine the
read/write behavior directly if the language is SQL, and you tell it
what it does if you're coding in some other language.

-Kevin


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-02 10:26:20
Message-ID: 1267525580.20952.5.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On mån, 2010-03-01 at 15:30 -0500, Jaime Casanova wrote:
> so IMMUTABLE = DETERMINISTIC NO SQL,
> STABLE = DETERMINISTIC READS SQL DATA
> VOLATILE = NOT DETERMINISTIC MODIFIES SQL DATA

It might be tempting to create such a mapping, but there could be a
number of pitfalls, especially if you define it as a commutative
equivalence rather than say logical implications. For example, MODIFIES
SQL DATA ought to imply VOLATILE, but the reverse is not true.

When the volatility attribute was introduced, we briefly looked at the
standard "deterministic" attribute, but concluded that it would be
better to create settings that describe how the PostgreSQL
planner/executor works instead of some abstract setting that is
descriptive but doesn't actually help optimizing the query.

We might actually end up with all three groups of settings at some
point.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Greg Stark <gsstark(at)mit(dot)edu>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: function side effects
Date: 2010-03-02 10:28:26
Message-ID: 1267525706.20952.7.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On mån, 2010-03-01 at 16:29 -0500, Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > SQL standard:
>
> > <SQL-data access indication> ::=
> > NO SQL
> > | CONTAINS SQL
> > | READS SQL DATA
> > | MODIFIES SQL DATA
>
> Huh. I understand three of those, but what is the use of CONTAINS SQL?

My reading is that CONTAINS SQL allows/indicates the use of non-data
statements such as CREATE or ALTER, whereas READS SQL DATA and MODIFIES
SQL DATA specifically refer to reading or writing table data.