Re: DETERMINISTIC as synonym for IMMUTABLE

Lists: pgsql-hackers
From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-17 21:56:21
Message-ID: pan.2004.10.17.21.56.20.537301@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

While reviewing PostgreSQL 8's SQL:2003 conformance (see the documentation
mailing list), I've come across a - possibly - easy-to-implement
conformance improvement for CREATE FUNCTION:

SQL:2003 uses the word DETERMINISTIC where PostgreSQL uses IMMUTABLE.
And SQL:2003 uses "NOT DETERMINISTIC" where PostgreSQL uses STABLE
or VOLATILE.

I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
"NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
grammar for CREATE FUNCTION.

I might try creating a patch, but I'd rather spend my time finishing the
conformance review.

--
Greetings from Troels Arvin, Copenhagen, Denmark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Troels Arvin <troels(at)arvin(dot)dk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-17 22:08:00
Message-ID: 13380.1098050880@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Troels Arvin <troels(at)arvin(dot)dk> writes:
> I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
> "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
> grammar for CREATE FUNCTION.

These do NOT mean the same thing.

regards, tom lane


From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-17 22:24:09
Message-ID: pan.2004.10.17.22.24.09.379121@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote:

>> I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
>> "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
>> grammar for CREATE FUNCTION.
>
> These do NOT mean the same thing.

I'm having a hard time seeing the difference between DETERMINISTIC and
IMMUTABLE.

My suggestion for "NOT DETERMINISTIC"==VOLATILE is because VOLATILE seems
to be the least strict of the three PostgreSQL volatility categories.

Do you disagree on both, or just the last one?

--
Greetings from Troels Arvin, Copenhagen, Denmark


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Troels Arvin <troels(at)arvin(dot)dk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-17 23:36:16
Message-ID: 14457.1098056176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Troels Arvin <troels(at)arvin(dot)dk> writes:
> On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote:
>> These do NOT mean the same thing.

> I'm having a hard time seeing the difference between DETERMINISTIC and
> IMMUTABLE.

Well, the spec is somewhat self-contradictory on the point, but I think
their intention is to model it after their notion of a deterministic
query:

A <query expression> or <query specification> is possibly non-
deterministic if an SQL-implementation might, at two different
times where the state of the SQL-data is the same, produce results
that differ by more than the order of the rows due to General Rules
that specify implementation-dependent behavior. [SQL99 4.17]

Notice that it is okay for a deterministic query to produce different
results when the content of the database changes; therefore this is not
IMMUTABLE in our terms. It is however stronger than our STABLE
condition (for example, "now()" is STABLE but is not deterministic per
the above definition). It appears to me that they are thinking of
functions like

SELECT value FROM table WHERE pkey = $1

which is deterministic per their definition and also according to (what
I think is) the common meaning of "deterministic". We could label this
function as STABLE, but not IMMUTABLE; however we have no category that
captures the notion that "it can't change as long as the database
content doesn't change".

What it actually says about deterministic functions in 4.23 is:

An SQL-invoked routine is either deterministic or possibly non-
deterministic. An SQL-invoked function that is deterministic always
returns the same return value for a given list of SQL argument
values. An SQL-invoked procedure that is deterministic always
returns the same values in its output and inout SQL parameters
for a given list of SQL argument values. An SQL-invoked routine
is possibly non-deterministic if, during invocation of that SQL-
invoked routine, an SQL-implementation might, at two different
times when the state of the SQL-data is the same, produce unequal
results due to General Rules that specify implementation-dependent
behavior.

This is clearly bogus as written since it claims that there are only two
possibilities when there are more than two. Any ordinary function that
selects from the database will satisfy neither their "deterministic" nor
their "possibly non-deterministic" definitions.

I think that they meant to define SQL functions as nondeterministic if
they act like or contain nondeterministic queries; for instance 13.5
says

3) An <SQL procedure statement> S is possibly non-deterministic if
and only if at least one of the following is satisfied:

a) S is a <select statement: single row> that is possibly non-
deterministic.

b) S contains a <routine invocation> whose subject routine is an
SQL-invoked routine that possibly modifies SQL-data.

c) S generally contains a <query specification> or a <query
expression> that is possibly non-deterministic.

d) S generally contains a <datetime value function>, CURRENT_
USER, CURRENT_ROLE, SESSION_USER, or SYSTEM_USER.

Anybody know whether the SQL2003 text clarifies the intent at all?

In any case, whether or not you think DETERMINISTIC means IMMUTABLE,
I don't think it's very helpful to identify NOT DETERMINISTIC with
VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it
isn't VOLATILE.

regards, tom lane


From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-18 09:52:01
Message-ID: pan.2004.10.18.09.52.00.650903@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 17 Oct 2004 19:36:16 -0400, Tom Lane wrote:

> Well, the spec is somewhat self-contradictory on the point, but I think
> their intention is to model it after their notion of a deterministic
> query:
>
> A <query expression> or <query specification> is possibly non-
> deterministic if an SQL-implementation might, at two different
> times where the state of the SQL-data is the same, produce results
> that differ by more than the order of the rows due to General Rules
> that specify implementation-dependent behavior. [SQL99 4.17]

This section has been removed in SQL:2003. Instead, a new section 4.16
("Determinism") has been added. The first paragraph of the new section
states:

In general, an operation is deterministic if that operation assuredly
computes identical results when repeated with identical input values.
For an SQL-invoked routine, the values in the argument list are
regarded as the input; otherwise, the SQL-data and the set of
privileges by which they are accessed is regarded as the input.

In my reading of the new section, there is nothing which indicates that
determinism is related to whether stored data are changed or not.

> What it actually says about deterministic functions in 4.23 is:
[...]
> An SQL-invoked routine
> is possibly non-deterministic if, during invocation of that SQL-
> invoked routine, an SQL-implementation might, at two different
> times when the state of the SQL-data is the same, produce unequal
> results
[...]

This paragraph has also been altered in SQL:2003. In SQL:2003's section
4.27.2, there is still a section on deterministic vs. possibly
non-deterministic routines; it doesn't say anything about the "state of
the SQL-data" any more. It says:

An SQL-invoked routine is either deterministic or possibly
non-deterministic. An SQL-invoked function that is deterministic
always returns the identical return value for a given list of SQL
argument values.
[... cut stuff about sql-invoked procedures which PostgreSQL doesn't
support yet...]
An SQL-invoked routine is possibly non-deterministic if it might
produce nonidentical results when invoked with the identical list of SQL
argument values.

> I think that they meant to define SQL functions as nondeterministic if
> they act like or contain nondeterministic queries; for instance 13.5
> says
>
> 3) An <SQL procedure statement> S is possibly non-deterministic if
> and only if at least one of the following is satisfied:
>
> a) S is a <select statement: single row> that is possibly non-
> deterministic.
>
> b) S contains a <routine invocation> whose subject routine is an
> SQL-invoked routine that possibly modifies SQL-data.
[...]

This has also been changed in SQL:2003:

4) [<SQL procedure statement>] is possibly non-deterministic if and only
if S is not an <SQL schema statement> and at least one of the following is
satisfied:
a) S is a <select statement: single row> that is possibly
non-deterministic.
b) S contains a <routine invocation> whose subject routine is an
SQL-invoked routine that is possibly non-deterministic.
c) S generally contains a <query specification> or a
<query expression> that is possibly non-deterministic.
d) S generally contains a <value expression> that is possibly
non-deterministic.

> Anybody know whether the SQL2003 text clarifies the intent at all?

I think it's fair to say that SQL:2003 is more clear on this subject, and
that PostgreSQL's IMMUTABLE is equivalent to SQL:2003's DETERMINISTIC. The
remaining problem now becomes if NOT DETERMINISTIC could be introduced as
an alias to VOLATILE:

> I don't think it's very helpful to identify NOT DETERMINISTIC with
> VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it
> isn't VOLATILE.

But does this have any semantic significance? - I mean: It's still safe to
call a function including now() NOT DETERMINISTIC==VOLATILE; no unexpected
results should result from this, except - potentially - lower performance.
I think it's a common phenomenon that performance can sometimes be
increased by utilizing certain product-specific expressions in stead of
the standards-defined ones.

--
Greetings from Troels Arvin, Copenhagen, Denmark


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Troels Arvin" <troels(at)arvin(dot)dk>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-19 21:57:45
Message-ID: 013801c4b626$aae1be30$6400a8c0@Nightingale
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>Tom Lane wrote
> In any case, whether or not you think DETERMINISTIC means IMMUTABLE,

Tom, Your knowledge of the confusing bits of the standard puts us all to
shame.

Troels did have a point, which was to do with standards conformance and
compatibility. The main point at issue is whether someone can run some ANSI
compliant SQL against PostgreSQL and have it work. That's a worthy goal.

AFAICS, your info shows that the standard's definition of DETERMINISTIC is
confusing and contradictory. Most people's interpretation would be that
DETERMINISTIC was the same as IMMUTABLE, so we should make the former a
synonym for the latter and document the possible difference of
interpretation. Seriously, if you can't put a blade of grass between them
then they're OK to be equated.

My understanding is that DETERMINISTIC in Oracle would work the same as
IMMUTABLE in PostgreSQL...

> I don't think it's very helpful to identify NOT DETERMINISTIC with
> VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it
> isn't VOLATILE.
>

You're spot on again with your info. NOT DETERMINISTIC means either STABLE
or VOLATILE in PostgreSQL terms, not just one of those.

IMHO we should allow the use of NOT DETERMINISTIC and document that although
it doesn't mean the same thing as VOLATILE, we should infer that meaning
because that is the mapping that is always correct. If the user wishes to
gain the possible performance advantages offered by STABLE, then they can
alter their code to do so. We're allowed to have performance enhancing
additions to the standard.

This is a similar situation to PostgreSQL's implementation of transaction
isolation levels. The implementation is both implemented according to the
standard and transactionally correct, yet READ UNCOMMITTED doesn't work
*exactly* as the standard says that level should, yet this is all clearly
documented and we are happy with that.

The standard ain't perfect, but we should get as close as possible and
document the difference - as long as there's no loss of correctness, which I
don't think is at issue here.

I'll submit a patch unless there is substantial disagreement.

Best Regards,

Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Troels Arvin" <troels(at)arvin(dot)dk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DETERMINISTIC as synonym for IMMUTABLE
Date: 2004-10-19 22:31:04
Message-ID: 26281.1098225064@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> AFAICS, your info shows that the standard's definition of DETERMINISTIC is
> confusing and contradictory. Most people's interpretation would be that
> DETERMINISTIC was the same as IMMUTABLE, so we should make the former a
> synonym for the latter and document the possible difference of
> interpretation. Seriously, if you can't put a blade of grass between them
> then they're OK to be equated.

The problem is that you *can* put a blade of grass between them, and
sooner or later we may wish to make the distinction. In particular
I can think of optimization possibilities that rely on being able to
identify a function as being a pure function of the database state.
(This would mostly come into play if we ever try to support function
result caching; we don't now, but certainly it's been asked for often
enough.) If we equate DETERMINISTIC with IMMUTABLE then we won't be
able to use DETERMINISTIC to describe cache-able functions.

Troels's later followup says that SQL2003 has redefined DETERMINISTIC
to in fact mean IMMUTABLE; if so I suppose we'll have to go with the
flow. I have not looked closely at that version to see if I agree with
his reading.

regards, tom lane