Re: Repeatable read and serializable transactions see data committed after tx start

From: Álvaro Hernández Tortosa <aht(at)8Kdata(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Repeatable read and serializable transactions see data committed after tx start
Date: 2014-11-05 00:11:40
Message-ID: 54596B3C.5010107@8Kdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 04/11/14 09:07, Craig Ringer wrote:
> On 11/04/2014 07:31 AM, Álvaro Hernández Tortosa wrote:
>> Thank you for your comment, Tom. However I think this behavior, as
>> seen from a user perspective, it's not the expected one.
> That may be the case, but I think it's the SQL-standard behaviour, so we
> can't really mess with it.
>
> The spec requires SET TRANSACTION ISOLATION, and you can't implement
> that if you take a snapshot at BEGIN.

It's true that the standard mandates SET TRANSACTION rather than
setting the isolation level with the BEGIN statement, and in any case
you can raise/lower the isolation level with SET regardless of what the
session or the begin command said. However, is it really a problem
taking a snapshot at BEGIN time --only if the tx is started with BEGIN
... (REPEATABLE READ | SERIALIZABLE)? AFAIK, and I may be missing some
internal details here, the worst that can happen is that you took one
extra, unnecessary snapshot. I don't see that as a huge problem.

The standard (92) says that transaction is initiated when a
transaction-initiating SQL-statement is executed. To be fair, that
sounds to me more of a "SELECT" rather than a "BEGIN", but I may be wrong.
>
>> If it is still the intended behavior, I think it should be clearly
>> documented as such, and a recommendation similar to "issue a 'SELECT 1'
>> right after BEGIN to freeze the data before any own query" or similar
>> comment should be added. Again, as I said in my email, the documentation
>> clearly says that "only sees data committed before the transaction
>> began". And this is clearly not the real behavior.
> It's more of a difference in when the transaction "begins".
>
> Arguably, "BEGIN" says "I intend to begin a new transaction with the
> next query" rather than "immediately begin executing a new transaction".
>
> This concept could be clearer in the docs.

If this is really how it should behave, I'd +1000 to make it
clearer in the docs, and to explicitly suggest the user to perform a
query discarding the results early after BEGIN if the user wants the
state freezed if there may span time between BEGIN and the real queries
to be executed (like doing a SELECT 1).

>
>> Sure, there are, that was the link I pointed out, but I found no
>> explicit mention to the fact that I'm raising here.
> I'm sure it's documented *somewhere*, in that I remember reading about
> this detail in the docs, but I can't find _where_ in the docs.
>
> It doesn't seem to be in:
>
> http://www.postgresql.org/docs/current/static/transaction-iso.html
>
> where I'd expect.

Yepp, there's no mention there.

>
> In any case, we simply cannot take the snapshot at BEGIN time, because
> it's permitted to:
>
> SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
>
> in a DB that has default serializable isolation or has a SET SESSION
> CHARACTERISTICS isolation mode of serializable. Note that SET
> TRANSACTION is SQL-standard.

As I said, AFAIK it shouldn't matter a lot to take the snapshot at
BEGIN. The worst that can happen is that you end up in read committed
and you need to take more snapshots, one per query.

>
> AFAIK deferring the snapshot that's consistent with other RDBMSes that
> use snapshots, too.

I tried Oracle and SQL Server. SQL Server seems to behave as
PostgreSQL, but just because it locks the table if accessed in a
serializable transaction, so it definitely waits until select to lock
it. However, Oracle behaved as I expected: data is frozen at BEGIN time.
I haven't tested others.

>
>
> The docs of that command allude to, but doesn't explicitly state, the
> behaviour you mention.
>
> http://www.postgresql.org/docs/current/static/sql-set-transaction.html
>
>

Should we improve then the docs stating this more clearly? Any
objection to do this?

Regards,

Álvaro

--
Álvaro Hernández Tortosa

-----------
8Kdata

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-11-05 01:13:47 Re: tracking commit timestamps
Previous Message Marko Tiikkaja 2014-11-05 00:04:28 Re: to_char_at_timezone()?