portable DBAPI auto-increment

Lists: psycopg
From: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
To: psycopg(at)postgresql(dot)org
Subject: portable DBAPI auto-increment
Date: 2011-04-07 20:40:29
Message-ID: 4D9E213D.7070804@stsci.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

Hi,

I have a portability issue that I'm not sure what to do with: I need an
automatically unique column, but I don't see a portable way to ask for
it in my code.

For my example, I'm going to use column k as the key and v as the value,
though in real life there are another 10 or so columns.

In sqlite and mysql, I can do
c.execute("insert into foo ( v ) values ( 'xyz' )")
k = c.lastrowid

In psycopg2, I find that lastrowid is always 0, even though it is making
rows in the table with k=1, k=2, etc.

I did some searching and found that the value of lastrowid is not taken
from the autoincrementing column, but is some non-unique entity called
an OID. The recommended workaround seems to be
c.execute("insert into foo ( v ) values ( 'xyz' ) returning k")

To do this, I would need to make a special case for inserting into
tables that have an autoincrement key. For example, I might use
k = my_db_hack.insert_with_autoinc(c, " insert into foo ( v )
values ('xyz')")

where insert_with_autoinc would recognize the postgres db interface and
add the "returning" and read the results out of the cursor. It seems
kind of klunky, though.

So, my question is: Is there a more portable way to do this that I'm
not thinking of?

Secondarily, I see the the postgres documentation recommends not using
the OID in your own tables. What use is the OID? Why doesn't
c.lastrowid to return the value of the serial column, or None if there
isn't one?

Mark S.


From: Federico Di Gregorio <federico(dot)digregorio(at)dndg(dot)it>
To: psycopg(at)postgresql(dot)org
Cc: sienkiew(at)stsci(dot)edu
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 07:51:20
Message-ID: 4D9EBE78.3080005@dndg.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On 07/04/11 22:40, Mark Sienkiewicz wrote:
> Hi,
>
> I have a portability issue that I'm not sure what to do with: I need an
> automatically unique column, but I don't see a portable way to ask for
> it in my code.
>
> For my example, I'm going to use column k as the key and v as the value,
> though in real life there are another 10 or so columns.
>
> In sqlite and mysql, I can do
> c.execute("insert into foo ( v ) values ( 'xyz' )")
> k = c.lastrowid

Declare k as "serial" and then do the following:

c.execute("INSERT INTO foo (v) VALUES ('xyz') RETURNING k")
k = c.fetchone()[0]

Hope this helps,
federico

--
Federico Di Gregorio federico(dot)digregorio(at)dndg(dot)it
Studio Associato Di Nunzio e Di Gregorio http://dndg.it
We should forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil. -- D.E.Knuth


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 09:52:13
Message-ID: 20110408095213.GB2465@hermes.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Thu, Apr 07, 2011 at 04:40:29PM -0400, Mark Sienkiewicz wrote:

> I have a portability issue that I'm not sure what to do with: I need
> an automatically unique column, but I don't see a portable way to ask
> for it in my code.

SELECT automatically_unique_column FROM table_in_question

> For my example, I'm going to use column k as the key and v as the
> value, though in real life there are another 10 or so columns.
>
> In sqlite and mysql, I can do
> c.execute("insert into foo ( v ) values ( 'xyz' )")
> k = c.lastrowid

.lastrowid is an implementation detail depending on the
database to keep a unique column on your rows behind your back.

PostgreSQL used to do that (the row OID) but that's been
deprecated because it serves no known purpose that cannot
easily be served otherwise.

> In psycopg2, I find that lastrowid is always 0, even though it is
> making rows in the table with k=1, k=2, etc.
>
> I did some searching and found that the value of lastrowid is not
> taken from the autoincrementing column, but is some non-unique entity
> called an OID.

OIDs *are* unique per table.

> So, my question is: Is there a more portable way to do this that I'm
> not thinking of?

Maybe second thougth needs to be given to whether .lastrowid
is the right tool for the job.

> Secondarily, I see the the postgres documentation recommends not
> using the OID in your own tables. What use is the OID? Why doesn't
> c.lastrowid to return the value of the serial column, or None if
> there isn't one?

Consider this table:

create table temp (
pk serial primary key,
pk2 serial unique,
pk3 integer
unique
not null
) with OIDS;

Which value would you expect psycopg2 to return as
.lastrowid ? They *all* fulfill the criteria: uniquely
identify the row (.oid, .pk, .pk2, .pk3).

Using RETURNING does not cut it either because one can
return nearly arbitary data.

Now, a case could be made to let the user *tell* psycopg2
what to use for .lastrowid on any given table. That's not as
trivial as it sounds, however.

Karsten
--
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
To:
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 16:23:30
Message-ID: 4D9F3682.5020601@stsci.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

Karsten Hilbert wrote:
>> In sqlite and mysql, I can do
>> c.execute("insert into foo ( v ) values ( 'xyz' )")
>> k = c.lastrowid
>>
>
> .lastrowid is an implementation detail depending on the
> database to keep a unique column on your rows behind your back.
>

Yes. From the available documentation, this looks like the obvious
purpose for .lastrowid existing at all.

But,

Joe Abbate wrote:
>
> As I recall, ROWID --which is only defined tautologically in PEP-249 as
> "the 'Row ID' column"-- is an Oracle feature. There it's a pseudo column
> that is normally hidden (does not appear in SELECT *).
> ...
>
> If MySQL and SQLite have interpreted ROWID as the value of an
> auto-increment column, that makes using it non-standard and therefore by
> definition non-portable.

That is totally non-obvious from the PEP, but _extremely_ useful to know.

( And, yes, both sqlite3 and MySQLdb return the value of the
autoincrement column in lastrowid. )

So,

Karsten Hilbert wrote:
> Maybe second thougth needs to be given to whether .lastrowid
> is the right tool for the job.
>

Evidently, you are right. Currently, I expect the database to create a
unique identifier for the record on insert. Instead, I need to make a
separate UID generator and insert the record with a previously generated
UID.

Federico Di Gregorio wrote:
> Declare k as "serial" and then do the following:
>
> c.execute("INSERT INTO foo (v) VALUES ('xyz') RETURNING k")
> k = c.fetchone()[0]
>

That basic code could be the core of the UID generation. It would also
need to deal with possibly non-unique numbers after it wraps (in
postgres, the value after 2147483647 is 1), but I probably have at least
5 years to figure that out.

(For comparison, mysql uses an unsigned 64 bit value for auto increment
and chokes when it runs out. sqlite wraps, but it automatically finds a
new key value that is not used.)

Thanks to all of you for the help!

Mark S.


From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 16:42:57
Message-ID: BANLkTinVFK++rKYxifNFc+nLeO2ZLP-Kpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Fri, Apr 8, 2011 at 5:23 PM, Mark Sienkiewicz <sienkiew(at)stsci(dot)edu> wrote:

> That basic code could be the core of the UID generation.  It would also need
> to deal with possibly non-unique numbers after it wraps (in postgres, the
> value after 2147483647 is 1), but I probably have at least 5 years to figure
> that out.

This is plain wrong. Where did you get this idea?

piro=> create table testser (id serial);
NOTICE: CREATE TABLE will create implicit sequence "testser_id_seq"
for serial column "testser.id"
CREATE TABLE
piro=> alter SEQUENCE testser_id_seq restart with 2147483647;
ALTER SEQUENCE
piro=> insert into testser values (default) returning id;
id
------------
2147483647
(1 row)

INSERT 0 1
piro=> insert into testser values (default) returning id;
ERROR: integer out of range

> (For comparison, mysql uses an unsigned 64 bit value for auto increment and
> chokes when it runs out.  sqlite wraps, but it automatically finds a new key
> value that is not used.)

Please, read the fine manual: PostgreSQL has a 64 bit bigserial data type too.

-- Daniele


From: David Blewett <david(at)dawninglight(dot)net>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
Cc: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>, psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 17:04:46
Message-ID: BANLkTinvuN-q1UeCYTwMyVKdgE1ByGtM8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Fri, Apr 8, 2011 at 12:42 PM, Daniele Varrazzo
<daniele(dot)varrazzo(at)gmail(dot)com> wrote:
> On Fri, Apr 8, 2011 at 5:23 PM, Mark Sienkiewicz <sienkiew(at)stsci(dot)edu> wrote:
>> (For comparison, mysql uses an unsigned 64 bit value for auto increment and
>> chokes when it runs out. sqlite wraps, but it automatically finds a new key
>> value that is not used.)
>
> Please, read the fine manual: PostgreSQL has a 64 bit bigserial data type too.

In addition, if you absolutely have to have the full 64 bit range, you
can set the starting value of the sequence to -9223372036854775808.

--
Thanks,

David Blewett


From: Marko Kreen <markokr(at)gmail(dot)com>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 17:36:25
Message-ID: BANLkTikpxcur-5dCAZ7dMNwNWAD3KffNGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Fri, Apr 8, 2011 at 12:52 PM, Karsten Hilbert
<Karsten(dot)Hilbert(at)gmx(dot)net> wrote:
> On Thu, Apr 07, 2011 at 04:40:29PM -0400, Mark Sienkiewicz wrote:
>> In psycopg2, I find that lastrowid is always 0, even though it is
>> making rows in the table with k=1, k=2, etc.
>>
>> I did some searching and found that the value of lastrowid is not
>> taken from the autoincrementing column, but is some non-unique entity
>> called an OID.
>
> OIDs *are* unique per table.

Actually they are not - you need unique index on oid column for that.

That is also one of the reasons to avoid them...

--
marko


From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 17:40:42
Message-ID: BANLkTikCW-dM6DnmQDPZpkd3nBCdzV4tFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Thu, Apr 7, 2011 at 9:40 PM, Mark Sienkiewicz <sienkiew(at)stsci(dot)edu> wrote:

> I have a portability issue that I'm not sure what to do with:  I need an
> automatically unique column, but I don't see a portable way to ask for it in
> my code.
>
> For my example, I'm going to use column k as the key and v as the value,
> though in real life there are another 10 or so columns.
>
> In sqlite and mysql, I can do
>   c.execute("insert into foo ( v ) values ( 'xyz' )")
>   k = c.lastrowid
>
> In psycopg2, I find that lastrowid is always 0, even though it is making
> rows in the table with k=1, k=2, etc.
>
> I did some searching and found that the value of lastrowid is not taken from
> the autoincrementing column, but is some non-unique entity called an OID.
>  The recommended workaround seems to be
>   c.execute("insert into foo ( v ) values ( 'xyz' ) returning k")
>
> To do this, I would need to make a special case for inserting into tables
> that have an autoincrement key.  For example, I might use
>    k = my_db_hack.insert_with_autoinc(c, " insert into foo ( v ) values
> ('xyz')")
>
> where insert_with_autoinc would recognize the postgres db interface and add
> the "returning" and read the results out of the cursor.  It seems kind of
> klunky, though.
>
> So, my question is:  Is there a more portable way to do this that I'm not
> thinking of?

I would say no, "returning" is the best option, "select
currval(seq_name)" a second -vastly inferior- one.

PostgreSQL doesn't give any special value to a field of type serial,
nor to the primary key of a table. It's not possible for psycopg to
add back such value without a) parsing the query, b) querying the
system catalog, 3) add policies of its own on top of the database and
4) rewrite the query. None of these things are likely to happen in
psycopg, albeit you may find some of those in some higher level tool
using psycopg as its driver.

Looks like you are hoping to write "portable SQL queries". I don't
believe this approach will bring you very far: if you want to write
portable _queries_ you need a tool that generates the _SQL_ for you,
or you will have problems in any query but the simplest ones. Hard?
Yes, it is. Take a look at SQLAlchemy if you want portable queries, or
be prepared to work at a level higher than the SQL.

> Secondarily, I see the the postgres documentation recommends not using the
> OID in your own tables.  What use is the OID?  Why doesn't c.lastrowid to
> return the value of the serial column, or None if there isn't one?

The use of the OID is mostly historical and currently only limited to
the system catalog for the limitations you have probably read in the
docs.

c.lastrowid is not set to the value of the serial column because
PostgreSQL doesn't tell it. psycopg doesn't even know the query you
have sent is an insert, let alone that this insert was performed on a
table that happened to have exactly one serial column (that by the way
is not even a type: it's only a value for the default and its type may
actually be anything).

-- Daniele


From: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 17:43:28
Message-ID: 4D9F4940.8030206@stsci.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

Daniele Varrazzo wrote:
> On Fri, Apr 8, 2011 at 5:23 PM, Mark Sienkiewicz <sienkiew(at)stsci(dot)edu> wrote:
>
>
>> That basic code could be the core of the UID generation. It would also need
>> to deal with possibly non-unique numbers after it wraps (in postgres, the
>> value after 2147483647 is 1), but I probably have at least 5 years to figure
>> that out.
>>
>
> This is plain wrong. Where did you get this idea?
>

I did an experiment:

test=# create table foo ( k serial, v varchar(10) );
NOTICE: CREATE TABLE will create implicit sequence "foo_k_seq" for
serial column "foo.k"
CREATE TABLE
test=# insert into foo ( k, v ) values ( 2147483647, 'max' );
INSERT 0 1
test=# insert into foo ( v ) values ( 'a' ) ;
INSERT 0 1
test=# select * from foo;
2147483647 | max
1 | a

test=#

So, for a "serial" column with 2147483647 in it, I got 1 for the next
value inserted. I read your example to find out what you were doing
different and then went back to the postgres manual. What I see is that
"serial" is not the same thing as auto-increment -- it is different
thing entirely that you can use instead of auto-increment in some
cases. That is important to know, but was not immediately obvious to me
from what I have read so far.

>> (For comparison, mysql uses an unsigned 64 bit value for auto increment and
>> chokes when it runs out. sqlite wraps, but it automatically finds a new key
>> value that is not used.)
>>
>
> Please, read the fine manual: PostgreSQL has a 64 bit bigserial data type too.
>

Yes, I do know about the manual and I have spent considerable time
reading it. As you may know about manuals for a system of this size, it
takes time to find everything you need to know and to understand all the
implications.

Mark S.


From: Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>
To: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
Cc: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>, psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 18:09:29
Message-ID: 4D9F4F59.50701@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On 04/08/2011 10:43 AM, Mark Sienkiewicz wrote:
> Daniele Varrazzo wrote:

>
>
> I did an experiment:
>
> test=# create table foo ( k serial, v varchar(10) );
> NOTICE: CREATE TABLE will create implicit sequence "foo_k_seq" for
> serial column "foo.k"
> CREATE TABLE
> test=# insert into foo ( k, v ) values ( 2147483647, 'max' );
> INSERT 0 1
> test=# insert into foo ( v ) values ( 'a' ) ;
> INSERT 0 1
> test=# select * from foo;
> 2147483647 | max
> 1 | a
>
> test=#
>
> So, for a "serial" column with 2147483647 in it, I got 1 for the next
> value inserted. I read your example to find out what you were doing
> different and then went back to the postgres manual. What I see is that
> "serial" is not the same thing as auto-increment -- it is different
> thing entirely that you can use instead of auto-increment in some cases.
> That is important to know, but was not immediately obvious to me from
> what I have read so far.

Actually, it is the same thing. A SERIAL or BIGSERIAL type is just a
macro that sets up an auto-increment field.

>
>

>
> Mark S.
>
>

--
Adrian Klaver
adrian(dot)klaver(at)gmail(dot)com


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-08 21:51:28
Message-ID: 20110408215128.GB2406@hermes.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

On Fri, Apr 08, 2011 at 08:36:25PM +0300, Marko Kreen wrote:

> > On Thu, Apr 07, 2011 at 04:40:29PM -0400, Mark Sienkiewicz wrote:
> >> In psycopg2, I find that lastrowid is always 0, even though it is
> >> making rows in the table with k=1, k=2, etc.
> >>
> >> I did some searching and found that the value of lastrowid is not
> >> taken from the autoincrementing column, but is some non-unique entity
> >> called an OID.
> >
> > OIDs *are* unique per table.
>
> Actually they are not - you need unique index on oid column for that.
>
> That is also one of the reasons to avoid them...

Thanks for the correction.

Karsten
--
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Mark Sienkiewicz <sienkiew(at)stsci(dot)edu>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: portable DBAPI auto-increment
Date: 2011-04-11 18:29:35
Message-ID: 4DA3488F.3030506@stsci.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: psycopg

Daniele Varrazzo wrote:
> I would say no, "returning" is the best option, "select
> currval(seq_name)" a second -vastly inferior- one.
>

I can see the logic in that. "Returning" happens in one statement, but
select currval followed by an insert means you have to go to the
database twice.

> Looks like you are hoping to write "portable SQL queries". I don't
> believe this approach will bring you very far: if you want to write
> portable _queries_ you need a tool that generates the _SQL_ for you,
> or you will have problems in any query but the simplest ones. Hard?
> Yes, it is. Take a look at SQLAlchemy if you want portable queries, or
> be prepared to work at a level higher than the SQL.
>

I'm aware of the risk, but I've had reasonable enough success so far for
routine queries that I don't want to get into the extra cost of using an
sql generator. Most of what I do with a database is look things up in
tables, and it is pretty easy to write even fairly complicated select
statements that don't get into database-specifics. This auto-increment
field is the most unusual thing I do in my current application, so
working around the database differences there is still less work than
converting to an sql generator.

I appreciate the advice, though. In other circumstances, I would look
into it.

Mark S.