Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}

From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Date: 2014-11-21 06:58:58
Message-ID: 1416553138.8931.288.camel@TTY32
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2014-11-20 at 13:42 -0800, Peter Geoghegan wrote:
> > I am a developer of the Django ORM. Django reports to the user whether a
> > row was inserted or updated. It is possible to know which rows were
> > inserted by returning the primary key value. If something is returned,
> > then it was an insert. If Django implements updated vs inserted checking
> > this way, then if PostgreSQL adds RETURNING for update case later on,
> > that would be a breaking change for Django.
>
> How does that actually work at the moment? Do you use RETURNING, or
> look at the command tag? Would you be happy to just know that certain
> rows were either inserted or updated in the context of an UPSERT (and
> not cancelled by a BEFORE ROW INSERT or UPDATE trigger returning
> NULL), or do you want to specifically know if there was an insert or
> an update in respect of each row/slot processed?

Django uses the command tag currently to check if a row was updated. We
also use RETURNING to get SERIAL values back from the database on
insert.

The most likely place to use this functionality in Django is
Model.save(). This method is best defined as "make sure this object's
state is either inserted or updated to the database by the primary key
of the object". The Model.save() method needs to also report if the
model was created or updated. The command tag is sufficient for this
case.

So, the proposed feature now has everything Django needs for
Model.save().

Django might add a bulk_merge(objs) command later on. This is defined as
"make sure each obj in objs is persisted to the database using the
fastest way available". The INSERT ON CONFLICT UPDATE command looks
excellent for this case. In this case it will be more problematic to
check which rows were inserted, which update, as we need information for
each primary key value separately for this case.

When I think of this feature outside of Django, it seems it is
completely reasonable to return database computed values on UPSERT. This
requires two queries with the proposed API. My opinion is that RETURNING
for the update case is better than not having it.

- Anssi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-11-21 07:19:20 Re: WAL format and API changes (9.5)
Previous Message Michael Paquier 2014-11-21 05:12:20 Fillfactor for GIN indexes