UPDATE... FROM - will ORDER BY not respected?

Lists: pgsql-general
From: "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: UPDATE... FROM - will ORDER BY not respected?
Date: 2009-04-28 17:48:03
Message-ID: gt7ffg$1nif$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

(FOR PG VERSION 8.3.6 running on Windows 2003 Server x64)

We have a function that assigns unique ID's (to use as row identifiers) to a
table via an UPDATE using nextval(). This table is imported from another
source, and there is a "sequencing" field to let the query know in which
order to assign the row identifiers. (Please do not confuse the sequencing
field with a sequence value from nextval())

The UPDATE command gets the order of the rows to update using a FROM clause,
which in turn reads from a sub-query to get the rows in the order of "seq".

The problem is that the UPDATE is NOT behaving as if it is receiving the
sequence identifiers in the order specified. In fact, it appears it is
returned in REVERSE order (assigning id's in reverse order based on the
values in seq)

Here is the essence of the query (further below you will find the full DDL
code of the function).

UPDATE impt_table
SET id = nextval(''id_seq'')
FROM
(SELECT seq
FROM impt_table
WHERE id IS NULL
ORDER BY seq
) AS empty_ids
WHERE
impt_table.seq = empty_ids.seq
AND impt_table.id IS NULL;

Was I wrong in assuming that the UPDATE would respect the order of rows
coming out of the sub-clause? Is there a better way to do this?

Thanks, Carlo

DDL CODE FOR FUNCTION
CREATE OR REPLACE FUNCTION "mdx_import"."impt_id_seed_from_impt_seq" (text)
RETURNS boolean AS
$body$
/* New function body */
declare
cmd varchar;
begin
cmd =
'update mdx_import.'||$1||'
set impt_id = nextval(''mdx_import.impt_id_seq'')
from
(select impt_seq
from mdx_import.'||$1||'
where impt_id is null
order by impt_seq
) as empty_impt_ids
where
'||$1||'.impt_seq = empty_impt_ids.impt_seq
and '||$1||'.impt_id is null;';
execute cmd;
return true;
end;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;


From: Adam Rich <adam(dot)r(at)sbcglobal(dot)net>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE... FROM - will ORDER BY not respected?
Date: 2009-04-28 18:19:57
Message-ID: 49F748CD.9010108@sbcglobal.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Carlo Stonebanks wrote:
> (FOR PG VERSION 8.3.6 running on Windows 2003 Server x64)
>
> We have a function that assigns unique ID's (to use as row identifiers)
> to a table via an UPDATE using nextval(). This table is imported from
> another source, and there is a "sequencing" field to let the query know
> in which order to assign the row identifiers. (Please do not confuse the
> sequencing field with a sequence value from nextval())
>
> The UPDATE command gets the order of the rows to update using a FROM
> clause, which in turn reads from a sub-query to get the rows in the
> order of "seq".
>
> The problem is that the UPDATE is NOT behaving as if it is receiving the
> sequence identifiers in the order specified. In fact, it appears it is
> returned in REVERSE order (assigning id's in reverse order based on the
> values in seq)
>
> Here is the essence of the query (further below you will find the full
> DDL code of the function).
>
> UPDATE impt_table
> SET id = nextval(''id_seq'')
> FROM
> (SELECT seq
> FROM impt_table
> WHERE id IS NULL
> ORDER BY seq
> ) AS empty_ids
> WHERE
> impt_table.seq = empty_ids.seq
> AND impt_table.id IS NULL;
>
> Was I wrong in assuming that the UPDATE would respect the order of rows
> coming out of the sub-clause? Is there a better way to do this?
>
> Thanks, Carlo
>

I think the ORDER BY is free to update the rows in any order it needs
to. The key is to put the sequence further down. How about this?

UPDATE impt_table
SET id = newid
FROM
SELECT seq, nextval('id_seq') as newid
FROM (SELECT seq
FROM impt_table
WHERE id IS NULL
ORDER BY seq
) AS pre_empty_ids ) as empty_ids
WHERE
impt_table.seq = empty_ids.seq
AND impt_table.id IS NULL;


From: "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE... FROM - will ORDER BY not respected?
Date: 2009-04-28 19:29:29
Message-ID: gt7ldj$2sfm$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> I think the ORDER BY is free to update the rows in any order it needs
> to. The key is to put the sequence further down. How about this?

Adam - thanks. Unless I hear otherwise I will assume that you mean the
UPDATE is free to update the rows in any way it wants - irregardless of how
whether the data return in the FROM clause is ordered.

I also appreciate the time you took to re-write the query for me. Thank you
very much for taking the time to reply.

Carlo

"Adam Rich" <adam(dot)r(at)sbcglobal(dot)net> wrote in message
news:49F748CD(dot)9010108(at)sbcglobal(dot)net(dot)(dot)(dot)
> Carlo Stonebanks wrote:
>> (FOR PG VERSION 8.3.6 running on Windows 2003 Server x64)
>>
>> We have a function that assigns unique ID's (to use as row identifiers)
>> to a table via an UPDATE using nextval(). This table is imported from
>> another source, and there is a "sequencing" field to let the query know
>> in which order to assign the row identifiers. (Please do not confuse the
>> sequencing field with a sequence value from nextval())
>>
>> The UPDATE command gets the order of the rows to update using a FROM
>> clause, which in turn reads from a sub-query to get the rows in the order
>> of "seq".
>>
>> The problem is that the UPDATE is NOT behaving as if it is receiving the
>> sequence identifiers in the order specified. In fact, it appears it is
>> returned in REVERSE order (assigning id's in reverse order based on the
>> values in seq)
>>
>> Here is the essence of the query (further below you will find the full
>> DDL code of the function).
>>
>> UPDATE impt_table
>> SET id = nextval(''id_seq'')
>> FROM
>> (SELECT seq
>> FROM impt_table
>> WHERE id IS NULL
>> ORDER BY seq
>> ) AS empty_ids
>> WHERE
>> impt_table.seq = empty_ids.seq
>> AND impt_table.id IS NULL;
>>
>> Was I wrong in assuming that the UPDATE would respect the order of rows
>> coming out of the sub-clause? Is there a better way to do this?
>>
>> Thanks, Carlo
>>
>
> I think the ORDER BY is free to update the rows in any order it needs
> to. The key is to put the sequence further down. How about this?
>
>
> UPDATE impt_table
> SET id = newid
> FROM
> SELECT seq, nextval('id_seq') as newid
> FROM (SELECT seq
> FROM impt_table
> WHERE id IS NULL
> ORDER BY seq
> ) AS pre_empty_ids ) as empty_ids
> WHERE
> impt_table.seq = empty_ids.seq
> AND impt_table.id IS NULL;
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


From: Adam Rich <adam(dot)r(at)sbcglobal(dot)net>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE... FROM - will ORDER BY not respected?
Date: 2009-04-28 20:28:40
Message-ID: 49F766F8.6030402@sbcglobal.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Carlo Stonebanks wrote:
>> I think the ORDER BY is free to update the rows in any order it needs
>> to. The key is to put the sequence further down. How about this?
>
> Adam - thanks. Unless I hear otherwise I will assume that you mean the
> UPDATE is free to update the rows in any way it wants - irregardless of
> how whether the data return in the FROM clause is ordered.
>

Whoops - you're right, that's what I meant. Even though one of the
tables is sorted, when it joins the two tables, the planner might decide
it's cheaper to update impt_table by iteratively looking up values from
the sorted table that match the (unordered) rows from impt_table, or
by hashing the keys which I think also results in unsorted updates.
You can use "explain" to see the details.

Adam


From: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE... FROM - will ORDER BY not respected?
Date: 2009-04-29 11:20:11
Message-ID: gt9d5b$rqd$2@reversiblemaps.ath.cx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 2009-04-28, Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> wrote:
> (FOR PG VERSION 8.3.6 running on Windows 2003 Server x64)
>
> We have a function that assigns unique ID's (to use as row identifiers) to a
> table via an UPDATE using nextval(). This table is imported from another
> source, and there is a "sequencing" field to let the query know in which
> order to assign the row identifiers. (Please do not confuse the sequencing
> field with a sequence value from nextval())
>
> The UPDATE command gets the order of the rows to update using a FROM clause,
> which in turn reads from a sub-query to get the rows in the order of "seq".
>
> The problem is that the UPDATE is NOT behaving as if it is receiving the
> sequence identifiers in the order specified. In fact, it appears it is
> returned in REVERSE order (assigning id's in reverse order based on the
> values in seq)
>
> Here is the essence of the query (further below you will find the full DDL
> code of the function).
>
> UPDATE impt_table
> SET id = nextval(''id_seq'')
> FROM
> (SELECT seq
> FROM impt_table
> WHERE id IS NULL
> ORDER BY seq
> ) AS empty_ids
> WHERE
> impt_table.seq = empty_ids.seq
> AND impt_table.id IS NULL;
>
> Was I wrong in assuming that the UPDATE would respect the order of rows
> coming out of the sub-clause? Is there a better way to do this?

try this:

UPDATE impt_table
SET id = newid
FROM
(
SELECT foo.seq,
nextval(''id_seq'') AS newid
(SELECT seq
FROM impt_table
WHERE id IS NULL
ORDER BY seq
) AS foo
) ASempty_ids
WHERE
impt_table.seq = empty_ids.seq
AND impt_table.id IS NULL;