Re: LIMIT for UPDATE and DELETE

Lists: pgsql-hackers
From: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: LIMIT for UPDATE and DELETE
Date: 2014-08-14 21:18:12
Message-ID: CADB9FDf-Vh6RnKAMZ4Rrg_YP9p3THdPbji8qe4qkxRuiOwm=mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greetings,

Based on the feedback on my previous patch, I've separated only the
LIMIT part into its own feature. This version plays nicely with
inheritance. The intended use is splitting up big UPDATEs and DELETEs
into batches more easily and efficiently.


Attachment Content-Type Size
update_delete_limit_v1.diff text/plain 26.0 KB

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-25 06:48:39
Message-ID: 53FADC47.2040507@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Rukh,

(2014/08/15 6:18), Rukh Meski wrote:
> Based on the feedback on my previous patch, I've separated only the
> LIMIT part into its own feature. This version plays nicely with
> inheritance. The intended use is splitting up big UPDATEs and DELETEs
> into batches more easily and efficiently.

Before looking into the patch, I'd like to know the use cases in more
details.

Thanks,

Best regards,
Etsuro Fujita


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-25 12:40:22
Message-ID: CAA4eK1KaZ2kzuDRzVWOeRzhfTkD261uLG7sFw_ceWmTkZZYbNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 25, 2014 at 12:18 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>>
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature. This version plays nicely with
>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
>
> Before looking into the patch, I'd like to know the use cases in more
details.

You can once check the previous commit fest thread [1] for
this feature, in that you can find some use cases and what are
the difficulties to implement this feature, it might aid you in
review of this feature.

[1]
https://commitfest.postgresql.org/action/patch_view?id=1412

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-25 13:25:53
Message-ID: 1408973153.27724.YahooMailNeo@web122301.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature.  This version plays nicely with
>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> Before looking into the patch, I'd like to know the use cases in more
> details.

There have been a few times I wanted something like this, so that
it was easier to do an update that affects a very high percentage
of rows in a table, while making the old version of the row no
longer match the selection criteria for the UPDATE.  There are
workarounds using cursors or subselects returning ctid, but they
are kludgy and error prone.  Basically I wanted to alternate UPDATE
of a subset of the rows with table VACUUM so that subsequent
iterations can re-use space and avoid bloating the table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-25 18:25:39
Message-ID: CAMkU=1w8B+y9ap9KywxBEXpLd9vwkN40ACqNjxHgC8=60aSyTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Aug 24, 2014 at 11:48 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp
> wrote:

> Hi Rukh,
>
>
> (2014/08/15 6:18), Rukh Meski wrote:
>
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature. This version plays nicely with
>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>>
>
> Before looking into the patch, I'd like to know the use cases in more
> details.
>

There are two common use cases I can think of:

1)

I've just added a column to an existing table, and it is all NULL. I've
changed the code to populate that column appropriately for new or updated
rows, but I need to back fill the existing rows. I have a (slow) method to
compute the new value. (I've not yet changed the code to depend on that
column being populated)

The obvious solution is:

update the_table set new_col=populate_new_col(whatever) where new_col is
null.

But this will bloat the table because vacuum cannot intervene, and will
take a very long time. The first row to be update will remain locked until
the last row gets updated, which is not acceptable. And if something goes
wrong before the commit, you've lost all the work.

With the limit clause, you can just do this:

update the_table set new_col=populate_new_col(whatever) where new_col is
null limit 50000;

In a loop with appropriate vacuuming and throttling.

2)

I've introduced or re-designed partitioning, and need to migrate rows to
the appropriate partitions without long lived row locks.

create table pgbench_accounts2 () inherits (pgbench_accounts);

and then in a loop:

with t as (delete from only pgbench_accounts where aid < 500000 limit 5000
returning *)
insert into pgbench_accounts2 select * from t;

Cheers,

Jeff


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-29 10:20:31
Message-ID: 540053EF.9090708@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/08/25 15:48), Etsuro Fujita wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature. This version plays nicely with
>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> Before looking into the patch, I'd like to know the use cases in more
> details.

Thanks for the input, Amit, Kevin and Jeff! I understand that the patch
is useful.

I've looked at the patch a bit closely. Here is my initial thought
about the patch.

The patch places limit-counting inside ModifyTable, and works well for
inheritance trees, but I'm not sure that that is the right way to go. I
think that this feature should be implemented in the way that we can
naturally extend it to the ORDER-BY-LIMIT case in future. But honestly
the patch doesn't seem to take into account that, I might be missing
something, though. What plan do you have for the future extensibility?

I think that the approach discussed in [1] would be promissing, so ISTM
that it would be better to implement this feature by allowing for plans
in the form of eg, ModifyTModifyTable+Limit+Append.

Thanks,

[1] http://www.postgresql.org/message-id/26819.1291133045@sss.pgh.pa.us

Best regards,
Etsuro Fujita


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-29 11:53:16
Message-ID: 540069AC.6000409@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/29/14 12:20 PM, Etsuro Fujita wrote:
> The patch places limit-counting inside ModifyTable, and works well for
> inheritance trees, but I'm not sure that that is the right way to go. I
> think that this feature should be implemented in the way that we can
> naturally extend it to the ORDER-BY-LIMIT case in future. But honestly
> the patch doesn't seem to take into account that, I might be missing
> something, though.

The LIMIT part *has* to happen after the rows have been locked or it
will work very surprisingly under concurrency (sort of like how FOR
SHARE / FOR UPDATE worked before 9.0). So either it has to be inside
ModifyTable or the ModifyTable has to somehow pass something to a Limit
node on top of it which would then realize that the tuples from
ModifyTable aren't supposed to be sent to the client (unless there's a
RETURNING clause). I think it's a lot nicer to do the LIMITing inside
ModifyTable, even though that duplicates a small portion of code that
already exists in the Limit node.

> What plan do you have for the future extensibility?
>
> I think that the approach discussed in [1] would be promissing, so ISTM
> that it would be better to implement this feature by allowing for plans
> in the form of eg, ModifyTModifyTable+Limit+Append.

I don't see an approach discussed there, just a listing of problems with
no solutions.

This is just my personal opinion, but what I think should happen is:

1) We put the LIMIT inside ModifyTable like this patch does. This
doesn't prevent us from doing ORDER BY in the future, but helps numerous
people who today have to
2) We allow ORDER BY on tables with no inheritance children using
something similar to Rukh's previous patch.
3) Someone rewrites how UPDATE works based on Tom's suggestion here:
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
which allows us to support ORDER BY on all tables (or perhaps maybe not
FDWs, I don't know how those work). The LIMIT functionality in this
patch is unaffected.

Now, I know some people disagree with me on whether step #2 is worth
taking or not, but that's a separate discussion. My point w.r.t. this
patch still stands: I don't see any forwards compatibility problems with
this approach, nor do I really see any viable alternatives either.

.marko


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-29 11:56:13
Message-ID: 54006A5D.4000500@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/29/14 1:53 PM, I wrote:
> This is just my personal opinion, but what I think should happen is:
>
> 1) We put the LIMIT inside ModifyTable like this patch does. This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to

Oops, looks like I didn't finish my thought here.

.. but helps numerous people who today have to achieve the same thing
via tedious, slow and problematic subqueries (or a choose-any-two
combination of these).

.marko


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-08-29 14:33:45
Message-ID: 17866.1409322825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> The LIMIT part *has* to happen after the rows have been locked or it
> will work very surprisingly under concurrency (sort of like how FOR
> SHARE / FOR UPDATE worked before 9.0).

Good point.

> So either it has to be inside
> ModifyTable or the ModifyTable has to somehow pass something to a Limit
> node on top of it

... or we add a LockRows node below the Limit node. Yeah, that would make
UPDATE/LIMIT a tad slower, but I think that might be preferable to what
you're proposing anyway. Raw speed of what is fundamentally a fringe
feature ought not trump every other concern.

> This is just my personal opinion, but what I think should happen is:

> 1) We put the LIMIT inside ModifyTable like this patch does. This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to
> 2) We allow ORDER BY on tables with no inheritance children using
> something similar to Rukh's previous patch.
> 3) Someone rewrites how UPDATE works based on Tom's suggestion here:
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
> which allows us to support ORDER BY on all tables (or perhaps maybe not
> FDWs, I don't know how those work). The LIMIT functionality in this
> patch is unaffected.

I still think we should skip #2 and go directly to work on #3. Getting
rid of the unholy mess that is inheritance_planner would be a very nice
thing.

regards, tom lane


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-01 12:06:41
Message-ID: 54046151.5050000@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/29/14 4:33 PM, Tom Lane wrote:
>> So either it has to be inside
>> ModifyTable or the ModifyTable has to somehow pass something to a Limit
>> node on top of it
>
> ... or we add a LockRows node below the Limit node. Yeah, that would make
> UPDATE/LIMIT a tad slower, but I think that might be preferable to what
> you're proposing anyway. Raw speed of what is fundamentally a fringe
> feature ought not trump every other concern.

I don't consider this a fringe feature, but in any case, the main use
case for LIMIT without ORDER BY in UPDATE and DELETE is to split up
large transactions into smaller batches. And considering that, I think
raw speed should be a concern (though it shouldn't trump every other
concern, obviously).

More to the point, personally, I think the changes to nodeModifyTable.c
are very reasonable so it's not clear to me that the "extra
LockRows+Limit nodes" approach would be inherently better (even ignoring
performance concerns).

>> This is just my personal opinion, but what I think should happen is:
>
>> 2) We allow ORDER BY on tables with no inheritance children using
>> something similar to Rukh's previous patch.
>> 3) Someone rewrites how UPDATE works based on Tom's suggestion here:
>> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
>
> I still think we should skip #2 and go directly to work on #3. Getting
> rid of the unholy mess that is inheritance_planner would be a very nice
> thing.

Ideally? Yeah, that would be great. But I don't see anyone
volunteering to do that work, and I think holding back a useful feature
(ORDER BY with UPDATE/DELETE) in hopes of getting someone to volunteer
to do it is insane. Now, you're free to argue that ORDER BY with
UPDATE/DELETE isn't that useful, of course, but I'm sure there are lots
of people who agree with me.

.marko


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-03 14:46:07
Message-ID: CA+TgmoYLOf8tnM3meD1Q+4kQC0HWnpqGH+nHwfShKKm_4K5WaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 1, 2014 at 8:06 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> Ideally? Yeah, that would be great. But I don't see anyone volunteering to
> do that work, and I think holding back a useful feature (ORDER BY with
> UPDATE/DELETE) in hopes of getting someone to volunteer to do it is insane.
> Now, you're free to argue that ORDER BY with UPDATE/DELETE isn't that
> useful, of course, but I'm sure there are lots of people who agree with me.

I still agree with Tom. Arbitrary restrictions on which features can
be used in combination with each other piss off and alienate users.
We've put quite a bit of effort into making table inheritance not suck
(e.g. statistics on inheritance trees, Merge Append, etc.). Making it
suck more because you don't think it's as important as your feature
is, in my opinion, not cool.

This is not to say that I don't like the feature. I like it a lot.
But I like a product where you can be sure that if walking works and
chewing gum works you can also walk and chew gum at the same time even
more.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-03 15:02:47
Message-ID: 54072D97.7040509@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/3/14 4:46 PM, Robert Haas wrote:
> Making it
> suck more because you don't think it's as important as your feature
> is, in my opinion, not cool.

I really can't see how that would make inheritance suck *more*. You
can't do UPDATE .. ORDER BY now, and you wouldn't be able to do it after
the patch. Yeah, sure, perhaps people using inheritance might feel left
out, but surely that would just motivate them to work on improving it.

.marko


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-03 15:39:25
Message-ID: CA+TgmoYtRJcuLHfOpzWb8oJksNO5tK=mVKeqJuwf5qr6SayXkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 3, 2014 at 11:02 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> On 9/3/14 4:46 PM, Robert Haas wrote:
>> Making it
>> suck more because you don't think it's as important as your feature
>> is, in my opinion, not cool.
>
> I really can't see how that would make inheritance suck *more*. You can't
> do UPDATE .. ORDER BY now, and you wouldn't be able to do it after the
> patch. Yeah, sure, perhaps people using inheritance might feel left out,
> but surely that would just motivate them to work on improving it.

I think it's entirely reasonable for us to require that all new SQL
features should be required to work with or without inheritance. If
we took the opposition position, and said that the only things that
need to work with inheritance are the ones that existed at the time
inheritance was introduced, then we'd not need to worry about it not
only for this feature but for row-level security and SKIP LOCKED and
GROUPING SETS and, going back a bit further, materialized views and
security-barrier views and LATERAL and CTEs and on and on. Perhaps
not all of those require any special handling for inheritance
hierarchies, but some of them surely did, and if even 10% of the SQL
features that we have added since table inheritance were allowed to
opt out of supporting it, we'd have a broken and unusable feature
today.

Now some people might argue that we have that anyway, but the fact is
that a lot of people are using inheritance today, even with all its
flaws, and they wouldn't be if there were a long laundry list of
limitations that didn't apply to regular tables. We should be looking
to lift the limitations that currently exist, not add more.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-09 09:57:53
Message-ID: 540ECF21.1080302@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/03/2014 06:39 PM, Robert Haas wrote:
> On Wed, Sep 3, 2014 at 11:02 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:
>> On 9/3/14 4:46 PM, Robert Haas wrote:
>>> Making it
>>> suck more because you don't think it's as important as your feature
>>> is, in my opinion, not cool.
>>
>> I really can't see how that would make inheritance suck *more*. You can't
>> do UPDATE .. ORDER BY now, and you wouldn't be able to do it after the
>> patch. Yeah, sure, perhaps people using inheritance might feel left out,
>> but surely that would just motivate them to work on improving it.
>
> I think it's entirely reasonable for us to require that all new SQL
> features should be required to work with or without inheritance. If
> we took the opposition position, and said that the only things that
> need to work with inheritance are the ones that existed at the time
> inheritance was introduced, then we'd not need to worry about it not
> only for this feature but for row-level security and SKIP LOCKED and
> GROUPING SETS and, going back a bit further, materialized views and
> security-barrier views and LATERAL and CTEs and on and on. Perhaps
> not all of those require any special handling for inheritance
> hierarchies, but some of them surely did, and if even 10% of the SQL
> features that we have added since table inheritance were allowed to
> opt out of supporting it, we'd have a broken and unusable feature
> today.
>
> Now some people might argue that we have that anyway, but the fact is
> that a lot of people are using inheritance today, even with all its
> flaws, and they wouldn't be if there were a long laundry list of
> limitations that didn't apply to regular tables. We should be looking
> to lift the limitations that currently exist, not add more.

I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should
work with inheritance. So the path forward is (using Marko's phrasing
upthread):

1) We put the LIMIT inside ModifyTable like this patch does. This
doesn't prevent us from doing ORDER BY in the future, but helps numerous
people who today have to
2) Someone rewrites how UPDATE works based on Tom's suggestion here:
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
which allows us to support ORDER BY on all tables (or perhaps maybe not
FDWs, I don't know how those work). The LIMIT functionality in this
patch is unaffected.

What's not clear to me is whether it make sense to do 1) without 2) ? Is
UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
apply this patch now, how much of it needs to be rewritten after 2) ? If
the answers are "yes" and "not much", then we should review this patch
now, and put 2) on the TODO list. Otherwise 2) should do done first.

Etsuro, Kaigei, please take a look at the patch and try to make a guess
on how much of this still needs to be rewritten if we do 2). If not
much, then please continue to review it, with the aim of getting it into
a committable state.

- Heikki


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-09 10:37:23
Message-ID: 540ED863.6020402@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/9/14 11:57 AM, Heikki Linnakangas wrote:
> What's not clear to me is whether it make sense to do 1) without 2) ? Is
> UPDATE .. LIMIT without support for an ORDER BY useful enough?

I'd say so; I could use it right now. I have to remove millions of
lines from a table, but I don't want the live transaction processing to
take a hit, so I have to do it in batches. Granted, some kind of rate
limiting would achieve the same thing for DELETE, but with UPDATE you
also have to consider row locking, and rate limiting wouldn't help with
that at all; it would, in fact, just make it worse. I'll also be
running a big UPDATE like that later today, so that's two uses today
alone for me. And no, these are not routine things so keep your "use
partitions" suggestions to yourselves.

> And if we
> apply this patch now, how much of it needs to be rewritten after 2) ? If
> the answers are "yes" and "not much", then we should review this patch
> now, and put 2) on the TODO list. Otherwise 2) should do done first.

I'd say "not much, if anything at all".

.marko


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-09 10:46:49
Message-ID: 540EDA99.7050100@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/9/14 12:37 PM, I wrote:
> And no, these are not routine things so keep your "use
> partitions" suggestions to yourselves.

My apologies. This was not directed at you personally, Heikki, and in
any case it was unnecessarily hostile.

.marko


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-09 11:13:08
Message-ID: 540EE0C4.5040603@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/09/2014 01:46 PM, Marko Tiikkaja wrote:
> On 9/9/14 12:37 PM, I wrote:
>> And no, these are not routine things so keep your "use
>> partitions" suggestions to yourselves.
>
> My apologies. This was not directed at you personally, Heikki, and in
> any case it was unnecessarily hostile.

No worries, it made me smile :-)

- Heikki


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-09 16:28:21
Message-ID: CAMkU=1xQ5bDvKSc=OsRtFvJfMYF3dSsDE-kg8cz_1PiDDbFbyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 2:57 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
wrote:

> I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should work
> with inheritance. So the path forward is (using Marko's phrasing upthread):
>
> 1) We put the LIMIT inside ModifyTable like this patch does. This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to
> 2) Someone rewrites how UPDATE works based on Tom's suggestion here:
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
> which allows us to support ORDER BY on all tables (or perhaps maybe not
> FDWs, I don't know how those work). The LIMIT functionality in this
> patch is unaffected.
>
> What's not clear to me is whether it make sense to do 1) without 2) ? Is
> UPDATE .. LIMIT without support for an ORDER BY useful enough?

I've wanted LIMIT even without ORDER BY many times, so I'd vote yes.

> And if we apply this patch now, how much of it needs to be rewritten after
> 2) ? If the answers are "yes" and "not much", then we should review this
> patch now, and put 2) on the TODO list. Otherwise 2) should do done first.
>

On that I can't give any useful feedback.

Cheers,

Jeff


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 02:25:13
Message-ID: 540FB689.9050901@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/09/09 18:57), Heikki Linnakangas wrote:
> On 09/03/2014 06:39 PM, Robert Haas wrote:
>> Now some people might argue that we have that anyway, but the fact is
>> that a lot of people are using inheritance today, even with all its
>> flaws, and they wouldn't be if there were a long laundry list of
>> limitations that didn't apply to regular tables. We should be looking
>> to lift the limitations that currently exist, not add more.

> I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should
> work with inheritance. So the path forward is (using Marko's phrasing
> upthread):
>
> 1) We put the LIMIT inside ModifyTable like this patch does. This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to
> 2) Someone rewrites how UPDATE works based on Tom's suggestion here:
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
> which allows us to support ORDER BY on all tables (or perhaps maybe not
> FDWs, I don't know how those work). The LIMIT functionality in this
> patch is unaffected.
>
> What's not clear to me is whether it make sense to do 1) without 2) ? Is
> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
> apply this patch now, how much of it needs to be rewritten after 2) ? If
> the answers are "yes" and "not much", then we should review this patch
> now, and put 2) on the TODO list. Otherwise 2) should do done first.

My answers are "yes" but "completely rewritten". So, I think we should
work on 2) first ideally, but 2) seems a large project at least to me.
So, I think it would be reasonable to implement UPDATE/DELETE .. LIMIT
based on Rukh' patch for now and put 2) and the re-implementation of 1)
after 2) on the TODO list.

I don't have enough time to review it for a while, so I'd like to work
on it (and postgres_fdw's "update pushdown" enhancement related to it)
at the next CF (I think I can do it earlier). I must apologize to Rukh
for not having enough time for the patch review.

Thanks,

Best regards,
Etsuro Fujita


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 07:57:49
Message-ID: 5410047D.8010109@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-10 04:25, Etsuro Fujita wrote:
> (2014/09/09 18:57), Heikki Linnakangas wrote:
>> What's not clear to me is whether it make sense to do 1) without 2) ? Is
>> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
>> apply this patch now, how much of it needs to be rewritten after 2) ? If
>> the answers are "yes" and "not much", then we should review this patch
>> now, and put 2) on the TODO list. Otherwise 2) should do done first.
>
> My answers are "yes" but "completely rewritten".

Any particular reason for you to say that? Because an UPDATE might have
a RETURNING clause, all the updated tuples have to go through the
ModifyTable node one at a time. I don't see why we couldn't LIMIT there
after implementing #2.

.marko


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Marko Tiikkaja <marko(at)joh(dot)to>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 09:25:35
Message-ID: 5410190F.2040909@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/09/10 16:57), Marko Tiikkaja wrote:
> On 2014-09-10 04:25, Etsuro Fujita wrote:
>> (2014/09/09 18:57), Heikki Linnakangas wrote:
>>> What's not clear to me is whether it make sense to do 1) without 2) ? Is
>>> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
>>> apply this patch now, how much of it needs to be rewritten after 2) ? If
>>> the answers are "yes" and "not much", then we should review this patch
>>> now, and put 2) on the TODO list. Otherwise 2) should do done first.
>>
>> My answers are "yes" but "completely rewritten".
>
> Any particular reason for you to say that? Because an UPDATE might have
> a RETURNING clause, all the updated tuples have to go through the
> ModifyTable node one at a time. I don't see why we couldn't LIMIT there
> after implementing #2.

The reason is because I think that after implementing #2, we should
re-implement this feature by extending the planner to produce a plan
tree such as ModifyTable+Limit+Append, maybe with LockRows below the
Limit node. Such an approach would prevent duplication of the LIMIT
code in ModifyTable, making the ModifyTable code more simple, I think.

Thanks,

Best regards,
Etsuro Fujita


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 09:33:29
Message-ID: 54101AE9.8070402@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/10/14 11:25 AM, Etsuro Fujita wrote:
> The reason is because I think that after implementing #2, we should
> re-implement this feature by extending the planner to produce a plan
> tree such as ModifyTable+Limit+Append, maybe with LockRows below the
> Limit node. Such an approach would prevent duplication of the LIMIT
> code in ModifyTable, making the ModifyTable code more simple, I think.

You can already change *this patch* to do ModifyTable <- Limit <-
LockRows. If we think that's what we want, we should rewrite this patch
right now. This isn't a reason not to implement LIMIT without ORDER BY.

Like I said upthread, I think LockRows is a bad idea, but I'll need to
run some performance tests first. But whichever method we decide to
implement for this patch shouldn't need to be touched when the changes
to UPDATE land, so your reasoning is incorrect.

.marko


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Marko Tiikkaja <marko(at)joh(dot)to>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 10:05:25
Message-ID: 54102265.9040105@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/09/10 18:33), Marko Tiikkaja wrote:
> On 9/10/14 11:25 AM, Etsuro Fujita wrote:
>> The reason is because I think that after implementing #2, we should
>> re-implement this feature by extending the planner to produce a plan
>> tree such as ModifyTable+Limit+Append, maybe with LockRows below the
>> Limit node. Such an approach would prevent duplication of the LIMIT
>> code in ModifyTable, making the ModifyTable code more simple, I think.

> You can already change *this patch* to do ModifyTable <- Limit <-
> LockRows. If we think that's what we want, we should rewrite this patch
> right now.

I think it might be relatively easy to do that for single-table cases,
but for inheritance cases, inheritance_planner needs work and I'm not
sure how much work it would take ...

> Like I said upthread, I think LockRows is a bad idea, but I'll need to
> run some performance tests first. But whichever method we decide to
> implement for this patch shouldn't need to be touched when the changes
> to UPDATE land, so your reasoning is incorrect.

Yeah, as you say, we need the performance tests, and I think it would
depend on those results whether LockRows is a bad idea or not.

Thanks,

Best regards,
Etsuro Fujita


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-10 10:18:41
Message-ID: 54102581.2020207@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/10/14 12:05 PM, Etsuro Fujita wrote:
> (2014/09/10 18:33), Marko Tiikkaja wrote:
>> You can already change *this patch* to do ModifyTable <- Limit <-
>> LockRows. If we think that's what we want, we should rewrite this patch
>> right now.
>
> I think it might be relatively easy to do that for single-table cases,
> but for inheritance cases, inheritance_planner needs work and I'm not
> sure how much work it would take ...

Uh. Yeah, I think I'm an idiot and you're right.

I'll try and get some benchmarking done and see what comes out.

.marko


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-16 08:05:41
Message-ID: 5417EF55.7020702@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/08/15 6:18), Rukh Meski wrote:
> Based on the feedback on my previous patch, I've separated only the
> LIMIT part into its own feature. This version plays nicely with
> inheritance. The intended use is splitting up big UPDATEs and DELETEs
> into batches more easily and efficiently.

IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT. Is
that OK? When we support ORDER BY ... LIMIT/OFFSET, we will also be
allowing for OFFSET with UPDATE/DELETE ... LIMIT. So, ISTM it would be
better for the patch to support OFFSET at this point. No?

Thanks,

Best regards,
Etsuro Fujita


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-16 15:31:26
Message-ID: 1410881486.45923.YahooMailNeo@web122302.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature. This version plays nicely with
>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT. Is
> that OK? When we support ORDER BY ... LIMIT/OFFSET, we will also be
> allowing for OFFSET with UPDATE/DELETE ... LIMIT. So, ISTM it would be
> better for the patch to support OFFSET at this point. No?

Without ORDER BY you really would have no idea *which* rows the
OFFSET would be skipping. Even more dangerously, you might *think*
you do, and get a surprise when you see the results (if, for
example, a seqscan starts at a point other than the start of the
heap, due to a concurrent seqscan from an unrelated query). It
might be better not to provide an illusion of a degree of control
you don't have, especially for UPDATE and DELETE operations.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-16 16:58:15
Message-ID: CA+Tgmobr8g1i+RNXDNgKcEm9D7oV2jz4BPR2C0sx3CKF486mSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> (2014/08/15 6:18), Rukh Meski wrote:
>>> Based on the feedback on my previous patch, I've separated only the
>>> LIMIT part into its own feature. This version plays nicely with
>>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>>> into batches more easily and efficiently.
>>
>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT. Is
>> that OK? When we support ORDER BY ... LIMIT/OFFSET, we will also be
>> allowing for OFFSET with UPDATE/DELETE ... LIMIT. So, ISTM it would be
>> better for the patch to support OFFSET at this point. No?
>
> Without ORDER BY you really would have no idea *which* rows the
> OFFSET would be skipping. Even more dangerously, you might *think*
> you do, and get a surprise when you see the results (if, for
> example, a seqscan starts at a point other than the start of the
> heap, due to a concurrent seqscan from an unrelated query). It
> might be better not to provide an illusion of a degree of control
> you don't have, especially for UPDATE and DELETE operations.

Fair point, but I'd lean toward including it. I think we all agree
the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
meaning. If we don't include it now, we'll just end up adding it
later. It makes for fewer patches, and fewer changes for users, if we
do it all at once.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-24 02:22:15
Message-ID: 54222AD7.70708@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/09/17 1:58), Robert Haas wrote:
> On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>> Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> (2014/08/15 6:18), Rukh Meski wrote:
>>>> Based on the feedback on my previous patch, I've separated only the
>>>> LIMIT part into its own feature. This version plays nicely with
>>>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>>>> into batches more easily and efficiently.
>>>
>>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT. Is
>>> that OK? When we support ORDER BY ... LIMIT/OFFSET, we will also be
>>> allowing for OFFSET with UPDATE/DELETE ... LIMIT. So, ISTM it would be
>>> better for the patch to support OFFSET at this point. No?
>>
>> Without ORDER BY you really would have no idea *which* rows the
>> OFFSET would be skipping. Even more dangerously, you might *think*
>> you do, and get a surprise when you see the results (if, for
>> example, a seqscan starts at a point other than the start of the
>> heap, due to a concurrent seqscan from an unrelated query). It
>> might be better not to provide an illusion of a degree of control
>> you don't have, especially for UPDATE and DELETE operations.
>
> Fair point, but I'd lean toward including it. I think we all agree
> the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
> meaning. If we don't include it now, we'll just end up adding it
> later. It makes for fewer patches, and fewer changes for users, if we
> do it all at once.

I agree with Robert.

Rukh, what do you think as an author?

Thanks,

Best regards,
Etsuro Fujita


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Rukh Meski <rukh(dot)meski(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIMIT for UPDATE and DELETE
Date: 2014-09-27 07:16:19
Message-ID: 54266443.6070802@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/24/2014 05:22 AM, Etsuro Fujita wrote:
> (2014/09/17 1:58), Robert Haas wrote:
>> On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>> Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> (2014/08/15 6:18), Rukh Meski wrote:
>>>>> Based on the feedback on my previous patch, I've separated only the
>>>>> LIMIT part into its own feature. This version plays nicely with
>>>>> inheritance. The intended use is splitting up big UPDATEs and DELETEs
>>>>> into batches more easily and efficiently.
>>>>
>>>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT. Is
>>>> that OK? When we support ORDER BY ... LIMIT/OFFSET, we will also be
>>>> allowing for OFFSET with UPDATE/DELETE ... LIMIT. So, ISTM it would be
>>>> better for the patch to support OFFSET at this point. No?
>>>
>>> Without ORDER BY you really would have no idea *which* rows the
>>> OFFSET would be skipping. Even more dangerously, you might *think*
>>> you do, and get a surprise when you see the results (if, for
>>> example, a seqscan starts at a point other than the start of the
>>> heap, due to a concurrent seqscan from an unrelated query). It
>>> might be better not to provide an illusion of a degree of control
>>> you don't have, especially for UPDATE and DELETE operations.
>>
>> Fair point, but I'd lean toward including it. I think we all agree
>> the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
>> meaning. If we don't include it now, we'll just end up adding it
>> later. It makes for fewer patches, and fewer changes for users, if we
>> do it all at once.
>
> I agree with Robert.
>
> Rukh, what do you think as an author?

I have marked this as "returned with feedback" in the commitfest app.
What I'd like to see happen next is:

Rewrite how UPDATEs work, per Tom's suggestion here:
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us. Then
implement ORDER BY ... LIMIT on top of that.

A lot of people would also be willing to settle for just implementing
LIMIT without ORDER BY, as a stopgap measure. But the UPDATE rewrite is
what would make everyone most happy.

- Heikki