Re: TABLESAMPLE patch

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: TABLESAMPLE patch
Date: 2015-04-06 13:07:10
Message-ID: CAA4eK1JvyocCZiDb=DXJBVeON1ppvtN1VUWK-BChG8=D8vh8EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 6, 2015 at 5:56 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>
> On 06/04/15 12:33, Amit Kapila wrote:
>>
>>
>> But I think the Update on target table with sample scan is
>> supported via views which doesn't seem to be the right thing
>> in case you just want to support it via FROM/USING, example
>>
>> postgres=# create view vw_test As select * from test_tablesample
>> TABLESAMPLE sys
>> tem(30);
>> postgres=# explain update vw_test set id = 4;
>> QUERY PLAN
>>
---------------------------------------------------------------------------
>> Update on test_tablesample (cost=0.00..4.04 rows=4 width=210)
>> -> Sample Scan on test_tablesample (cost=0.00..4.04 rows=4
width=210)
>> (2 rows)
>>
>
> Right, I'll make those views not auto-updatable.
>
>>
>> > Standard is somewhat useless for UPDATE and DELETE as it only defines
>> quite limited syntax there. From what I've seen when doing research
>> MSSQL also only supports it in their equivalent of FROM/USING list,
>> Oracle does not seem to support their SAMPLING clause outside of SELECTs
>> at all and if I got the cryptic DB2 manual correctly I think they don't
>> support it outside of (sub)SELECTs either.
>> >
>>
>> By the way, what is the usecase to support sample scan in
>> Update or Delete statement?
>>
>
> Well for the USING/FROM part the use-case is same as for SELECT -
providing sample of the data for the query (it can be useful also for
getting pseudo random rows fast). And if we didn't support it, it could
still be done using sub-select so why not have it directly.
>

I can understand why someone wants to read sample data via
SELECT, but not clearly able to understand, why some one wants
to Update or Delete random data in table and if there is a valid
case, then why just based on sub-selects used in where clause
or table reference in FROM/USING list. Can't we keep it simple
such that either we support to Update/Delete based on Tablesample
clause or prohibit it in all cases?

>> Also, isn't it better to mention in the docs for Update and
>> Delete incase we are going to support tablesample clause
>> for them?
>>
>
> Most of other clauses that we support in FROM are not mentioned in
UPDATE/DELETE docs, both of those commands just say something like "refer
to the SELECT FROM docs for more info". Do you think TABLESAMPLE deserves
special treatment in this regard?
>

Nothing too important, just as I got confused while using,
someone else can also get confused, but I think we can leave
it.

>>
>> > And we do this type of coercion even for table data (you can insert
>> -2.3 into integer column and it will work) so I don't see what's wrong
>> with it here.
>> >
>>
>> I am not sure we can compare it with column of a table. I think we
>> can support it within a valid range (similar to tablesample method) and
>> if user inputs value outside the range, then return error.
>>
>
> But that's not what standard says, it says any numeric value expression
is valid. The fact that Oracle limits it to some range should not make us
do the same. I think most important thing here is that using -2.3 will
produce same results if called repeatedly (if there are no changes to data,
vacuum etc). Yes passing -2 will produce same results, I don't know if that
is a problem. The main reason why I have the coercion there is so that
users don't have to explicitly typecast expression results.
>

Actually, not a big point, but I felt it will be clear if there is a valid
range and actually we are not doing anything with negative (part)
of seed input by the user.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-04-06 13:16:36 Re: initdb -S and tablespaces
Previous Message Alvaro Herrera 2015-04-06 13:01:59 Re: pg_rewind and log messages