Re: gSoC - ADD MERGE COMMAND - code patch submission

From: Boxuan Zhai <bxzhai2010(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: gSoC - ADD MERGE COMMAND - code patch submission
Date: 2010-07-17 12:31:44
Message-ID: AANLkTimmu8tng5Bkxw6bDlpIMoYZmBMlExEAVgb3nqq9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I have just moved my modifications to the latest git edition. And I made a
patch file through git diff as the second submission. I think the format is
much better the my last submission.

As I mentioned before, our progress has come into the executor. So far, the
executor can accept the top-level query and return tuples for it. The next
step is to add action qualification evaluation on the returned tuple slot.

Thanks

Boxuan

2010/7/17 Boxuan Zhai <bxzhai2010(at)gmail(dot)com>

>
>
> ---------- Forwarded message ----------
> From: Boxuan Zhai <bxzhai2010(at)gmail(dot)com>
> Date: 2010/7/17
> Subject: Re: [HACKERS] gSoC - ADD MERGE COMMAND - code patch submission
> To: Simon Riggs <simon(at)2ndquadrant(dot)com>
>
>
>
>
> 2010/7/17 Simon Riggs <simon(at)2ndquadrant(dot)com>
>
> On Fri, 2010-07-16 at 08:26 +0800, Boxuan Zhai wrote:
>> > The merge actions are transformed into lower level queries. I create a
>> > Query node for each of them and append them in a newly create List
>> > field mergeActQry. The action queries have different command type and
>> > specific target list and qual list, according to their declaration by
>> > user. But they all share the same range table. This is because we
>> > don't need the action queries to be planned latter. The joining
>> > strategy is decided by the top query. We are only interest in their
>> > specific action qualifications. In other words, these action queries
>> > are only containers for their target list and qualifications.
>> >
>> > 2. When the query is ready, it will be send to rewriter. In this part,
>> > we can call RewriteQuery() to handle the action queries. The UPDATE
>> > action will trigger rules on UPDATE, and so on. What need to be
>> > noticed are: 1. the actions of the same type should not be rewritten
>> > repeatedly. If there are two UPDATE actions in merge command, we
>> > should not trigger the ON UPDATE rules twice. 2. if an action type is
>> > fully replaced by rules, we should remove all actions of this type
>> > from the action list.
>> > Rewriter will also do some process on the target list of each action.
>>
>> IMHO it is a bad thing that we are attempting to execute each action
>> statement as a query. That means we need to execute an inner SQL
>> statement for each row returned by the top level query.
>>
>> That design makes MERGE similar in performance to an upsert PL/pgsql
>> function, which will perform terribly on large numbers of rows.
>>
>> Dear Simmon,
>
> Thanks for your feedback. I may not present my idea clearly.
> In my design, the merge actions are not executed as separate queries. Only
> the top level query (that is a query like "<source table> LEFT JOIN
> <target_table> ON <matching_qual>" ) will be planned and executed. For each
> tuple return by this plan, we will choose a proper action for it and do the
> corresponding modification. The tables will only be scanned and joined
> once. One merge action will not do a full run of tables join and then modify
> table as a standard UPDATE/DELETE/INSERT query. (Is this what you are
> worried about?)
>
> In fact, for one action, we only need the information of: 1. the action
> type (UPDATE or DELTE or INSERT). 2 the target list. and 3. the additional
> qualifications. And a Query node is a perfect container for these infor.
> That's why I transform them in to Query nodes. But all through the analyzer,
> rewriter, planner and executor. I just call related functions to formalize
> the expressions in their target list and qual lists. The range table and
> join tree is only dermined by the top level query, they will not be effected
> by merge actions.
>
>
>
>
>> This was exactly the point where I stopped implementation previously:
>> attempting to make MERGE work with rules is enough to prevent a tighter
>> in-executor implementation of the action list.
>>
> I am sorry that I don't catch your meanning here clearly.
> As my understanding, if there is a rule on the target table, the rewriter
> will add a new query in the execution queue. (or replace the original
> query). I think the rule queries will not effect the process within the
> original query, because they are totally separate queries which will be run
> before or after the original query. Are you suggest that we should not allow
> rules on MERGE command?
>
>
>
>> [To Boxuan, on a personal note, you seem to be coping quite well with
>> the code and the process; congratulations and keep going.]
>>
>>
> Thank you. Your encouragement is very important to me.
>
>
>> --
>>
>> Simon Riggs www.2ndQuadrant.com <http://www.2ndquadrant.com/>
>> PostgreSQL Development, 24x7 Support, Training and Services
>>
>>
>
>

Attachment Content-Type Size
merge_command_submission2.patch text/plain 43.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-07-17 13:56:39 Re: bg worker: overview
Previous Message Boxuan Zhai 2010-07-17 12:24:05 Fwd: gSoC - ADD MERGE COMMAND - code patch submission