From: | Kedar Potdar <kedar(dot)potdar(at)gmail(dot)com> |
---|---|
To: | Emmanuel Cecchet <manu(at)asterdata(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Amit Gupta <amit(dot)pc(dot)gupta(at)gmail(dot)com> |
Subject: | Re: Partitioning feature ... |
Date: | 2009-03-30 11:51:21 |
Message-ID: | bd8134a40903300451lfbc1cc5w5756140902f2689c@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Emmanuel,
Thanks for your time. This is a WIP patch and we will integrate your
suggestions/comments as appropriate.
Regards,
--
Kedar.
On Fri, Mar 27, 2009 at 3:38 AM, Emmanuel Cecchet <manu(at)asterdata(dot)com>wrote:
> Hi Kedar,
>
> First of all, congratulations for the excellent work.
> I have some comments and questions.
>
> In get_relevent_partition (btw, relevant is spelled with an a) you are
> maintaining 2 lists.
> Oops! 'a' typographical error.
I guess this is only useful for multi-column partitions, right?
> If you have a single column partition (without subpartitions), I think you
> could directly return on the first match (without maintaining any list)
> since you guarantee that there is no overlap between partitions.
> A simple but effective optimization for inserts consists of caching the
> last partition used (consecutive inserts often go to the same partition) and
> try it first before going through the whole loop.
> Yep.
>
>
> The update trigger should first check if the tuple needs to be moved. If
> the updated tuple still matches the constraints of the partitions it will
> not have to be moved and will save a lot of overhead.
> Yes. We agree on that.
>
>
> The COPY operation should probably be optimized to use the same code as the
> one in the insert trigger for partitioned tables. I guess some code could be
> factorized in COPY to make the inserts more efficient.
>
> The current trigger approach should prevent other triggers to be added to
> the table, or you should make sure that the partition trigger is always the
> one to execute last.
> As triggers are executed in order of their names, we've prefixed the
trigger names with "zz". This should work fine as long as no-one uses
trigger-name which starts with "zz".
>
> As we don't have automatic partition creation, it would be interesting to
> have an optional mechanism to deal with tuples that don't match any
> partition (very useful when you do bulk insert and some new data require a
> new partition). Having a simple overflow partition or an error logging
> mechanism would definitely help to identify these tuples and prevent things
> like large COPY operations to fail.
> Will get back on this.
>
>
>
> Looking forward to your responses,
> Emmanuel
>
>
>> We are implementing table partitioning feature to support Range and Hash
>> partitions. Please find attached, the WIP patch and test-cases.
>>
>> The syntax used conforms to most of the suggestions mentioned in
>> http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php,
>> barring the following:
>> -- Specification of partition names is optional. System will be able to
>> generate partition names in such cases.
>> -- Sub partitioning
>>
>> We are maintaining a system catalog(pg_partition) for partition meta-data.
>> System will look-up this table to find appropriate partition to operate on.
>> System internally uses low-level 'C' triggers to row-movement.
>>
>> Regards,
>> --
>> Kedar.
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>
>
> --
> Emmanuel Cecchet
> Aster Data Systems
> Web: http://www.asterdata.com
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2009-03-30 11:55:24 | Re: PQinitSSL broken in some use casesf |
Previous Message | Marko Kreen | 2009-03-30 10:11:48 | 8.3.5: Crash in CountActiveBackends() - lockless race? |