Re: On partitioning

From: "Amit Langote" <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>
Cc: "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On partitioning
Date: 2014-11-20 03:27:30
Message-ID: 032601d00471$ea6f4280$bf4dc780$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Robert,

>
> I thought putting the partition boundaries into pg_inherits was a
> strange choice. I'd put it in pg_class, or in pg_partition if we
> decide to create that.

Hmm, yeah I guess we are better off using pg_inherits for just saying that a partition is an inheritance child. Other details should go elsewhere for sure.

> Maybe as anyarray, but I think pg_node_tree
> might even be better. That can also represent data of some arbitrary
> type, but it doesn't enforce that everything is uniform. So you could
> have a list of objects of the form {RANGEPARTITION :lessthan {CONST
> ...} :partition 16982} or similar. The relcache could load that up
> and convert the list to a C array, which would then be easy to
> binary-search.
>
> As you say, you also need to store the relevant operator somewhere,
> and the fact that it's a range partition rather than list or hash,
> say.
>

I'm wondering here if it's better to keep partition values per partition wherein we have two catalogs, say, pg_partitioned_rel and pg_partition_def.

pg_partitioned_rel stores information like partition kind, key (attribute number(s)?), key opclass(es). Optionally, we could also say here if a given record (in pg_partitioned_rel) represents an actual top-level partitioned table or a partition that is sub-partitioned (wherein this record is just a dummy for keys of sub-partitioning and such); something like partisdummy...

pg_partition_def stores information of individual partitions (/sub-partitions, too?) such as its parent (either an actual top level partitioned table or a sub-partitioning template), whether this is an overflow/default partition, and partition values.

Such a scheme would be similar to what Greenplum [1] has.

Perhaps this duplicates inheritance and can be argued in that sense, though.

Do you think keeping partition defining values with the top-level partitioned table would make some partitioning schemes (multikey, sub- , etc.) a bit complicated to implement? I cannot offhand imagine the actual implementation difficulties that might be involved myself but perhaps you have a better idea of such details and would have a say...

Thanks,
Amit

[1] http://gpdb.docs.pivotal.io/4330/index.html#ref_guide/system_catalogs/pg_partition_rule.html

http://gpdb.docs.pivotal.io/4330/index.html#ref_guide/system_catalogs/pg_partition.html

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-11-20 03:52:14 Re: GIN pageinspect functions
Previous Message Peter Eisentraut 2014-11-20 03:25:08 Re: Bugfix and new feature for PGXS