Re: Syntax for partitioning

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Subject: Re: Syntax for partitioning
Date: 2011-11-12 13:22:52
Message-ID: 20111112132252.GB25874@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 10, 2011 at 10:19:02PM +0100, Dimitri Fontaine wrote:
> Now the aim would be to be able to implement the operation you describe
> by using the new segment map, which is an index pointing to sequential
> ranges of on-disk blocks where the data is known to share a common key
> range over the columns you're segmenting on. I would imagine this SQL:
>
> TRUNCATE foo WHERE year < 2009;
>
> As the on-disk location of the data that qualify this WHERE clause is
> known, it could be possible to (predicate) lock it and bulk remove it,
> unlinking whole segments (1GB) at a time when relevant.

While I agree that explicit partitioning is somewhat of a hack, it's a
really useful hack. But for me the most important use of partitioning
is "dropping a billion rows efficiently and getting the disk space
back". And the biggest problem is always that dropping blocks of a
table requires fixing all the indexes.

For fixing the index of the partition key it's a simpler problem, you
could probably prune the btree relatively efficiently. But for all
other indexes there's no better solution than walk the entire index.

However, in the very special case where the drop boundaries explicitly
match the dataset, you can simply drop all the indexes.

Now, if someone cames up with an efficient way to drop a huge number of
rows quickly, then I admit one of the major issues is fixed. But
recovering the disk space is much harder. Yes, recent versions of
Linux come with ways to punch holes in existing files, but that doesn't
make it quick or efficient.

> > While automatic clustering would be nice, it isn't the same thing as
> > partitioning.
>
> That has been my initial reaction to that kind of ideas too. After some
> more time brewing the ideas, I'm not convinced that the use cases that
> usually drives you to the latter can't be solved with the former.

I hope so, but I'm not sure I'd like partitioning support to wait on
someone hitting on the right idea.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Soudakov 2011-11-12 15:54:42 trivial patch: foreign table either defines row type
Previous Message hubert depesz lubaczewski 2011-11-12 08:58:07 Re: Strange problem with create table as select * from table;