Re: Sequence Access Method WIP

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Vik Fearing <vik(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: hlinnaka(at)iki(dot)fi, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequence Access Method WIP
Date: 2015-06-17 15:51:26
Message-ID: 5581977E.3080302@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-06-15 11:32, Vik Fearing wrote:
> I've been looking at these patches a bit and here are some comments:
>

Thanks for looking at this.

> Patch 1: seqam
>
> I would like to see an example in the docs for CREATE SEQUENCE. That's
> perhaps not possible (or desirable) with only the "local" seqam? Not sure.
>

It is possible to have example with local seqam, it might be confusing
though, given it produces same results as not putting USING in the query.

> In the docs for pg_class, there is no mention that relam refers to
> pg_seqam for sequences but pg_am for all other types.
>
> There are errant half-sentences in the documentation, for example "to
> the options in the CREATE SEQUENCE or ALTER SEQUENCE statement." in
> Sequence Access Method Functions.

I think that's the side effect of all the rebases and rewrites over the
2y(!) that this has been going forward and back. It can be easily fixed
by proof reading before final submission. I didn't pay too much
attention yet because it's not clear how the docs should look like if
there is no real agreement on the api. (This applies to other comments
about docs as well)

>
> I'd prefer a README instead of the long comment at the start of seqam.c.
> The other ams do that.
>

OK, since things have been moved to separate directory, README is
doable, I personally prefer the docs in the main .c file usually but I
know project uses README sometimes for this.

> As mentioned upthread, this patch isn't a seamless replacement for
> what's already there because of the amdata field. I wasn't part of the
> conversation of FOSDEM unfortunately, and there's not enough information
> in this thread to know why this solution is preferred over each seqam
> having its own table type with all the columns it needs. I see that
> Heikki is waffling a bit between the two, and I have a fairly strong
> opinion that amdata should be split into separate columns. The patch
> already destroys and recreates what it needs when changing access method
> via ALTER SEQUENCE, so I don't really see what the problem is.
>

FOSDEM was just about agreeing that amdata is simpler after we discussed
it with Heikki. Nothing too important you missed there I guess.

I can try to summarize what are the differences:
- amdata is somewhat simpler in terms of code for both init, alter and
DDL, since with custom columns you have to specify them somehow and deal
with them in catalog, also ALTER SEQUENCE USING means that there are
going to be colums marked as deleted which produces needless waste, etc
- amdata make it easier to change the storage model as the tuple
descriptor is same for all sequences
- the separate columns are much nicer from user point of view
- my opinion is that separate columns also more nicely separate state
from options and I think that if we move to separate storage model,
there can be state table per AM which solves the tuple descriptor issue
- there is probably some slight performance benefit to amdata but I
don't think it's big enough to be important

I personally have slight preference to separate columns design, but I am
ok with both ways honestly.

>
> There is no \d command for sequence access methods. Without querying
> pg_seqam directly, how does one discover what's available?
>

Good point.

>
> Patch 2: seqam ddl
>
> When defining a new access method for sequences, it is possible to list
> the arguments multiple times (last one wins). Other defel loops raise
> an error if the argument is specified more than once. I haven't looked
> at all of such loops to see if this is the only odd man out or not, but
> I prefer the error behavior.
>

Hmm yeah, there should be error. I think only tsearch doesn't enforce
errors from the existing stuff, should probably be fixed as well
(separately of course).

>
> Patch 3: gapless_seq
>
> I really like the idea of having a gapless sequence in contrib.
> Although it has big potential to be abused, doing it manually when it's
> needed (like for invoices, at least in France) is a major pain. So big
> +1 for including this.
>

Yeah, people make gapless sequences regardless, it's better to provide
them one that behaves correctly, also it's quite good test for the API.

>
> It would be nice to be able to specify an access method when declaring a
> serial or bigserial column. This could be a separate patch later on,
> though.
>

The patch originally had GUC for this, but Heikki didn't like it so it's
left for the future developments.

>
> On the whole, I think this is a pretty good patchset. Aside from the
> design decision of whether amdata is a single opaque column or a set of
> columns, there are only a few things that need to be changed before it's
> ready for committer, and those things are mostly documentation.
>

Unfortunately the amdata being opaque vs set of columns is the main
issue here.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-06-17 16:31:32 Re: Auto-vacuum is not running in 9.1.12
Previous Message Tomas Vondra 2015-06-17 14:47:28 Re: PATCH: adaptive ndistinct estimator v4