Re: portability of "designated initializers"

Lists: pgsql-hackers
From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: portability of "designated initializers"
Date: 2008-11-22 23:10:29
Message-ID: 20081122231029.GF3813@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I wonder how portable designated initializers are. As far as I can tell
they were only defined in C99. Can we use them in our source? If not,
is there a way to do this in C89?

I mean something like this:

typedef struct foo {
char type;
union {
int ival;
float fval;
} val;
} foo;

static foo foos[2] = {
{
.type = 'i',
.val.ival = 42
} , {
.type = 'f',
.val.fval = 2.78
}
};

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 00:05:57
Message-ID: 16747.1227398757@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I wonder how portable designated initializers are. As far as I can tell
> they were only defined in C99. Can we use them in our source?

I'd vote no. We're still targeting ANSI C (eg, no // comments).

> I mean something like this:

Where/why do you need to do that?

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 00:07:59
Message-ID: 20081123000759.GG3813@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:

> > I mean something like this:
>
> Where/why do you need to do that?

The reloptions patch uses three arrays, one for each type of option
(bool, int, float). I'm wondering if we could use a single array with
all options, and a union containing the values. The only problem with
that (AFAICS) is the initialization.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 00:58:08
Message-ID: 17135.1227401888@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Tom Lane wrote:
>> Where/why do you need to do that?

> The reloptions patch uses three arrays, one for each type of option
> (bool, int, float). I'm wondering if we could use a single array with
> all options, and a union containing the values. The only problem with
> that (AFAICS) is the initialization.

Hmm ... I'd not looked at that patch before, but now that I have I think
it's gone pretty seriously off on the overdesigned-and-inefficient end
of the spectrum. Turning RelationGetFillFactor and friends from simple
macros into functions that are probably *at least* a thousand times slower
than the macros doesn't seem like a good idea at all. Deconstructing a
reloptions datum is supposed to be done once by the relcache, not every
time one of the values is needed. Frankly I'd throw the entire thing
away and go back to a hardwired set of options feeding into a predefined
struct that's held by the relcache and examined by callers.

But as for your immediate point, I don't see that you'd buy enough
notational savings to justify upping our compiler requirement to C99.
All you really need here is to merge the three arrays into a single
array of pointer to struct relopt_gen. It'd be slightly
different-looking from guc.c but only because the pointer array
would be a constant rather than built on the fly.

static const struct relopt_bool relOptAutovacuumEnabled =
{
"autovacuum_enabled",
"Enables autovacuum in this relation",
RELOPT_TYPE_BOOL,
RELOPT_KIND_HEAP,
false
};

static const struct relopt_int relOptFillFactor =
{
"fillfactor",
"Packs table pages only to this percentage",
RELOPT_TYPE_INT,
RELOPT_KIND_HEAP | RELOPT_KIND_INDEX,
100,
10,
100
};

... more ...

static const struct relopt_gen * const relOpts[] =
{
(const struct relopt_gen *) &relOptAutovacuumEnabled,
(const struct relopt_gen *) &relOptFillFactor,
... more ... ,
NULL
};

regards, tom lane


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 01:24:12
Message-ID: 4928B0BC.9070908@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escreveu:

> The reloptions patch uses three arrays, one for each type of option
> (bool, int, float). I'm wondering if we could use a single array with
> all options, and a union containing the values. The only problem with
> that (AFAICS) is the initialization.
>
I already tried that and don't use because of portability issues. But I
don't thing 3 (maybe 4 -- when we have string types in it) loops in
sequence are so ugly. IMHO, we can live with that.

--
Euler Taveira de Oliveira
http://www.timbira.com/


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 01:39:32
Message-ID: 4928B454.6030301@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escreveu:

> Hmm ... I'd not looked at that patch before, but now that I have I think
> it's gone pretty seriously off on the overdesigned-and-inefficient end
> of the spectrum. Turning RelationGetFillFactor and friends from simple
> macros into functions that are probably *at least* a thousand times slower
> than the macros doesn't seem like a good idea at all. Deconstructing a
> reloptions datum is supposed to be done once by the relcache, not every
> time one of the values is needed. Frankly I'd throw the entire thing
> away and go back to a hardwired set of options feeding into a predefined
> struct that's held by the relcache and examined by callers.
>
I think about that but I don't do any benchmarks after I finished the
patch. We can do an exception as the current code do for "oids" and
don't rip out the StdRdOptions just to maitain the
RelationGetFillFactor() and others like a macro. Honestly, I don't like
to bring RelationGet*() to reloptions.c but we can always refactor that
before committing it.

Alvaro, let me know if you want me to send another patch; or will you do
it? I have some small corrections too.

--
Euler Taveira de Oliveira
http://www.timbira.com/


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 03:43:22
Message-ID: 20081123034322.GI3813@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Euler Taveira de Oliveira wrote:
> Tom Lane escreveu:
>
> > Hmm ... I'd not looked at that patch before, but now that I have I think
> > it's gone pretty seriously off on the overdesigned-and-inefficient end
> > of the spectrum. Turning RelationGetFillFactor and friends from simple
> > macros into functions that are probably *at least* a thousand times slower
> > than the macros doesn't seem like a good idea at all. Deconstructing a
> > reloptions datum is supposed to be done once by the relcache, not every
> > time one of the values is needed. Frankly I'd throw the entire thing
> > away and go back to a hardwired set of options feeding into a predefined
> > struct that's held by the relcache and examined by callers.
> >
> I think about that but I don't do any benchmarks after I finished the
> patch. We can do an exception as the current code do for "oids" and
> don't rip out the StdRdOptions just to maitain the
> RelationGetFillFactor() and others like a macro. Honestly, I don't like
> to bring RelationGet*() to reloptions.c but we can always refactor that
> before committing it.

I think the point is not only not regressing in fillfactor's
performance, but also to get good performance for the other options.
The problem with this design is that to find any option you have to
trawl the arrays and do strcmp() on each to find whether it's the one
we're looking for, which seems a loser. I thought about keeping the
array sorted and then doing bsearch, but Tom's right that this is still
a lot slower than a predefined struct.

> Alvaro, let me know if you want me to send another patch; or will you do
> it? I have some small corrections too.

I've already modified your patch a bit ... please send your updated
patch so I can merge it into mine. However, my changes were also
relatively minor. Since Tom wants it to be entirely rewritten then
maybe merging minor fixes to it is a waste of time ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 04:03:01
Message-ID: 19848.1227412981@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I've already modified your patch a bit ... please send your updated
> patch so I can merge it into mine. However, my changes were also
> relatively minor. Since Tom wants it to be entirely rewritten then
> maybe merging minor fixes to it is a waste of time ...

The thing I'm complaining about is having dropped the intermediate
struct that represents the fully decoded set of reloptions. If you
put that back I won't object to a table-driven approach to doing the
decoding.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-11-23 16:56:54
Message-ID: 25276.1227459414@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> The thing I'm complaining about is having dropped the intermediate
> struct that represents the fully decoded set of reloptions.

After looking at the patch a bit more I have a couple of other comments:

* I disagree with changing the argument of the RelationGetXXX
macros from Relation to bytea*, as in this example for instance:

* after initial build do not.
*/
gistdoinsert(index, itup,
! RelationGetTargetPageFreeSpace(index, GIST_DEFAULT_FILLFACTOR),
&buildstate->giststate);

buildstate->indtuples += 1;
--- 202,208 ----
* after initial build do not.
*/
gistdoinsert(index, itup,
! RelationGetTargetPageFreeSpace(index->rd_options, GIST_DEFAULT_FILLFACTOR),
&buildstate->giststate);

buildstate->indtuples += 1;

This hasn't got anything to recommend it. It forecloses the possibility
of RelationGetXXX looking aside at, say, relkind or relam; as it might
wish to do to select a default value for example. The flip side of that
coin is that the call sites now know more than they need to about where
the value comes from (ie, from rd_options and not any other part of a
relcache entry). And in any case a function named RelationGetFoo
ought to take a Relation.

* On the other hand, a change I'd approve of is to get rid of the
call-site knowledge about appropriate defaults (GIST_DEFAULT_FILLFACTOR
in this example). If we're going to try to make the thing table-driven
then it's natural to put the default values into the table. You'd need
a slightly more complex table to support defaults that vary across index
AMs, but you really need that anyway since not all options might apply
to all AMs in the first place. I think the place this would end up is
that an rd_options struct would always be created for every relcache
entry, containing default values if nothing was supplied in pg_class.
Or maybe we should just get rid of rd_options and put all the options
fields directly into RelationData? The existing design is predicated
on an assumption that rd_options would be omitted most of the time,
but if we're always going to have it there then it's not real clear
that a separate palloc step buys much.

regards, tom lane


From: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-12-05 03:43:58
Message-ID: 4938A37E.9080301@timbira.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escreveu:

> I've already modified your patch a bit ... please send your updated
> patch so I can merge it into mine. However, my changes were also
> relatively minor. Since Tom wants it to be entirely rewritten then
> maybe merging minor fixes to it is a waste of time ...
>
Since Alvaro is involved in finishing another patches, I improved the first
patch based on discussion. What did I do?

(i) change from relOptXXX arrays to relOpts as Tom pointed out in [1];
(ii) fix some memory alloc/free problems;
(iii) add some code to uniform the output of boolean parameters (output will
be true/false);
(iv) change the *_base_thresh names to *_threshold. I know the names are not
appropriated but postgresql.conf use similar ones;
(v) rollback the RelationGetXXX changes as Tom pointed out in [2];
(vi) fillfactor is stored as a separate variable in RelationData;
(vi) include documentation.

I did some small benchmarks to figure out if storing the parameters as a bytea
(rd_options) has any performance penalty; i don't see any significant
difference but my laptop is not a good parameter. Comments?

PS> I don't include regression tests but if you think it's important to
exercise that code path, I will elaborate some tests.

[1] http://archives.postgresql.org/pgsql-hackers/2008-11/msg01494.php
[2] http://archives.postgresql.org/pgsql-hackers/2008-11/msg01500.php

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachment Content-Type Size
reloptv2.diff.gz application/x-gzip 22.0 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-12-05 13:19:55
Message-ID: 20081205131955.GC3755@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Euler Taveira de Oliveira wrote:
> Alvaro Herrera escreveu:
>
> > I've already modified your patch a bit ... please send your updated
> > patch so I can merge it into mine. However, my changes were also
> > relatively minor. Since Tom wants it to be entirely rewritten then
> > maybe merging minor fixes to it is a waste of time ...
> >
> Since Alvaro is involved in finishing another patches, I improved the first
> patch based on discussion. What did I do?

Thanks. Sorry for being unable to work on this.

> (iv) change the *_base_thresh names to *_threshold. I know the names are not
> appropriated but postgresql.conf use similar ones;

Oh, please spell the names in full (e.g. vac_threshold to
vacuum_threshold; anl -> analyze, etc).

I'll try to have a quick look at this version today.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-12-10 20:17:22
Message-ID: 20081210201722.GI5503@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > I've already modified your patch a bit ... please send your updated
> > patch so I can merge it into mine. However, my changes were also
> > relatively minor. Since Tom wants it to be entirely rewritten then
> > maybe merging minor fixes to it is a waste of time ...
>
> The thing I'm complaining about is having dropped the intermediate
> struct that represents the fully decoded set of reloptions. If you
> put that back I won't object to a table-driven approach to doing the
> decoding.

Do we need a struct, or can we get away with storing the values directly
in RelationData? Something like this:

typedef struct RelationData
{
...

/*
* Parsed values from reloptions. These are set whenever rd_rel is loaded
* into the relcache entry.
*/
bool autovacuum_enabled;
int fillfactor;
int autovacuum_vacuum_threshold;
int autovacuum_analyze_threshold;
int autovacuum_vacuum_cost_delay;
int autovacuum_vacuum_cost_limit;
int autovacuum_freeze_min_age;
int autovacuum_freeze_max_age;
float8 autovacuum_vacuum_scale_factor;
float8 autovacuum_analyze_scale_factor;

One problem is that this enlarges Relation by a fair bit for every
relation, even those that are never concerned with autovacuum, e.g.
indexes. I'm not sure how much is this a problem in practice. (It
could become a problem for people who has thousands of partitions, but
we don't actually support that.)

I have also modified the patch to include the default value for each
option in the table; to solve the problem of differing fillfactors for
the different index AMs, I've created separated "kinds" this way:

/* kind supported by reloptions */
#define RELOPT_KIND_NULL (1 << 0)
#define RELOPT_KIND_HEAP (1 << 1)
#define RELOPT_KIND_BTREE (1 << 2)
#define RELOPT_KIND_HASH (1 << 3)
#define RELOPT_KIND_GIN (1 << 4)
#define RELOPT_KIND_GIST (1 << 5)

so we can do something like this:

{
{
"fillfactor",
"Packs table pages only to this percentage",
RELOPT_TYPE_INT, RELOPT_KIND_HEAP
},
HEAP_DEFAULT_FILLFACTOR, HEAP_MIN_FILLFACTOR, 100
},
{
{
"fillfactor",
"Packs btree index pages only to this percentage",
RELOPT_TYPE_INT, RELOPT_KIND_BTREE
},
BTREE_DEFAULT_FILLFACTOR, BTREE_MIN_FILLFACTOR, 100
},
... more for the other AMs ...

Is that approach OK?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: portability of "designated initializers"
Date: 2008-12-10 20:50:32
Message-ID: 7708.1228942232@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Do we need a struct, or can we get away with storing the values directly
> in RelationData? Something like this:

The intention behind having a separate struct was that there could
possibly be different sets of reloptions for different types of
relations, different index AMs, etc. Pushing the fields directly
into Relation puts the kibosh on that permanently, without really
buying much AFAICS.

> One problem is that this enlarges Relation by a fair bit for every
> relation, even those that are never concerned with autovacuum, e.g.
> indexes. I'm not sure how much is this a problem in practice. (It
> could become a problem for people who has thousands of partitions, but
> we don't actually support that.)

Sure we do. Check the archives to find messages from people who are
using thousands of relations right now.

> I have also modified the patch to include the default value for each
> option in the table; to solve the problem of differing fillfactors for
> the different index AMs, I've created separated "kinds" this way:

Right, I was thinking along the same lines. Though this pretty much
destroys any notion that the set of index AMs can be extended without
modifying the core code ... can we adjust it to improve that?

regards, tom lane