Planning a change of representation in the planner

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Planning a change of representation in the planner
Date: 2003-02-07 04:35:48
Message-ID: 23775.1044592548@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Currently, the planner spends a good deal of time pushing around lists
of small integers, because it uses such lists to identify join
relations. For example, given SELECT ... FROM a, b, c WHERE ...
the list (1,2) (or equivalently (2,1)) would represent the join of
a and b.

This representation is pretty clearly a hangover from the days when the
Postgres planner was written in Lisp :-(. It's inefficient --- common
operations like union, intersection, and is-subset tests require O(N^2)
steps. And it's error-prone: I just had my nose rubbed once again in
the nasty things that happen if you accidentally get some duplicate
entries in a relation ID list. (It's nasty because some but not all of
the low-level list-as-set operations depend on the assumption that the
elements of a given list are distinct.)

I'm thinking of replacing this representation by a
variable-length-bitmap representation. Basically it would be like

struct bitmapset {
int nwords; /* number of words in array */
int array[1]; /* really [nwords] */
};

Each array element would hold 32 bits; the integer i is a member of
the set iff (array[i/32] >> (i%32)) & 1 == 1. For sets containing no
elements over 31 (which would account for the vast majority of queries)
only a single word would be needed in the array. Operations like set
union, intersection, and subset test could process 32 bits at a time ---
they'd reduce to trivial C operations like |=, &=, & ~, applied once per
word. There would be a few things that would be slower (like iterating
through the actual integer elements of a set) but AFAICT this
representation is much better suited to the planner's needs than the
list method.

I've been thinking of doing this for a while just on efficiency grounds,
but kept putting it off because I don't expect much of any performance
gain on simple queries. (You need a dozen or so tables in a query
before the inefficiencies of the list representation really start to
hurt.) But tonight I'm thinking I'll do it anyway, because it'll also
be impervious to duplicate-element bugs.

Comments?

regards, tom lane


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Planning a change of representation in the planner
Date: 2003-02-07 08:08:56
Message-ID: 1044605336.5672.2.camel@taru.tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane kirjutas R, 07.02.2003 kell 06:35:
> I've been thinking of doing this for a while just on efficiency grounds,
> but kept putting it off because I don't expect much of any performance
> gain on simple queries. (You need a dozen or so tables in a query
> before the inefficiencies of the list representation really start to
> hurt.) But tonight I'm thinking I'll do it anyway, because it'll also
> be impervious to duplicate-element bugs.
>
> Comments?

Maybe the quicker way to avoid duplicate-element bugs (and get faster
merges) is to keep the lists ordered, so instead of just appending the
next int, you scan to the proper place and put it there (if it is not
there already).

Ordered lists are also much faster ( just O(N) ) to
compare/union/intersect.

--
Hannu Krosing <hannu(at)tm(dot)ee>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Planning a change of representation in the planner
Date: 2003-02-07 14:41:15
Message-ID: 26603.1044628875@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hannu Krosing <hannu(at)tm(dot)ee> writes:
> Maybe the quicker way to avoid duplicate-element bugs (and get faster
> merges) is to keep the lists ordered, so instead of just appending the
> next int, you scan to the proper place and put it there (if it is not
> there already).

I had thought of doing that before it occurred to me to switch to a
bitmap representation, but I was always afraid to --- I think it would
be more buggy not less so. The compiler won't give any help in catching
places where plain-list operations are applied to what should be an
ordered list. If I change the struct type completely, then the compiler
will help me.

Also, the bitmap representation makes for a nice reduction in palloc()
traffic (typically one palloc per set, not one per set element). That
part of the performance gain won't be there if we just change to ordered
lists.

regards, tom lane