Re: embedded list

Lists: pgsql-hackers
From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: embedded list v2
Date: 2012-06-26 23:26:00
Message-ID: 201206270126.01388.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

To recapitulate why I think this sort of embedded list is worthwile:
* minimal memory overhead (16 bytes for double linked list heads/nodes on
64bit systems)
* no additional memory allocation overhead
* no additional dereference to access the contents of a list element
* most modifications are completely branchless
* the current dllist.h interface has double the memory overhead and much more
complex manipulation operators
* Multiple places in postgres have grown local single or double linked list
implementations
* I need it ;)

Attached are three patches:
1. embedded list implementation
2. make the list implementation usable without USE_INLINE
3. convert all callers to ilist.h away from dllist.h

For 1 I:
a. added more comments and some introduction, some more functions
b. moved the file from utils/ilist.h to lib/ilist.h
c. actually included the c file with the check functions
d. did *not* split it up into single/double linked list files, doesn't seem to
be worth the trouble given how small ilist.(c|h) are
e. did *not* try to get an interface similar to dllist.h. I don't think the
old one is better and it makes the breakage more obvious should somebody else
use the old implementation although I doubt it.

I can be convinced to do d. and e. but I don't think they are an improvement.

For 2 I used ugly macro hackery to avoid declaring every function twice, in a
c file and in a header.

Opinions on the state of the above patches?

I did not expect any performance difference in the current usage, but just to
be sure I ran the following tests:

connection heavy:
pgbench -n -S -p 5501 -h /tmp -U andres postgres -c 16 -j 16 -T 10 -C
master: 3109 3024 3012
ilist: 3097 3033 3024

somewhat SearchCatCache heavy:
pgbench -n -S -p 5501 -h /tmp -U andres postgres -T 100 -c 16 -j 1
master: 98979.453879 99554.485631 99393.587880
ilist: 98960.545559 99583.319870 99498.923273

As expected the differences are on the level of noise...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-embedded-list-interface-header-only.patch text/x-patch 14.3 KB
0002-Rework-the-newly-added-ilist-interface-to-be-usable-.patch text/x-patch 10.3 KB
0003-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch text/x-patch 23.4 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 16:23:05
Message-ID: CA+TgmoY66P=GMDXfKFyNquZR2yVattV7__uOBzOEKLcAo_opXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 26, 2012 at 7:26 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Attached are three patches:
> 1. embedded list implementation
> 2. make the list implementation usable without USE_INLINE
> 3. convert all callers to ilist.h away from dllist.h

This code doesn't follow PostgreSQL coding style guidelines; in a
function definition, the name must start on its own line. Also, stuff
like this is grossly unlike what's done elsewhere; use the same
formatting as e.g. foreach:

+#define ilist_d_reverse_foreach(name, ptr) for(name =
(ptr)->head.prev; \
+ name != &(ptr)->head; \
+ name = name->prev)

And this is definitely NOT going to survive pgindent:

+ for(cur = head->head.prev;
+ cur != &head->head;
+ cur = cur->prev){
+ if(!(cur) ||
+ !(cur->next) ||
+ !(cur->prev) ||
+ !(cur->prev->next == cur) ||
+ !(cur->next->prev == cur))
+ {
+ elog(ERROR, "double linked list is corrupted");
+ }
+ }

And this is another meme we don't use elsewhere; add an explicit NULL test:

+ while ((cur = last->next))

And then there's stuff like this:

+ if(!cur){
+ elog(ERROR, "single linked list is corrupted");
+ }

Aside from the formatting issues, I think it would be sensible to
preserve the terminology of talking about the "head" and "tail" of a
list that we use elsewhere, instead of calling them the "front" and
"back" as you've done here. I would suggest that instead of add_after
and add_before you use the names insert_after and insert_before, which
I think is more clear; also instead of remove, I think you should say
delete, for consistency with pg_list.h.

A number of these inlined functions could be rewritten as macros -
e.g. ilist_d_has_next, ilist_d_has_prev, ilist_d_is_empty. Since some
things are written as macros anyway maybe it's good to do all the
one-liners that way, although I guess it doesn't matter that much. I
also agree with your XXX comment that ilist_s_remove should probably
be out of line.

Apart from the above, mostly fairly superficial concerns I think this
makes sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 18:20:59
Message-ID: 201206282020.59376.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, June 28, 2012 06:23:05 PM Robert Haas wrote:
> On Tue, Jun 26, 2012 at 7:26 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> > Attached are three patches:
> > 1. embedded list implementation
> > 2. make the list implementation usable without USE_INLINE
> > 3. convert all callers to ilist.h away from dllist.h
>
> This code doesn't follow PostgreSQL coding style guidelines; in a
> function definition, the name must start on its own line.
Hrmpf. Yes.

> Also, stuff like this is grossly unlike what's done elsewhere; use the same
> formatting as e.g. foreach:
> +#define ilist_d_reverse_foreach(name, ptr) for(name =
> (ptr)->head.prev; \
> + name != &(ptr)->head; \
> + name = name->prev)
Its not only unlike the rest its also ugly... Fixed.

> And this is definitely NOT going to survive pgindent:
>
> + for(cur = head->head.prev;
> + cur != &head->head;
> + cur = cur->prev){
> + if(!(cur) ||
> + !(cur->next) ||
> + !(cur->prev) ||
> + !(cur->prev->next == cur) ||
> + !(cur->next->prev == cur))
> + {
> + elog(ERROR, "double linked list is corrupted");
> + }
> + }
I changed the for() contents to one line. I don't think I can write anything
that won't be changed by pgindent for the if()s contents.

> And this is another meme we don't use elsewhere; add an explicit NULL test:
>
> + while ((cur = last->next))
Fixed.

> And then there's stuff like this:
>
> + if(!cur){
> + elog(ERROR, "single linked list is corrupted");
> + }
Fixed the places that I found.

> Aside from the formatting issues, I think it would be sensible to
> preserve the terminology of talking about the "head" and "tail" of a
> list that we use elsewhere, instead of calling them the "front" and
> "back" as you've done here. I would suggest that instead of add_after
> and add_before you use the names insert_after and insert_before, which
> I think is more clear; also instead of remove, I think you should say
> delete, for consistency with pg_list.h.
Heh. Ive been poisoned from using c++ too much (thats partially stl names).

Changed.

> A number of these inlined functions could be rewritten as macros -
> e.g. ilist_d_has_next, ilist_d_has_prev, ilist_d_is_empty. Since some
> things are written as macros anyway maybe it's good to do all the
> one-liners that way, although I guess it doesn't matter that much.
I find inline functions preferrable because of multiple evaluation stuff. The
macros are macros because of the dynamic return type and other similar issues
which cannot be done in plain C.

> I also agree with your XXX comment that ilist_s_remove should probably
> be out of line.
Done.

Looks good now?

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-embedded-list-interface.patch text/x-patch 16.1 KB
0002-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch text/x-patch 23.6 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 19:47:05
Message-ID: 1340911563-sup-2911@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Andres Freund's message of jue jun 28 14:20:59 -0400 2012:

> Looks good now?

The one thing I dislike about this code is the names you've chosen. I
mean, ilist_s_stuff and ilist_d_stuff. I mean, why not just Slist_foo
and Dlist_bar, say? As far as I can tell, you've chosen the "i" prefix
because it's "integrated" or "inline", but this seems to me a rather
irrelevant implementation detail that's of little use to the callers.

Also, I don't find so great an idea to have everything in a single file.
Is there anything wrong with separating singly and doubly linked lists
each to its own file? Other than you not liking it, I mean. As a
person who spends some time trying to untangle header dependencies, I
would appreciate keeping stuff as lean as possible. However, since
nobody else seems to have commented on this, maybe it's just me.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 19:57:26
Message-ID: CA+Tgmob9yJj8fHD4HtqbROo_EDcHSqH11ejvLQ7q+0o84Vft3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 28, 2012 at 3:47 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
>
> Excerpts from Andres Freund's message of jue jun 28 14:20:59 -0400 2012:
>
>> Looks good now?
>
> The one thing I dislike about this code is the names you've chosen.  I
> mean, ilist_s_stuff and ilist_d_stuff.  I mean, why not just Slist_foo
> and Dlist_bar, say?  As far as I can tell, you've chosen the "i" prefix
> because it's "integrated" or "inline", but this seems to me a rather
> irrelevant implementation detail that's of little use to the callers.
>
> Also, I don't find so great an idea to have everything in a single file.
> Is there anything wrong with separating singly and doubly linked lists
> each to its own file?  Other than you not liking it, I mean.  As a
> person who spends some time trying to untangle header dependencies, I
> would appreciate keeping stuff as lean as possible.  However, since
> nobody else seems to have commented on this, maybe it's just me.

Well, it's not JUST you. I agree entirely with all of your points.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 20:03:26
Message-ID: 201206282203.27109.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, June 28, 2012 09:47:05 PM Alvaro Herrera wrote:
> Excerpts from Andres Freund's message of jue jun 28 14:20:59 -0400 2012:
> > Looks good now?
>
> The one thing I dislike about this code is the names you've chosen. I
> mean, ilist_s_stuff and ilist_d_stuff. I mean, why not just Slist_foo
> and Dlist_bar, say? As far as I can tell, you've chosen the "i" prefix
> because it's "integrated" or "inline", but this seems to me a rather
> irrelevant implementation detail that's of little use to the callers.
Well, its not irrelevant because you actually need to change the contained
structs to use it. I find that a pretty relevant distinction.

> Also, I don't find so great an idea to have everything in a single file.
> Is there anything wrong with separating singly and doubly linked lists
> each to its own file? Other than you not liking it, I mean. As a
> person who spends some time trying to untangle header dependencies, I
> would appreciate keeping stuff as lean as possible. However, since
> nobody else seems to have commented on this, maybe it's just me.
Robert had the same comment, its not just you...

It would mean duplicating the ugliness around the conditional inlining, the
comment explaining how to use the stuff (because its basically used the same
way for single and double linked lists), you would need to #define
ilist_container twice or have a third file....
Just seems to much overhead for ~100 lines (the single linked list
implementation).

What I wonder is how hard it would be to remove catcache.h's structs into the
implementation. Thats the reason why the old and new list implementation
currently is included all over the backend...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: embedded list v2
Date: 2012-06-28 21:06:49
Message-ID: 201206282306.49566.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, June 28, 2012 10:03:26 PM Andres Freund wrote:
> What I wonder is how hard it would be to remove catcache.h's structs into
> the implementation. Thats the reason why the old and new list
> implementation currently is included all over the backend...
Moving them into the implementation isn't possible, but catcache.h being
included just about everywhere simply isn't needed.

It being included everywhere was introduced by a series of commits from Bruce:
b85a965f5fc7243d0386085e12f7a6c836503b42
b43ebe5f83b28e06a3fd933b989aeccf0df7844a
e0522505bd13bc5aae993fc50b8f420665d78e96
and others

That looks broken. An implementation file not including its own header... A
minimal patch to fix this particular problem is attached (looks like there are
others in the series).

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Stop-including-catcache.h-from-syscache.h.patch text/x-patch 10.2 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-06-28 21:09:03
Message-ID: 1340917461-sup-6517@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Andres Freund's message of jue jun 28 16:03:26 -0400 2012:
>
> On Thursday, June 28, 2012 09:47:05 PM Alvaro Herrera wrote:
> > Excerpts from Andres Freund's message of jue jun 28 14:20:59 -0400 2012:
> > > Looks good now?
> >
> > The one thing I dislike about this code is the names you've chosen. I
> > mean, ilist_s_stuff and ilist_d_stuff. I mean, why not just Slist_foo
> > and Dlist_bar, say? As far as I can tell, you've chosen the "i" prefix
> > because it's "integrated" or "inline", but this seems to me a rather
> > irrelevant implementation detail that's of little use to the callers.
> Well, its not irrelevant because you actually need to change the contained
> structs to use it. I find that a pretty relevant distinction.

Sure, at that point it is relevant. Once you're past that, there's no
point. I mean, you would look up how it's used in the header comment of
the implementation, and then just refer to the API.

> > Also, I don't find so great an idea to have everything in a single file.
> > Is there anything wrong with separating singly and doubly linked lists
> > each to its own file? Other than you not liking it, I mean. As a
> > person who spends some time trying to untangle header dependencies, I
> > would appreciate keeping stuff as lean as possible. However, since
> > nobody else seems to have commented on this, maybe it's just me.
> Robert had the same comment, its not just you...
>
> It would mean duplicating the ugliness around the conditional inlining, the
> comment explaining how to use the stuff (because its basically used the same
> way for single and double linked lists), you would need to #define
> ilist_container twice or have a third file....
> Just seems to much overhead for ~100 lines (the single linked list
> implementation).

Well, then don't duplicate a comment -- create a README.lists and
refer to it in the comments. Not sure about the ilist_container stuff,
but in principle I don't have a problem with having a file with a single
#define that's included by two other headers.

> What I wonder is how hard it would be to remove catcache.h's structs into the
> implementation. Thats the reason why the old and new list implementation
> currently is included all over the backend...

Yeah, catcache.h is a pretty ugly beast. I'm sure there are ways to behead it.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: embedded list v2
Date: 2012-06-28 21:45:08
Message-ID: 1340919452-sup-6683@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Andres Freund's message of jue jun 28 17:06:49 -0400 2012:
> On Thursday, June 28, 2012 10:03:26 PM Andres Freund wrote:
> > What I wonder is how hard it would be to remove catcache.h's structs into
> > the implementation. Thats the reason why the old and new list
> > implementation currently is included all over the backend...
> Moving them into the implementation isn't possible, but catcache.h being
> included just about everywhere simply isn't needed.
>
> It being included everywhere was introduced by a series of commits from Bruce:
> b85a965f5fc7243d0386085e12f7a6c836503b42
> b43ebe5f83b28e06a3fd933b989aeccf0df7844a
> e0522505bd13bc5aae993fc50b8f420665d78e96
> and others
>
> That looks broken. An implementation file not including its own header... A
> minimal patch to fix this particular problem is attached (looks like there are
> others in the series).

Hmm, I think this is against project policy -- we do want each header to
be compilable separately. I would go instead the way of splitting
resowner.h in two or more pieces.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: embedded list v2
Date: 2012-06-28 21:55:24
Message-ID: 201206282355.24246.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, June 28, 2012 11:45:08 PM Alvaro Herrera wrote:
> Excerpts from Andres Freund's message of jue jun 28 17:06:49 -0400 2012:
> > On Thursday, June 28, 2012 10:03:26 PM Andres Freund wrote:
> > > What I wonder is how hard it would be to remove catcache.h's structs
> > > into the implementation. Thats the reason why the old and new list
> > > implementation currently is included all over the backend...
> >
> > Moving them into the implementation isn't possible, but catcache.h being
> > included just about everywhere simply isn't needed.
> >
> > It being included everywhere was introduced by a series of commits from
> > Bruce: b85a965f5fc7243d0386085e12f7a6c836503b42
> > b43ebe5f83b28e06a3fd933b989aeccf0df7844a
> > e0522505bd13bc5aae993fc50b8f420665d78e96
> > and others
> >
> > That looks broken. An implementation file not including its own header...
> > A minimal patch to fix this particular problem is attached (looks like
> > there are others in the series).
>
> Hmm, I think this is against project policy -- we do want each header to
> be compilable separately. I would go instead the way of splitting
> resowner.h in two or more pieces.
It was done nearly the same way in catcache.h before Bruce changed things. You
can see still the rememnants of that in syscache.h:
/* list-search interface. Users of this must import catcache.h too */
extern struct catclist *SearchSysCacheList(int cacheId, int nkeys,
Datum key1, Datum key2, Datum key3, Datum key4);

The only difference is that gcc warns if you declare a struct in a parameter -
thats why I forward declared it explicitly...

resowner.h still compiles standalone and is still usable. You can even call
ResourceOwnerRememberCatCacheListRef if you get the list parameter from
somewhere else.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list v2
Date: 2012-07-05 01:49:36
Message-ID: CAEYLb_Uta4K_P1uHW7r=Aqo0QNYVLtp2VBpfqyGdijTxBRK+DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 June 2012 19:20, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> <0001-Add-embedded-list-interface.patch>
>
> Looks good now?

I have a few gripes.

+ * there isn't much we can test in a single linked list except that its

There are numerous references to "single linked lists", where, I
believe, "singly linked list" is intended (the same can be said for
"double" and "doubly" respectively).

/* Functions we want to be inlined if possible */
+ #ifndef USE_INLINE
...
+ #endif /* USE_INLINE */

A minor quibble, but that last line probably be:

#endif /* ! defined USE_INLINE */

Another minor quibble. We usually try and keep these in alphabetical order:

*** a/src/backend/lib/Makefile
--- b/src/backend/lib/Makefile
*************** subdir = src/backend/lib
*** 12,17 ****
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global

! OBJS = dllist.o stringinfo.o

include $(top_srcdir)/src/backend/common.mk
--- 12,17 ----
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global

! OBJS = dllist.o stringinfo.o ilist.o

include $(top_srcdir)/src/backend/common.mk

New files generally don't require this:

+ * Portions Copyright (c) 1994, Regents of the University of California

This needs to be altered:

+ /*
+ * enable for extra debugging. This is rather expensive so its not enabled by
+ * default even with --enable-cassert
+ */
+ /* #define ILIST_DEBUG */

I'm not sure that this extra error-detection warrants its own macro.
syncrep.c similarly has its own rather expensive error-detection, with
entire function bodies only compiled when USE_ASSERT_CHECKING is
defined. Any of the other dedicated "debugging macros", like
LOCK_DEBUG or WAL_DEBUG seem to exist to dump LOG messages when
binaries were built with the macros defined (this can happen via
pg_config_manual.h. I note that what you have here lacks a
corresponding entry). I think that it would be more idiomatic to just
use USE_ASSERT_CHECKING, and rewrite the debugging functions such that
they are used directly within an Assert, in the style of syncrep.c .
Now, I know Tom wasn't too enthusiastic about having this sort of
thing within --enable-cassert builds, but I'm inclined to think that
there is little point in having this additional error checking if
they're not at least available when that configuration is used. Maybe
we should consider taking the sophisticated asserts out of
--enable-cassert builds, or removing them entirely, if and when they
prove to be annoying?

There is slight misalignment within the comments at the top of ilist.h.

Within ilist.c, the following block exists:

+ #ifndef USE_INLINE
+ #define ILIST_USE_DECLARATION
+ #endif
+
+ #include "lib/ilist.h"
+
+ #ifndef USE_INLINE
+ #undef ILIST_USE_DECLARATION
+ #endif

I see little reason for the latter "#undef" block within ilist.c. It
isn't that exactly the same code already exists at the end of ilist.h
(though there is that too) - it's mostly that it's unnecessary,
because there is no need or logical reason to #undef within ilist.c.

+ /*
+ * The following function declarations are only used if inlining is supported
+ * or when included from a file that explicitly declares USE_DECLARATION
+ */
+ #ifdef ILIST_USE_DECLARATION

Shouldn't that be "The following function definitions..." and
ILIST_USE_DEFINITIONS?

I think the fact that it's possible in principle for
ILIST_USE_DECLARATION to not be exactly equivalent to ! USE_INLINE is
strictly speaking dangerous, since USE_INLINE needs to be baked into
the ABI targeted by third-party module developers. What if a module
was built that called a function that didn't have an entry in the
procedure linkage table, due to ad-hoc usage of ILIST_USE_DECLARATION?
That'd result in a libdl error, if you're lucky. Now, that probably
sounds stupid - I'm pretty sure that you didn't intend that
ILIST_USE_DECLARATION be set by just any old client of this
infrastructure. Rather, you intended that ILIST_USE_DECLARATION be set
either when ilist.h was included while USE_INLINE, so that macro
expansion would make those functions inline, or within ilist.c, when
!USE_INLINE, so that the functions would not be inlined. This should
be much more explicit though. You simply describe the mechanism rather
than the intent at present.

I rather suspect that INLINE_IF_POSSIBLE should be a general purpose
utility, perhaps defined next to NON_EXEC_STATIC within c.h, with a
brief explanation there (rather than in any new header that needs to
do this). I think you'd be able to reasonably split singly and doubly
linked lists into their own headers without much repetition between
the two then. You could formalise the idea that where USE_INLINE,
another macro, defined alongside INLINE_IF_POSSIBLE (functionally
equivalent to ILIST_USE_DECLARATION in the extant code - say,
USE_INLINING_DEFINITIONS) is going to be generally defined everywhere
USE_INLINE is defined. You're then going to have to deal with the hack
whereby USE_INLINING_DEFINITIONS is set just "to suck the definitions
out of the header" to make !USE_INLINE work within .c files only (or
pretty close). That's kins of logical though - !USE_INLINE is hardly
ever used, so it deserves to get the slightly grottier code. The only
direct macro usage that your header files now need is "#ifdef
USE_INLINING_DEFINITIONS" (plus INLINE_IF_POSSIBLE rather than plain
"static inline", of course), as well as having extern declarations for
the !USE_INLINE case. The idea is that each header file has slightly
fewer eyebrow-raising macro hacks, and there is at least no obvious
redundancy between the two. In particular, this only has to happen
once, within c.h:

#ifdef USE_INLINE
#define INLINE_IF_POSSIBLE static inline
#define USE_INLINING_DEFINITIONS // actually spelt
"ILIST_USE_DECLARATION" in patch
#else
#define INLINE_IF_POSSIBLE
#endif

What's more, under this scheme, the following code does not need to
(and shouldn't) appear at the end of headers like ilist.h:

+ /*
+ * if we defined ILIST_USE_DECLARATION undef it again, its not interesting
+ * outside this file
+ */
+ #ifdef USE_INLINE
+ #undef USE_INLINING_DEFINITIONS // actually spelt
"ILIST_USE_DECLARATION" in patch
+ #endif

because only when considering the !USE_INLINE case do we have to
consider that we might need to manually set USE_INLINING_DEFINITIONS
to "suck in" definitions within C files (as I've said, unsetting it
very probably doesn't matter within a C file, so no need to do it
there either).

If that isn't quite clear, the simple version is:

We assume the USE_INLINE case until we learn otherwise, rather than
assuming neither USE_INLINE nor !USE_INLINE. It's cleaner to treat
!USE_INLINE as a sort of edge case with special handling, rather than
having ilist.h and friends worry about cleaning up after themselves
all the time, and worrying about initialising things for themselves
alone according to the present build configuration.

In any case, if we're going to manage the !USE_INLINE case like this
at all, we should probably formalise it along some lines.

I still don't think this is going to fly:

+ #ifdef __GNUC__
+ #define unused_attr __attribute__((unused))
+ #else
+ #define unused_attr
+ #endif

There is no reasonable way to prevent MSVC from giving a warning as a
result of this. The rationale for doing things this way is:

+ /*
+ * gcc warns about unused parameters if -Wunused-parameter is specified (part
+ * of -Wextra ). In the cases below its perfectly fine not to use those
+ * parameters because they are just passed to make the interface
consistent and
+ * to allow debugging in case of ILIST_DEBUG.
+ *
+ */

Seems to me that you'd be able to do a better job with a thin macro
shim on the existing (usually) inline functions. That'd also take care
of your concerns about multiple evaluation.

The comments could probably use some wordsmithing in various places.
That's no big deal though.

I agree with Alvaro on the naming of these functions. I'd prefer it if
they didn't have the "i" prefix too, though that's hardly very
important.

That's all I have for now. I'll take a look at the other patch
(0002-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch)
soon.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list v2
Date: 2012-07-23 10:55:01
Message-ID: CAEYLb_WqBtXRe7p2SbqJQ4mErv0m5Bdev1AvCkDjPEq-3YrM5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5 July 2012 02:49, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> On 28 June 2012 19:20, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> <0001-Add-embedded-list-interface.patch>
>>
>> Looks good now?
>
> I have a few gripes.

We are passed the nominal deadline. Had you planned on getting back to
me this commitfest? If not, I'll shelve my review of
"0002-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch"
until commitfest 2012-09, and mark this "returned with feedback".

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list v2
Date: 2012-07-23 11:06:28
Message-ID: 201207231306.28758.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, July 23, 2012 12:55:01 PM Peter Geoghegan wrote:
> On 5 July 2012 02:49, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> > On 28 June 2012 19:20, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> <0001-Add-embedded-list-interface.patch>
> >>
> >> Looks good now?
> >
> > I have a few gripes.
>
> We are passed the nominal deadline. Had you planned on getting back to
> me this commitfest? If not, I'll shelve my review of
> "0002-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch"
> until commitfest 2012-09, and mark this "returned with feedback".
I was actually waiting for the second review ;). Sorry for the
misunderstanding.

There is no rule that review cannot happen outside the commitfest, so I would
appreciate if we could push this further...

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: embedded list v2
Date: 2012-09-04 21:56:56
Message-ID: 1346795713-sup-6069@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Andres Freund's message of jue jun 28 17:06:49 -0400 2012:
> On Thursday, June 28, 2012 10:03:26 PM Andres Freund wrote:
> > What I wonder is how hard it would be to remove catcache.h's structs into
> > the implementation. Thats the reason why the old and new list
> > implementation currently is included all over the backend...
> Moving them into the implementation isn't possible, but catcache.h being
> included just about everywhere simply isn't needed.

FWIW this got fixed during some header changes I made a couple of weeks
ago. If you have similar fixes to propose, let me know.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-06 16:09:35
Message-ID: 1346947325-sup-3130@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's a prettified version of this stuff. I found one bug in the macro
ilist_s_head: the test was reversed. Also, curiously, the macro had the
same name as the struct, so I renamed the macro. I take it you haven't
used this macro, so maybe it shouldn't be there at all? Or maybe I
completely misread what the macro is supposed to do.

I also renamed all the structs and functions by changing ilist_s_foo to
Slist_foo. Similarly for ilist_d_foo. This is all mechanical so any
subsequent patch should be trivial to refresh for this change.

I think README.ilist (which is what you had in the comment at the top of
ilist.h) should be heavily expanded. I don't find it at all clear.

There were other cosmetic changes, but the implementation is pretty much
the same you submitted.

I didn't look at the other patch you posted, replacing dllist.c usage;
will do that next to verify that the list implementation works.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
ilist.patch application/octet-stream 16.7 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-06 17:01:01
Message-ID: CA+TgmoYGL_+MnBOqdg5A2o_iGgLE5qEQYE7kKBGwgsPGXkbe1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 6, 2012 at 12:09 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Here's a prettified version of this stuff. I found one bug in the macro
> ilist_s_head: the test was reversed. Also, curiously, the macro had the
> same name as the struct, so I renamed the macro. I take it you haven't
> used this macro, so maybe it shouldn't be there at all? Or maybe I
> completely misread what the macro is supposed to do.
>
> I also renamed all the structs and functions by changing ilist_s_foo to
> Slist_foo. Similarly for ilist_d_foo. This is all mechanical so any
> subsequent patch should be trivial to refresh for this change.

I think this is a good direction, but why not just slist_foo and
dlist_foo? I don't see much value in capitalizing the first letter.
It's not like it's the beginning of a word or anything. Plus, that
way the new stuff will be more obviously different from Dllist, which
it will (I think) replace.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-09 15:36:43
Message-ID: 201209091736.43338.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Alvaro,

Thanks for the review!

On Thursday, September 06, 2012 06:09:35 PM Alvaro Herrera wrote:
> Here's a prettified version of this stuff. I found one bug in the macro
> ilist_s_head: the test was reversed.
Oh, good catch. I had only used the _unchecked version because my code checked
that there are elements just some lines before that...

> Also, curiously, the macro had the same name as the struct, so I renamed the
> macro. I take it you haven't used this macro, so maybe it shouldn't be
there at all? Or maybe I completely misread what the macro is supposed to do.
According to my patches here that got introduced by me whe renaming
_front/back to _head/tail according to Roberts wishes. Sorry for that.

> I also renamed all the structs and functions by changing ilist_s_foo to
> Slist_foo. Similarly for ilist_d_foo. This is all mechanical so any
> subsequent patch should be trivial to refresh for this change.
Ok. I concur with robert that a lower case first letter might be better
readable but again, I don't really care that much.

> I think README.ilist (which is what you had in the comment at the top of
> ilist.h) should be heavily expanded. I don't find it at all clear.
Hm. I agree :(. Let me have a go when you have a state you find acceptable
otherwise...

> There were other cosmetic changes, but the implementation is pretty much
> the same you submitted.
Good.

> I didn't look at the other patch you posted, replacing dllist.c usage;
> will do that next to verify that the list implementation works.
Thanks!

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-14 17:22:18
Message-ID: 1347642731-sup-2925@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Here's an updated version of both patches, as well as a third patch that
converts the cc_node list link in catcache.c into an slist.

There are very few changes here; in ilist.h a singleton slist was being
considered empty. Andres reported this to me privately. One other
change is that in catcache.c we no longer compute a new HASH_INDEX on a
CatCTup in order to remove it from its list; instead we store a pointer
to the list in the element itself. We weren't able to measure any
difference between these two approaches to the problem, so we chose the
approach that hasn't been previously vetoed -- see
http://archives.postgresql.org/message-id/2852.1174575239%40sss.pgh.pa.us

I also addressed the unused_attr thingy by taking it out and having the
non-debug version emit a cast to void of the argument.

I think I would get this committed during CF2, and then have a look at
changing some uses of SHM_QUEUE with ilists too.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-initial-ilist-stuff-from-Andres.patch application/octet-stream 16.5 KB
0002-Remove-usage-of-lib-dllist.h-and-replace-it-by-the-n.patch application/octet-stream 23.6 KB
0003-convert-catcache.h-cc_next-into-an-slist.patch application/octet-stream 5.2 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-14 19:50:37
Message-ID: 1347651393-sup-4543@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Alvaro Herrera's message of vie sep 14 14:22:18 -0300 2012:
>
> Here's an updated version of both patches, as well as a third patch that
> converts the cc_node list link in catcache.c into an slist.

One thing I would like more input in, is whether people think it's
worthwhile to split dlists and slists into separate files. Thus far
this has been mentioned by three people independently.

Another question is whether ilist_container() should actually be a more
general macro "containerof" defined in c.h. (ISTM it would be necessary
to have this macro if we want to split into two files; that way we don't
need to have two macros dlist_container and slist_container with
identical definition, or alternatively a third file that defines just
ilist_container)

Third question is about the INLINE_IF_POSSIBLE business as commented by
Peter. It seems to me that the simple technique used here to avoid
having two copies of the source could be used by memcxt.c, list.c,
sortsupport.c as well (maybe clean up fastgetattr too).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-14 20:48:35
Message-ID: 21476.1347655715@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Here's an updated version of both patches, as well as a third patch that
> converts the cc_node list link in catcache.c into an slist.

There's a lot of stuff here that seems rather unfortunate and/or sloppy.

Does it even compile? The 0002 patch refers to a typedef ilist_d_head
that I don't see defined anywhere. (It would be good to shorten that
name by a couple of characters anyway, for tab-stop alignment reasons.)

The documentation (such as it is) claims that the lists are circularly
linked, which doesn't seem to be the case from the code; slist_foreach
for instance certainly won't work if that's true. (As far as the
docs go, I'd get rid of the README file and put the how-to-use-it
comments into the header file, where they have some chance of being
(a) seen and (b) maintained. But first they need to be rewritten.)

The distinction between head and node structs seems rather useless,
and it's certainly notationally tedious. Do we need it?

I find the need for this change quite unfortunate:

@@ -256,7 +258,7 @@ typedef struct
static AutoVacuumShmemStruct *AutoVacuumShmem;

/* the database list in the launcher, and the context that contains it */
-static Dllist *DatabaseList = NULL;
+static ilist_d_head DatabaseList;
static MemoryContext DatabaseListCxt = NULL;

/* Pointer to my own WorkerInfo, valid on each worker */
@@ -403,6 +405,9 @@ AutoVacLauncherMain(int argc, char *argv[])
/* Identify myself via ps */
init_ps_display("autovacuum launcher process", "", "", "");

+ /* initialize to be empty */
+ ilist_d_init(&DatabaseList);
+
ereport(LOG,
(errmsg("autovacuum launcher started")));

Instead let's provide a macro for an empty list value, so that you can
do something like

static ilist_d_head DatabaseList = EMPTY_DLIST;

and not require a new assumption that there will be a convenient place
to initialize static list variables. In general I think it'd be a lot
better if the lists were defined such that all-zeroes is a valid empty
list header, anyway.

The apparently random decisions to make some things inline functions
and other things macros (even though they evaluate their args multiple
times) will come back to bite us. I am much more interested in
unsurprising behavior of these constructs than I am in squeezing out
an instruction or two, so I'm very much against the unsafe macros.

I think we could do without such useless distinctions as slist_get_head
vs slist_get_head_unchecked, too. We've already got Assert and
ILIST_DEBUG, we do not need even more layers of check-or-not
distinctions.

Meanwhile, obvious checks that *should* be made, like slist_pop_head
asserting that there be something to pop, don't seem to be made.

Not a full review, just some things that struck me in a quick scan...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-14 20:57:54
Message-ID: 21666.1347656274@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> One thing I would like more input in, is whether people think it's
> worthwhile to split dlists and slists into separate files. Thus far
> this has been mentioned by three people independently.

They're small enough and similar enough that one header and one .c file
seem like plenty; but I don't really have a strong opinion about it.

> Another question is whether ilist_container() should actually be a more
> general macro "containerof" defined in c.h. (ISTM it would be necessary
> to have this macro if we want to split into two files; that way we don't
> need to have two macros dlist_container and slist_container with
> identical definition, or alternatively a third file that defines just
> ilist_container)

I'd vote for not having that at all, but rather two separate macros
dlist_container and slist_container. If we had a bunch of operations
that could work interchangeably on dlists and slists, it might be worth
having a concept of "ilist" --- but if we only have this, it would be
better to remove the concept from the API altogether.

> Third question is about the INLINE_IF_POSSIBLE business as commented by
> Peter. It seems to me that the simple technique used here to avoid
> having two copies of the source could be used by memcxt.c, list.c,
> sortsupport.c as well (maybe clean up fastgetattr too).

Yeah, looks reasonable ... material for a different patch of course.
But that would mean INLINE_IF_POSSIBLE should be defined someplace else,
perhaps c.h. Also, I'm not that thrilled with having the header file
define ILIST_USE_DEFINITION. I suggest that it might be better to do

#if defined(USE_INLINE) || defined(DEFINE_ILIST_FUNCTIONS)
... function decls here ...
#else
... extern decls here ...
#endif

where ilist.c defines DEFINE_ILIST_FUNCTIONS before including the
header.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-14 23:20:52
Message-ID: 1347664431-sup-6085@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Tom Lane's message of vie sep 14 17:48:35 -0300 2012:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Here's an updated version of both patches, as well as a third patch that
> > converts the cc_node list link in catcache.c into an slist.
>
> There's a lot of stuff here that seems rather unfortunate and/or sloppy.
>
> Does it even compile? The 0002 patch refers to a typedef ilist_d_head
> that I don't see defined anywhere. (It would be good to shorten that
> name by a couple of characters anyway, for tab-stop alignment reasons.)

Hm, I might have submitted the wrong 0002 file. Sorry about that. (The
correct file would have the right typedef names and a couple of bugfixes
but it'd be pretty similar to what you read.)

> [many useful comments]
>
> Not a full review, just some things that struck me in a quick scan...

Great stuff nonetheless, many thanks. I will see about submitting an
improved version.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-15 00:50:22
Message-ID: 201209150250.23045.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Friday, September 14, 2012 10:48:35 PM Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Here's an updated version of both patches, as well as a third patch that
> > converts the cc_node list link in catcache.c into an slist.
>
> There's a lot of stuff here that seems rather unfortunate and/or sloppy.
Most of that unfortunately my fault not Alvaro's...

> The documentation (such as it is) claims that the lists are circularly
> linked, which doesn't seem to be the case from the code; slist_foreach
> for instance certainly won't work if that's true. (As far as the
> docs go, I'd get rid of the README file and put the how-to-use-it
> comments into the header file, where they have some chance of being
> (a) seen and (b) maintained. But first they need to be rewritten.)
True, only dlist's are circular. The docs were in the header at first, then
somebody voted for moving them ;)

> The distinction between head and node structs seems rather useless,
> and it's certainly notationally tedious. Do we need it?
I think its useful. It makes the usage in structs its embedded to way much
clearer. The head struct actually has a different meaning than normal node
structs because its there even if the list is empty...

> I find the need for this change quite unfortunate:
>
> @@ -256,7 +258,7 @@ typedef struct
> static AutoVacuumShmemStruct *AutoVacuumShmem;
>
> /* the database list in the launcher, and the context that contains it */
> -static Dllist *DatabaseList = NULL;
> +static ilist_d_head DatabaseList;
> static MemoryContext DatabaseListCxt = NULL;
>
> /* Pointer to my own WorkerInfo, valid on each worker */
> @@ -403,6 +405,9 @@ AutoVacLauncherMain(int argc, char *argv[])
> /* Identify myself via ps */
> init_ps_display("autovacuum launcher process", "", "", "");
>
> + /* initialize to be empty */
> + ilist_d_init(&DatabaseList);
> +
> ereport(LOG,
> (errmsg("autovacuum launcher started")));
>
> Instead let's provide a macro for an empty list value, so that you can
> do something like
>
> static ilist_d_head DatabaseList = EMPTY_DLIST;
I don't think that can work with dlists because they are circular and need
their pointers to be adjusted. From my POV it seems to be better to keep those
in sync.

> and not require a new assumption that there will be a convenient place
> to initialize static list variables. In general I think it'd be a lot
> better if the lists were defined such that all-zeroes is a valid empty
> list header, anyway.
For the dlists thats a major efficiency difference in some use cases.
Unfortunately those are the ones I care about... Due to not needing to check
for NULLs several branches can be removed from the whole thing.

> The apparently random decisions to make some things inline functions
> and other things macros (even though they evaluate their args multiple
> times) will come back to bite us. I am much more interested in
> unsurprising behavior of these constructs than I am in squeezing out
> an instruction or two, so I'm very much against the unsafe macros.
At the moment the only thing that are macros are things that cannot be
expressed as inline functions because they return the actual contained type
and/or because they contain a for() loop. Do you have a trick in mind to
handle such cases?

Earlier on when talking with Alvaro I mentioned that I would like to add some
more functions that return the [sd]list_node's instead of the contained
elements. Those should be inline functions.

> I think we could do without such useless distinctions as slist_get_head
> vs slist_get_head_unchecked, too. We've already got Assert and
> ILIST_DEBUG, we do not need even more layers of check-or-not
> distinctions.
The _unchecked variants remove the check whether the list is already empty and
thus give up some safety for speed. Quite often that check is made before
calling dlist_get_head() or such anyway...
I wondered whether the solution for that would be to remove the variants that
check for an empty list (except an Assert).

> Not a full review, just some things that struck me in a quick scan...
Thanks!

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-15 05:32:45
Message-ID: 4226.1347687165@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Friday, September 14, 2012 10:48:35 PM Tom Lane wrote:
>> Instead let's provide a macro for an empty list value, so that you can
>> do something like
>> static ilist_d_head DatabaseList = EMPTY_DLIST;

> I don't think that can work with dlists because they are circular and need
> their pointers to be adjusted.

Well, actually, that just brings us to the main point which is: I do not
believe that circular links are a good design choice here. They prevent
the possibility of trivial list initialization, they make plain
iteration over the list more expensive, and you have provided no
evidence that there's any meaningful savings in other operations, much
less enough to justify those disadvantages.

>> The apparently random decisions to make some things inline functions
>> and other things macros (even though they evaluate their args multiple
>> times) will come back to bite us. I am much more interested in
>> unsurprising behavior of these constructs than I am in squeezing out
>> an instruction or two, so I'm very much against the unsafe macros.

> At the moment the only thing that are macros are things that cannot be
> expressed as inline functions because they return the actual contained type
> and/or because they contain a for() loop.

I don't really care. If you can't build it without multiple-evaluation
macros, it's too dangerous for a fundamental construct that's meant to
be used all over the place. Time to redesign.

One possible option, though it's a bit restrictive, is to insist that
the list node be the first element of whatever it's embedded in,
eliminating the need for ilist_container altogether. This would not
work if we meant to put these kinds of list links into Node-derived
structs, but I suspect we don't need that. Then for instance (given
the additional choice to not use circular links) dlist_get_head
would degenerate to ((type *) (ptr)->head.next), which eliminates
its multi-evaluation hazard and saves a few instructions as well.

Or if you don't want to do that, dlist_get_head(type, membername, ptr)
could be defined as
((type *) dlist_do_get_head(ptr, offsetof(type, membername)))
where dlist_do_get_head is an inline'able function, eliminating the
multi-evaluation-of-ptr hazard.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-15 08:08:50
Message-ID: 201209151008.51251.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom,

On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Friday, September 14, 2012 10:48:35 PM Tom Lane wrote:
> >> Instead let's provide a macro for an empty list value, so that you can
> >> do something like
> >> static ilist_d_head DatabaseList = EMPTY_DLIST;
> >
> > I don't think that can work with dlists because they are circular and
> > need their pointers to be adjusted.
>
> Well, actually, that just brings us to the main point which is: I do not
> believe that circular links are a good design choice here. They prevent
> the possibility of trivial list initialization, they make plain
> iteration over the list more expensive, and you have provided no
> evidence that there's any meaningful savings in other operations, much
> less enough to justify those disadvantages.
I think I have talked about the reasoning on the list before, but here it
goes: The cases where I primarily used them are cases where the list
*manipulation* is a considerable part of the expense. In those situations
having less branches is beneficial and, to my knowledge, cannot be done in
normal flat lists.
For the initial user of those lists - the slab allocator for postgres which I
intend to finish at some point - the move to circular lists instead of
classical lists was an improvement in the 12-15% range.

Inhowfar do they make iteration more expensive? ptr != head shouldn't be more
expensive than !ptr.

Imo, that leaves initialization where I admit that you have a case. I don't
find it a big problem in practise though.

> >> The apparently random decisions to make some things inline functions
> >> and other things macros (even though they evaluate their args multiple
> >> times) will come back to bite us. I am much more interested in
> >> unsurprising behavior of these constructs than I am in squeezing out
> >> an instruction or two, so I'm very much against the unsafe macros.
> >
> > At the moment the only thing that are macros are things that cannot be
> > expressed as inline functions because they return the actual contained
> > type and/or because they contain a for() loop.
>
> I don't really care. If you can't build it without multiple-evaluation
> macros, it's too dangerous for a fundamental construct that's meant to
> be used all over the place. Time to redesign.
Its not like pg doesn't use any other popularly used macros that have multiple
evaluation hazarards. Even in cases where they *could* be replaced by inline
functions without that danger.

> One possible option, though it's a bit restrictive, is to insist that
> the list node be the first element of whatever it's embedded in,
> eliminating the need for ilist_container altogether. This would not
> work if we meant to put these kinds of list links into Node-derived
> structs, but I suspect we don't need that.
I already had list elements which are in multiple lists at the same time and I
think replacing some of the pg_list.h usages might be a good idea even for
Node derived structures given the List manipulation overhead we have seen in
several profiles.

> Then for instance (given the additional choice to not use circular links)
> dlist_get_head would degenerate to ((type *) (ptr)->head.next), which
> eliminates its multi-evaluation hazard and saves a few instructions as well.
I still fail to see why not using cirular lists removes instructions in such a
situation:

Get the first element in a circular list:
(ptr)->head.next

->head.next is at a constant offset from the start of *ptr, just as a ->first
pointer is.

In iterations like:

for(name = (ptr)->head.next;
name != &(ptr)->head;
name = name->next)
{
}

you have one potentially additional indexed memory access (&ptr->head) for the
whole loop to an address which will normally lie on the same cacheline as the
already accessed ->next pointer.

> Or if you don't want to do that, dlist_get_head(type, membername, ptr)
> could be defined as
> ((type *) dlist_do_get_head(ptr, offsetof(type, membername)))
> where dlist_do_get_head is an inline'able function, eliminating the
> multi-evaluation-of-ptr hazard.
Thats a rather neat idea. Let me play with it for a second, it might make some
of the macros safe, although I don't see how it could work for for() loops.

Just to make that clear, purely accessing the first node can be done without
any macros at al, its just the combination of returning the first node and
getting the contained element that needs to be a macro because of the variadic
type issues (at times, I really wish for c++ style templates...)

I will take a stab at trying to improve Alvaro's current patch wrt to those
issues.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-15 17:21:44
Message-ID: 17398.1347729704@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
>> Well, actually, that just brings us to the main point which is: I do not
>> believe that circular links are a good design choice here.

> I think I have talked about the reasoning on the list before, but here it
> goes: The cases where I primarily used them are cases where the list
> *manipulation* is a considerable part of the expense. In those situations
> having less branches is beneficial and, to my knowledge, cannot be done in
> normal flat lists.
> For the initial user of those lists - the slab allocator for postgres which I
> intend to finish at some point - the move to circular lists instead of
> classical lists was an improvement in the 12-15% range.

Let me make my position clear: I will not accept performance as the sole
figure of merit for this datatype. It also has to be easy and reliable
to use, and the current design is a failure on those dimensions.
This question of ease and reliability of use isn't just academic, since
you guys had trouble converting catcache.c without introducing bugs.
That isn't exactly a positive showing for this design.

Furthermore, one datapoint for one use-case (probably measured on only
one CPU architecture) isn't even a very convincing case for the
performance being better. To take a handy example, if we were to
convert dynahash to use dlists, having to initialize each hash bucket
header this way would probably be a pretty substantial cost for a lot
of hashtable use-cases. We have a lot of short-lived dynahash tables.

> Inhowfar do they make iteration more expensive? ptr != head shouldn't be more
> expensive than !ptr.

That's probably true *as long as the head pointer is available in a
register*. But having to reserve a second register for the loop
mechanics can be a serious loss on register-poor architectures.

This also ties into the other problem, since it's hard to code such
loop control as a macro without multiple evaluation of the list-head
argument. To me that's a show stopper of the first order. I'm not
going to accept a replacement for foreach() that introduces bug hazards
that aren't in foreach().

>> I don't really care. If you can't build it without multiple-evaluation
>> macros, it's too dangerous for a fundamental construct that's meant to
>> be used all over the place. Time to redesign.

> Its not like pg doesn't use any other popularly used macros that have multiple
> evaluation hazarards.

There are certainly some multiple-evaluation macros, but I don't think
they are associated with core data types. You will not find any in
pg_list.h for instance. I think it's important that these new forms
of list be as easy and reliable to use as List is. I'm willing to trade
off some micro-performance to have that.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-16 14:23:14
Message-ID: 201209161623.14340.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, September 15, 2012 07:21:44 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
> >> Well, actually, that just brings us to the main point which is: I do not
> >> believe that circular links are a good design choice here.
> >
> > I think I have talked about the reasoning on the list before, but here it
> > goes: The cases where I primarily used them are cases where the list
> > *manipulation* is a considerable part of the expense. In those situations
> > having less branches is beneficial and, to my knowledge, cannot be done
> > in normal flat lists.
> > For the initial user of those lists - the slab allocator for postgres
> > which I intend to finish at some point - the move to circular lists
> > instead of classical lists was an improvement in the 12-15% range.
>
> Let me make my position clear: I will not accept performance as the sole
> figure of merit for this datatype. It also has to be easy and reliable
> to use, and the current design is a failure on those dimensions.
> This question of ease and reliability of use isn't just academic, since
> you guys had trouble converting catcache.c without introducing bugs.
> That isn't exactly a positive showing for this design.
Uhm. I had introduced a bug there, true (Maybe Alvaro as well, I can't remember
anything right now). But it was something like getting the tail instead of the
head element due to copy paste. Nothing will be able to protect the code from
me.

> Furthermore, one datapoint for one use-case (probably measured on only
> one CPU architecture) isn't even a very convincing case for the
> performance being better. To take a handy example, if we were to
> convert dynahash to use dlists, having to initialize each hash bucket
> header this way would probably be a pretty substantial cost for a lot
> of hashtable use-cases. We have a lot of short-lived dynahash tables.

What do you think about doing something like:

#define DLIST_INIT(name) {{&name.head, &name.head}}
static dlist_head DatabaseList = DLIST_INIT(DatabaseList);

That works, but obviously the initialization will have to be performed at some
point.

> This also ties into the other problem, since it's hard to code such
> loop control as a macro without multiple evaluation of the list-head
> argument. To me that's a show stopper of the first order. I'm not
> going to accept a replacement for foreach() that introduces bug hazards
> that aren't in foreach().
What do you think about something like:

typedef struct dlist_iter
{
/*
* Use a union with equivalent storage as dlist_node to make it possible to
* initialize the struct inside a macro without multiple evaluation.
*/
union {
struct {
dlist_node *cur;
dlist_node *end;
};
dlist_node init;
};
} dlist_iter;

typedef struct dlist_mutable_iter
{
union {
struct {
dlist_node *cur;
dlist_node *end;
};
dlist_node init;
};
dlist_node *next;
} dlist_mutable_iter;

#define dlist_iter_foreach(iter, ptr) \
for (iter.init = (ptr)->head; iter.cur != iter.end; \
iter.cur = iter.cur->next)

#define dlist_iter_foreach_modify(iter, ptr) \
for (iter.init = (ptr)->head, iter.next = iter.cur->next; \
iter.cur != iter.end \
iter.cur = iter.next, iter.next = iter.cur->next)

With that and some trivial changes *all* multiple evaluation possibilities are
gone.

(_iter_ in there would go, thats just so I can have both in the same file for
now).

> There are certainly some multiple-evaluation macros, but I don't think
> they are associated with core data types. You will not find any in
> pg_list.h for instance. I think it's important that these new forms
> of list be as easy and reliable to use as List is. I'm willing to trade
> off some micro-performance to have that.
Just from what I had in open emacs frames without switching buffers when I read
that email:
Min/Max in c.h and about half of the macros in htup.h (I had the 9.1 tree open
at that point)...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-16 17:56:54
Message-ID: 201209161956.54346.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Sunday, September 16, 2012 04:23:14 PM Andres Freund wrote:
> What do you think about something like:
>
> typedef struct dlist_iter
> {
> /*
> * Use a union with equivalent storage as dlist_node to make it possible
> to * initialize the struct inside a macro without multiple evaluation. */
> union {
> struct {
> dlist_node *cur;
> dlist_node *end;
> };
> dlist_node init;
> };
> } dlist_iter;
>
> typedef struct dlist_mutable_iter
> {
> union {
> struct {
> dlist_node *cur;
> dlist_node *end;
> };
> dlist_node init;
> };
> dlist_node *next;
> } dlist_mutable_iter;
>
> #define dlist_iter_foreach(iter, ptr) \
> for (iter.init = (ptr)->head; iter.cur != iter.end; \
> iter.cur = iter.cur->next)
>
> #define dlist_iter_foreach_modify(iter, ptr)
\
> for (iter.init = (ptr)->head, iter.next = iter.cur->next; \
> iter.cur != iter.end \
> iter.cur = iter.next, iter.next = iter.cur->next)
>
> With that and some trivial changes *all* multiple evaluation possibilities
> are gone.
>
> (_iter_ in there would go, thats just so I can have both in the same file
> for now).

I am thinking whether a macro like:

#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
#define assert_compatible_types(a, b) _Static_assert( \
__builtin_types_compatible_p(a, __typeof__ (b) ), \
"variable `" #b "` is not compatible to type `" #a "`" )
#else
#define assert_compatible_types(a, b) (void)0
#endif

used like:

#define dlist_iter_foreach(iter, ptr) \
assert_compatible_types(dlist_iter, iter); \
for (iter.init = (ptr)->head; iter.cur != iter.end; \
iter.cur = iter.cur->next)

would be useful.

If you use the wrong type you get an error like:

error: static assertion failed: "variable `iter` is not compatible to type
`dlist_iter`"

Do people think this is something worthwile for some of the macros in pg? At
times the compiler errors that get generated in larger macros can be a bit
confusing and something like that would make it easier to see the originating
error.

I found __builtin_types_compatible while perusing the gcc docs to find whether
there is something like __builtin_constant_p for checking the pureness of an
expression ;)

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-28 22:48:43
Message-ID: 201209290048.43993.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday, September 14, 2012 10:57:54 PM Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > One thing I would like more input in, is whether people think it's
> > worthwhile to split dlists and slists into separate files. Thus far
> > this has been mentioned by three people independently.
>
> They're small enough and similar enough that one header and one .c file
> seem like plenty; but I don't really have a strong opinion about it.
>
> > Another question is whether ilist_container() should actually be a more
> > general macro "containerof" defined in c.h. (ISTM it would be necessary
> > to have this macro if we want to split into two files; that way we don't
> > need to have two macros dlist_container and slist_container with
> > identical definition, or alternatively a third file that defines just
> > ilist_container)
>
> I'd vote for not having that at all, but rather two separate macros
> dlist_container and slist_container. If we had a bunch of operations
> that could work interchangeably on dlists and slists, it might be worth
> having a concept of "ilist" --- but if we only have this, it would be
> better to remove the concept from the API altogether.
>
> > Third question is about the INLINE_IF_POSSIBLE business as commented by
> > Peter. It seems to me that the simple technique used here to avoid
> > having two copies of the source could be used by memcxt.c, list.c,
> > sortsupport.c as well (maybe clean up fastgetattr too).
>
> Yeah, looks reasonable ... material for a different patch of course.
> But that would mean INLINE_IF_POSSIBLE should be defined someplace else,
> perhaps c.h. Also, I'm not that thrilled with having the header file
> define ILIST_USE_DEFINITION. I suggest that it might be better to do
>
> #if defined(USE_INLINE) || defined(DEFINE_ILIST_FUNCTIONS)
> ... function decls here ...
> #else
> ... extern decls here ...
> #endif
>
> where ilist.c defines DEFINE_ILIST_FUNCTIONS before including the
> header.
I am preparing a new version of this right now. So, some last ditch questions
are coming up...

The reason I had the header declare DEFINE_ILIST_FUNCTIONS (or rather
ILIST_USE_DEFINITION back then) instead of reusing USE_INLINE directly is that
it makes it easier to locally change a "module" to not inlining which makes
testing the !USE_INLINE case easier. Does anybody think this is worth
something? I have no strong feelings but found it convenient.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-28 23:39:03
Message-ID: 17859.1348875543@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> The reason I had the header declare DEFINE_ILIST_FUNCTIONS (or rather
> ILIST_USE_DEFINITION back then) instead of reusing USE_INLINE directly is that
> it makes it easier to locally change a "module" to not inlining which makes
> testing the !USE_INLINE case easier. Does anybody think this is worth
> something? I have no strong feelings but found it convenient.

Right offhand it doesn't seem like it really gains that much even for
that use-case. You'd end up editing the include file either way, just
slightly differently.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-28 23:42:49
Message-ID: 201209290142.50065.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, September 29, 2012 01:39:03 AM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > The reason I had the header declare DEFINE_ILIST_FUNCTIONS (or rather
> > ILIST_USE_DEFINITION back then) instead of reusing USE_INLINE directly is
> > that it makes it easier to locally change a "module" to not inlining
> > which makes testing the !USE_INLINE case easier. Does anybody think this
> > is worth something? I have no strong feelings but found it convenient.
>
> Right offhand it doesn't seem like it really gains that much even for
> that use-case. You'd end up editing the include file either way, just
> slightly differently.
Well, with USE_INLINE you have to recompile the whole backend because you
otherwise easily end up with strange incompatibilities between files.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-28 23:54:37
Message-ID: 18107.1348876477@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Saturday, September 29, 2012 01:39:03 AM Tom Lane wrote:
>> Right offhand it doesn't seem like it really gains that much even for
>> that use-case. You'd end up editing the include file either way, just
>> slightly differently.

> Well, with USE_INLINE you have to recompile the whole backend because you
> otherwise easily end up with strange incompatibilities between files.

Eh? You would anyway, or at least recompile every .o file depending on
that header, if what you want is to inline or de-inline the functions.
There's no magic shortcut for that.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-29 00:14:42
Message-ID: 201209290214.43020.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Current version is available at branch ilist in:
git://git.postgresql.org/git/users/andresfreund/postgres.git
ssh://git(at)git(dot)postgresql(dot)org/users/andresfreund/postgres.git

Based on Alvaro's last version I made several changes, including:
* naming is now [ds]list_*
* README is split back into the header
* README is rewritten
* much improved comments
* no non-error checking for empty lists anymore, Asserts added everywhere
* no multiple evaluation at all anymore
* introduction of [ds]list_iterator structs
* typechecking added to macros
* DLIST_STATIC_INIT added to initialize list elements at declaration time.
* added some more functions (symetry, new users)
* s/ILIST_USE_DEFINITION/ILIST_DEFINE_FUNCTIONS/
* don't declare ILIST_DEFINE_FUNCTIONS in the header, rely on USE_INLINE
* pgindent compatible styling

Patch 0001 contains a assert_compatible_types(a, b) and a
assert_compatible_types_bool(a, b) macro which I found very useful to make it
harder to misuse the api. I think its generally useful and possibly should be
used in more places.

Opinions?

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-compile-time-checked-assert_compatible_types-mac.patch text/x-patch 1.5 KB
0002-Add-ds-list-s-which-can-be-used-to-embed-lists-in-bi.patch text/x-patch 26.2 KB
0003-Remove-current-users-of-dllist.h.patch text/x-patch 29.1 KB
0004-Remove-dllist.-ch.patch text/x-patch 7.9 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v2
Date: 2012-09-29 00:15:43
Message-ID: 201209290215.43447.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, September 29, 2012 01:54:37 AM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Saturday, September 29, 2012 01:39:03 AM Tom Lane wrote:
> >> Right offhand it doesn't seem like it really gains that much even for
> >> that use-case. You'd end up editing the include file either way, just
> >> slightly differently.
> >
> > Well, with USE_INLINE you have to recompile the whole backend because you
> > otherwise easily end up with strange incompatibilities between files.
>
> Eh? You would anyway, or at least recompile every .o file depending on
> that header, if what you want is to inline or de-inline the functions.
> There's no magic shortcut for that.
Well, --enable-depend copes with changing that in the header fine. As long as
its only used in a low number of files thats shorter than a full rebuild ;)
Anyway, changed.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-29 00:19:23
Message-ID: 201209290219.24075.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Add [ds]list's which can be used to embed lists in bigger data structures
> without additional memory management

> Alvaro, Andres, Review by Peter G. and Tom
This is missing Robert. Sorry for that.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 16:57:32
Message-ID: 27446.1349024252@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Patch 0001 contains a assert_compatible_types(a, b) and a
> assert_compatible_types_bool(a, b) macro which I found very useful to make it
> harder to misuse the api. I think its generally useful and possibly should be
> used in more places.

This seems like basically a good idea, but the macro names are very
unfortunately chosen: they don't comport with our other names for
assertion macros, and they imply that the test is symmetric which it
isn't. It's also unclear what the point of the _bool version is
(namely, to be used in expression contexts in macros).

I suggest instead

AssertVariableIsOfType(varname, typename)

AssertVariableIsOfTypeMacro(varname, typename)

Or possibly we should leave off the "Assert" prefix, since this will be
a compile-time-constant check and thus not really all that much like
the existing run-time Assert mechanism. Or write "Check" instead of
"Assert", or some other verb.

Anybody got another color for this bikeshed?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 20:33:28
Message-ID: 5171.1349037208@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Current version is available at branch ilist in:
> git://git.postgresql.org/git/users/andresfreund/postgres.git
> ssh://git(at)git(dot)postgresql(dot)org/users/andresfreund/postgres.git

I'm still pretty desperately unhappy with your insistence on circularly
linked dlists. Not only does that make initialization problematic,
but now it's not even consistent with slists.

A possible compromise that would fix the initialization issue is to
allow an empty dlist to be represented as *either* two NULL pointers
or a pair of self-pointers. Conversion from one case to the other
could happen like this:

INLINE_IF_POSSIBLE void
dlist_push_head(dlist_head *head, dlist_node *node)
{
+ if (head->head.next == NULL)
+ dlist_init(head);
+
node->next = head->head.next;
node->prev = &head->head;
node->next->prev = node;
head->head.next = node;

dlist_check(head);
}

It appears to me that of the inline'able functions, only dlist_push_head
and dlist_push_tail would need to do this; the others presume nonempty
lists and so wouldn't have to deal with the NULL/NULL case.
dlist_is_empty would need one extra test too. dlist_foreach could be
something like

#define dlist_foreach(iter, ptr)
for (AssertVariableIsOfTypeMacro(iter, dlist_iter),
AssertVariableIsOfTypeMacro(ptr, dlist_head *),
iter.end = &(ptr)->head,
iter.cur = iter.end->next ? iter.end->next : iter.end;
iter.cur != iter.end;
iter.cur = iter.cur->next)

I remain unimpressed with the micro-efficiency of this looping code,
but since you're set on pessimizing list iteration to speed up list
alteration, maybe it's the best we can do.

BTW, the "fast path" in dlist_move_head can't be right can it?

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 20:33:47
Message-ID: 201209302233.47275.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, September 30, 2012 06:57:32 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > Patch 0001 contains a assert_compatible_types(a, b) and a
> > assert_compatible_types_bool(a, b) macro which I found very useful to
> > make it harder to misuse the api. I think its generally useful and
> > possibly should be used in more places.
>
> This seems like basically a good idea, but the macro names are very
> unfortunately chosen: they don't comport with our other names for
> assertion macros, and they imply that the test is symmetric which it
> isn't. It's also unclear what the point of the _bool version is
> (namely, to be used in expression contexts in macros).
>
> I suggest instead
>
> AssertVariableIsOfType(varname, typename)
>
> AssertVariableIsOfTypeMacro(varname, typename)
>
> Or possibly we should leave off the "Assert" prefix, since this will be
> a compile-time-constant check and thus not really all that much like
> the existing run-time Assert mechanism. Or write "Check" instead of
> "Assert", or some other verb.
>
> Anybody got another color for this bikeshed?
No, happy with the new name.

Thanks for committing! Wondered for a minute what the point of autoconfiscation
is/was but I see that e.g. clang already works... Nice.

The bizarre syntactic placement requirements directly come from the standard
btw. No idea why they thought that would be a good idea... (check 6.7.1,
6.7.2.1, 6.7.10).

Perhaps we need to decouple _Static_assert support from compound statement
support at some point, but we will see.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 20:48:01
Message-ID: 5441.1349038081@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Perhaps we need to decouple _Static_assert support from compound statement
> support at some point, but we will see.

Yeah, possibly, but until we have an example of a non-gcc-compatible
compiler that can do something equivalent, it's hard to guess how we
might need to alter the autoconf tests. Anyway the important thing
for now is the external specification of the macros, and I think we're
good on that (modulo any better naming suggestions).

I'm fairly sure there are already a few cases of Asserts on
compile-time-constant expressions, so I made sure that there was a layer
allowing access to _Static_assert without the type-compatibility code.
I didn't go looking for anything to convert, but I think there's some
in the shared memory initialization code.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 21:34:09
Message-ID: 201209302334.09495.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Sunday, September 30, 2012 10:33:28 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > Current version is available at branch ilist in:
> > git://git.postgresql.org/git/users/andresfreund/postgres.git
> > ssh://git(at)git(dot)postgresql(dot)org/users/andresfreund/postgres.git
>
> I'm still pretty desperately unhappy with your insistence on circularly
> linked dlists. Not only does that make initialization problematic,
> but now it's not even consistent with slists.
The slist code originally grew out of the dlist code and thus was pretty
similar, but at some point (after your dislike of the circular lists?, not
sure) I thought about it again and found no efficiency differences for any of
the common manipulations in single linked lists because you don't need to deal
with prev and tail pointers, so I saw no point in insisting there. No external
user should care.

> A possible compromise that would fix the initialization issue is to
> allow an empty dlist to be represented as *either* two NULL pointers
> or a pair of self-pointers. Conversion from one case to the other
> could happen like this:

> It appears to me that of the inline'able functions, only dlist_push_head
> and dlist_push_tail would need to do this; the others presume nonempty
> lists and so wouldn't have to deal with the NULL/NULL case.
> dlist_is_empty would need one extra test too.
The problem is that dlist_push_head/tail and and dlist_is_empty are prety hot
code paths.

In transaction reassembly/wal decoding for every wal record that we need to
look at in the context of a transaction the code very roughly does something
like:

/* get change */
Change *change;

if (dlist_is_empty(&apply_cache->cached_changes))
change = dlist_container(..., dlist_pop_head_node(&apply_cache-
>cached_changes))
else
change = malloc(...);

/* get data from wal */
fill_change_change(current_wal_record, change);

/* remember change, till TX is complete */
dlist_push_tail(tx->changes, change->node);

(In reality more of those happen, but anyway)

We literally have tens of thousands list manipulation a second if the server is
busy. Iteration only happens once a XLOG_COMMIT/ABORT is found (in combination
with merging the subtransaction's changes).

In the slab allocator I originally built this for it was exactly the same. The
lists are manipulated for every palloc/pfree but only iterated at
MemoryContextReset.

I am really sorry for being stubborn here, but I changed to circular lists
after profiling and finding that pipeline stalls & misprediced branches where
the major thing I could change. Not sure how we can resolv this :(

> BTW, the "fast path" in dlist_move_head can't be right can it?
Yea, its crap, thanks for noticing. Shouldn't cause any issues except being
slower, thats why I probably didn't notice I broke it at some point. ->next is
missing.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-09-30 21:42:56
Message-ID: 201209302342.56993.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday, September 30, 2012 10:48:01 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > Perhaps we need to decouple _Static_assert support from compound
> > statement support at some point, but we will see.
>
> Yeah, possibly, but until we have an example of a non-gcc-compatible
> compiler that can do something equivalent, it's hard to guess how we
> might need to alter the autoconf tests. Anyway the important thing
> for now is the external specification of the macros, and I think we're
> good on that (modulo any better naming suggestions).
I thought msvc supported _Static_assert as well, but after a short search it
seems I misremembered and it only supports static_assert from C++11 (which is
plausible, because I've worked on a C++11 project which was ported to windows
). I don't know how C and C++ compilation modes are different in msvc.

I really don't understand why C and C++ standard development isn't coordinated
more... They seem to come up with annoying incompatibilities all the time.

> I'm fairly sure there are already a few cases of Asserts on
> compile-time-constant expressions, so I made sure that there was a layer
> allowing access to _Static_assert without the type-compatibility code.
> I didn't go looking for anything to convert, but I think there's some
> in the shared memory initialization code.
Definitely a sensible goal. I wasn't really sure how well the idea would be
received after I asked before and only heard echoes of my excitement and didn't
want to spend too much time on it...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-10-01 13:19:35
Message-ID: 50699867.5050005@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/30/12 5:42 PM, Andres Freund wrote:
> I thought msvc supported _Static_assert as well, but after a short search it
> seems I misremembered and it only supports static_assert from C++11 (which is
> plausible, because I've worked on a C++11 project which was ported to windows
> ). I don't know how C and C++ compilation modes are different in msvc.

The issue is rather that the MSVC guys have decided not to bother
supporting C99.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-10-01 15:33:01
Message-ID: 25206.1349105581@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Sunday, September 30, 2012 10:33:28 PM Tom Lane wrote:
>> I'm still pretty desperately unhappy with your insistence on circularly
>> linked dlists. Not only does that make initialization problematic,
>> but now it's not even consistent with slists.

> We literally have tens of thousands list manipulation a second if the server is
> busy.

Tens of thousands, with maybe 1ns extra per call, adds up to what?

> I am really sorry for being stubborn here, but I changed to circular lists
> after profiling and finding that pipeline stalls & misprediced branches where
> the major thing I could change. Not sure how we can resolv this :(

I'm going to be stubborn too. I think you're allowing very small
micro-optimization arguments to contort the design of a fundamental data
structure, in a way that makes it harder to use. That's not a tradeoff
I like. Especially when the micro-optimization isn't even uniformly a
win. I remain of the opinion that the extra cycles spent on iteration
(which are real despite your denials) will outweigh any savings in list
alteration in many use-cases.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Subject: Re: embedded list v3
Date: 2012-10-01 16:53:47
Message-ID: 201210011853.47524.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, October 01, 2012 05:33:01 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Sunday, September 30, 2012 10:33:28 PM Tom Lane wrote:
> >> I'm still pretty desperately unhappy with your insistence on circularly
> >> linked dlists. Not only does that make initialization problematic,
> >> but now it's not even consistent with slists.
> >
> > We literally have tens of thousands list manipulation a second if the
> > server is busy.
>
> Tens of thousands, with maybe 1ns extra per call, adds up to what?
Well, a pipeline stall is a bit more than a ns.

> > I am really sorry for being stubborn here, but I changed to circular
> > lists after profiling and finding that pipeline stalls & misprediced
> > branches where the major thing I could change. Not sure how we can
> > resolv this :(
>
> I'm going to be stubborn too. I think you're allowing very small
> micro-optimization arguments to contort the design of a fundamental data
> structure, in a way that makes it harder to use. That's not a tradeoff
> I like.
Your usability problem is the initialization? Iteration?

dlist_initialiaized_(push_head|push_tail|is_empty)() + your hybrid approach of
checking for NULL at the plain functions?

> Especially when the micro-optimization isn't even uniformly a
> win. I remain of the opinion that the extra cycles spent on iteration
> (which are real despite your denials) will outweigh any savings in list
> alteration in many use-cases.
I am not denying that one more register used which possibly leading to one
more register spill can be an efficiency difference. Just that it is not as big
as the differences are for modification.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list v3
Date: 2012-10-08 19:43:51
Message-ID: CAEYLb_X=RUE0z0ZuYH8AE7+jx=ZXSDULEdyuFZHPpct0FxcxDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 September 2012 01:14, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Current version is available at branch ilist in:
> git://git.postgresql.org/git/users/andresfreund/postgres.git
> ssh://git(at)git(dot)postgresql(dot)org/users/andresfreund/postgres.git

I have taken a look at the ilist branch, merged into today's master.
This patch includes changes to core, so that certain call sites now
call this new doubly-linked list infrastructure rather than
infrastructure located in dllist.c, which the patch deprecates, per
Andres' proprosal. As a convenience to others, I attach a patch file
of same. I have taken the liberty of altering it so that it now uses
the static assert infrastructure that Tom split off, altered somewhat,
and committed about a week ago, with commits
ea473fb2dee7dfe61f8fbdfac9d9da87d84e564e and
0d0aa5d29175c539db1981be27dbbf059be6f3b1. I haven't altered anything
beyond what is needed to make the patch build against head, including
changes already suggested by Tom; this is still entirely Andres'
patch.

Pendantry: This should be in alphabetical order:

! OBJS = stringinfo.o ilist.o

I notice that the patch (my revision) produces a whole bunch of
warnings like this with Clang, though not GCC:

""""""""

[peter(at)peterlaptop cache]$ clang -O2 -g -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -I../../../../src/include
-D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -I/usr/include/libxml2 -c -o
catcache.o catcache.c -MMD -MP -MF .deps/catcache.Po
clang: warning: argument unused during compilation:
'-fexcess-precision=standard'
catcache.c:457:21: warning: expression result unused [-Wunused-value]
CatCache *ccp = slist_container(CatCache, cc_next,
cache_iter.cur);

^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../src/include/lib/ilist.h:721:3: note: expanded from macro
'slist_container'
(AssertVariableIsOfTypeMacro(ptr, slist_node *),...
^
../../../../src/include/c.h:735:2: note: expanded from macro
'AssertVariableIsOfTypeMacro'
StaticAssertExpr(__builtin_types_compatible_p(__typeof__(varname),
typename), \
^
../../../../src/include/c.h:710:46: note: expanded from macro 'StaticAssertExpr'
({ StaticAssertStmt(condition, errmessage); true; })
^
../../../../src/include/c.h:185:15: note: expanded from macro 'true'
#define true ((bool) 1)
^ ~

*** SNIP SNIP SNIP ***

catcache.c:1818:21: warning: expression result unused [-Wunused-value]
CatCache *ccp = slist_container(CatCache, cc_next, iter.cur);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../../src/include/lib/ilist.h:722:3: note: expanded from macro
'slist_container'
AssertVariableIsOfTypeMacro(((type*)NULL)->membername,
slist_node), \
^
../../../../src/include/c.h:735:2: note: expanded from macro
'AssertVariableIsOfTypeMacro'
StaticAssertExpr(__builtin_types_compatible_p(__typeof__(varname),
typename), \
^
../../../../src/include/c.h:710:46: note: expanded from macro 'StaticAssertExpr'
({ StaticAssertStmt(condition, errmessage); true; })
^
../../../../src/include/c.h:185:15: note: expanded from macro 'true'
#define true ((bool) 1)
^ ~
28 warnings generated.

""""""""

I suppose that this is something that's going to have to be addressed
sooner or later.

This seems kind of arbitrary:

/* The socket number we are listening for connections on */
int PostPortNumber;
+
/* The directory names for Unix socket(s) */
char *Unix_socket_directories;
+
/* The TCP listen address(es) */
char *ListenAddresses;

This looks funny:

+ #endif /* defined(USE_INLINE) ||
+ * defined(ILIST_DEFINE_FUNCTIONS) */

I tend to consider the 80-column thing a recommendation more than a requirement.

A further stylistic gripe is that this:

+ #define dlist_container(type, membername, ptr) \
+ (AssertVariableIsOfTypeMacro(ptr, dlist_node *), \
+ AssertVariableIsOfTypeMacro(((type *) NULL)->membername, dlist_node), \
+ ((type *)((char *)(ptr) - offsetof(type, membername))) \
+ )

Should probably look like this:

+ #define dlist_container(type, membername, ptr) \
+ (AssertVariableIsOfTypeMacro(ptr, dlist_node *), \
+ AssertVariableIsOfTypeMacro(((type *) NULL)->membername, dlist_node), \
+ ((type *)((char *)(ptr) - offsetof(type, membername))))

This is a little unclear:

+ * dlist_foreach (iter, &db->tables)
+ * {
+ * // inside an *_foreach the iterator's .cur field can be used to access
+ * // the current element

This comment:

+ * Singly linked lists are *not* circularly linked (how could they be?) in
+ * contrast to the doubly linked lists. As no pointer to the last list element

Isn't quite accurate, if I've understood you correctly; it is surely
possible in principle to have a circular singly-linked list.

This could be a little clearer; its "content"?:

+ * This is used to convert a dlist_node* returned by some list
+ * navigation/manipulation back to its content.

I don't think you should refer to --enable-cassert here; it is
better-principled to refer to USE_ASSERT_CHECKING:

+ /*
+ * enable for extra debugging. This is rather expensive, so it's not
enabled by
+ * default even with --enable-cassert.
+ */
+ /* #define ILIST_DEBUG */

As to the substantial issues, I have no strong opinion either way as
to whether or not the doubly-linked list should be circular. It might
be nice to see some numbers, if that's what is informing Andres'
thinking here.

The code appears to be reasonable to me. The example is generally illustrative.

That's all for now.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment Content-Type Size
ilist.2012_10_08.patch application/octet-stream 75.5 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list v3
Date: 2012-10-08 20:03:52
Message-ID: 201210082203.52764.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Peter,

On Monday, October 08, 2012 09:43:51 PM Peter Geoghegan wrote:
> Pendantry: This should be in alphabetical order:
>
> ! OBJS = stringinfo.o ilist.o
Argh. Youve said that before. Somehow I reintroduced it...

> I notice that the patch (my revision) produces a whole bunch of
> warnings like this with Clang, though not GCC:
>
> """"""""
>
> [peter(at)peterlaptop cache]$ clang -O2 -g -Wall -Wmissing-prototypes
> -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
> -fwrapv -fexcess-precision=standard -g -I../../../../src/include
> -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -I/usr/include/libxml2 -c -o
> catcache.o catcache.c -MMD -MP -MF .deps/catcache.Po
> clang: warning: argument unused during compilation:
> '-fexcess-precision=standard'
> catcache.c:457:21: warning: expression result unused [-Wunused-value]
> CatCache *ccp = slist_container(CatCache, cc_next,
> cache_iter.cur);
>
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../../src/include/lib/ilist.h:721:3: note: expanded from macro
> 'slist_container'
> (AssertVariableIsOfTypeMacro(ptr, slist_node *),...
> ^
> ../../../../src/include/c.h:735:2: note: expanded from macro
> 'AssertVariableIsOfTypeMacro'
> StaticAssertExpr(__builtin_types_compatible_p(__typeof__(varname),
> typename), \
> ^
> ../../../../src/include/c.h:710:46: note: expanded from macro
> 'StaticAssertExpr' ({ StaticAssertStmt(condition, errmessage); true; })
> ^
> ../../../../src/include/c.h:185:15: note: expanded from macro 'true'
> #define true ((bool) 1)
> ^ ~
>
> *** SNIP SNIP SNIP ***
>
> catcache.c:1818:21: warning: expression result unused [-Wunused-value]
> CatCache *ccp = slist_container(CatCache, cc_next,
> iter.cur); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../../src/include/lib/ilist.h:722:3: note: expanded from macro
> 'slist_container'
> AssertVariableIsOfTypeMacro(((type*)NULL)->membername,
> slist_node), \
> ^
> ../../../../src/include/c.h:735:2: note: expanded from macro
> 'AssertVariableIsOfTypeMacro'
> StaticAssertExpr(__builtin_types_compatible_p(__typeof__(varname),
> typename), \
> ^
> ../../../../src/include/c.h:710:46: note: expanded from macro
> 'StaticAssertExpr' ({ StaticAssertStmt(condition, errmessage); true; })
> ^
> ../../../../src/include/c.h:185:15: note: expanded from macro 'true'
> #define true ((bool) 1)
> ^ ~
> 28 warnings generated.
>
> """"""""
>
> I suppose that this is something that's going to have to be addressed
> sooner or later.
That looks like an annoying warning. Split of StaticAssertExpr into
StaticAssertExpr and StaticAssertExprBool?

> This seems kind of arbitrary:
>
> /* The socket number we are listening for connections on */
> int PostPortNumber;
> +
> /* The directory names for Unix socket(s) */
> char *Unix_socket_directories;
> +
> /* The TCP listen address(es) */
> char *ListenAddresses;
Yep, no idea why I added the spaces.

> This looks funny:
>
> + #endif /* defined(USE_INLINE) ||
> + * defined(ILIST_DEFINE_FUNCTIONS) */
>
> I tend to consider the 80-column thing a recommendation more than a
> requirement.
Thats pgindent's doing. I couldn't convince it not to do that short of making
it a multiline comment with ----'s.

> A further stylistic gripe is that this:
>
> + #define dlist_container(type, membername, ptr)
\
> + (AssertVariableIsOfTypeMacro(ptr, dlist_node *), \
> + AssertVariableIsOfTypeMacro(((type *) NULL)->membername, dlist_node),
> \ + ((type *)((char *)(ptr) - offsetof(type, membername)))
\
> + )
>
> Should probably look like this:
>
> + #define dlist_container(type, membername, ptr)
\
> + (AssertVariableIsOfTypeMacro(ptr, dlist_node *), \
> + AssertVariableIsOfTypeMacro(((type *) NULL)->membername, dlist_node),
> \ + ((type *)((char *)(ptr) - offsetof(type, membername))))
Why? I find the former better readable.

> This is a little unclear:
>
> + * dlist_foreach (iter, &db->tables)
> + * {
> + * // inside an *_foreach the iterator's .cur field can be used to
> access + * // the current element

> This comment:
>
> + * Singly linked lists are *not* circularly linked (how could they be?)
> in + * contrast to the doubly linked lists. As no pointer to the last
> list element
>
> Isn't quite accurate, if I've understood you correctly; it is surely
> possible in principle to have a circular singly-linked list.
Its complete crap. One shouldn't write comments after a 2h delayed 6h train
ride. No idea what exactly warped my mind there.

> This could be a little clearer; its "content"?:
>
> + * This is used to convert a dlist_node* returned by some list
> + * navigation/manipulation back to its content.
Youre right.'containing element'? 'containing struct'?

> I don't think you should refer to --enable-cassert here; it is
> better-principled to refer to USE_ASSERT_CHECKING:
Fine with me.

> That's all for now.
Thanks.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-10 03:35:35
Message-ID: 20121010033534.GA11741@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's a new version of this patch, updated to STATIC_IF_INLINE. It
addresses most stuff mentioned so far, excepting only the ugly clang
warning in the static assert macros as pointed out by Peter. The
explanatory comment at the top of ilist.h got reformatted a bit and
slightly reworded as well.

The first file, ilist-4.patch, follows the same semantics implemented by
Andres originally: doubly linked lists are always circular, even when
empty. If you apply the second file (ilist-4-circular.patch) on top of
that, you get lists that can be validly initialized to two NULL
pointers. catcache.c gets an example of such lists: instead of calling
dlist_init repeatedly for each cache, we just MemSet() the whole bunch.

I also included two new functions in that patch, dlisti_push_head and
dlisti_push_tail. These functions are identical to dlist_push_head and
dlist_push_tail, except that they do not accept non-circular lists.
What this means is that callers that find the non-circularity acceptable
can use the regular version, and will run dlist_init() on demand;
callers that want the super-tight code can use the other version.
I didn't bother with a new dlist_is_empty.

I imagine both sides will have much to say about this approach. Please
opine.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
ilist-4.patch text/x-diff 74.1 KB
ilist-4-circular.patch text/x-diff 7.7 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-11 13:23:12
Message-ID: 20121011132311.GA4997@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escribió:

> I also included two new functions in that patch, dlisti_push_head and
> dlisti_push_tail. These functions are identical to dlist_push_head and
> dlist_push_tail, except that they do not accept non-circular lists.
> What this means is that callers that find the non-circularity acceptable
> can use the regular version, and will run dlist_init() on demand;
> callers that want the super-tight code can use the other version.
> I didn't bother with a new dlist_is_empty.

Is there any more input on this? At this point I would recommend
committing this patch _without_ these dlisti functions, i.e. we will
only have the functions that check for NULL-initialized dlists. We can
later discuss whether to include them or not (it would be a much smaller
patch and would not affect the existing functionality in any way.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-11 13:27:17
Message-ID: 201210111527.17188.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, October 11, 2012 03:23:12 PM Alvaro Herrera wrote:
> Alvaro Herrera escribió:
> > I also included two new functions in that patch, dlisti_push_head and
> > dlisti_push_tail. These functions are identical to dlist_push_head and
> > dlist_push_tail, except that they do not accept non-circular lists.
> > What this means is that callers that find the non-circularity acceptable
> > can use the regular version, and will run dlist_init() on demand;
> > callers that want the super-tight code can use the other version.
> > I didn't bother with a new dlist_is_empty.
>
> Is there any more input on this? At this point I would recommend
> committing this patch _without_ these dlisti functions, i.e. we will
> only have the functions that check for NULL-initialized dlists. We can
> later discuss whether to include them or not (it would be a much smaller
> patch and would not affect the existing functionality in any way.)
I can live with that. I didn't have a chance to look at the newest revision
yet, will do that after I finish my first pass through foreign key locks.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-11 16:37:59
Message-ID: 201210111837.59692.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, October 11, 2012 03:27:17 PM Andres Freund wrote:
> On Thursday, October 11, 2012 03:23:12 PM Alvaro Herrera wrote:
> > Alvaro Herrera escribió:
> > > I also included two new functions in that patch, dlisti_push_head and
> > > dlisti_push_tail. These functions are identical to dlist_push_head and
> > > dlist_push_tail, except that they do not accept non-circular lists.
> > > What this means is that callers that find the non-circularity
> > > acceptable can use the regular version, and will run dlist_init() on
> > > demand; callers that want the super-tight code can use the other
> > > version. I didn't bother with a new dlist_is_empty.
> >
> > Is there any more input on this? At this point I would recommend
> > committing this patch _without_ these dlisti functions, i.e. we will
> > only have the functions that check for NULL-initialized dlists. We can
> > later discuss whether to include them or not (it would be a much smaller
> > patch and would not affect the existing functionality in any way.)
>
> I can live with that. I didn't have a chance to look at the newest revision
> yet, will do that after I finish my first pass through foreign key locks.
I looked at and I am happy enough ;)

One thing:
I think you forgot to adjust dlist_reverse_foreach to the NULL list header.

Thanks!

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-15 09:31:46
Message-ID: 201210151131.46443.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday, October 11, 2012 06:37:59 PM Andres Freund wrote:
> On Thursday, October 11, 2012 03:27:17 PM Andres Freund wrote:
> > On Thursday, October 11, 2012 03:23:12 PM Alvaro Herrera wrote:
> > > Alvaro Herrera escribió:
> > > > I also included two new functions in that patch, dlisti_push_head and
> > > > dlisti_push_tail. These functions are identical to dlist_push_head
> > > > and dlist_push_tail, except that they do not accept non-circular
> > > > lists. What this means is that callers that find the non-circularity
> > > > acceptable can use the regular version, and will run dlist_init() on
> > > > demand; callers that want the super-tight code can use the other
> > > > version. I didn't bother with a new dlist_is_empty.
> > >
> > > Is there any more input on this? At this point I would recommend
> > > committing this patch _without_ these dlisti functions, i.e. we will
> > > only have the functions that check for NULL-initialized dlists. We can
> > > later discuss whether to include them or not (it would be a much
> > > smaller patch and would not affect the existing functionality in any
> > > way.)
> >
> > I can live with that. I didn't have a chance to look at the newest
> > revision yet, will do that after I finish my first pass through foreign
> > key locks.
>
> I looked at and I am happy enough ;)
>
> One thing:
> I think you forgot to adjust dlist_reverse_foreach to the NULL list header.

Tom, whats your thought about Alvaro's latest version (except the "bug"
mentioned above)? It looks like a somewhat fair compromise between our
positions and I don't think the external interface needs to change if we decide
to resolve any of our differences differently.

Greetings,

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-16 20:18:00
Message-ID: 20121016201800.GB10361@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's the final version. I think this is ready to go in.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
ilist-5.patch text/x-diff 74.9 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)postgresql(dot)org>
Subject: Re: embedded list
Date: 2012-10-17 16:10:37
Message-ID: 20121017161037.GI5217@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escribió:
> Here's the final version. I think this is ready to go in.

Committed.

There are several uses of SHM_QUEUE in the backend code which AFAICS can
be replaced with dlist. If someone's looking for an easy project,
here's one.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: embedded list
Date: 2012-10-18 20:15:47
Message-ID: 4201.1350591347@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Here's the final version. I think this is ready to go in.

I got around to reviewing this today. I'm pretty seriously annoyed at
the definition of dlist_delete: it should *not* require the list header.
The present coding simply throws away one of the primary advantages of
a doubly-linked list over a singly-linked list, namely that you don't
have to have your hands on the list header in order to unlink a node.
This isn't merely academic either, as I see that the patch to catcache
code actually added a field to struct catctup to support making the
list header available. That's a complete waste of 8 bytes (on a 64-bit
machine) per catalog cache entry. The only thing it buys for us is
the ability to run dlist_check, which is something that isn't even
compiled (not even in an Assert build), and which doesn't actually do
that much useful even if it is compiled --- for instance, there's no way
to verify that the nodes were actually in the list claimed.

I think we should remove the head argument at least from dlist_delete,
and probably also dlist_insert_after and dlist_insert_before.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: embedded list
Date: 2012-10-18 20:39:29
Message-ID: 20121018203929.GS3763@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escribió:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Here's the final version. I think this is ready to go in.
>
> I got around to reviewing this today. I'm pretty seriously annoyed at
> the definition of dlist_delete: it should *not* require the list header.
> The present coding simply throws away one of the primary advantages of
> a doubly-linked list over a singly-linked list, namely that you don't
> have to have your hands on the list header in order to unlink a node.
> This isn't merely academic either, as I see that the patch to catcache
> code actually added a field to struct catctup to support making the
> list header available. That's a complete waste of 8 bytes (on a 64-bit
> machine) per catalog cache entry. The only thing it buys for us is
> the ability to run dlist_check, which is something that isn't even
> compiled (not even in an Assert build), and which doesn't actually do
> that much useful even if it is compiled --- for instance, there's no way
> to verify that the nodes were actually in the list claimed.

Oops. I mentioned this explicitely somewhere in the discussion. I
assumed you had seen that, and that you would have complained had you
found it objectionable. (It's hard enough to figure out if people don't
respond because they don't have a problem with something, or just
because they didn't see it.)

On the other hand, it's convenient to remove them, because in
predicate.c there are plenty of SHM_QUEUE node removals which is clearly
easier to port over to dlist if we don't have to figure out exactly
which list each node is in. (Maybe in other SHM_QUEUE users as well,
but that's the most complex of the bunch.)

> I think we should remove the head argument at least from dlist_delete,
> and probably also dlist_insert_after and dlist_insert_before.

There are more functions that get the list head just to run the check.
Can I assume that you don't propose removing the argument from those?
(dlist_next_node, dlist_prev_node I think are the only ones).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: embedded list
Date: 2012-10-18 20:56:07
Message-ID: 4957.1350593767@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Oops. I mentioned this explicitely somewhere in the discussion. I
> assumed you had seen that, and that you would have complained had you
> found it objectionable.

Sorry, I've been too busy to pay very much attention to this patch.

>> I think we should remove the head argument at least from dlist_delete,
>> and probably also dlist_insert_after and dlist_insert_before.

> There are more functions that get the list head just to run the check.
> Can I assume that you don't propose removing the argument from those?
> (dlist_next_node, dlist_prev_node I think are the only ones).

Yeah, I wondered whether to do the same for those. But it's less of an
issue there, because in practice the caller is almost certainly going to
also need to do dlist_has_next or dlist_has_prev respectively, and those
require the list header.

On the other hand, applying the same principle to slists, you could
argue that slist_has_next and slist_next_node should not require the
head pointer (since that's throwing away an advantage of slists).
If we wanted to remove the head pointer from those, there would be some
value in not having the head argument in dlist_next_node/dlist_prev_node
for symmetry with slist_next_node.

I'm not as excited about these since it seems relatively less likely to
matter. What do you think?

regards, tom lane