Variable-length FunctionCallInfoData

Lists: pgsql-hackers
From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Variable-length FunctionCallInfoData
Date: 2018-06-05 17:29:52
Message-ID: 20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

While prototyping codegen improvements for JITed expression evaluation,
I once more hit the issue that the FunctionCallInfoData structs are
really large (936 bytes), despite arguments beyond the fourth barely
every being used. I think we should fix that.

What I think we should do is convert
FunctionCallInfoData->{arg,argisnull} into an array of NullableDatum
(new type, a struct of Datum and bool), and then use a variable length
array for the arguments. In the super common case of 2 arguments that
reduces the size of the array from 936 to 64 bytes. Besides the size
reduction this also noticably reduces the number of cachelines accessed
- before it's absolutely guaranteed that the arg and argnull arrays for
the same argument aren't on the same cacheline, after it's almost
guaranteed to be the case.

Attached is a *PROTOTYPE* patch doing so. Note I was too lazy to fully
fix up the jit code, I didn't want to do the legwork before we've some
agreement on this. We also can get rid of FUNC_MAX_ARGS after this, but
there's surrounding code that still relies on it.

There's some added uglyness, which I hope we can polish a bit
further. Right now we allocate a good number of FunctionCallInfoData
struct on the stack - which doesn't quite work afterwards anymore. So
the stack allocations, for the majoroity cases where the argument number
is known, currently looks like:

union {
FunctionCallInfoData fcinfo;
char *fcinfo_data[SizeForFunctionCallInfoData(0)];
} fcinfodata;
FunctionCallInfo fcinfo = &fcinfodata.fcinfo;

that's not pretty, but also not that bad.

It's a bit unfortunate that this'll break some extensions, but I don't
really see a way around that. The current approach, to me, clearly
doesn't have a future. I wonder if we should add a bunch of accessor
macros / inline functions that we (or extension authors) can backpatch
to reduce the pain of maintaining different code paths.

Besides the change here, I think we should also go much further with the
conversion to NullableDatum. There's two main areas of change: I want
to move the execExpr.c related code so steps return data into
NullableDatums - that removes a good chunk of pointer dereferences and
allocations. Secondly I think we should move TupleTableSlot to this
format - the issue with nulls / datums being on separate cachelines is
noticeable in profiles, but more importantly the code looks more
consistent with it.

As an example for the difference in memory usage, here's the memory
consumption at ExecutorRun time, for TPCH's Q01:

master:
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
PortalContext: 1024 total in 1 blocks; 576 free (0 chunks); 448 used:
ExecutorState: 90744 total in 5 blocks; 31568 free (2 chunks); 59176 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 3488 free (0 chunks); 4704 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
Grand total: 149112 bytes in 13 blocks; 82976 free (2 chunks); 66136 used

patch:
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
PortalContext: 1024 total in 1 blocks; 576 free (0 chunks); 448 used:
ExecutorState: 65536 total in 4 blocks; 33536 free (6 chunks); 32000 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 5408 free (0 chunks); 2784 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
Grand total: 123904 bytes in 12 blocks; 86864 free (6 chunks); 37040 used

As you can see, the ExecutorState context uses nearly half the amount of
memory as before. In a lot of cases a good chunk of the benefit is going
to be hidden due to memory context sizing, but I'd expect that to matter
much less for more complex statements and plpgsql functions etc.

Comments?

Greetings,

Andres Freund

Attachment Content-Type Size
v1-0001-Variable-length-FunctionCallInfoData.patch text/x-diff 125.1 KB

From: serge(at)rielau(dot)com
To: "Andres Freund" <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: RE: Variable-length FunctionCallInfoData
Date: 2018-06-05 17:40:22
Message-ID: 20180605104022.9cf3801d03770ada01bb39dc8f52321d.4224f0183a.mailapi@email18.godaddy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Big +1 on this one.

Here is what we did. It's very crude, but minimized the amount of pain:

It helps that the C compiler treats arrays and pointers the same.

I can dig for the complete patch if you want...

Cheers
Serge
/*
* This struct is the data actually passed to an fmgr-called function.
* There are three flavors:
* FunctionCallInfoData:
* Used when the number of arguments is both known and fixed small
* This structure is used for direct function calls involving
* builtin functions
* This structure must be initialized with: InitFunctionCallInfoData()
* FunctionCallInfoDataVariable:
* Used when the number of arguments is unknown and possibly large
* This structure must be allocated with allocFCInfoVar() and initialized with
* InitFunctionCallInfoData().
* FunctionCallInfoDataLarge:
* Used when the number of arguments is unknown, possibly large and
* the struct is embedded somewhere where a variable length is not acceptable
* This structure must be initialized with: InitFunctionCallInfoData()
*
* All structures have the same header and the arg/argnull fields shoule not be
* accessed directly but via the below accessor macros.
*/
typedef struct FunctionCallInfoData
{
FmgrInfo *flinfo; /* ptr to lookup info used for this call */
fmNodePtr context; /* pass info about context of call */
fmNodePtr resultinfo; /* pass or return extra info about result */
Oid fncollation; /* collation for function to use */
bool isnull; /* function must set true if result is NULL */
bool isFixed; /* Must be true */
short nargs; /* # arguments actually passed */
Datum *arg; /* pointer to function arg array */
bool *argnull; /* pointer to null indicator array */
Datum __arg[FUNC_MAX_ARGS_FIX]; /* Arguments passed to function */
bool __argnull[FUNC_MAX_ARGS_FIX]; /* T if arg[i] is actually NULL */
} FunctionCallInfoData;
typedef struct FunctionCallInfoDataVariable
{
FmgrInfo *flinfo; /* ptr to lookup info used for this call */
fmNodePtr context; /* pass info about context of call */
fmNodePtr resultinfo; /* pass or return extra info about result */
Oid fncollation; /* collation for function to use */
bool isnull; /* function must set true if result is NULL */
bool isFixed; /* Must be false */
short nargs; /* # arguments actually passed */
Datum *arg; /* pointer to function arg array */
bool *argnull; /* pointer to null indicator array */
} FunctionCallInfoDataVariable;

typedef struct FunctionCallInfoDataLarge
{
FmgrInfo *flinfo; /* ptr to lookup info used for this call */
fmNodePtr context; /* pass info about context of call */
fmNodePtr resultinfo; /* pass or return extra info about result */
Oid fncollation; /* collation for function to use */
bool isnull; /* function must set true if result is NULL */
bool isFixed; /* Must be false */
short nargs; /* # arguments actually passed */
Datum *arg; /* pointer to function arg array */
bool *argnull; /* pointer to null indicator array */
Datum __arg[FUNC_MAX_ARGS]; /* Arguments passed to function */
bool __argnull[FUNC_MAX_ARGS]; /* T if arg[i] is actually NULL */
} FunctionCallInfoDataLarge;


From: Andres Freund <andres(at)anarazel(dot)de>
To: serge(at)rielau(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-06-05 17:46:12
Message-ID: 20180605174612.fzcw6rzjsrilf4sw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2018-06-05 10:40:22 -0700, serge(at)rielau(dot)com wrote:
> Big +1 on this one.

Cool.

> Here is what we did. It's very crude, but minimized the amount of pain:

I think I'd rather go for my approach in core though. Needlessly
carrying around a bunch of pointers, and adding the necessary
indirections on accesses, and more allocations don't seem to buy us that
much. Nor do I really like the hackyness...

Greetings,

Andres Freund


From: Serge Rielau <serge(at)rielau(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-06-05 17:53:58
Message-ID: a5659d64-6b29-4fd7-944a-f4e2d193c797@rielau.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yeah, our approach had to mergeable. You can rip that bandaid.


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-06-05 19:08:33
Message-ID: 0ef82301-49b5-f56e-f626-c304379ebed0@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/5/18 13:29, Andres Freund wrote:
> Besides the change here, I think we should also go much further with the
> conversion to NullableDatum. There's two main areas of change: I want
> to move the execExpr.c related code so steps return data into
> NullableDatums - that removes a good chunk of pointer dereferences and
> allocations. Secondly I think we should move TupleTableSlot to this
> format - the issue with nulls / datums being on separate cachelines is
> noticeable in profiles, but more importantly the code looks more
> consistent with it.

There are also a variety of utility functions that return a Datum and
have an extra bool *isnull argument. What are your thoughts on using
NullableDatum more in those cases? Is returning structs a problem for
low-level performance?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-06-05 19:47:38
Message-ID: 20180605194738.amkarpipo7j26guf@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2018-06-05 15:08:33 -0400, Peter Eisentraut wrote:
> On 6/5/18 13:29, Andres Freund wrote:
> > Besides the change here, I think we should also go much further with the
> > conversion to NullableDatum. There's two main areas of change: I want
> > to move the execExpr.c related code so steps return data into
> > NullableDatums - that removes a good chunk of pointer dereferences and
> > allocations. Secondly I think we should move TupleTableSlot to this
> > format - the issue with nulls / datums being on separate cachelines is
> > noticeable in profiles, but more importantly the code looks more
> > consistent with it.
>
> There are also a variety of utility functions that return a Datum and
> have an extra bool *isnull argument.

I was thinking of, for now at least, not touching those.

> What are your thoughts on using NullableDatum more in those cases? Is
> returning structs a problem for low-level performance?

It vastly depends on architecture, compiler and ABI. It'd be a lot
better if we were using C++ (because the caller reserves the space for
such things, and then it's basically free, see return value optimization
/ copy elision). Thus I'm not wild in immediately changing all of them.
I think there aren't that many that aren't performance critical however,
so I guess an argument could be made to change this regardless.

Greetings,

Andres Freund


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-10-09 19:18:02
Message-ID: 20181009191802.ppt6lqcvkpjvkm76@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Here's an updated version of the patch. Besides a rebase the biggest
change is that I've wrapped:

On 2018-06-05 10:29:52 -0700, Andres Freund wrote:
> There's some added uglyness, which I hope we can polish a bit
> further. Right now we allocate a good number of FunctionCallInfoData
> struct on the stack - which doesn't quite work afterwards anymore. So
> the stack allocations, for the majoroity cases where the argument number
> is known, currently looks like:
>
> union {
> FunctionCallInfoData fcinfo;
> char *fcinfo_data[SizeForFunctionCallInfoData(0)];
> } fcinfodata;
> FunctionCallInfo fcinfo = &fcinfodata.fcinfo;
>
> that's not pretty, but also not that bad.

into a macro STACK_FCINFO_FOR_ARGS(varname, numargs). That makes the
code look much nicer.

I think we should go for this. If there's some agreement on that I'll
perform a bit more polishing.

I think it'd probably good to add accessors for value/nullness in
arguments that hide the difference between <v12 and v12, for the sake of
extension authors. Would probably mostly make sense if we backpatched
those for compatibility.

Also attached is a second patch that avoids all the duplication in
fmgr.[ch]. While the savings are nice, I'm a bit doubtful that the
amount of magic here is reasonable. Any opinions?

Greetings,

Andres Freund

Attachment Content-Type Size
v2-0001-Variable-length-FunctionCallInfoData.patch text/x-diff 120.9 KB
v2-0002-Replace-fmgr.-ch-duplication-with-macro-magic.patch text/x-diff 36.3 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-12-14 23:03:48
Message-ID: 20181214230348.zisbzqzj35aqbijj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2018-10-09 12:18:02 -0700, Andres Freund wrote:
> Here's an updated version of the patch. Besides a rebase the biggest
> change is that I've wrapped:
>
> On 2018-06-05 10:29:52 -0700, Andres Freund wrote:
> > There's some added uglyness, which I hope we can polish a bit
> > further. Right now we allocate a good number of FunctionCallInfoData
> > struct on the stack - which doesn't quite work afterwards anymore. So
> > the stack allocations, for the majoroity cases where the argument number
> > is known, currently looks like:
> >
> > union {
> > FunctionCallInfoData fcinfo;
> > char *fcinfo_data[SizeForFunctionCallInfoData(0)];
> > } fcinfodata;
> > FunctionCallInfo fcinfo = &fcinfodata.fcinfo;
> >
> > that's not pretty, but also not that bad.
>
> into a macro STACK_FCINFO_FOR_ARGS(varname, numargs). That makes the
> code look much nicer.
>
> I think we should go for this. If there's some agreement on that I'll
> perform a bit more polishing.
>
> I think it'd probably good to add accessors for value/nullness in
> arguments that hide the difference between <v12 and v12, for the sake of
> extension authors. Would probably mostly make sense if we backpatched
> those for compatibility.
>
>
> Also attached is a second patch that avoids all the duplication in
> fmgr.[ch]. While the savings are nice, I'm a bit doubtful that the
> amount of magic here is reasonable. Any opinions?

Any comments? Otherwise I plan to press forward with this soon-ish.

Greetings,

Andres Freund


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-12-14 23:33:12
Message-ID: 87sgyzy9dp.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Andres" == Andres Freund <andres(at)anarazel(dot)de> writes:

>> I think it'd probably good to add accessors for value/nullness in
>> arguments that hide the difference between <v12 and v12, for the
>> sake of extension authors. Would probably mostly make sense if we
>> backpatched those for compatibility.

Speaking as an affected extension author: don't backpatch them.

The extension code has to cope with being compiled against a minor
release that doesn't have the backpatch, so the extension author has to
do their own workaround anyway. Having additional conditions based on
the minor release is just more pain.

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-12-15 15:45:21
Message-ID: 25654.1544888721@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Andres" == Andres Freund <andres(at)anarazel(dot)de> writes:
>> I think it'd probably good to add accessors for value/nullness in
>> arguments that hide the difference between <v12 and v12, for the
>> sake of extension authors. Would probably mostly make sense if we
>> backpatched those for compatibility.

> Speaking as an affected extension author: don't backpatch them.

Yeah, I agree.

Looking at the patch, it seems like a real pain that substituting
"STACK_FCINFO_FOR_ARGS(fcinfo, ...)" for "FunctionCallInfoData fcinfo"
has the effect of converting "fcinfo" into a pointer. Is there a way
to define the macro so that that doesn't happen, and the ensuing
minor-but-invasive code changes aren't needed? (It'd be easy in C++,
but not quite seeing how to do it in C, at least not without defining
an additional macro for "fcinfo" that we'd then need to #undef at the
end of the function.)

With or without that, I'm pretty sure you wanted the pad member to be
char fcinfo_data[SizeForFunctionCallInfoData(nargs)]; \
not
char *fcinfo_data[SizeForFunctionCallInfoData(nargs)]; \

I also wonder if we should rename the type FunctionCallInfoData,
perhaps to FunctionCallInfo_Data, so as to intentionally break
code that hasn't been converted. On the other hand, that might
introduce too much useless code churn --- not sure how many live
references to that struct type will remain in place.

One more naming thought: would "LOCAL_FCINFO(...)" be a better
name for that macro? I don't think FOR_ARGS is adding much in
any case.

Why does struct agg_strict_input_check now have *both*
NullableDatum and "bool *nulls"? If that's not a typo,
it needs to be documented what the fields are for.

Please try to avoid random changes of vertical whitespace, eg in
the first hunk in pltcl.c. pgindent isn't very good about
cleaning that up.

I do not think the 0002 patch is a good idea. It's going to add
cycles to function calls, and it's not buying anything I'd call
important.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2018-12-15 19:44:30
Message-ID: 20181215194430.mfynjg7qyoliqst5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2018-12-15 10:45:21 -0500, Tom Lane wrote:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> > "Andres" == Andres Freund <andres(at)anarazel(dot)de> writes:
> >> I think it'd probably good to add accessors for value/nullness in
> >> arguments that hide the difference between <v12 and v12, for the
> >> sake of extension authors. Would probably mostly make sense if we
> >> backpatched those for compatibility.
>
> > Speaking as an affected extension author: don't backpatch them.
>
> Yeah, I agree.

Ok, convinced.

> Looking at the patch, it seems like a real pain that substituting
> "STACK_FCINFO_FOR_ARGS(fcinfo, ...)" for "FunctionCallInfoData fcinfo"
> has the effect of converting "fcinfo" into a pointer. Is there a way
> to define the macro so that that doesn't happen, and the ensuing
> minor-but-invasive code changes aren't needed? (It'd be easy in C++,
> but not quite seeing how to do it in C, at least not without defining
> an additional macro for "fcinfo" that we'd then need to #undef at the
> end of the function.)

It'd be nice, but I wasn't able to come up with anything either. While
I'd like that, I'm unfortunately doubtful others will agree to move to
c++ for this :P.

> I also wonder if we should rename the type FunctionCallInfoData,
> perhaps to FunctionCallInfo_Data, so as to intentionally break
> code that hasn't been converted. On the other hand, that might
> introduce too much useless code churn --- not sure how many live
> references to that struct type will remain in place.

Probably doable, there ought not to be many FunctionCallInfoData
references afterwards.

> One more naming thought: would "LOCAL_FCINFO(...)" be a better
> name for that macro? I don't think FOR_ARGS is adding much in
> any case.

Hm, that works.

> Why does struct agg_strict_input_check now have *both*
> NullableDatum and "bool *nulls"? If that's not a typo,
> it needs to be documented what the fields are for.

I'll check whether it can be simplified.

> I do not think the 0002 patch is a good idea. It's going to add
> cycles to function calls, and it's not buying anything I'd call
> important.

Yea, same conclusion I came to. I dislike those verbose copies of
functions a lot, but the approach I could find out to resolve that, seem
like a cure worse than the disease.

Greetings,

Andres Freund


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-25 20:51:02
Message-ID: 20190125205102.ey5muymsazf4yon2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2018-12-15 11:44:30 -0800, Andres Freund wrote:
> On 2018-12-15 10:45:21 -0500, Tom Lane wrote:
> > I also wonder if we should rename the type FunctionCallInfoData,
> > perhaps to FunctionCallInfo_Data, so as to intentionally break
> > code that hasn't been converted. On the other hand, that might
> > introduce too much useless code churn --- not sure how many live
> > references to that struct type will remain in place.
>
> Probably doable, there ought not to be many FunctionCallInfoData
> references afterwards.

I did not like FunctionCallInfo_Data, the _ grated
me. FunctionCallInfoBaseData isn't great, but seems better?

> > With or without that, I'm pretty sure you wanted the pad member to be
> > char fcinfo_data[SizeForFunctionCallInfoData(nargs)]; \
> > not
> > char *fcinfo_data[SizeForFunctionCallInfoData(nargs)]; \

Indeed. And that hid a bug or two, where not enough space for
arguments was allocated. I changed a few on-stack infos to be of
FUNC_MAX_ARGS length, because they're not known at compile time. Seems
better than unnecesarily introducing a dynamic allocation, and they're
not that hot locations.

> > One more naming thought: would "LOCAL_FCINFO(...)" be a better
> > name for that macro? I don't think FOR_ARGS is adding much in
> > any case.
>
> Hm, that works.

Done.

> > Why does struct agg_strict_input_check now have *both*
> > NullableDatum and "bool *nulls"? If that's not a typo,
> > it needs to be documented what the fields are for.
>
> I'll check whether it can be simplified.

It can't trivially: For tuplesort cases the null check points into
TupleTableSlot's isnull array, but for other aggs it points into
FunctionCallInfoData->args. If we change TupleTableSlots to use
NullableDatum as well - probably a good idea for efficiency reasons -
we could change that, but that's a separate reasonably large sided
patch. Added a comment to that end.

Updated patch attached. Besides the above changes, there's a fair bit
of comment changes, and I've implemented the necessary JIT changes.

Greetings,

Andres Freund

Attachment Content-Type Size
v3-0001-Change-function-call-information-to-be-variable-l.patch text/x-diff 155.6 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-27 02:32:53
Message-ID: 20190127023253.if6755wcx6gkiln4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2019-01-25 12:51:02 -0800, Andres Freund wrote:
> Updated patch attached. Besides the above changes, there's a fair bit
> of comment changes, and I've implemented the necessary JIT changes.

I pushed a further polished version of this.

The buildfarm largely seems to have had no problem with it, but handfish
failed:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2019-01-26%2022%3A57%3A19
but I have no idea what the error is, nor whether it's related to this
failure, because for reasons I do not understand, the stage message is
effectively empty. Going to wait for another run before investing
resources to debug...

Regards,

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-27 02:46:50
Message-ID: 22034.1548557210@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> The buildfarm largely seems to have had no problem with it, but handfish
> failed:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2019-01-26%2022%3A57%3A19
> but I have no idea what the error is, nor whether it's related to this
> failure, because for reasons I do not understand, the stage message is
> effectively empty.

Yeah, we've been seeing random failures with empty stage logs on multiple
machines. I suspect something weird with the buildfarm client script,
but no idea what.

regards, tom lane


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-27 04:20:59
Message-ID: b62abdcb-b2cf-f961-d845-8e493cac6071@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 1/26/19 9:46 PM, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> The buildfarm largely seems to have had no problem with it, but handfish
>> failed:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2019-01-26%2022%3A57%3A19
>> but I have no idea what the error is, nor whether it's related to this
>> failure, because for reasons I do not understand, the stage message is
>> effectively empty.
> Yeah, we've been seeing random failures with empty stage logs on multiple
> machines. I suspect something weird with the buildfarm client script,
> but no idea what.

It's happening on a handful of animals (see below) , and at that
intermittently. Some of the animals are leading or bleeding edge
distros. Without more info I'm inclined to suspect environmental factors.

cheers

andrew

pgbfprod=> select sysname, branch, snapshot, log_stage from
public.build_status_log where snapshot > now() - interval '90 days' and
length(log_text) = 0 order by 1,3,2;
  sysname   |    branch     |      snapshot       |           
log_stage           
------------+---------------+---------------------+---------------------------------
 caiman     | REL_11_STABLE | 2018-10-30 05:00:06 | configure.log
 caiman     | REL_10_STABLE | 2018-11-05 01:00:18 | check-pg_upgrade.log
 caiman     | REL_11_STABLE | 2018-11-05 01:05:16 | make.log
 caiman     | REL9_4_STABLE | 2018-11-06 02:57:28 | make.log
 caiman     | REL9_5_STABLE | 2018-11-06 03:00:08 | configure.log
 caiman     | REL_10_STABLE | 2018-11-06 03:12:50 | configure.log
 caiman     | HEAD          | 2018-11-06 03:25:58 | make.log
 caiman     | HEAD          | 2018-11-06 21:55:13 | make.log
 caiman     | HEAD          | 2018-11-09 06:15:27 | make.log
 caiman     | HEAD          | 2018-11-14 08:53:13 |
contrib-install-check-C.log
 caiman     | HEAD          | 2018-11-17 19:58:06 | make-install.log
 caiman     | REL9_6_STABLE | 2018-11-26 02:00:23 | make.log
 caiman     | REL_11_STABLE | 2018-11-26 02:12:53 | make.log
 caiman     | HEAD          | 2018-12-05 13:11:32 | configure.log
 caiman     | REL_11_STABLE | 2018-12-16 20:48:41 | make.log
 caiman     | REL9_5_STABLE | 2018-12-17 01:58:40 | check.log
 caiman     | REL9_6_STABLE | 2018-12-17 02:00:07 | make.log
 caiman     | REL_11_STABLE | 2018-12-17 02:10:13 | make.log
 caiman     | REL_10_STABLE | 2018-12-18 18:00:14 | configure.log
 caiman     | HEAD          | 2018-12-20 22:00:31 | make-contrib.log
 caiman     | REL_11_STABLE | 2018-12-24 11:54:44 | lastcomand.log
 caiman     | HEAD          | 2018-12-24 12:00:13 | configure.log
 caiman     | HEAD          | 2019-01-02 19:00:21 | configure.log
 caiman     | REL_11_STABLE | 2019-01-20 00:54:29 | make.log
 caiman     | REL_11_STABLE | 2019-01-22 06:00:25 | configure.log
 caiman     | REL9_6_STABLE | 2019-01-26 03:50:26 | make.log
 caiman     | HEAD          | 2019-01-26 04:14:53 | make.log
 eelpout    | HEAD          | 2018-11-26 00:35:13 | make.log
 eelpout    | HEAD          | 2018-11-26 00:35:13 | configure.log
 eelpout    | HEAD          | 2018-11-26 00:35:13 | make-contrib.log
 handfish   | REL_11_STABLE | 2018-10-31 00:46:56 | make.log
 handfish   | HEAD          | 2018-10-31 01:00:33 | configure.log
 handfish   | REL_11_STABLE | 2018-11-02 22:25:25 | configure.log
 handfish   | REL_11_STABLE | 2018-11-05 01:05:18 | make.log
 handfish   | REL_11_STABLE | 2018-11-06 03:58:58 | make.log
 handfish   | HEAD          | 2018-11-06 21:51:39 | make.log
 handfish   | REL_11_STABLE | 2018-11-07 01:03:16 | make.log
 handfish   | REL9_5_STABLE | 2018-11-09 05:59:24 | make.log
 handfish   | REL_11_STABLE | 2018-11-09 06:24:31 | make.log
 handfish   | REL_11_STABLE | 2018-11-12 01:00:15 | make.log
 handfish   | HEAD          | 2018-11-26 02:13:04 | make.log
 handfish   | REL_11_STABLE | 2018-12-05 13:11:12 | configure.log
 handfish   | REL_11_STABLE | 2018-12-16 20:45:08 | make.log
 handfish   | REL9_4_STABLE | 2018-12-17 01:55:31 | initdb-C.log
 handfish   | REL_10_STABLE | 2018-12-17 02:09:59 | make.log
 handfish   | HEAD          | 2018-12-18 18:25:55 | make.log
 handfish   | HEAD          | 2018-12-24 12:04:56 | make.log
 handfish   | REL9_6_STABLE | 2019-01-01 16:57:14 | ecpg-check.log
 handfish   | REL_10_STABLE | 2019-01-01 17:05:17 | configure.log
 handfish   | HEAD          | 2019-01-26 05:11:49 | make.log
 handfish   | HEAD          | 2019-01-26 22:57:19 |
testmodules-install-check-C.log
 queensnake | REL9_6_STABLE | 2018-11-26 01:57:27 | check.log
 queensnake | REL_10_STABLE | 2018-11-26 02:00:14 | configure.log
 queensnake | REL_11_STABLE | 2018-11-30 04:57:15 | check-pg_upgrade.log
 queensnake | REL_10_STABLE | 2018-12-05 12:57:28 | make-install.log
 queensnake | REL_11_STABLE | 2018-12-16 20:44:19 | make-contrib.log
 queensnake | REL9_5_STABLE | 2018-12-17 01:57:36 | check-pg_upgrade.log
 queensnake | HEAD          | 2018-12-20 22:09:20 | configure.log
 queensnake | HEAD          | 2019-01-20 00:23:49 | make-contrib.log
 queensnake | HEAD          | 2019-01-26 04:39:01 | make.log
 serinus    | HEAD          | 2018-11-10 23:28:34 | make.log
(61 rows)

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-27 07:03:17
Message-ID: CAM-w4HNFfvGYBUA5R9Nqrvrbc1_2m01t-YYBFX3urTx7Srm73A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I assume you already considered and rejected having a fixed size null
bitmap followed by a variable size array of datums. That seems like it
would be denser and work better with cpu cache.

I guess the reason you prefer the struct is because it can be used
elsewhere on its own?


From: Andres Freund <andres(at)anarazel(dot)de>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-01-27 07:13:15
Message-ID: 20190127071315.n5z6efwd5ybyrol6@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2019-01-27 08:03:17 +0100, Greg Stark wrote:
> I assume you already considered and rejected having a fixed size null
> bitmap followed by a variable size array of datums. That seems like it
> would be denser and work better with cpu cache.

It'd be more expensive to access individually (offset calculation +
masks, ~5 insn, not fully pipelineable), it'd not guarantee that the
null bit and datum are on the same cacheline, you could not pass the
null-bit to various functions accepting a bool*, you could not pass the
new equivalent NullableDatums to other functions (like both the past and
current solution allow).

Greetings,

Andres Freund


From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, pgsql-hackers(at)postgresql(dot)org, Filipe Rosset <rosset(dot)filipe(at)gmail(dot)com>
Subject: Re: Variable-length FunctionCallInfoData
Date: 2019-02-17 17:03:40
Message-ID: a2022e64-6c96-2f42-ad30-b90b8eb7ed1f@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 1/26/19 11:20 PM, Andrew Dunstan wrote:
> On 1/26/19 9:46 PM, Tom Lane wrote:
>> Andres Freund <andres(at)anarazel(dot)de> writes:
>>> The buildfarm largely seems to have had no problem with it, but handfish
>>> failed:
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=handfish&dt=2019-01-26%2022%3A57%3A19
>>> but I have no idea what the error is, nor whether it's related to this
>>> failure, because for reasons I do not understand, the stage message is
>>> effectively empty.
>> Yeah, we've been seeing random failures with empty stage logs on multiple
>> machines. I suspect something weird with the buildfarm client script,
>> but no idea what.
>
> It's happening on a handful of animals (see below) , and at that
> intermittently. Some of the animals are leading or bleeding edge
> distros. Without more info I'm inclined to suspect environmental factors.
>

I think I have discovered the issue - small logic bug on my part :-(
Luckily it won't affect most users, especially if they are using
run_branches.pl.

Fixed in
<https://github.com/PGBuildFarm/client-code/commit/d4dd30e3b43981b51de8f5b572f131c759737813>
There will be a new release this week which will contain the fix.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services