Re: [HACKERS] Function structure in formatting.c

Lists: pgsql-hackerspgsql-patches
From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Function structure in formatting.c
Date: 2007-08-08 11:36:35
Message-ID: 37ed240d0708080436kc4250d5w405d076f9ea6fef@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi hackers,

I'm currently poking around in backend/utils/adt/formatting.c with a
view to improving to_date() parsing (see thread at
http://archives.postgresql.org/pgsql-hackers/2007-07/msg00513.php),
and I've noticed that the way the functions are organised is pretty
weird.

The original author appears to have gone to great lengths to make the
various functions work both for conversions *to* string, and *from*
string. For each formatting "keyword" (DD, MM, etc), there is just
one processing function; dch_global, dch_date or dch_time. Each of
these takes an argument called "is_to_char". Since parsing a date out
of a string, and formatting a date into a string, are fundamentally
different objectives the functions end up reading a lot like this:

if (is_to_char)
{
// do something
}
else
{
// do something completely different
}

In fact, almost all of the actual formatting code in the file is
enclosed in one of these if .. else blocks.

To my mind, it would make a lot more sense (and make hacking the file
a lot easier) if the processing functions were split into to_char and
from_char variants. I'm not sure what, if any, advantage is gleaned
by having these functions combined.

I'd like to hear from someone who has more familiarity with
formatting.c on this. Is there some good reason for keeping the
functions unified?

Obviously there's a fair bit of work in splitting the functions up,
but I'd be willing to do it if only to spare my own sanity when
working on to_date parsing.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Function structure in formatting.c
Date: 2007-08-08 14:38:28
Message-ID: 225.1186583908@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Brendan Jurd" <direvus(at)gmail(dot)com> writes:
> To my mind, it would make a lot more sense (and make hacking the file
> a lot easier) if the processing functions were split into to_char and
> from_char variants. I'm not sure what, if any, advantage is gleaned
> by having these functions combined.

Yeah, I never liked that approach either. I suppose the idea was to
keep the to- and from- code for each format code close together, but
it blurs what's going on to a much greater extent than that's worth.

regards, tom lane


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Function structure in formatting.c
Date: 2007-08-08 16:07:11
Message-ID: 37ed240d0708080907p587c7713l4938afeeeddb32d7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/9/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Brendan Jurd" <direvus(at)gmail(dot)com> writes:
> > To my mind, it would make a lot more sense (and make hacking the file
> > a lot easier) if the processing functions were split into to_char and
> > from_char variants. I'm not sure what, if any, advantage is gleaned
> > by having these functions combined.
>
> Yeah, I never liked that approach either. I suppose the idea was to
> keep the to- and from- code for each format code close together, but
> it blurs what's going on to a much greater extent than that's worth.

Okay, I'll see what I can do.

Incidentally, anybody know what DCH is supposed to stand for?
Throughout the code DCH is used to refer to date/time formatting and
NUM to numeric formatting. Just curious.


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Function structure in formatting.c
Date: 2007-08-09 02:43:06
Message-ID: 37ed240d0708081943w607a96e9y71f891de12f7c1e3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Just quick update on this. It turns out (as it always does) that I
want to refactor a bit more intensively than I first suggested.

The functions dch_global, dch_time and dch_date seem to be wholly
pointless, since dch_global is effectively a one-liner for the FX
flag, and the other two merely wrap around a giant switch statement.

My thought was to split DCH_processor into a to_char flavour and a
from_char flavour, and have those functions do all the work; loop
through each of the FormatNodes and run the giant switch statement.

This means there is no need to put pointers to "action functions" in
each of the KeyWords. It makes the code simpler, more readable -- and
considerably shorter.

A patch will be on the way shortly.


From: "Jaime Casanova" <systemguards(at)gmail(dot)com>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Function structure in formatting.c
Date: 2007-08-09 04:51:39
Message-ID: c2d9e70e0708082151j2a7d61dfpfb41b5f5de1bdd4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/9/07, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> Just quick update on this. It turns out (as it always does) that I
> want to refactor a bit more intensively than I first suggested.
>
[...]
> This means there is no need to put pointers to "action functions" in
> each of the KeyWords. It makes the code simpler, more readable -- and
> considerably shorter.
>
> A patch will be on the way shortly.
>

take your time, this seems like it will be for 8.4 anyway

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
Richard Cook


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Jaime Casanova" <systemguards(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Function structure in formatting.c
Date: 2007-08-09 12:09:19
Message-ID: 37ed240d0708090509n6daf6a57qbcd4b8e1239a4589@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/9/07, Jaime Casanova <systemguards(at)gmail(dot)com> wrote:
>
> take your time, this seems like it will be for 8.4 anyway
>

I hear you, unfortunately "taking my time" usually means I forget
about it for eight months and by the time I come back to it I've
forgotten what I was doing =)

I wasn't really expecting this to make it into 8.3. I just need to
get it done so I can free up the headspace for other projects.


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: pgsql-patches(at)postgresql(dot)org
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, "Jaime Casanova" <systemguards(at)gmail(dot)com>
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2007-08-10 16:53:11
Message-ID: 37ed240d0708100953q60b09fcbu8d6f1132878288bf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hello,

As discussed on -hackers, I've done some refactoring work on
backend/utils/adt/formatting.c, in an attempt to make the code a bit
more intelligible before improving handling of bogus formats.

This is purely a refactor. The functionality of the file hasn't
changed; it does the same job as before, but it does it in ~200 fewer
lines and ~3.5k fewer characters. The clarity of code is greatly
improved. Sadly, performance appears to be unchanged.

Summary of changes:

* Did away with dch_global, dch_date and dch_time.
* Replaced DCH_processor with two new functions DCH_to_char and
DCH_from_char, which now do all the work previously done by
dch_{global,date,time}.
* Removed the 'action' field from the KeyWord struct as it is no longer useful.
* Changed the type of the 'character' field in the FormatNode struct
to char, because ... that's what it is. The original choice of 'int'
seems to have been an error.
* Removed commented-out function declaration for is_acdc. According
to CVS annotate, this hasn't been in use since sometime in the early
Cretaceous period, and in any case I don't know why you'd want to
check whether a string was the rock band AC/DC. =)
* Reworded some of the comments for clarity.
* Didn't touch any of the number formatting routines.

This compiles cleanly on x86 gentoo and passes check, installcheck and
installcheck-parallel.

Thanks for your time,
BJ

Attachment Content-Type Size
formatting-refactor.diff text/plain 68.1 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Jaime Casanova <systemguards(at)gmail(dot)com>
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2007-08-14 04:00:34
Message-ID: 200708140400.l7E40Yl21519@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Brendan Jurd wrote:
> Hello,
>
> As discussed on -hackers, I've done some refactoring work on
> backend/utils/adt/formatting.c, in an attempt to make the code a bit
> more intelligible before improving handling of bogus formats.
>
> This is purely a refactor. The functionality of the file hasn't
> changed; it does the same job as before, but it does it in ~200 fewer
> lines and ~3.5k fewer characters. The clarity of code is greatly
> improved. Sadly, performance appears to be unchanged.
>
> Summary of changes:
>
> * Did away with dch_global, dch_date and dch_time.
> * Replaced DCH_processor with two new functions DCH_to_char and
> DCH_from_char, which now do all the work previously done by
> dch_{global,date,time}.
> * Removed the 'action' field from the KeyWord struct as it is no longer useful.
> * Changed the type of the 'character' field in the FormatNode struct
> to char, because ... that's what it is. The original choice of 'int'
> seems to have been an error.
> * Removed commented-out function declaration for is_acdc. According
> to CVS annotate, this hasn't been in use since sometime in the early
> Cretaceous period, and in any case I don't know why you'd want to
> check whether a string was the rock band AC/DC. =)
> * Reworded some of the comments for clarity.
> * Didn't touch any of the number formatting routines.
>
> This compiles cleanly on x86 gentoo and passes check, installcheck and
> installcheck-parallel.
>
> Thanks for your time,
> BJ

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2007-10-02 08:35:21
Message-ID: 37ed240d0710020135k31b68e33yea63dad06ac04058@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I noticed an editing error in the patch I originally submitted; it
defined the same debugging macro twice.

I've attached a fresh copy of the patch against the current HEAD with
the fix included.

Cheers,
BJ

On 8/11/07, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> Hello,
>
> As discussed on -hackers, I've done some refactoring work on
> backend/utils/adt/formatting.c, in an attempt to make the code a bit
> more intelligible before improving handling of bogus formats.
>
> This is purely a refactor. The functionality of the file hasn't
> changed; it does the same job as before, but it does it in ~200 fewer
> lines and ~3.5k fewer characters. The clarity of code is greatly
> improved. Sadly, performance appears to be unchanged.
>
> Summary of changes:
>
> * Did away with dch_global, dch_date and dch_time.
> * Replaced DCH_processor with two new functions DCH_to_char and
> DCH_from_char, which now do all the work previously done by
> dch_{global,date,time}.
> * Removed the 'action' field from the KeyWord struct as it is no longer useful.
> * Changed the type of the 'character' field in the FormatNode struct
> to char, because ... that's what it is. The original choice of 'int'
> seems to have been an error.
> * Removed commented-out function declaration for is_acdc. According
> to CVS annotate, this hasn't been in use since sometime in the early
> Cretaceous period, and in any case I don't know why you'd want to
> check whether a string was the rock band AC/DC. =)
> * Reworded some of the comments for clarity.
> * Didn't touch any of the number formatting routines.
>
> This compiles cleanly on x86 gentoo and passes check, installcheck and
> installcheck-parallel.
>
> Thanks for your time,
> BJ
>
>

Attachment Content-Type Size
formatting-refactor_1.diff.gz application/x-gzip 10.0 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2007-10-09 00:33:26
Message-ID: 200710090033.l990XRY02455@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Brendan Jurd wrote:
> I noticed an editing error in the patch I originally submitted; it
> defined the same debugging macro twice.
>
> I've attached a fresh copy of the patch against the current HEAD with
> the fix included.
>
> Cheers,
> BJ
>
> On 8/11/07, Brendan Jurd <direvus(at)gmail(dot)com> wrote:
> > Hello,
> >
> > As discussed on -hackers, I've done some refactoring work on
> > backend/utils/adt/formatting.c, in an attempt to make the code a bit
> > more intelligible before improving handling of bogus formats.
> >
> > This is purely a refactor. The functionality of the file hasn't
> > changed; it does the same job as before, but it does it in ~200 fewer
> > lines and ~3.5k fewer characters. The clarity of code is greatly
> > improved. Sadly, performance appears to be unchanged.
> >
> > Summary of changes:
> >
> > * Did away with dch_global, dch_date and dch_time.
> > * Replaced DCH_processor with two new functions DCH_to_char and
> > DCH_from_char, which now do all the work previously done by
> > dch_{global,date,time}.
> > * Removed the 'action' field from the KeyWord struct as it is no longer useful.
> > * Changed the type of the 'character' field in the FormatNode struct
> > to char, because ... that's what it is. The original choice of 'int'
> > seems to have been an error.
> > * Removed commented-out function declaration for is_acdc. According
> > to CVS annotate, this hasn't been in use since sometime in the early
> > Cretaceous period, and in any case I don't know why you'd want to
> > check whether a string was the rock band AC/DC. =)
> > * Reworded some of the comments for clarity.
> > * Didn't touch any of the number formatting routines.
> >
> > This compiles cleanly on x86 gentoo and passes check, installcheck and
> > installcheck-parallel.
> >
> > Thanks for your time,
> > BJ
> >
> >

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2008-03-22 21:02:09
Message-ID: 7344.1206219729@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Brendan Jurd" <direvus(at)gmail(dot)com> writes:
> I noticed an editing error in the patch I originally submitted; it
> defined the same debugging macro twice.
> I've attached a fresh copy of the patch against the current HEAD with
> the fix included.

Working through this patch now. I found one thing that seems to be
a mistake (probably an overenthusiastic search&replace): the patch
changes
- {"iy", 2, dch_date, DCH_IY, TRUE},
to
+ {"iyear", 2, DCH_IY, TRUE},

The removal of dch_date is intended, but surely the keyword should
still be "iy". I'm proceeding on that assumption, but if this change
was actually intended, please explain.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2008-03-22 22:32:51
Message-ID: 9913.1206225171@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Brendan Jurd" <direvus(at)gmail(dot)com> writes:
>> As discussed on -hackers, I've done some refactoring work on
>> backend/utils/adt/formatting.c, in an attempt to make the code a bit
>> more intelligible before improving handling of bogus formats.

Applied with minor revisions.

regards, tom lane


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Function structure in formatting.c
Date: 2008-03-24 10:49:04
Message-ID: 37ed240d0803240349i690c6b9bn5bd530d2b5885bbb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 23/03/2008, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Working through this patch now. I found one thing that seems to be
> a mistake (probably an overenthusiastic search&replace): the patch
> changes
> - {"iy", 2, dch_date, DCH_IY, TRUE},
> to
> + {"iyear", 2, DCH_IY, TRUE},
>
> The removal of dch_date is intended, but surely the keyword should
> still be "iy". I'm proceeding on that assumption, but if this change
> was actually intended, please explain.
>

Nice catch. Not sure how that got in there, but your theory about a
search & replace gone awry seems the most likely.

Now that the functions have been refactored, I'm looking forward to
getting back into improving the sanity checking in to_date.

Cheers,
BJ