ISO 8601 "Time Intervals" of the "format with time-unit deignators"

From: "Ron Mayer" <ron(at)intervideo(dot)com>
To: <pgsql-patches(at)postgresql(dot)org>
Cc: <ron(at)intervideo(dot)com>
Subject: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Date: 2003-09-08 04:50:49
Message-ID: POEDIPIPKGJJLDNIEMBEGEPADIAA.ron@intervideo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-general pgsql-hackers pgsql-patches

Short summary:

This patch allows ISO 8601 "time intervals" using the "format
with time-unit designators" to specify postgresql "intervals".

Below I have (A) What these time intervals are, (B) What I
modified to support them, (C) Issues with intervals I want
to bring up, and (D) a patch supporting them.

It's helpful to me. Any feedback is appreciated. If you
did want to consider including it, let me know what to clean
up. If not, I thought I'd just put it here if anyone else finds
it useful too.

Thanks for your time,

Ron Mayer

Longer:

(A) What these intervals are.

ISO 8601, the standard from which PostgreSQL gets some of it's
time syntax, also has a specification for "time-intervals".

In particular, section 5.5.4.2 has a "Representation of
time-interval by duration only" which I believe maps
nicely to ISO intervals.

Compared to the ISO 8601 time interval specification, the
postgresql interval syntax is quite verbose. For example:

Postgresql interval: ISO8601 Interval
---------------------------------------------------
'1 year 6 months' 'P1Y6M'
'3 hours 25 minutes 42 seconds' 'PT3H25M42S'

Yeah, it's uglier, but it sure is short which can make
for quicker typing and shorter scripts, and if for some
strange reason you had an application using this format
it's nice not to have to translate.

The syntax is as follows:
Basic extended format: PnYnMnDTnHnMnS
PnW

Where everything before the "T" is a date-part and everything
after is a time-part. W is for weeks.
In the date-part, Y=Year, M=Month, D=Day
In the time-part, H=Hour, M=Minute, S=Second

Much more info can be found from the draft standard
ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
The final standard's only available for $$$ so I didn't
look at it. Some other sites imply that this part didn't
change from the last draft to the standard.

(B) This change was made by adding two functions to "datetime.c"
next to where DecodeInterval parses the normal interval syntax.

A total of 313 lines were added, including comments and sgml docs.
Of these only 136 are actual code, the rest, comments, whitespace, etc.

One new function "DecodeISO8601Interval" follows the style of
"DecodeInterval" below it, and trys to strictly follow the ISO
syntax. If it doesn't match, it'll return -1 and the old syntax
will be checked as before.

The first test (first character of the first field must be 'P',
and second character must be 'T' or '\0') should be fast so I don't
think this will impact performance of existing code.

The second function ("adjust_fval") is just a small helper-function
to remove some of the cut&paste style that DecodeInterval used.

It seems to work.
=======================================================================
betadb=# select 'P1M15DT12H30M7S'::interval;
interval
------------------------
1 mon 15 days 12:30:07
(1 row)

betadb=# select '1 month 15 days 12 hours 30 minutes 7 seconds'::interval;
interval
------------------------
1 mon 15 days 12:30:07
(1 row)
=====================================================================

(C) Open issues with intervals, and questions I'd like to ask.

1. DecodeInterval seems to have a hardcoded '.' for specifying
fractional times. ISO 8601 states that both '.' and ',' are
ok, but "of these, the comma is the preferred sign".

In DecodeISO8601Interval I loosened the test to allow
both but left it as it was in DecodeInterval. Should
both be changed to make them more consistant?

2. In "DecodeInterval", fractional weeks and fractional months
can produce seconds; but fractional years can not (rounded
to months). I didn't understand the reasoning for this, so
I left it the same, and followed the same convention for
ISO intervals. Should I change this?

3. I could save a bunch of copy-paste-lines-of-code from the
pre-existing DecodeInterval by calling the adjust_fval helper
function. The tradeoff is a few extra function-calls when
decoding an interval. However I didn't want to risk changes
to the existing part unless you guys encourage me to do so.

(D) The patch.

Index: doc/src/sgml/datatype.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/datatype.sgml,v
retrieving revision 1.123
diff -u -1 -0 -r1.123 datatype.sgml
--- doc/src/sgml/datatype.sgml 31 Aug 2003 17:32:18 -0000 1.123
+++ doc/src/sgml/datatype.sgml 8 Sep 2003 04:04:58 -0000
@@ -1735,20 +1735,71 @@
Quantities of days, hours, minutes, and seconds can be specified without
explicit unit markings. For example, <literal>'1 12:59:10'</> is read
the same as <literal>'1 day 12 hours 59 min 10 sec'</>.
</para>

<para>
The optional precision
<replaceable>p</replaceable> should be between 0 and 6, and
defaults to the precision of the input literal.
</para>
+
+
+ <para>
+ Alternatively, <type>interval</type> values can be written as
+ ISO 8601 time intervals, using the "Format with time-unit designators".
+ This format always starts with the character <literal>'P'</>, followed
+ by a string of values followed by single character time-unit designators.
+ A <literal>'T'</> separates the date and time parts of the interval.
+ </para>
+
+ <para>
+ Format: PnYnMnDTnHnMnS
+ </para>
+ <para>
+ In this format, <literal>'n'</> gets replaced by a number, and
+ <literal>Y</> represents years,
+ <literal>M</> (in the date part) months,
+ <literal>D</> months,
+ <literal>H</> hours,
+ <literal>M</> (in the time part) minutes,
+ and <literal>S</> seconds.
+ </para>
+
+
+ <table id="interval-example-table">
+ <title>Interval Example</title>
+ <tgroup cols="2">
+ <thead>
+ <row>
+ <entry>Traditional</entry>
+ <entry>ISO-8601 time-interval</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>1 month</entry>
+ <entry>P1M</entry>
+ </row>
+ <row>
+ <entry>1 hour 30 minutes</entry>
+ <entry>PT1H30M</entry>
+ </row>
+ <row>
+ <entry>2 years 10 months 15 days 10 hours 30 minutes 20 seconds</entry>
+ <entry>P2Y10M15DT10H30M20S</entry>
+ </row>
+ </tbody>
+ </thead>
+ </table>
+
+ </para>
</sect3>

<sect3>
<title>Special Values</title>

<indexterm>
<primary>time</primary>
<secondary>constants</secondary>
</indexterm>

Index: src/backend/utils/adt/datetime.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/utils/adt/datetime.c,v
retrieving revision 1.116
diff -u -1 -0 -r1.116 datetime.c
--- src/backend/utils/adt/datetime.c 27 Aug 2003 23:29:28 -0000 1.116
+++ src/backend/utils/adt/datetime.c 8 Sep 2003 04:04:59 -0000
@@ -30,20 +30,21 @@
struct tm * tm, fsec_t *fsec, int *is2digits);
static int DecodeNumberField(int len, char *str,
int fmask, int *tmask,
struct tm * tm, fsec_t *fsec, int *is2digits);
static int DecodeTime(char *str, int fmask, int *tmask,
struct tm * tm, fsec_t *fsec);
static int DecodeTimezone(char *str, int *tzp);
static datetkn *datebsearch(char *key, datetkn *base, unsigned int nel);
static int DecodeDate(char *str, int fmask, int *tmask, struct tm * tm);
static void TrimTrailingZeros(char *str);
+static int DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec);


int day_tab[2][13] = {
{31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0},
{31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0}};

char *months[] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec", NULL};

char *days[] = {"Sunday", "Monday", "Tuesday", "Wednesday",
@@ -2872,30 +2873,271 @@
default:
*val = tp->value;
break;
}
}

return type;
}


+void adjust_fval(double fval,struct tm * tm, fsec_t *fsec, int scale);
+{
+ int sec;
+ fval *= scale;
+ sec = fval;
+ tm->tm_sec += sec;
+#ifdef HAVE_INT64_TIMESTAMP
+ *fsec += ((fval - sec) * 1000000);
+#else
+ *fsec += (fval - sec);
+#endif
+}
+
+
+/* DecodeISO8601Interval()
+ *
+ * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of
+ * time-interval by duration only."
+ * Basic extended format: PnYnMnDTnHnMnS
+ * PnW
+ * For more info.
+ * http://www.astroclark.freeserve.co.uk/iso8601/index.html
+ * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
+ *
+ * Examples: P1D for 1 day
+ * PT1H for 1 hour
+ * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min
+ *
+ * The first field is exactly "p" or "pt" it may be of this type.
+ *
+ * Returns -1 if the field is not of this type.
+ *
+ * It pretty strictly checks the spec, with the two exceptions
+ * that a week field ('W') may coexist with other units, and that
+ * this function allows decimals in fields other than the least
+ * significant units.
+ */
+int
+DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec)
+{
+ char *cp;
+ int fmask = 0,
+ tmask;
+ int val;
+ double fval;
+ int arg;
+ int datepart;
+
+ /*
+ * An ISO 8601 "time-interval by duration only" must start
+ * with a 'P'. If it contains a date-part, 'p' will be the
+ * only character in the field. If it contains no date part
+ * it will contain exactly to characters 'PT' indicating a
+ * time part.
+ * Anything else is illegal and will be treated like a
+ * traditional postgresql interval.
+ */
+ if (!(field[0][0] == 'p' &&
+ ((field[0][1] == 0) || (field[0][1] == 't' && field[0][2] == 0))))
+ {
+ return -1;
+ }
+
+
+ /*
+ * If the first field is exactly 1 character ('P'), it starts
+ * with date elements. Otherwise it's two characters ('PT');
+ * indicating it starts with a time part.
+ */
+ datepart = (field[0][1] == 0);
+
+ /*
+ * Every value must have a unit, so we require an even
+ * number of value/unit pairs. Therefore we require an
+ * odd nubmer of fields, including the prefix 'P'.
+ */
+ if ((nf & 1) == 0)
+ return -1;
+
+ /*
+ * Process pairs of fields at a time.
+ */
+ for (arg = 1 ; arg < nf ; arg+=2)
+ {
+ char * value = field[arg ];
+ char * units = field[arg+1];
+
+ /*
+ * The value part must be a number.
+ */
+ if (ftype[arg] != DTK_NUMBER)
+ return -1;
+
+ /*
+ * extract the number, almost exactly like the non-ISO interval.
+ */
+ val = strtol(value, &cp, 10);
+
+ /*
+ * One difference from the normal postgresql interval below...
+ * ISO 8601 states that "Of these, the comma is the preferred
+ * sign" so I allow it here for locales that support it.
+ * Note: Perhaps the old-style interval code below should
+ * allow for this too, but I didn't want to risk backward
+ * compatability.
+ */
+ if (*cp == '.' || *cp == ',')
+ {
+ fval = strtod(cp, &cp);
+ if (*cp != '\0')
+ return -1;
+
+ if (val < 0)
+ fval = -(fval);
+ }
+ else if (*cp == '\0')
+ fval = 0;
+ else
+ return -1;
+
+
+ if (datepart)
+ {
+ /*
+ * All the 8601 unit specifiers are 1 character, but may
+ * be followed by a 'T' character if transitioning between
+ * the date part and the time part. If it's not either
+ * one character or two characters with the second being 't'
+ * it's an error.
+ */
+ if (!(units[1] == 0 || (units[1] == 't' && units[2] == 0)))
+ return -1;
+
+ if (units[1] == 't')
+ datepart = 0;
+
+ switch (units[0]) /* Y M D W */
+ {
+ case 'd':
+ tm->tm_mday += val;
+ if (fval != 0)
+ adjust_fval(fval,tm,fsec, 86400);
+ tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
+ break;
+
+ case 'w':
+ tm->tm_mday += val * 7;
+ if (fval != 0)
+ adjust_fval(fval,tm,fsec,7 * 86400);
+ tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY));
+ break;
+
+ case 'm':
+ tm->tm_mon += val;
+ if (fval != 0)
+ adjust_fval(fval,tm,fsec,30 * 86400);
+ tmask = DTK_M(MONTH);
+ break;
+
+ case 'y':
+ /*
+ * Why can fractional months produce seconds,
+ * but fractional years can't? Well the older
+ * interval code below has the same property
+ * so this one follows the other one too.
+ */
+ tm->tm_year += val;
+ if (fval != 0)
+ tm->tm_mon += (fval * 12);
+ tmask = ((fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR));
+ break;
+
+ default:
+ return -1; /* invald date unit prefix */
+ }
+ }
+ else
+ {
+ /*
+ * ISO 8601 time part.
+ * In the time part, only one-character
+ * unit prefixes are allowed. If it's more
+ * than one character, it's not a valid ISO 8601
+ * time interval by duration.
+ */
+ if (units[1] != 0)
+ return -1;
+
+ switch (units[0]) /* H M S */
+ {
+ case 's':
+ tm->tm_sec += val;
+#ifdef HAVE_INT64_TIMESTAMP
+ *fsec += (fval * 1000000);
+#else
+ *fsec += fval;
+#endif
+ tmask = DTK_M(SECOND);
+ break;
+
+ case 'm':
+ tm->tm_min += val;
+ if (fval != 0)
+ adjust_fval(fval,tm,fsec,60);
+ tmask = DTK_M(MINUTE);
+ break;
+
+ case 'h':
+ tm->tm_hour += val;
+ if (fval != 0)
+ adjust_fval(fval,tm,fsec,3600);
+ tmask = DTK_M(HOUR);
+ break;
+
+ default:
+ return -1; /* invald time unit prefix */
+ }
+ }
+ fmask |= tmask;
+ }
+
+ if (*fsec != 0)
+ {
+ int sec;
+
+#ifdef HAVE_INT64_TIMESTAMP
+ sec = (*fsec / INT64CONST(1000000));
+ *fsec -= (sec * INT64CONST(1000000));
+#else
+ TMODULO(*fsec, sec, 1e0);
+#endif
+ tm->tm_sec += sec;
+ }
+ return (fmask != 0) ? 0 : -1;
+}
+
+
/* DecodeInterval()
* Interpret previously parsed fields for general time interval.
* Returns 0 if successful, DTERR code if bogus input detected.
*
* Allow "date" field DTK_DATE since this could be just
* an unsigned floating point number. - thomas 1997-11-16
*
* Allow ISO-style time span, with implicit units on number of days
* preceding an hh:mm:ss field. - thomas 1998-04-30
+ *
+ * Allow ISO-8601 style "Representation of time-interval by duration only"
+ * of the format 'PnYnMnDTnHnMnS' and 'PnW' - ron 2003-08-30
*/
+
int
DecodeInterval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec)
{
int is_before = FALSE;
char *cp;
int fmask = 0,
tmask,
type;
int i;
int dterr;
@@ -2906,20 +3148,37 @@

type = IGNORE_DTF;
tm->tm_year = 0;
tm->tm_mon = 0;
tm->tm_mday = 0;
tm->tm_hour = 0;
tm->tm_min = 0;
tm->tm_sec = 0;
*fsec = 0;

+ /*
+ * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of
+ * time-interval by duration only."
+ * Basic extended format: PnYnMnDTnHnMnS
+ * PnW
+ * http://www.astroclark.freeserve.co.uk/iso8601/index.html
+ * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF
+ * Examples: P1D for 1 day
+ * PT1H for 1 hour
+ * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min
+ *
+ * The first field is exactly "p" or "pt" it may be of this type.
+ */
+ if (DecodeISO8601Interval(field,ftype,nf,dtype,tm,fsec) == 0) {
+ return 0;
+ }
+
/* read through list backwards to pick up units before values */
for (i = nf - 1; i >= 0; i--)
{
switch (ftype[i])
{
case DTK_TIME:
dterr = DecodeTime(field[i], fmask, &tmask, tm, fsec);
if (dterr)
return dterr;
type = DTK_DAY;
@@ -2983,20 +3242,21 @@
}
/* DROP THROUGH */

case DTK_DATE:
case DTK_NUMBER:
val = strtol(field[i], &cp, 10);

if (type == IGNORE_DTF)
type = DTK_SECOND;

+ /* should this allow ',' for locales that use it ? */
if (*cp == '.')
{
fval = strtod(cp, &cp);
if (*cp != '\0')
return DTERR_BAD_FORMAT;

if (val < 0)
fval = -(fval);
}
else if (*cp == '\0')

===================================================================

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Tom Lane 2003-09-08 05:47:17 Re: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Previous Message Anthony Chavez 2003-09-07 23:52:50 Re: Need a good slogan to use for ...

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Suleymanov 2003-09-08 05:40:37 Re: default EXECUTE privilege
Previous Message Tom Lane 2003-09-08 04:24:50 Re: why does count take so long?

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-09-08 05:47:17 Re: ISO 8601 "Time Intervals" of the "format with time-unit deignators"
Previous Message Bruce Momjian 2003-09-08 04:25:50 Re: FreeBSD/i386 thread test

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2003-09-08 04:51:45 Re: Minor lmgr code cleanup
Previous Message Tom Lane 2003-09-08 04:42:47 Re: Minor lmgr code cleanup