Add protransform for numeric, varbit, and temporal types

Lists: pgsql-hackers
From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Add protransform for numeric, varbit, and temporal types
Date: 2012-01-01 00:36:19
Message-ID: 20120101003619.GA4395@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds
protransform functions to the length coercions for numeric, varbit, timestamp,
timestamptz, time, timetz and interval. This mostly serves to make more ALTER
TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) ->
numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4).
The rules for varbit are exactly the same as for varchar. Numeric is slightly
more complex:

* Flatten calls to our length coercion function that solely represent
* increases in allowable precision. Scale changes mutate every datum, so
* they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an
* unconstrained numeric, so a change from an unconstrained numeric to any
* constrained numeric is also unoptimizable.

time{,stamp}{,tz} are similar to varchar for these purposes, except that, for
example, plain "timestamptz" is equivalent to "timestamptz(6)". interval has
a vastly different typmod format, but the principles applicable to length
coercion remain the same.

Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is
always a no-op when one would logically expect as much. Does there exist a
timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4)
due to floating point rounding? Even if so, I'm fairly comfortable calling it
a feature rather than a bug to avoid perturbing values that way.

After these patches, the only core length coercion casts not having
protransform functions are those for "bpchar" and "bit". For those, we could
only optimize trivial cases of no length change. I'm not planning to do so.

Thanks,
nm

Attachment Content-Type Size
transform-varbit-v1.patch text/plain 3.6 KB
transform-numeric-v1.patch text/plain 4.6 KB
transform-temporal-v1.patch text/plain 17.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-01-03 15:31:51
Message-ID: CA+TgmoZqpm+DDHM7_9VaPNzW9CsaLTwd=SXDdiu-35XdNX6y1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Dec 31, 2011 at 7:36 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds
> protransform functions to the length coercions for numeric, varbit, timestamp,
> timestamptz, time, timetz and interval.  This mostly serves to make more ALTER
> TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) ->
> numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4).
> The rules for varbit are exactly the same as for varchar.  Numeric is slightly
> more complex:
>
>  * Flatten calls to our length coercion function that solely represent
>  * increases in allowable precision.  Scale changes mutate every datum, so
>  * they are unoptimizable.  Some values, e.g. 1E-1001, can only fit into an
>  * unconstrained numeric, so a change from an unconstrained numeric to any
>  * constrained numeric is also unoptimizable.
>
> time{,stamp}{,tz} are similar to varchar for these purposes, except that, for
> example, plain "timestamptz" is equivalent to "timestamptz(6)".  interval has
> a vastly different typmod format, but the principles applicable to length
> coercion remain the same.
>
> Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is
> always a no-op when one would logically expect as much.  Does there exist a
> timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4)
> due to floating point rounding?  Even if so, I'm fairly comfortable calling it
> a feature rather than a bug to avoid perturbing values that way.
>
> After these patches, the only core length coercion casts not having
> protransform functions are those for "bpchar" and "bit".  For those, we could
> only optimize trivial cases of no length change.  I'm not planning to do so.

This is cool stuff. I will plan to review this once the CF starts.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-07 17:43:11
Message-ID: CA+TgmoaPGLz9qHenEE=k7QsrsfZcyjmkeGCWF1Tww+kb2f7i6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 3, 2012 at 10:31 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Dec 31, 2011 at 7:36 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> Building on commit 8f9fe6edce358f7904e0db119416b4d1080a83aa, this adds
>> protransform functions to the length coercions for numeric, varbit, timestamp,
>> timestamptz, time, timetz and interval.  This mostly serves to make more ALTER
>> TABLE ALTER TYPE operations avoid a rewrite, including numeric(10,2) ->
>> numeric(12,2), varbit(4) -> varbit(8) and timestamptz(2) -> timestamptz(4).
>> The rules for varbit are exactly the same as for varchar.  Numeric is slightly
>> more complex:
>>
>>  * Flatten calls to our length coercion function that solely represent
>>  * increases in allowable precision.  Scale changes mutate every datum, so
>>  * they are unoptimizable.  Some values, e.g. 1E-1001, can only fit into an
>>  * unconstrained numeric, so a change from an unconstrained numeric to any
>>  * constrained numeric is also unoptimizable.
>>
>> time{,stamp}{,tz} are similar to varchar for these purposes, except that, for
>> example, plain "timestamptz" is equivalent to "timestamptz(6)".  interval has
>> a vastly different typmod format, but the principles applicable to length
>> coercion remain the same.
>>
>> Under --disable-integer-datetimes, I'm not positive that timestamp_scale() is
>> always a no-op when one would logically expect as much.  Does there exist a
>> timestamp such that v::timestamp(2) differs from v:timestamp(2)::timestamp(4)
>> due to floating point rounding?  Even if so, I'm fairly comfortable calling it
>> a feature rather than a bug to avoid perturbing values that way.
>>
>> After these patches, the only core length coercion casts not having
>> protransform functions are those for "bpchar" and "bit".  For those, we could
>> only optimize trivial cases of no length change.  I'm not planning to do so.
>
> This is cool stuff.  I will plan to review this once the CF starts.

I've committed the numeric and varbit patches and will look at the
temporal one next. I did notice one odd thing, though: sometimes we
seem to get a rebuild on the toast index for no obvious reason:

rhaas=# set client_min_messages=debug1;
SET
rhaas=# create table foo (a serial primary key, b varbit);
NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for
serial column "foo.a"
DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"foo_pkey" for table "foo"
DEBUG: building index "foo_pkey" on table "foo"
CREATE TABLE
rhaas=# alter table foo alter column b set data type varbit(4);
DEBUG: rewriting table "foo"
DEBUG: building index "foo_pkey" on table "foo"
ALTER TABLE
rhaas=# alter table foo alter column b set data type varbit;
DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430"
ALTER TABLE

Strangely, it doesn't happen if I add another column to the table:

rhaas=# set client_min_messages=debug1;
SET
rhaas=# create table foo (a serial primary key, b varbit, c varbit);
NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for
serial column "foo.a"
DEBUG: building index "pg_toast_16481_index" on table "pg_toast_16481"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"foo_pkey" for table "foo"
DEBUG: building index "foo_pkey" on table "foo"
CREATE TABLE
rhaas=# alter table foo alter column b set data type varbit(4);
DEBUG: building index "pg_toast_16490_index" on table "pg_toast_16490"
DEBUG: rewriting table "foo"
DEBUG: building index "foo_pkey" on table "foo"
ALTER TABLE
rhaas=# alter table foo alter column b set data type varbit;
ALTER TABLE

There may not be any particular harm in a useless rebuild of the TOAST
table index (wasted effort excepted), but my lack of understanding of
why this is happening causes me to fear that there's a bug, not so
much in these patches as in the core alter table code that is enabling
skipping table and index rebuilds.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-08 01:45:01
Message-ID: 20120208014501.GA4073@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 07, 2012 at 12:43:11PM -0500, Robert Haas wrote:
> I've committed the numeric and varbit patches and will look at the
> temporal one next.

Thanks. The comment you added to numeric_transform() has a few typos,
"constained" -> "constrained" and "nodes" -> "notes".

> I did notice one odd thing, though: sometimes we
> seem to get a rebuild on the toast index for no obvious reason:
>
> rhaas=# set client_min_messages=debug1;
> SET
> rhaas=# create table foo (a serial primary key, b varbit);
> NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for
> serial column "foo.a"
> DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430"
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
> "foo_pkey" for table "foo"
> DEBUG: building index "foo_pkey" on table "foo"
> CREATE TABLE
> rhaas=# alter table foo alter column b set data type varbit(4);
> DEBUG: rewriting table "foo"
> DEBUG: building index "foo_pkey" on table "foo"
> ALTER TABLE

When we rebuild the table at this point, it has a small maximum tuple size.
Therefore, we omit the toast table entirely.

> rhaas=# alter table foo alter column b set data type varbit;
> DEBUG: building index "pg_toast_16430_index" on table "pg_toast_16430"
> ALTER TABLE

This command makes the tuple size unbounded again, so we create a toast table.

> Strangely, it doesn't happen if I add another column to the table:
>
> rhaas=# set client_min_messages=debug1;
> SET
> rhaas=# create table foo (a serial primary key, b varbit, c varbit);

With the extra unconstrained varbit column, the tuple size is continuously
unbounded and the table has a toast table at all stages. So, the difference
in behavior is correct, albeit unintuitive.

> NOTICE: CREATE TABLE will create implicit sequence "foo_a_seq" for
> serial column "foo.a"
> DEBUG: building index "pg_toast_16481_index" on table "pg_toast_16481"
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
> "foo_pkey" for table "foo"
> DEBUG: building index "foo_pkey" on table "foo"
> CREATE TABLE
> rhaas=# alter table foo alter column b set data type varbit(4);
> DEBUG: building index "pg_toast_16490_index" on table "pg_toast_16490"
> DEBUG: rewriting table "foo"
> DEBUG: building index "foo_pkey" on table "foo"
> ALTER TABLE
> rhaas=# alter table foo alter column b set data type varbit;
> ALTER TABLE


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-08 02:42:42
Message-ID: CA+TgmoaMSOgmgn2wTF9HTs5Oy3q2_QmF6xPj6PJ0iAnMQ=DE7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 7, 2012 at 8:45 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Feb 07, 2012 at 12:43:11PM -0500, Robert Haas wrote:
>> I've committed the numeric and varbit patches and will look at the
>> temporal one next.
>
> Thanks.  The comment you added to numeric_transform() has a few typos,
> "constained" -> "constrained" and "nodes" -> "notes".

Yuck, that's pretty bad. Thanks, fixed (I hope).

> When we rebuild the table at this point, it has a small maximum tuple size.
> Therefore, we omit the toast table entirely.

Ah, OK. The debug messages might be more clear if they mentioned
whether or not we were omitting the TOAST table, I suppose.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-08 14:37:01
Message-ID: CA+TgmoY+Dk4bzn88GtYHRLY-s48c40_MSmdwD2ADH=LR4XqhuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 7, 2012 at 12:43 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I've committed the numeric and varbit patches and will look at the
> temporal one next.

Committed, after changing the OIDs so they don't conflict.

Here's one more case for you to ponder:

rhaas=# create table noah (i interval day);
CREATE TABLE
rhaas=# alter table noah alter column i set data type interval second(3);
DEBUG: rewriting table "noah"
ALTER TABLE

Do we really need a rewrite in that case? The code acts like the
interval range and precision are separate beasts, but is that really
true?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-09 12:18:29
Message-ID: 20120209121828.GA3653@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 08, 2012 at 09:37:01AM -0500, Robert Haas wrote:
> On Tue, Feb 7, 2012 at 12:43 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > I've committed the numeric and varbit patches and will look at the
> > temporal one next.
>
> Committed, after changing the OIDs so they don't conflict.
>
> Here's one more case for you to ponder:
>
> rhaas=# create table noah (i interval day);
> CREATE TABLE
> rhaas=# alter table noah alter column i set data type interval second(3);
> DEBUG: rewriting table "noah"
> ALTER TABLE
>
> Do we really need a rewrite in that case? The code acts like the
> interval range and precision are separate beasts, but is that really
> true?

The code has a thinko; a given interval typmod ultimately implies a single
point from which we truncate rightward. The precision only matters if the
range covers SECOND. Thanks; the attached patch improves this.

Attachment Content-Type Size
transform-interval-precis-v1.patch text/plain 1.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add protransform for numeric, varbit, and temporal types
Date: 2012-02-09 17:25:17
Message-ID: CA+TgmoZJ0jqLiND6nXaDXOEyTK7jBXR97Bnr0JYyY7e==J8xzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 9, 2012 at 7:18 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> The code has a thinko; a given interval typmod ultimately implies a single
> point from which we truncate rightward.  The precision only matters if the
> range covers SECOND.  Thanks; the attached patch improves this.

Thanks, committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company