Re: [PATCH] bugfix for int2vectorin

Lists: pgsql-hackers
From: Caleb Welton <cwelton(at)greenplum(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 02:36:05
Message-ID: C73B1297.3D2D%cwelton@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I believe the int2vectorin function should handle invalid input more cleanly.

Current behavior:

-- Correct input format:
SELECT '1 2 3'::int2vector;
int2vector
------------
1 2 3
(1 row)

-- Three variations on incorrect input format
SELECT '1not 2really_an 3int2vector'::int2vector;
int2vector
------------
1 2 3
(1 row)

SELECT '1,2,3'::int2vector;
int2vector
------------
1
(1 row)

SELECT '1, 2,3'::int2vector;
int2vector
------------
1 2
(1 row)

-- What other types do:
SELECT '1,2'::oid;
ERROR: invalid input syntax for type oid: "1,2"

Behavior after attached patch:

-- Correct input format:
SELECT '1 2 3'::int2vector;
int2vector
------------
1 2 3
(1 row)

-- Three variations on incorrect input format
SELECT '1not 2really_an 3int2vector'::int2vector;
ERROR: invalid input for int2vector: "1not 2really_an 3int2vector"

SELECT '1,2,3'::int2vector;
ERROR: invalid input for int2vector: "1,2,3"

SELECT '1, 2,3'::int2vector;
ERROR: invalid input for int2vector: "1, 2,3"

-- What other types do:
SELECT '1,2'::oid;
ERROR: invalid input syntax for type oid: "1,2"

Regards,
Caleb

Attachment Content-Type Size
int2vector.patch application/octet-stream 1.7 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 03:38:17
Message-ID: 617.1259725097@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Caleb Welton <cwelton(at)greenplum(dot)com> writes:
> I believe the int2vectorin function should handle invalid input more cleanly.

Why bother? Under what circumstances would users (or anyone at all)
be putting data into an int2vector? It's an internal type and is
certainly not meant for use in user tables.

regards, tom lane


From: Caleb Welton <cwelton(at)greenplum(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 03:52:29
Message-ID: C73B247D.3D32%cwelton@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,
Thanks for the comments.

Caleb Welton <cwelton(at)greenplum(dot)com> writes:
> I believe the int2vectorin function should handle invalid input more cleanly.

On 12/1/09 7:38 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Why bother?

Because correctness is good. Results that produce unexpected results from incorrect input formatting lead to incorrect data being loaded, which is bad.

>> Under what circumstances would users (or anyone at all) be putting data into an int2vector?

I agree in principle, but users do the darnedest things. Perhaps they simply want a compact int array that won't toast. Perhaps they liked that word "vector" in the name. I stopped trying to second guess them a long time ago.

>> It's an internal type and is certainly not meant for use in user tables.

But there is nothing preventing it from being used in user tables so calling it "internal" doesn't really mean a whole lot.

What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably?

-Caleb


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Caleb Welton <cwelton(at)greenplum(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 03:52:32
Message-ID: 603c8f070912011952m33f67a7dm29a62944fc275396@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 1, 2009 at 10:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Caleb Welton <cwelton(at)greenplum(dot)com> writes:
>> I believe the int2vectorin function should handle invalid input more cleanly.
>
> Why bother?  Under what circumstances would users (or anyone at all)
> be putting data into an int2vector?  It's an internal type and is
> certainly not meant for use in user tables.

Maybe it's just me, but this seems like an unnecessarily dismissive
response to a report of some rather odd-looking behavior.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 05:24:04
Message-ID: 3274.1259731444@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Caleb Welton <cwelton(at)greenplum(dot)com> writes:
> On 12/1/09 7:38 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Under what circumstances would users (or anyone at all) be putting data into an int2vector?

> What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably?

I'm trying to gauge what the actual use-case is for having a slightly
nicer error behavior. The proposed patch adds another translatable
error string, which is no skin off my own nose but does create ongoing
work for our translation team. And presumably, if we're going to fix
this, we ought to fix the about-equally-stupid parsing logic in oidvectorin.
While we're at it, should we trouble to detect overflow in int2vectorin?
You could spend quite a bit of time and code making these functions more
bulletproof, but I'm not convinced it's worth any work.

regards, tom lane


From: Caleb Welton <cwelton(at)greenplum(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-02 19:59:54
Message-ID: C73C073B.3D4C%cwelton@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

New patch attached:

1. Does not add a new error message (though the pg_atoi's error message is a little goofy looking).
2. Handles int2 overflow cases.
3. oidvectorin does NOT suffer from the same problems as int2vectorin, someone already fixed it.

As for the use-case I'm not completely sure... I'm not an end-user, I'm just responding to a bug report.

My stance here is that returning an error (even a bad error) on trying to convert data in is better
doing something "wrong" with bogus input. In the first case a user scratches their head, maybe
files a bug report, you tell them the correct syntax and they go on. In the second case they input
a bunch of data and then start complaining about "data corruption", "loss of data", etc. and the
support case is 100x worse.

The amount of code we are talking about here is less than 5 lines of code...

Regards,
Caleb

On 12/1/09 9:24 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

Caleb Welton <cwelton(at)greenplum(dot)com> writes:
> On 12/1/09 7:38 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Under what circumstances would users (or anyone at all) be putting data into an int2vector?

> What exactly is your objection to having the int2arrayin parser handle its input conversion reasonably?

I'm trying to gauge what the actual use-case is for having a slightly
nicer error behavior. The proposed patch adds another translatable
error string, which is no skin off my own nose but does create ongoing
work for our translation team. And presumably, if we're going to fix
this, we ought to fix the about-equally-stupid parsing logic in oidvectorin.
While we're at it, should we trouble to detect overflow in int2vectorin?
You could spend quite a bit of time and code making these functions more
bulletproof, but I'm not convinced it's worth any work.

regards, tom lane

Attachment Content-Type Size
int2vector.patch application/octet-stream 1.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-28 08:22:14
Message-ID: 603c8f070912280022o1916ba1eh1ab0089e2dc81a2e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 2, 2009 at 2:59 PM, Caleb Welton <cwelton(at)greenplum(dot)com> wrote:
> New patch attached:
>
> 1. Does not add a new error message (though the pg_atoi's error message is a
> little goofy looking).
> 2. Handles int2 overflow cases.
> 3. oidvectorin does NOT suffer from the same problems as int2vectorin,
> someone already fixed it.
>
> As for the use-case I'm not completely sure... I'm not an end-user, I'm just
> responding to a bug report.
>
> My stance here is that returning an error (even a bad error) on trying to
> convert data in is better
> doing  something "wrong" with bogus input.  In the first case a user
> scratches their head, maybe
> files a bug report, you tell them the correct syntax and they go on.  In the
> second case they input
> a bunch of data and then start complaining about "data corruption", "loss of
> data", etc. and the
> support case is 100x worse.
>
> The amount of code we are talking about here is less than 5 lines of code...

I have scrutinized the latest version of this patch and I feel that it
is a modest improvement on the status quo and that there is really no
downside. Absent strong objections, I will commit it later this week.

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Caleb Welton <cwelton(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] bugfix for int2vectorin
Date: 2009-12-30 01:30:07
Message-ID: 603c8f070912291730l39a5a6c3qab3c00db0f23c06f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 28, 2009 at 3:22 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I have scrutinized the latest version of this patch and I feel that it
> is a modest improvement on the status quo and that there is really no
> downside.  Absent strong objections, I will commit it later this week.

Committed.

...Robert