Re: substring with this pattern works in 8.3.1; does not work in 8.3.4

Lists: pgsql-bugs
From: "Chris Wood" <chris(dot)wood(at)bookitnow(dot)ca>
To: pgsql-bugs(at)postgresql(dot)org
Subject: substring with this pattern works in 8.3.1; does not work in 8.3.4
Date: 2008-11-12 18:07:46
Message-ID: 84d154660811121007w764a02a7i6cf475ec7c253a2e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I apologize in advance for not testing on 8.3.5, but that would be very
difficult for me.

here is where it works in 8.3.1:
protocalte=> select version() ;
version
----------------------------------------------------------------------------------------
PostgreSQL 8.3.1 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.3
(Debian 4.2.3-2)
(1 row)

protocalte=> CREATE TABLE locn (
locn_key integer NOT NULL,
public_phone text NOT NULL,
CONSTRAINT public_phone_ch CHECK (((public_phone = ''::text) OR
("substring"(public_phone, '^[0-9]{10}(,[0-9]{10})*$'::text) IS NOT NULL)))
);
CREATE TABLE
protocalte=> insert into locn values(10, '1231231234') ;
INSERT 0 1

and here is where it does not work in 8.3.4:
postgresbugs=# select version() ;
version
--------------------------------------------------------------------------------------------
PostgreSQL 8.3.4 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian
4.3.2-1) 4.3.2
(1 row)

postgresbugs=# CREATE TABLE locn (
locn_key integer NOT NULL,
public_phone text NOT NULL,
CONSTRAINT public_phone_ch CHECK (((public_phone = ''::text) OR
("substring"(public_phone, '^[0-9]{10}(,[0-9]{10})*$'::text) IS NOT NULL)))
);
CREATE TABLE
postgresbugs=# insert into locn values(10, '1231231234') ;
ERROR: new row for relation "locn" violates check constraint
"public_phone_ch"


From: Tomasz Ostrowski <tometzky(at)batory(dot)org(dot)pl>
To: Chris Wood <chris(dot)wood(at)bookitnow(dot)ca>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: substring with this pattern works in 8.3.1; does not work in 8.3.4
Date: 2008-11-20 09:30:57
Message-ID: 49252E51.7080605@batory.org.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2008-11-12 19:07, Chris Wood wrote:

> here is where it works in 8.3.1:
> CONSTRAINT public_phone_ch CHECK (((public_phone = ''::text) OR
> ("substring"(public_phone, '^[0-9]{10}(,[0-9]{10})*$'::text) IS NOT NULL))));
> protocalte=> insert into locn values(10, '1231231234') ;

8.3.1 had a bug in substring function which was corrected in 8.3.2:

http://www.postgresql.org/docs/8.3/static/release-8-3-2.html
Fix a corner case in regular-expression substring matching
(substring(string from pattern)) (Tom)

The problem occurs when there is a match to the pattern overall but the
user has specified a parenthesized subexpression and that subexpression
hasn't got a match. An example is substring('foo' from 'foo(bar)?').
This should return NULL, since (bar) isn't matched, but it was
mistakenly returning the whole-pattern match instead (ie, foo).

> and here is where it does not work in 8.3.4:
> ERROR: new row for relation "locn" violates check constraint
> "public_phone_ch"

http://www.postgresql.org/docs/8.3/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
The substring function with two parameters, substring(string from
pattern), provides extraction of a substring that matches a POSIX
regular expression pattern. It returns null if there is no match,
otherwise the portion of the text that matched the pattern. But *if*
*the* *pattern* *contains* *any* *parentheses*, *the* *portion* *of*
*the* *text* *that* *matched* *the* *first* *parenthesized*
*subexpression* (the one whose left parenthesis comes first) *is*
*returned*. You can put parentheses around the whole expression if you
want to use parentheses within it without triggering this exception.

So your check should look like this:
CONSTRAINT public_phone_ch
CHECK (
(
(public_phone = ''::text)
OR
("substring"(public_phone, '(^[0-9]{10}(,[0-9]{10})*$)'::text)
IS NOT NULL))
)
);

Or much simpler and clear:
CONSTRAINT public_phone_ch
CHECK ( public_phone ~ '^$|^[0-9]{10}(,[0-9]{10})*$' )

> I apologize in advance for not testing on 8.3.5, but that would be
> very difficult for me.

It is not that difficult:

PGVERSION=2.3.5
mkdir /tmp/postgres
cd /tmp/postgres
wget \
ftp://ftp.postgresql.org/pub/source/v$PGVERSION/postgresql-$PGVERSION.tar.bz2
tar xjf postgresql-$PGVERSION.tar.bz2
cd postgresql-$PGVERSION
./configure --prefix=/tmp/postgres
make install
cd /tmp/postgres/bin
./initdb --no-locale -D /tmp/postgres/data
./postgres -p 54320 -D /tmp/postgres/data &
./psql -p 54320 postgres

postgres=# select version();
PostgreSQL 8.3.5 on i686-pc-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)

This works for any user (besides root) on any Unix-like system.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh