Lists: | pgsql-bugs |
---|
From: | "Eric Haszlakiewicz" <erh+pgsql(at)swapsimple(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #3645: regular expression back references seem broken |
Date: | 2007-10-01 00:43:49 |
Message-ID: | 200710010043.l910hn3H021900@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 3645
Logged by: Eric Haszlakiewicz
Email address: erh+pgsql(at)swapsimple(dot)com
PostgreSQL version: 8.2.5
Operating system: NetBSD
Description: regular expression back references seem broken
Details:
I was attempting to create a simple regular expression that uses back
references and I noticed some very odd behaviour. This regexp is supposed
to match a string where all the characters are the same:
^(.)\1*$
If I try it, it doesn't work. I would expect this to return false:
template1=# select 'xyz' ~ E'^(.)\\1*$';
?column?
----------
t
(1 row)
But adding some extra parens does:
template1=# select 'xyz' ~ E'^(.)(\\1)*$';
?column?
----------
f
(1 row)
As does changing the "." to an "x":
template1=# select 'xyz' ~ E'^(x)\\1*$';
?column?
----------
f
(1 row)
As does forcing it to be a extended regular expression:
template1=# select 'xyz' ~ E'(?e)^(.)\\1*$';
?column?
----------
f
(1 row)
The docs claim: "A single non-zero digit, not followed by another digit, is
always taken as a back reference." (The note at the end of 9.7.3.3)
It's relatively easy to work around the problem, but it certainly led to a
fair bit of head scratching while trying to debug some code. :)
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Eric Haszlakiewicz" <erh+pgsql(at)swapsimple(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #3645: regular expression back references seem broken |
Date: | 2007-10-01 15:34:54 |
Message-ID: | 7405.1191252894@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
"Eric Haszlakiewicz" <erh+pgsql(at)swapsimple(dot)com> writes:
> I would expect this to return false:
> template1=# select 'xyz' ~ E'^(.)\\1*$';
> ?column?
> ----------
> t
> (1 row)
Seems to be a bug in the Tcl regexp library we use. It's already
reported upstream:
https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894
regards, tom lane
From: | Eric Haszlakiewicz <erh+pgsql(at)swapsimple(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #3645: regular expression back references seem broken |
Date: | 2007-10-02 08:07:05 |
Message-ID: | 4701FC29.50609@swapsimple.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Tom Lane wrote:
> "Eric Haszlakiewicz" <erh+pgsql(at)swapsimple(dot)com> writes:
>> I would expect this to return false:
>
>> template1=# select 'xyz' ~ E'^(.)\\1*$';
>> ?column?
>> ----------
>> t
>> (1 row)
>
> Seems to be a bug in the Tcl regexp library we use. It's already
> reported upstream:
> https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894
>
> regards, tom lane
er.. it's been languishing there for over 2 years. That doesn't sound
very promising for getting it fixed. :(
eric
From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Eric Haszlakiewicz <erh+pgsql(at)swapsimple(dot)com> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #3645: regular expression back references seem broken |
Date: | 2008-03-25 00:00:35 |
Message-ID: | 200803250000.m2P00Z109617@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Added to TODO:
* Fix regular expression bug when using complex back-references
http://archives.postgresql.org/pgsql-bugs/2007-10/msg00000.php
---------------------------------------------------------------------------
Eric Haszlakiewicz wrote:
>
> The following bug has been logged online:
>
> Bug reference: 3645
> Logged by: Eric Haszlakiewicz
> Email address: erh+pgsql(at)swapsimple(dot)com
> PostgreSQL version: 8.2.5
> Operating system: NetBSD
> Description: regular expression back references seem broken
> Details:
>
> I was attempting to create a simple regular expression that uses back
> references and I noticed some very odd behaviour. This regexp is supposed
> to match a string where all the characters are the same:
>
> ^(.)\1*$
>
> If I try it, it doesn't work. I would expect this to return false:
>
> template1=# select 'xyz' ~ E'^(.)\\1*$';
> ?column?
> ----------
> t
> (1 row)
>
> But adding some extra parens does:
> template1=# select 'xyz' ~ E'^(.)(\\1)*$';
> ?column?
> ----------
> f
> (1 row)
>
> As does changing the "." to an "x":
>
> template1=# select 'xyz' ~ E'^(x)\\1*$';
> ?column?
> ----------
> f
> (1 row)
>
> As does forcing it to be a extended regular expression:
>
>
> template1=# select 'xyz' ~ E'(?e)^(.)\\1*$';
> ?column?
> ----------
> f
> (1 row)
>
> The docs claim: "A single non-zero digit, not followed by another digit, is
> always taken as a back reference." (The note at the end of 9.7.3.3)
>
> It's relatively easy to work around the problem, but it certainly led to a
> fair bit of head scratching while trying to debug some code. :)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +