Re: Regex pattern with shorter back reference does NOT work as expected

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regex pattern with shorter back reference does NOT work as expected
Date: 2013-07-11 18:05:14
Message-ID: 26570.1373565914@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com> writes:
> Following example does not work as expected:

> -- Should return TRUE but returning FALSE
> SELECT 'Programmer' ~ '(\w).*?\1' as t;

This is clearly broken, but I'm uncomfortable with the proposed patch.
As written, it changes behavior for both the shortest-match-preferred
and longest-match-preferred cases; but you've offered no evidence that
the longest-match case is broken. Maybe it is --- it's sure not
obvious why it's okay to abandon the search early in this case. But I
think we'd have been likely to hear about it before now if there were
a matching failure in that path, since longest-preferred is so much
more widely used.

I think we should either convince ourselves that the longest-preferred
case is also broken (preferably with a test case), or understand why it
isn't. Such understanding would probably also teach us how to fix the
shortest-preferred case in a way that doesn't give up early search exit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-07-11 18:06:03 Re: [PATCH] big test separation POC
Previous Message Gibheer 2013-07-11 17:31:43 Patch for reserved connections for replication users