Re: Regexp match not working.. (SQL help)

From: "David Johnston" <polobo(at)yahoo(dot)com>
To: "'Phoenix Kiula'" <phoenix(dot)kiula(at)gmail(dot)com>, "'PG-General Mailing List'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Regexp match not working.. (SQL help)
Date: 2011-05-11 18:59:51
Message-ID: 006901cc100d$9c9d1020$d5d73060$@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> I am using this SQL:
>
> SELECT id FROM table1
> WHERE mytext ~* E'sub1|sub2|sub3|sub4...'
> LIMIT 10;
>
> This is basically working, but some of the "mytext" columns being returned
> that do not contain any of these substrings. Am I doing the POSIX regexp
> wrongly? This same thing works when I try it in PHP with preg_match. But
not
> in Postgresql. I have tried several variations
> too:
>
> WHERE mytext ~* E'(sub1)(sub2)(sub3)(sub4)...'
>

When requesting help with RegEx you are strongly advised to supply the text
of the records that you are concerned about (in this case you say you are
getting false positives so provide the contents of "mytext" for those
records) AND the exact expression you are using.

You seem to indicate the contents of mytext contains a "text document" and
you are attempting to find specific words in that document. The expression
format supplied does not take into consideration word boundaries. If any
part of a word matches "subX" then the pattern will match.

You may want to consider finding one or more books on RegEx. The fact that
you consider E'(sub1)(sub2)...' to be a variation of E'sub1|sub2...'
indicates that the issue is likely not PostgreSQL itself but your
understanding on RegEx.

You may also want to try the "regexp_matches(...)" function in PostgreSQL.
Instead of just evaluating true/false it returns an array of all the matches
that were found. Using this you would be able to see exactly what text
PostgreSQL is matching with your expression.

Figuring out why something is matching that should not be (false positive)
is fairly easy since the engine itself will tell you what it matched. The
hard situation is the false-negative, where you think something should match
and it does not.

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bborie Park 2011-05-11 19:09:21 Re: Returning NULL to a set returning C type function
Previous Message Tom Lane 2011-05-11 18:23:39 Re: Returning NULL to a set returning C type function