Re: WIP: index support for regexp search

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Erik Rijkers <er(at)xs4all(dot)nl>, pgsql-hackers(at)postgresql(dot)org, pavel(dot)stehule(at)gmail(dot)com
Subject: Re: WIP: index support for regexp search
Date: 2012-12-16 21:25:06
Message-ID: CAPpHfdsi5jUP_fHhfhtPt4VGagT+oSRnwxbpeb7tFLPj+ew_1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 14, 2012 at 1:34 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:

> On Mon, Dec 3, 2012 at 4:31 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:
>
>> Actually, I generally dislike path matrix for same reasons. But:
>> 1) Output graphs could contain trigrams which are completely useless for
>> search. For example, for regex /(abcdefgh)*ijk/ we need only "ijk" trigram
>> while graph would contain much more.Path matrix is a method to get rid of
>> all of them.
>> 2) If we use color trigrams then we need some criteria for which color
>> trigrams to expand into trigrams. Simultaneously, we shouldn't allow path
>> from initial state to the final by unexpanded trigrams. It seems much
>> harder to do with graph than with matrix.
>>
>
> Now, I have an idea about doing some not comprehensive but simple and fast
> simplification of graph. I'm doing experiments now. In case of success we
> could get rid of path matrix.
>

Attached patch have following changes:
1) Postphone expansion of colors. Graph are building on color trigrams.
2) Selective expansion of color trigrams into simple trigrams. All
non-expanded color trigrams are removed. Such removal leads to union of all
states pairs connected with corresponding arcs. Surely, this must no lead
to union of initial and final states: that could do all previous work
senseless.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
trgm-regexp-0.8.patch.gz application/x-gzip 16.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-12-16 21:25:32 Re: multiple CREATE FUNCTION AS items for PLs
Previous Message Peter Eisentraut 2012-12-16 21:23:39 Re: multiple CREATE FUNCTION AS items for PLs