From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> |
Cc: | NISHIYAMA Tomoaki <tomoakin(at)staff(dot)kanazawa-u(dot)ac(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Notes about fixing regexes and UTF-8 (yet again) |
Date: | 2012-02-18 23:45:10 |
Message-ID: | 7392.1329608710@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Yeah, it's conceivable that we could implement something whereby
>> characters with codes above some cutoff point are handled via runtime
>> calls to iswalpha() and friends, rather than being included in the
>> statically-constructed DFA maps. The cutoff point could likely be a lot
>> less than U+FFFF, too, thereby saving storage and map build time all
>> round.
> It's been proposed to build a regexp type in PostgreSQL which would
> store the DFA directly and provides some way to run that DFA out of its
> storage without recompiling.
> Would such a mechanism be useful here?
No, this is about what goes into the DFA representation in the first
place, not about how we store it and reuse it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-02-18 23:55:39 | Re: Future of our regular expression code |
Previous Message | Dimitri Fontaine | 2012-02-18 23:12:09 | Re: Future of our regular expression code |