Re: NOT LIKE much faster than LIKE?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrea Arcangeli <andrea(at)cpushare(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: NOT LIKE much faster than LIKE?
Date: 2006-01-10 02:04:48
Message-ID: 24021.1136858688@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Andrea Arcangeli <andrea(at)cpushare(dot)com> writes:
> It just makes no sense to me that the planner takes a difference
> decision based on a "not".

Why in the world would you think that? In general a NOT will change the
selectivity of the WHERE condition tremendously. If the planner weren't
sensitive to that, *that* would be a bug. The only case where it's
irrelevant is if the selectivity of the base condition is exactly 50%,
which is not a very reasonable default guess for LIKE.

It sounds to me that the problem is misestimation of the selectivity
of the LIKE pattern --- the planner is going to think that
LIKE '%% PREEMPT %%' is fairly selective because of the rather long
match text, when in reality it's probably not so selective on your
data. But we don't keep any statistics that would allow the actual
number of matching rows to be estimated well. You might want to think
about changing your data representation so that the pattern-match can be
replaced by a boolean column, or some such, so that the existing
statistics code can make a more reasonable estimate how many rows are
selected.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andrea Arcangeli 2006-01-10 02:23:03 Re: NOT LIKE much faster than LIKE?
Previous Message Andrea Arcangeli 2006-01-10 01:44:47 NOT LIKE much faster than LIKE?