Re: [dspam-users] Postgres vs. MySQL

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: [dspam-users] Postgres vs. MySQL
Date: 2004-11-27 18:43:15
Message-ID: m3u0rbxhdo.fsf@knuth.knuth.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Martha Stewart called it a Good Thing when cshobe(at)osss(dot)net ("Casey Allen Shobe") wrote:
> I posted about this a couple days ago on dspam-dev...
>
> I am using DSpam with PostgreSQL, and like you discovered the horrible
> performance. The reason is because the default PostgreSQL query planner
> settings determine that a sequence scan will be more efficient than an
> index scan, which is wrong. To correct this behavior, adjust the query
> planner settings for the appropriate table/column with this command:
>
> alter table "dspam_token_data" alter "token" set statistics 200; analyze;
>
> Let me know if it help you. It worked wonders for me.

That makes a great deal of sense; the number of tokens are likely to
be rather larger than 10, and are likely to be quite unevenly
distributed. That fits with the need you found to collect more
statistics on that column.

Other cases where it seems plausible that it would be worthwhile to do
the same:

alter table dspam_signature_data alter signature set statistics 200;
alter table dspam_neural_data alter node set statistics 200;
alter table dspam_neural_decisions alter signature set statistics 200;

Lionel's suggestion of having a functional index on dspam_token_data
(innocent_hits + spam_hits) also seems likely to be helpful. Along
with that, it might prove necessary to alter stats on dspam_token_data
thus:

alter table dspam_token_data alter innocent_hits set statistics 200;
alter table dspam_token_data alter spam_hits set statistics 200;

None of these changes are likely to make things materially worse; if
they do help, they'll help rather a lot.
--
(format nil "~S(at)~S" "cbbrowne" "gmail.com")
http://www.ntlug.org/~cbbrowne/nonrdbms.html
Rules of the Evil Overlord #112. "I will not rely entirely upon
"totally reliable" spells that can be neutralized by relatively
inconspicuous talismans." <http://www.eviloverlord.com/>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2004-11-29 21:50:56 Re: Postgres vs. DSpam
Previous Message Lionel Bouton 2004-11-27 10:14:30 Re: [dspam-users] Postgres vs. MySQL