Re: Text Search vs MYSQL vs Lucene

From: David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com>
To: Steve Atkins <steve(at)blighty(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Text Search vs MYSQL vs Lucene
Date: 2004-09-09 15:33:11
Message-ID: 414077B7.5080808@zara.6.isreserved.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Steve Atkins wrote:
>>What would be performance of pgSQL text search vs MySQL vs Lucene (flat
>>file) for a 2 terabyte db?
>>thanks for any comments.
>
> My experience with tsearch2 has been that indexing even moderately
> large chunks of data is too slow to be feasible. Moderately large
> meaning tens of megabytes.

My experience with MySQL's full text search as well as the various
MySQL-based text indexing programs (forgot the names, it's been a while)
for some 10-20GB of mail archives has been pretty disappointing too. My
biggest gripe is with the indexing speed. It literally takes days to
index less than a million documents.

I ended up using Swish++. Microsoft's CHM compiler also has pretty
amazing indexing speed (though it crashes quite often when encountering
bad HTML).

--
dave

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-09-09 16:02:14 Re: [JDBC] ERROR: canceling query due to user request
Previous Message Vivek Khera 2004-09-09 14:27:35 Re: How to determine a database is intact?

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Wong 2004-09-09 17:12:55 Re: fsync vs open_sync
Previous Message Hervé Piedvache 2004-09-09 14:56:01 Re: TSearch2 and optimisation ...