Re: soundex and metaphone

Lists: pgsql-hackers
From: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: soundex and metaphone
Date: 2005-05-24 21:12:25
Message-ID: 429398B9.2080600@tvi.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hey everyone,

I've been working with a couple people who didn't know that soundex and
metaphone were included in the distribution as contrib modules. While
it's their fault that they didn't check contrib, soundex is pretty
common among database systems and I was wondering if there was a reason
it is not included as a core function?

-Jonah


From: Douglas McNaught <doug(at)mcnaught(dot)org>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 09:27:46
Message-ID: m2is16e3ql.fsf@Douglas-McNaughts-Powerbook.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jonah H. Harris" <jharris(at)tvi(dot)edu> writes:

> Hey everyone,
>
> I've been working with a couple people who didn't know that soundex
> and metaphone were included in the distribution as contrib modules.
> While it's their fault that they didn't check contrib, soundex is
> pretty common among database systems and I was wondering if there was
> a reason it is not included as a core function?

Because no one's taken ownership of the code and made a case for
integrating it into the core backend.

-Doug


From: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
To: Douglas McNaught <doug(at)mcnaught(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 13:07:34
Message-ID: 4295CA16.2020608@tvi.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At a minimum I think we should support soundex in the core. I'm willing
to move soundex and metaphone into the backend. Does anyone see a
reason not to do so?

Douglas McNaught wrote:

>"Jonah H. Harris" <jharris(at)tvi(dot)edu> writes:
>
>
>
>>Hey everyone,
>>
>>I've been working with a couple people who didn't know that soundex
>>and metaphone were included in the distribution as contrib modules.
>>While it's their fault that they didn't check contrib, soundex is
>>pretty common among database systems and I was wondering if there was
>>a reason it is not included as a core function?
>>
>>
>
>Because no one's taken ownership of the code and made a case for
>integrating it into the core backend.
>
>-Doug
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>
>


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 14:00:23
Message-ID: 200505261600.25258.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jonah H. Harris wrote:
> At a minimum I think we should support soundex in the core. I'm
> willing to move soundex and metaphone into the backend. Does anyone
> see a reason not to do so?

Soundex is really only useful for English names with English
pronunciation. If we were to adapt a phonetic algorithm into the core,
I'd like to see something more general.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 14:15:07
Message-ID: 4295D9EB.3070502@tvi.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter,

I don't disagree with you that a more generalized function would also be
good, just that soundex is common and would be helpful if it were built-in.

Peter Eisentraut wrote:

>Jonah H. Harris wrote:
>
>
>>At a minimum I think we should support soundex in the core. I'm
>>willing to move soundex and metaphone into the backend. Does anyone
>>see a reason not to do so?
>>
>>
>
>Soundex is really only useful for English names with English
>pronunciation. If we were to adapt a phonetic algorithm into the core,
>I'd like to see something more general.
>
>
>


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 14:27:18
Message-ID: 4295DCC6.9060609@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jonah H. Harris wrote:

> At a minimum I think we should support soundex in the core. I'm
> willing to move soundex and metaphone into the backend. Does anyone
> see a reason not to do so?
>
>
I take it you mean apart from the fact that soundex is horribly limited
and out of data and probably nobody should be using it?

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 14:58:25
Message-ID: 2188.1117119505@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jonah H. Harris" <jharris(at)tvi(dot)edu> writes:
> At a minimum I think we should support soundex in the core. I'm willing
> to move soundex and metaphone into the backend. Does anyone see a
> reason not to do so?

Is it really ready for prime time? For one thing, a quick look shows no
evidence of being multibyte-ready. There's a fair amount of cleanup of
random private coding conventions (META_MALLOC!?) to be done too.

Doug's point is valid: there's some actual work needed here, not just
arguing to shove the code from point A to point B.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>, Douglas McNaught <doug(at)mcnaught(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: soundex and metaphone
Date: 2005-05-26 15:17:44
Message-ID: 4295E898.5060606@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

>"Jonah H. Harris" <jharris(at)tvi(dot)edu> writes:
>
>
>>At a minimum I think we should support soundex in the core. I'm willing
>>to move soundex and metaphone into the backend. Does anyone see a
>>reason not to do so?
>>
>>
>
>Is it really ready for prime time? For one thing, a quick look shows no
>evidence of being multibyte-ready. There's a fair amount of cleanup of
>random private coding conventions (META_MALLOC!?) to be done too.
>
>Doug's point is valid: there's some actual work needed here, not just
>arguing to shove the code from point A to point B.
>
>
>
>

Well, META_MALLOC occurs in part of the code that he didn't ask for ...
it was inherited from the perl module code that I adapted to do double
metaphone. And a minimal wrapper suited my purposes at the time just fine.

But the point is well taken nevertheless.

cheers

andrew


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Subject: Re: soundex and metaphone
Date: 2005-05-27 06:53:21
Message-ID: 4296C3E1.3090403@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jonah H. Harris wrote:
> I'm willing to move soundex and metaphone into the backend.
> Does anyone see a reason not to do so?

As a kinda strange reason, I like them in contrib because
they demonstrate a nice simple example of how one can write a
contrib extension.

This module has simple functions that take a string or
two and return a string or number. Most of the other
contrib modules do tricky stuff with weird types or
indexes that make them rather complex to use as
a starting point.

If they were to be moved out of contrib, I think it'd be
really nice if someone add a "hello_world" contrib that
demonstrates a bunch of simple operations in C to be used
as such a model.