Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-06-17 21:28:53
Message-ID: 53A0B315.3020205@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/17/2014 01:59 PM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Emitting a suggestion with a large distance seems like it could be
>> rather irritating. If the user types in SELECT prodct_id FROM orders,
>> and that column does not exist, suggesting "product_id", if such a
>> column exists, will likely be well-received. Suggesting a column
>> named, say, "price", however, will likely make at least some users say
>> "no I didn't mean that you stupid @%!#" - because probably the issue
>> there is that the user selected from the completely wrong table,
>> rather than getting 6 of the 9 characters they typed incorrect.
>
> Yeah, that's my point exactly. There's no very good reason to assume that
> the intended answer is in fact among the set of column names we can see;
> and if it *is* there, the Levenshtein distance to it isn't going to be
> all that large. I think that suggesting "foobar" when the user typed
> "glorp" is not only not helpful, but makes us look like idiots.

Well, there's two different issues:

(1) offering a suggestion which is too different from what the user
typed. This is easily limited by having a max distance (most likely a
distance/length ratio, with a max of say, 0.5). The only drawback of
this would be the extra cpu cycles to calculate it, and some arguments
about what the max distance should be. But for the sake of the
children, let's not have a GUC for it.

(2) If there are multiple columns with the same levenschtien distance,
which one do you suggest? The current code picks a random one, which
I'm OK with. The other option would be to list all of the columns.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2014-06-17 21:34:49 Re: Minmax indexes
Previous Message Kevin Grittner 2014-06-17 21:15:19 Re: Doing better at HINTing an appropriate column within errorMissingColumn()