Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-06-17 21:41:22
Message-ID: 53A0B602.30905@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/17/2014 02:36 PM, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> (2) If there are multiple columns with the same levenschtien distance,
>> which one do you suggest? The current code picks a random one, which
>> I'm OK with. The other option would be to list all of the columns.
>
> I objected to that upthread. I don't think that picking a random one is
> sane at all. Listing them all might be OK (I notice that that seems to be
> what both bash and git do).
>
> Another issue is whether to print only those having exactly the minimum
> observed Levenshtein distance, or to print everything less than some
> cutoff. The former approach seems to me to be placing a great deal of
> faith in something that's only a heuristic.

Well, that depends on what the cutoff is. If it's high, like 0.5, that
could be a LOT of columns. Like, I plan to test this feature with a
3-table join that has a combined 300 columns. I can completely imagine
coming up with a string which is within 0.5 or even 0.3 of 40 columns names.

So if we want to list everything below a cutoff, we'd need to make that
cutoff fairly narrow, like 0.2. But that means we'd miss a lot of
potential matches on short column names.

I really think we're overthinking this: it is just a HINT, and we can
improve it in future PostgreSQL versions, and most of our users will
ignore it anyway because they'll be using a client which doesn't display
HINTs.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-06-17 21:46:02 Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Previous Message Michael Paquier 2014-06-17 21:40:36 Re: WAL replay bugs