Quick Links

Re: [COMMITTERS] pgsql: Clean up jsonb code.

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>
Subject:	Re: [COMMITTERS] pgsql: Clean up jsonb code.
Date:	2014-05-08 07:45:56
Message-ID:	536B3634.3070900@vmware.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers pgsql-hackers

On 05/08/2014 02:25 AM, Peter Geoghegan wrote:
> findJsonbValueFromSuperHeader()'s lowbound argument
> previously served to establish a low bound for searching when
> searching for multiple keys (so the second and subsequent
> user-supplied key could skip much of the object).

Got that.

> In the case of
> jsonb_exists_any(), say, if you only have a reasonable expectation
> that about 1 key exists, and that happens to be the last key that the
> user passed to the text[] argument (to the existence/? operator), then
> n - 1 calls to what is now findJsonbValueFromContainer() (which now
> does not accept a lowbound) are wasted.

Check.

> That's elem_count - 1
> top-level binary searches of the entire jsonb. Or elem_count such
> calls rather than 1 call (plus 1 sort of the supplied array) in the
> common case where jsonb_exists_any() will return false.

Ok, but I don't see any big difference in that regard. It still called
findJsonbValueFromContainer() elem_count times, before this commit. Each
call was somewhat cheaper, because the lower bound of the binary search
was initialized to where the previous search ended, but you still had to
perform the search.

> Granted, that might not be that bad right now, given that it's only
> ever (say) elem_count or elem_count - 1 wasted binary searches through
> the *top* level, but that might not always be true.

If we are ever to perform a deep search, I think we'll want to do much
more to optimize that than just keep track of the lower bound. Like,
start an iterator of tree and check for all of the keys in one go.

> And even today,
> sorting a presumably much smaller user-passed lookup array once has to
> be cheaper than searching through the entire jsonb perhaps elem_count
> times per call.

Well, the quick testing I did suggested otherwise. It's not a big
difference, but sorting has all kinds of overhead, like pallocing the
array to sort, copying the elements around etc. For a small array, the
startup cost of sorting trumps the savings in the binary searches.
Possibly the way the sorting was done was not optimal, and you could
reduce the copying and allocations involved in that, but it's hardly
worth the trouble.

- Heikki

In response to

Re: pgsql: Clean up jsonb code. at 2014-05-07 23:25:11 from Peter Geoghegan

Responses

Re: [COMMITTERS] pgsql: Clean up jsonb code. at 2014-05-09 17:27:34 from Alexander Korotkov

Browse pgsql-committers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2014-05-08 07:59:31	pgsql: Include files copied from libpqport in .gitignore
Previous Message	Tom Lane	2014-05-08 01:39:31	pgsql: Avoid buffer bloat in libpq when server is consistently faster t

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2014-05-08 07:48:19	Re: popen and pclose redefinitions causing many warning in Windows build
Previous Message	Jaime Casanova	2014-05-08 06:31:22	Re: [WIP] showing index maintenance on EXPLAIN