Re: BRIN indexes - TRAP: BadArgument

From: Greg Stark <stark(at)mit(dot)edu>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Emanuel Calvo <3manuek(at)esdebian(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BRIN indexes - TRAP: BadArgument
Date: 2014-11-10 22:43:46
Message-ID: CAM-w4HNEFvDpiSmvRy5-X=nPnTJaM2HQPTFUYF7Rw4-zxiXboQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 10, 2014 at 9:31 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Every time the index is accessed, yeah. I'm not sure about figuring the
> initial creation details. Do you think we need another support
> procedure to help with that? We can add it if needed; minmax would just
> define it to InvalidOid.

I have a working bloom filter with hard coded filter size and hard
coded number of hash functions. I need to think about how I'm going to
make it more general now. I think the answer is that I should have an
index option that specifies the false positive rate and calculates the
optimal filter size and number of hash functions. It might possibly
need to peek at the table statistics to determine the population size
though. Or perhaps I should bite the bullet and size the bloom filters
based on the actual number of rows in a chunk since the BRIN
infrastructure does allow each summary to be a different size.

There's another API question I have. To implement Consistent I need to
call the hash function which in the case of functions like hashtext
could be fairly expensive and I even need to generate multiple hash
values(though currently I'm slicing them all from the integer hash
value so that's not too bad) and then test each of those bits. It
would be natural to call hashtext once at the start of the scan and
possibly build a bitmap and compare all of them in a single &
operation. But afaict there's no way to hook the beginning of the scan
and opaque is not associated with the specific scan so I don't think I
can cache the hash value of the scan key there safely. Is there a good
way to do it with the current API?

On a side note I'm curious about something, I was stepping through the
my code in gdb and discovered that a single row insert appeared to
construct a new summary then union it into the existing summary
instead of just calling AddValue on the existing summary. Is that
intentional? What led to that?

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2014-11-10 22:44:42 Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Previous Message Christopher Browne 2014-11-10 22:37:40 Re: Add CREATE support to event triggers