Gsoc2012 idea, tablesample

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Joshua Berkus <josh(at)agliodbs(dot)com>
Cc: Qi Huang <huangqiyx(at)hotmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de, alvherre(at)commandprompt(dot)com, neil conway <neil(dot)conway(at)gmail(dot)com>, daniel(at)heroku(dot)com, cbbrowne(at)gmail(dot)com, kevin grittner <kevin(dot)grittner(at)wicourts(dot)gov>
Subject: Gsoc2012 idea, tablesample
Date: 2012-04-17 06:16:29
Message-ID: 4F8D0ABD.4080403@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24.03.2012 22:12, Joshua Berkus wrote:
> Qi,
>
> Yeah, I can see that. That's a sign that you had a good idea for a project, actually: your idea is interesting enough that people want to debate it. Make a proposal on Monday and our potential mentors will help you refine the idea.

Yep. The discussion withered, so let me try to summarize:

1. We probably don't want the SQL syntax to be added to the grammar.
This should be written as an extension, using custom functions as the
API, instead of extra SQL syntax.

2. It's not very useful if it's just a dummy replacement for "WHERE
random() < ?". It has to be more advanced than that. Quality of the
sample is important, as is performance. There was also an interesting
idea of on implementing monetary unit sampling.

I think this would be a useful project if those two points are taken
care of.

Another idea that Robert Haas suggested was to add support doing a TID
scan for a query like "WHERE ctid< '(501,1)'". That's not enough work
for GSoC project on its own, but could certainly be a part of it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-04-17 06:36:04 Re: Why can't I use pgxs to build a plpgsql plugin?
Previous Message Magnus Hagander 2012-04-17 05:47:46 Re: Bug tracker tool we need