From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Robert Haas" <robertmhaas(at)gmail(dot)com> |
Subject: | Re: *_collapse_limit, geqo_threshold - example schema |
Date: | 2009-07-09 15:00:42 |
Message-ID: | 200907091700.43411.andres@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tuesday 07 July 2009 17:40:50 Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > I cannot reasonably plan some queries with join_collapse_limit set to 20.
> > At least not without setting the geqo limit very low and a geqo_effort to
> > a low value.
> > So I would definitely not agree that removing j_c_l is a good idea.
> Can you show some specific examples? All of this discussion seems like
> speculation in a vacuum ...
As similar wishes came up multiple times now I started to create a schema I
may present which is sufficiently similar to show the same effects.
I had to cut down the complexity of the schema considerably - both for easier
understanding and easier writing of the demo schema.
I also have a moderately complex demo query similar to really used ones.
Autogenerated (GUI) queries do not use views like I did in the example one but
it seemed easier to play around with query size this way.
Also the real queries often have way much more conditions than the one I
present here.
Also I have not "tuned" the queries here in any way, the join order is not
optimized (like in the real application), but I don't think that does matter
for the purpose of this discussion
The queries itself only sketch what they are intended for and query many
fictional datapoints, but again I dont think this is a problem.
Is it helpfull this way?
Some numbers about the query_2.sql are attached. Short overview:
- a low from_collapse_limit is deadly
- a high from_collapse_limit is not costly here
- geqo_effort basically changes nothing
- geqo changes basically nothing
- with a higher join_collapse_limit (12) geqo=on costs quite a bit! (factor
20!). I double checked. At other times I get 'failed to make a valid plan'
The numbers are all 8.5 as of today.
Some explanations about the schema:
- It uses surrogate keys everywhere as the real schema employs some form of
row level, label based access checking (covert channel issues)
- The real schema uses partitions - I don't think they would be interesting
here?
- its definitely not the most beautiful schema I have seen, but I have to admit
that I cannot think of a much nicer one which serves the different purposes as
well
- somewhat complex queries
- new "information_set"'s and "information"'s are added frequently
- automated and manual data entry has to work with such additions
- GUI query tool needs to work in face of such changes
- I have seen similar schemas multiple times now.
- The original schema employs materialized views in parts (both for execution
and planning speed)
- The queries are crazy, but people definitely create/use them.
Andres
Attachment | Content-Type | Size |
---|---|---|
query_2.sql | text/x-sql | 2.1 KB |
schema.sql | text/x-sql | 6.3 KB |
numbers.txt | text/plain | 1.5 KB |
views.sql | text/x-sql | 55.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2009-07-09 15:37:41 | Re: *_collapse_limit, geqo_threshold |
Previous Message | Peter Hunsberger | 2009-07-09 14:14:37 | Re: *_collapse_limit, geqo_threshold |