Quick Links

Re: Bundle of patches

Lists:	pgsql-hackerspgsql-patches

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Bundle of patches
Date:	2006-12-04 14:10:25
Message-ID:	45742C51.9020602@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches
we (Oleg and me) developed during our work on porting of the '1C:Enterprise'
system to support the PostgreSQL database.

We would like to suggest to commit they to HEAD.

1) Typmod for user-defined types
http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
Patch is based on ideas from
http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php

Patch adds to type definition two optional function: typmodinput and
typmodoutput. That allows to develop user-defined types with type's
modificators. Built-in types use typmod input/output functions too.
Typmod internally is represented as non-negative int4 value,
but patch allows to point list of integer in type definition. So,
NUMERIC type works with a help of typmodin/typmodout function.

2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
http://www.sigaev.ru/misc/NULLS_82-0.5.gz
Allow to sort NULLs as greater or lesser than any value. The goal was to
simplificate migrations from MySQL/MS SQL which think that NULL is less.
Also, syntax conforms to SQL2003. It operate on gram.y level, and
adds 'field is [not] null' qualification to sortClause.
Note, to allow queries like 'select .. union .. order by f nulls first'
pgsql now can rewrite that query to
'select * from (select .. union ..) order by f nulls first'. This solves the
problem with 'resjunk' column in SelectStmt->sortClause.

3) Allow to use index for IS [NOT] NULL
http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
Initially patch was developed by Martijn van Oosterhout <kleptog(at)svana(dot)org>.
But it's reworked and support of searching NULLS to GiST too. Patch
adds new column named amsearchnull to pg_am. To recognize IS NULL clause
ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and
ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are
treated greater than any value. It might be look some odd that
for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST,
but if sk_flags contains SK_ISNULL then we never call user-defined functions.

4) OR clauses optimizations
http://www.sigaev.ru/misc/OR_82-0.6.gz
Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses
generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries

4.1) Allow any useful common restriction clauses to be extracted from
OR-of-AND quals. Also, it allows to combine several different
operations to one which can be used in index scan.
SELECT
a, b
FROM
tst
WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 )
ORDER BY a, b
LIMIT 20;
Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20
loops=1)
-> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8)
(actual time=0.265..0.611 rows=20 loops=1)
Index Cond: (a >= 50000)
Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000)))
4.2) When OR clauses aren't intersect and use the same index, it's possible
to just concatenate results of indexscans. For that, now postgres may use
Append node. Append node is modified to have a pathkeys.

SELECT
a
FROM
tst
WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
ORDER BY ASC
LIMIT 20;
Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20
loops=1)
-> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824
rows=20 loops=1)
-> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
time=0.349..0.742 rows=20 loops=1)
-> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990
width=4) (actual time=0.346..0.684 rows=20 loops=1)
Index Cond: ((a > 20000) AND (a < 21000))
-> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018
width=4) (never executed)
Index Cond: ((a > 60000) AND (a < 61000))

Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys
ranges (compare plan with previous one).
SELECT
a
FROM
tst
WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
ORDER BY a DESC
LIMIT 20;
Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20
loops=1)
-> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589
rows=20 loops=1)
-> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
time=0.149..0.511 rows=20 loops=1)
-> Index Scan Backward using aidx on tst (cost=0.00..2001.12
rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1)
Index Cond: ((a > 60000) AND (a < 61000))
-> Index Scan Backward using aidx on tst (cost=0.00..2000.42
rows=990 width=4) (never executed)
Index Cond: ((a > 20000) AND (a < 21000))

4.3) As side effect of previous point, overlapped clauses can be eliminated:

SELECT
a
FROM
tst
WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 )
ORDER BY a
LIMIT 20;
Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20
loops=1)
-> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4)
(actual time=0.162..0.935 rows=20 loops=1)
Index Cond: ((a > 50000) AND (a < 61000))

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 18:19:03
Message-ID:	19324.1165256343@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
> http://www.sigaev.ru/misc/NULLS_82-0.5.gz

i haven't looked at the other patches yet, but this one is just horrid :-(
Instead of creating a proper parse-tree representation of the NULLS
FIRST/LAST option, you've kluged it to convert the syntax into a
double-column ordering using "foo IS [NOT] NULL" as the first column.
This has obvious semantic disdvantages (what if foo is an expensive
function?); but the real problem is that there's no way for the planner
to reason about ordering in this representation. This patch would
guarantee that an ORDER BY with the NULLS option couldn't use an
indexscan, even if the index sorts nulls at the correct end.

I think a reasonable implementation requires introducing an explicit
concept of nulls-first-or-last into the planner's model of sort order,
and probably into btree index opclasses as well. There is already some
discussion about this in the archives with respect to the problem of
handling descending-sort opclasses nicely. (If you just switch the
operators to make a descending-sort opclass, then nulls end up at the
"other end" from where they would otherwise, meaning that a backwards
index scan does something unexpected. We have to fix that or else
descending-sort opclasses can break mergejoins...)

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 18:35:21
Message-ID:	19450.1165257321@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> 3) Allow to use index for IS [NOT] NULL
> http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
> Initially patch was developed by Martijn van Oosterhout <kleptog(at)svana(dot)org>.
> But it's reworked and support of searching NULLS to GiST too. Patch
> adds new column named amsearchnull to pg_am. To recognize IS NULL clause
> ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and
> ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
> ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are
> treated greater than any value.

And what happens when we implement NULLS FIRST/LAST correctly? This is
really a poor choice of representation.

One thing I find questionable about this is the assumption that indexes
can support "foo IS NULL" and "foo IS NOT NULL" searches equally
conveniently. This is demonstrably false for, say, hash. (Hash could
store null keys by assigning them a fixed hashcode, say 0. Then it
would be able to handle IS NULL searches, but not IS NOT NULL, because
it can't do full-index scans.)

I am not real sure that there is any point in making IS NOT NULL an
indexable condition. We don't support <> as an indexable condition,
and no one's yelled about that. It might be best just to simplify
the patch to do IS NULL only. But if we are going to support both,
we probably have to have two pg_am flags not one.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 18:45:55
Message-ID:	19560.1165257955@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> 4) OR clauses optimizations
> http://www.sigaev.ru/misc/OR_82-0.6.gz

This seems kinda ugly --- it looks very expensive and unlikely to find
useful optimizations most of the time. Have you done any benchmarking
to find out what the cost is when the optimizations don't succeed?

I guess my big problem with it is that it's a huge expansion of the
single ugliest, most ad-hoc part of the planner, ie, orindxqual.c.
And the new code is just as ad-hoc as what was there before ...

Also, what's with the pull_tlist bit?

regards, tom lane

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 18:46:15
Message-ID:	45746CF7.1070801@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> This has obvious semantic disdvantages (what if foo is an expensive
> function?);
Agree.

> but the real problem is that there's no way for the planner
> to reason about ordering in this representation. This patch would
> guarantee that an ORDER BY with the NULLS option couldn't use an
> indexscan, even if the index sorts nulls at the correct end.

create table foo ( i int);
insert into foo values (1), (5), (NULL);
create index fooidx on foo (i);
set enable_seqscan=off;
set enable_bitmapscan=off;
explain select i from foo order by i asc nulls last;
QUERY PLAN
-------------------------------------------------------------------
Index Scan using fooidx on foo (cost=0.00..12.05 rows=3 width=4)
explain select i from foo order by i desc nulls first;
QUERY PLAN
----------------------------------------------------------------------------
Index Scan Backward using fooidx on foo (cost=0.00..12.05 rows=3 width=4)

Patch is smart enough about "native" NULL's ordering, so it adds quals only if
it needed.

Index support of non-"native" NULL's ordering, IMHO, has some correlation with
suggested OR-patch. Sorting by ASC NULLS FIRST may done by two index scan with
append node:
Append
Index Scan
Cond: foo IS NULL
Index Scan
Cond: foo IS NOT NULL

> I think a reasonable implementation requires introducing an explicit
> concept of nulls-first-or-last into the planner's model of sort order,
Agree, but I tried to keep patches independent as possible...

If we will have agreement about ways to resolve, I'll will time to work
further in foreseeable future.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 19:04:26
Message-ID:	19734.1165259066@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> 1) Typmod for user-defined types
> http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
> Patch is based on ideas from
> http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
> http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php

This one seems generally workable, but I really dislike the approach
that's been used for passing typmod arguments to the typmod_in function.
Representing them with an "internal" parameter means it'll be forever
impossible to write typmod functions in anything but C, which seems an
ugly restriction. Perhaps an array of int4 would be better? How much
flexibility do we really want to provide for typmod arguments? Allowing
full "expr_list" in the grammar seems less than sane, considering the
result is still going to have to pack into 32 bits.

The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-04 19:26:28
Message-ID:	19958.1165260388@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> but the real problem is that there's no way for the planner
>> to reason about ordering in this representation. This patch would
>> guarantee that an ORDER BY with the NULLS option couldn't use an
>> indexscan, even if the index sorts nulls at the correct end.

> Patch is smart enough about "native" NULL's ordering, so it adds quals
> only if it needed.

[ looks again... ] Oh, so you've hard-wired the assumption that ASC/DESC
correlate to NULLS FIRST/LAST right into the grammar :-(

This is not a workable base for future extension. To be able to do
anything interesting with descending-order opclasses, we really have to
separate ASC/DESC from NULLS FIRST/LAST, not bind them permanently
together.

BTW, another sufficient reason to reject this approach is that a query
entered as "ORDER BY foo NULLS FIRST" would come out as something quite
different, if reverse-listed by ruleutils.c. We've paid the price of
that sort of shortsightedness too many times in the past: implementation
details should not get exposed in the reverse-listing. As a rule of
thumb, if you are putting a transformation involving semantic knowledge
into gram.y, you are probably putting it in the wrong place. (One of
the reasons I like the typmod patch is that it helps pull a lot of
inappropriate knowledge out of gram.y ...)

regards, tom lane

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 19:34:09
Message-ID:	45747831.1000506@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> This seems kinda ugly --- it looks very expensive and unlikely to find
> useful optimizations most of the time. Have you done any benchmarking
> to find out what the cost is when the optimizations don't succeed?

Yep, it's a big win with order and limit specified.
Let table (a,b) has index over (a,b), so,
queries with ((a=3 AND b>10) OR a>3) ORDER BY a,b may be done with one pass of
index scan with condition a>=3 and filter ((a=3 and b>10) or a>3). And scan out
is already sorted. The single way to execute it without patch is bitmap or scan
over two index scans and following ordering. If limit is small enough then there
is a lot unnecessary work for executor. Thats case may be found by
find_common_quals() which is fast enough.

Simplest case of second kind is ( a<3 or a>5 ). If it's possible to prove that
set of rows for first condition and set for second are not intersected then
output of correlated index scans can be simply joined/appended. In this case, we
broaden applicability of Append node. What is more, it's possible to order index
scans by conditions, which allows do not use sort node.

I understand, that proving of non-intersecting and ordering by conditions is an
expensive operation because of using predicate_implied_by/predicate_refuted_by,
but in most cases they aren't even called. For using this optimization all
conditions should on single index - and it's first step.

Suggested plan's is very effective when a or b is large values like varchar, not
a just integers.

> Also, what's with the pull_tlist bit?
pull target list from subplan.

I've add it because query
select b,a from foo where a<3 or a>5 order by a limit 1;
with plan
Result
Append
Index Scan
Index Scan
fails because Result node thinks that it gets (b,a) tuple, but in fact it gets
(a,b). So, if pull_tlist is TRUE, then create_append_plan takes target list from
first subplan. Currently, that bit set only with OR-optimized plan. And
optimizer of OR-clause guarantee that target lists are the same. Sorry, but I
didn't find clearer solution...

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 19:42:03
Message-ID:	45747A0B.5090903@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> This one seems generally workable, but I really dislike the approach
> that's been used for passing typmod arguments to the typmod_in function.
> Representing them with an "internal" parameter means it'll be forever
> impossible to write typmod functions in anything but C, which seems an
> ugly restriction. Perhaps an array of int4 would be better? How much
I don't think that is a problem - I'll change that

> flexibility do we really want to provide for typmod arguments? Allowing
> full "expr_list" in the grammar seems less than sane, considering the
> result is still going to have to pack into 32 bits.
As I remember, I tried to use some thing else but, I've got a lot conflicts
with
AexprConst:
func_name '(' expr_list ')' Sconst

>
> The patch needs more cleanup before applying, too, eg make comments
> match code, get rid of unused keywords added to gram.y.
Ok.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-04 20:03:37
Message-ID:	45747F19.3080506@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> And what happens when we implement NULLS FIRST/LAST correctly? This is
> really a poor choice of representation.

If it's just appending of indexscan's it's not a problem...

>
> One thing I find questionable about this is the assumption that indexes
> can support "foo IS NULL" and "foo IS NOT NULL" searches equally
> conveniently. This is demonstrably false for, say, hash. (Hash could
> store null keys by assigning them a fixed hashcode, say 0. Then it
> would be able to handle IS NULL searches, but not IS NOT NULL, because
> it can't do full-index scans.)

Is there a guarantee that hash value of some not-null keys doesn't equal to
special hash code?

>
> the patch to do IS NULL only. But if we are going areto support both,
> we probably have to have two pg_am flags not one.

GiST isn't effective with single NOT NULL condition ... So, using two flags may
be useful.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-04 22:05:06
Message-ID:	20061204220506.GC24581@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Mon, Dec 04, 2006 at 01:35:21PM -0500, Tom Lane wrote:
> > 3) Allow to use index for IS [NOT] NULL
> > http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
> > Initially patch was developed by Martijn van Oosterhout <kleptog(at)svana(dot)org>.
> > But it's reworked and support of searching NULLS to GiST too. Patch
> > adds new column named amsearchnull to pg_am. To recognize IS NULL clause
> > ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and
> > ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
> > ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are
> > treated greater than any value.
>
> I am not real sure that there is any point in making IS NOT NULL an
> indexable condition. We don't support <> as an indexable condition,
> and no one's yelled about that. It might be best just to simplify
> the patch to do IS NULL only. But if we are going to support both,
> we probably have to have two pg_am flags not one.

Originally I didn't have IS NOT NULL but added it because it was easy
and someone suggested a use case: for indexed columns that have a lot
of nulls, it allows you to create an index scan that stops as soon as
it reaches the first null entry. This is useful for the NULL FIRST/LAST
optimisation for example.

You're right, it doesn't work for hash indexes, but you can't do full
scans on them anyway, so it's not terribly important.

I'd say that ordered indexes like b-tree are the only ones that would
get any benefit from IS NOT NULL.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-04 22:12:04
Message-ID:	20061204221203.GD24581@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Mon, Dec 04, 2006 at 02:04:26PM -0500, Tom Lane wrote:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> > 1) Typmod for user-defined types
> > http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
> > Patch is based on ideas from
> > http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
> > http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php
>
> This one seems generally workable, but I really dislike the approach
> that's been used for passing typmod arguments to the typmod_in function.
> Representing them with an "internal" parameter means it'll be forever
> impossible to write typmod functions in anything but C, which seems an
> ugly restriction. Perhaps an array of int4 would be better? How much
> flexibility do we really want to provide for typmod arguments? Allowing
> full "expr_list" in the grammar seems less than sane, considering the
> result is still going to have to pack into 32 bits.

People have been discussion passing character set names as typmod
parameters, so restricting them to int4 seems too tight.

I'd favour the approach where the arguments to the typmod_in function
determine the types required. This allows the system to do proper
checking and casting and most important of all, good error messages,
eg:

ERROR: Invalid argument to type: must be one of
numeric(), numeric(integer), numeric(integer, integer)

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-05 12:13:16
Message-ID:	4575625C.7070701@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> useful optimizations most of the time. Have you done any benchmarking
> to find out what the cost is when the optimizations don't succeed?
Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled
with -O0 and --enable-debug --enable-cassert

Test results (ms)
| [1] | [2] | [3]
-----------------------------------------------------------
[I] 8.3 | 88490.275 | 46684.972 | 21.563
[II] 8.3 splited query | 0.492 | 0.560 | 0.635
[III] 8.3 rewrited query | 0.523 | 0.952 | 1.004
[IV] 8.3+patch | 0.444 | 0.410 | 0.679

Query description
All queries are variants of sinple OR clause with index support. In tests
for 8.3 and 8.3+patch usial queries are used. '8.2 splited query' test executes
separate queries for each OR-ed clause (in supposition that apllication should
concatenate results). '8.3 rewrited query' test uses rewritten queries with
the same result of execution and queries are fast as I could find.

Test result shows a big advantage of using such kind of optimization and
doesn't demonstrate performance gap on optimization stage.

Test suite:
# select * into foo from generate_series(1,100000) as f1, generate_series(1,100)
as f2;
# create index idx on foo (f1,f2);
# vacuum analyze foo;

Queries and plans in details:
==================[I] Without patch=======================
[1]
# explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order
by f1, f2 limit 10;
Limit (cost=0.00..45.94 rows=10 width=8) (actual time=88490.044..88490.109
rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..14113503.78 rows=3071985
width=8) (actual time=88490.035..88490.072 rows=10 loops=1)
Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 88490.275 ms

[2]
# explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000
and f1<70000) order by f1, f2 limit 10;
Limit (cost=0.00..69.91 rows=10 width=8) (actual time=46684.725..46684.813
rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..14138503.78 rows=2022419
width=8) (actual time=46684.717..46684.768 rows=10 loops=1)
Filter: (((f1 > 40000) AND (f1 < 50000)) OR ((f1 > 60000) AND (f1 <
70000)))
Total runtime: 46684.972 ms

[3]
# explain analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980
and f1<70000) order by f1, f2 limit 10;
Limit (cost=13581.04..13581.07 rows=10 width=8) (actual time=20.767..20.809
rows=10 loops=1)
-> Sort (cost=13581.04..13592.03 rows=4393 width=8) (actual
time=20.759..20.773 rows=10 loops=1)
Sort Key: f1, f2
-> Bitmap Heap Scan on foo (cost=102.12..13315.25 rows=4393 width=8)
(actual time=3.495..11.911 rows=3800 loops=1)
Recheck Cond: (((f1 > 49980) AND (f1 < 50000)) OR ((f1 > 69980) AND
(f1 < 70000)))
-> BitmapOr (cost=102.12..102.12 rows=4393 width=0) (actual
time=3.415..3.415 rows=0 loops=1)
-> Bitmap Index Scan on idx (cost=0.00..51.01 rows=2191
width=0) (actual time=1.887..1.887 rows=1900 loops=1)
Index Cond: ((f1 > 49980) AND (f1 < 50000))
-> Bitmap Index Scan on idx (cost=0.00..51.12 rows=2202
width=0) (actual time=1.519..1.519 rows=1900 loops=1)
Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 21.563 ms

=================[II] Without patch, Two queries=======================
Here plans are omitted because they are simple index scans.
[1]
# explain analyze select * from foo where f1=70000 and f2>95 order by f1, f2
limit 10;
Total runtime: 0.219 ms
# explain analyze select * from foo where f1>70000 order by f1, f2 limit 10;
Total runtime: 0.273 ms

[2]
# explain analyze select * from foo where f1>40000 and f1<50000 order by f1,
f2 limit 10;
Total runtime: 0.245 ms
# explain analyze select * from foo where f1>60000 and f1<70000 order by f1, f2
limit 10;
Total runtime: 0.315 ms

[3]
# explain analyze select * from foo where f1>49980 and f1<50000 order by f1, f2
limit 10;
Total runtime: 0.292 ms
# explain analyze seleCt * from foo where f1>69980 and f1<70000 order by f1, f2
limit 10;
Total runtime: 0.343 ms

==================[III]Without patch, rewrited query===================
[1]
# explain analyze select * from foo where ((f1=70000 and f2>95) or f1>70000) and
f1>=70000 order by f1, f2 limit 10;
Limit (cost=0.00..50.33 rows=10 width=8) (actual time=0.320..0.387 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..3974489.10 rows=789613
width=8) (actual time=0.313..0.348 rows=10 loops=1)
Index Cond: (f1 >= 70000)
Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 0.523 ms

[2]
# explain analyze
select * from (
select * from (select * from foo where f1>40000 and f1<50000 order by f1,
f2 limit 10) as t1
union all
select * from (select * from foo where f1>60000 and f1<70000 order by f1,
f2 limit 10) as t2
) as t
order by f1, f2 limit 10;
Limit (cost=28.85..28.87 rows=10 width=8) (actual time=0.586..0.627 rows=10
loops=1)
-> Sort (cost=28.85..28.90 rows=20 width=8) (actual time=0.581..0.594
rows=10 loops=1)
Sort Key: t.f1, t.f2
-> Result (cost=0.00..28.42 rows=20 width=8) (actual
time=0.085..0.431 rows=20 loops=1)
-> Append (cost=0.00..28.42 rows=20 width=8) (actual
time=0.076..0.345 rows=20 loops=1)
-> Limit (cost=0.00..14.11 rows=10 width=8) (actual
time=0.074..0.127 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..1358815.69
rows=963067 width=8) (actual time=0.069..0.098 rows=10 loops=1)
Index Cond: ((f1 > 40000) AND (f1 < 50000))
-> Limit (cost=0.00..14.11 rows=10 width=8) (actual
time=0.101..0.154 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..1527039.50
rows=1082489 width=8) (actual time=0.097..0.121 rows=10 loops=1)
Index Cond: ((f1 > 60000) AND (f1 < 70000))
Total runtime: 0.952 ms

[3]
# explain analyze
select * from (
select * from (select * from foo where f1>49980 and f1<50000 order by f1,
f2 limit 10) as t1
union all
select * from (select * from foo where f1>69980 and f1<70000 order by f1,
f2 limit 10) as t2
) as t
order by f1, f2 limit 10;
Limit (cost=35.61..35.64 rows=10 width=8) (actual time=0.630..0.673 rows=10
loops=1)
-> Sort (cost=35.61..35.66 rows=20 width=8) (actual time=0.626..0.641
rows=10 loops=1)
Sort Key: t.f1, t.f2
-> Result (cost=0.00..35.18 rows=20 width=8) (actual
time=0.114..0.476 rows=20 loops=1)
-> Append (cost=0.00..35.18 rows=20 width=8) (actual
time=0.107..0.401 rows=20 loops=1)
-> Limit (cost=0.00..17.51 rows=10 width=8) (actual
time=0.103..0.156 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..3550.22
rows=2028 width=8) (actual time=0.099..0.126 rows=10 loops=1)
Index Cond: ((f1 > 49980) AND (f1 < 50000))
-> Limit (cost=0.00..17.47 rows=10 width=8) (actual
time=0.129..0.181 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..3804.01
rows=2177 width=8) (actual time=0.125..0.152 rows=10 loops=1)
Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 1.004 ms

=================[IV]With patch===================
[1]
# explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order
by f1, f2 limit 1
Limit (cost=0.00..1.41 rows=1 width=8) (actual time=0.314..0.316 rows=1 loops=1)
-> Index Scan using idx on foo (cost=0.00..4210762.24 rows=2982440
width=8) (actual time=0.309..0.309 rows=1 loops=1)
Index Cond: (f1 >= 70000)
Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 0.444 ms

[2]
# explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000
and f1<70000) order by f1, f2 limit 10;
Limit (cost=0.00..14.16 rows=10 width=8) (actual time=0.087..0.192 rows=10 loops=1)
-> Result (cost=0.00..2932816.48 rows=2071539 width=8) (actual
time=0.081..0.160 rows=10 loops=1)
-> Append (cost=0.00..2932816.48 rows=2071539 width=8) (actual
time=0.074..0.123 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..1382937.86 rows=976723
width=8) (actual time=0.069..0.092 rows=10 loops=1)
Index Cond: ((f1 > 40000) AND (f1 < 50000))
-> Index Scan using idx on foo (cost=0.00..1549878.62
rows=1094816 width=8) (never executed)
Index Cond: ((f1 > 60000) AND (f1 < 70000))
Total runtime: 0.410 ms

[3]
# explain analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980
and f1<70000) order by f1, f2 limit 10
Limit (cost=0.00..17.57 rows=10 width=8) (actual time=0.115..0.226 rows=10 loops=1)
-> Result (cost=0.00..6902.27 rows=3928 width=8) (actual
time=0.111..0.197 rows=10 loops=1)
-> Append (cost=0.00..6902.27 rows=3928 width=8) (actual
time=0.102..0.156 rows=10 loops=1)
-> Index Scan using idx on foo (cost=0.00..3427.16 rows=1950
width=8) (actual time=0.099..0.125 rows=10 loops=1)
Index Cond: ((f1 > 49980) AND (f1 < 50000))
-> Index Scan using idx on foo (cost=0.00..3475.11 rows=1978
width=8) (never executed)
Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 0.679 ms

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-05 15:32:23
Message-ID:	4918.1165332743@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> useful optimizations most of the time. Have you done any benchmarking
>> to find out what the cost is when the optimizations don't succeed?

> Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled
> with -O0 and --enable-debug --enable-cassert

This is not responding to my concern. What you presented was an
advertisement for the cases where the patch is able to find a better
plan. What I want to know about is how much planning time is added
for queries that it's *not* able to improve. EXPLAIN ANALYZE output
doesn't address that point because it doesn't show planning time.

regards, tom lane

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-06 18:41:43
Message-ID:	45770EE7.9010605@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

> This is not responding to my concern. What you presented was an
Sorry, I see your point now.

Timing is with the same test suite, the same notebook, the same compiling
options and patch (attached) which measures total time of
keybased_rewrite_index_paths() call.

Time in table is measured in microseconds (not milliseconds!):

Number of clauses: | 2 | 3 | 4
----------------------------------------------------
not improve | 123-333 | 209-566 | 359-876
improve | 138-486 | 962 | 1588

Details:
a) Patch isn't able to improve
# select * from foo where (f1>40000 and f1<50000) or (f1>45000 and f1<70000)
order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000333 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and
f1<70000) or (f1>65000 and f1<80000) order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000566 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and
f1<70000) or (f1>65000 and f1<80000) or ( f1>75000 and f1<90000 ) order by f1,
f2 limit 10;
NOTICE: Elapsed time 0.000876 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000123 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
or (f1<30000 and f2<36) order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000209 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
or (f1<30000 and f2<36) or (f1>45000 and f2>5) order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000359 sec

b) Successful improving
# select * from foo where (f1=70000 and f2>95) or f1>70000 order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000414 sec
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000)
order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000486 sec
# select * from foo where (f1>49980 and f1<50000) or (f1>69980 and f1<70000)
order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000478 sec
# select * from foo where (f1=70000 and f2>95) or f1>40000 order by f1, f2
limit 10;
NOTICE: Elapsed time 0.000138 sec
2) tree clause
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or
(f1>80000 and f1<90000)order by f1, f2 limit 10;
NOTICE: Elapsed time 0.000962 sec
3) 4 clauses
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or
(f1>80000 and f1<90000) or (f1>95000 and f1<96000) order by f1, f2 limit 10;
NOTICE: Elapsed time 0.001588 sec

Unsuccessful

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment	Content-Type	Size
timing.patch	text/plain	1.4 KB

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-20 14:35:53
Message-ID:	45894A49.7030602@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

>> Perhaps an array of int4 would be better? How much

Done
http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz

>> The patch needs more cleanup before applying, too, eg make comments
>> match code, get rid of unused keywords added to gram.y.

Cleaned.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-20 14:46:30
Message-ID:	21357.1166625990@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>>> Perhaps an array of int4 would be better? How much

> Done
> http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz

>>> The patch needs more cleanup before applying, too, eg make comments
>>> match code, get rid of unused keywords added to gram.y.

> Cleaned.

OK. I'm up to my rear in the opclass/opfamily rewrite, but will take
another look at the typmod code as soon as I have a working HEAD again ;-)

regards, tom lane

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-21 18:29:30
Message-ID:	458AD28A.1080501@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

0.9 doesn't apply cleanly after Peter's changes, so, new version

http://www.sigaev.ru/misc/user_defined_typmod-0.10.gz

Teodor Sigaev wrote:
> >> Perhaps an array of int4 would be better? How much
>
> Done
> http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz
>
>>> The patch needs more cleanup before applying, too, eg make comments
>>> match code, get rid of unused keywords added to gram.y.
>
> Cleaned.
>
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-29 12:50:08
Message-ID:	45950F00.3080309@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Just a freshing for clean applying..
http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz

Is any objections to commit?
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2006-12-29 12:51:44
Message-ID:	45950F60.8070303@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

>> This is not responding to my concern. What you presented was an
> Sorry, I see your point now.

Is that test enough? Or I should make more?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-29 17:21:56
Message-ID:	13803.1167412916@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> Just a freshing for clean applying..
> http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz
> Is any objections to commit?

There's still a lot I don't particularly care for here (lack of
documentation being the biggest), but I'll make a pass at cleaning it up.

One thought I had after a quick reading of the patch is that there
doesn't seem to be a whole lot of percentage in trying to move the
typmod handling for time/timestamp/interval types into callable
functions, because we still have to special-case the darn things
thanks to the SQL spec's oddball syntax requirements. For instance
in format_type_be() you've got

case TIMETZOID:
if (with_typemod)
! buf = psnprintf(50, "time(%d) with time zone",
! typemod);
else
buf = pstrdup("time with time zone");
break;

changed to

case TIMETZOID:
if (with_typemod)
! buf = printTypmod("time", typemod, typeform->typmodoutput);
else
buf = pstrdup("time with time zone");
break;

which hardly seems like much of an improvement. If we could have gotten
rid of the switch() branch for TIMETZOID entirely, then we'd be
somewhere, but there's no chance because the typmod-less case still has
to be special. So I'm sort of inclined to leave the processing of these
datatypes as-is, and only use the typmodin/typmodout infrastructure for
datatypes that follow the "canonical" syntax of type_name(int[,int ...]).
Thoughts?

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Bundle of patches
Date:	2006-12-30 21:23:02
Message-ID:	21416.1167513782@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> Just a freshing for clean applying..
> http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz

Applied with some revisions, and pg_dump support and regression tests
added.

regards, tom lane

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] [PATCHES] Bundle of patches
Date:	2007-01-10 10:47:22
Message-ID:	45A4C43A.3050505@sigaev.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Nice, thanks a lot.

Tom Lane wrote:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> Just a freshing for clean applying..
>> http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz
>
> Applied with some revisions, and pg_dump support and regression tests
> added.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: Bundle of patches
Date:	2007-02-04 02:38:07
Message-ID:	200702040238.l142c7S26284@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

I assume we have taken all the patches from here we want.

---------------------------------------------------------------------------

Teodor Sigaev wrote:
> The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches
> we (Oleg and me) developed during our work on porting of the '1C:Enterprise'
> system to support the PostgreSQL database.
>
> We would like to suggest to commit they to HEAD.
>
>
> 1) Typmod for user-defined types
> http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
> Patch is based on ideas from
> http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
> http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php
>
> Patch adds to type definition two optional function: typmodinput and
> typmodoutput. That allows to develop user-defined types with type's
> modificators. Built-in types use typmod input/output functions too.
> Typmod internally is represented as non-negative int4 value,
> but patch allows to point list of integer in type definition. So,
> NUMERIC type works with a help of typmodin/typmodout function.
>
>
> 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
> http://www.sigaev.ru/misc/NULLS_82-0.5.gz
> Allow to sort NULLs as greater or lesser than any value. The goal was to
> simplificate migrations from MySQL/MS SQL which think that NULL is less.
> Also, syntax conforms to SQL2003. It operate on gram.y level, and
> adds 'field is [not] null' qualification to sortClause.
> Note, to allow queries like 'select .. union .. order by f nulls first'
> pgsql now can rewrite that query to
> 'select * from (select .. union ..) order by f nulls first'. This solves the
> problem with 'resjunk' column in SelectStmt->sortClause.
>
> 3) Allow to use index for IS [NOT] NULL
> http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
> Initially patch was developed by Martijn van Oosterhout <kleptog(at)svana(dot)org>.
> But it's reworked and support of searching NULLS to GiST too. Patch
> adds new column named amsearchnull to pg_am. To recognize IS NULL clause
> ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and
> ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
> ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are
> treated greater than any value. It might be look some odd that
> for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST,
> but if sk_flags contains SK_ISNULL then we never call user-defined functions.
>
> 4) OR clauses optimizations
> http://www.sigaev.ru/misc/OR_82-0.6.gz
> Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses
> generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries
>
> 4.1) Allow any useful common restriction clauses to be extracted from
> OR-of-AND quals. Also, it allows to combine several different
> operations to one which can be used in index scan.
> SELECT
> a, b
> FROM
> tst
> WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 )
> ORDER BY a, b
> LIMIT 20;
> Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20
> loops=1)
> -> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8)
> (actual time=0.265..0.611 rows=20 loops=1)
> Index Cond: (a >= 50000)
> Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000)))
> 4.2) When OR clauses aren't intersect and use the same index, it's possible
> to just concatenate results of indexscans. For that, now postgres may use
> Append node. Append node is modified to have a pathkeys.
>
> SELECT
> a
> FROM
> tst
> WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
> ORDER BY ASC
> LIMIT 20;
> Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20
> loops=1)
> -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824
> rows=20 loops=1)
> -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
> time=0.349..0.742 rows=20 loops=1)
> -> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990
> width=4) (actual time=0.346..0.684 rows=20 loops=1)
> Index Cond: ((a > 20000) AND (a < 21000))
> -> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018
> width=4) (never executed)
> Index Cond: ((a > 60000) AND (a < 61000))
>
> Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys
> ranges (compare plan with previous one).
> SELECT
> a
> FROM
> tst
> WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 )
> ORDER BY a DESC
> LIMIT 20;
> Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20
> loops=1)
> -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589
> rows=20 loops=1)
> -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual
> time=0.149..0.511 rows=20 loops=1)
> -> Index Scan Backward using aidx on tst (cost=0.00..2001.12
> rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1)
> Index Cond: ((a > 60000) AND (a < 61000))
> -> Index Scan Backward using aidx on tst (cost=0.00..2000.42
> rows=990 width=4) (never executed)
> Index Cond: ((a > 20000) AND (a < 21000))
>
> 4.3) As side effect of previous point, overlapped clauses can be eliminated:
>
> SELECT
> a
> FROM
> tst
> WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 )
> ORDER BY a
> LIMIT 20;
> Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20
> loops=1)
> -> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4)
> (actual time=0.162..0.935 rows=20 loops=1)
> Index Cond: ((a > 50000) AND (a < 61000))
>
>
>
> --
> Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
> WWW: http://www.sigaev.ru/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +