Quick Links

Re: [GENERAL] PostgreSQL TPC-H test result?

Lists:	pgsql-generalpgsql-performance

From:	"Amber" <guxiaobo1982(at)hotmail(dot)com>
To:	"pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject:	PostgreSQL TPC-H test result?
Date:	2008-09-09 11:59:49
Message-ID:	BLU139-DS5AB90A6E11E3F439F0006CD540@phx.gbl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

I read something from http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html saying that PostgreSQL can't give the correct result of the some TPC-H queries, I wonder is there any official statements about this, because it will affect our plane of using PostgreSQL as an alternative because it's usability. BTW I don't think PostgreSQL performances worse because the default configuration usually can't use enough resources of the computer, as as memory.

From:	Andrew Sullivan <ajs(at)commandprompt(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 12:39:47
Message-ID:	20080909123947.GA59092@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, Sep 09, 2008 at 07:59:49PM +0800, Amber wrote:

> I read something from
> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html

Given that the point of that "study" is to prove something about
performance, one should be leery of any claims based on an "out of the
box" comparison. Particularly since the "box" their own product comes
out of is "compiled from CVS checkout". Their argument seems to be
that people can learn how to drive CVS and to compile software under
active development, but can't read the manual that comes with Postgres
(and a release of Postgres well over a year old, at that).

I didn't get any further in reading the claims, because it's obviously
nothing more than a marketing effort using the principle that deriding
everyone else will make them look better. Whether they have a good
product is another question entirely.

A
--
Andrew Sullivan
ajs(at)commandprompt(dot)com
+1 503 667 4564 x104
http://www.commandprompt.com/

From:	"Amber" <guxiaobo1982(at)hotmail(dot)com>
To:	"Andrew Sullivan" <ajs(at)commandprompt(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 14:06:01
Message-ID:	BLU139-DS28D9D90A25BD2AD8CDEF0CD540@phx.gbl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Yes, we don't care about the performance results, but we do care about the point that PostgreSQL can't give the correct results of TPC-H queries.

--------------------------------------------------
From: "Andrew Sullivan" <ajs(at)commandprompt(dot)com>
Sent: Tuesday, September 09, 2008 8:39 PM
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: [GENERAL] PostgreSQL TPC-H test result?

> On Tue, Sep 09, 2008 at 07:59:49PM +0800, Amber wrote:
>
>> I read something from
>> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html
>
> Given that the point of that "study" is to prove something about
> performance, one should be leery of any claims based on an "out of the
> box" comparison. Particularly since the "box" their own product comes
> out of is "compiled from CVS checkout". Their argument seems to be
> that people can learn how to drive CVS and to compile software under
> active development, but can't read the manual that comes with Postgres
> (and a release of Postgres well over a year old, at that).
>
> I didn't get any further in reading the claims, because it's obviously
> nothing more than a marketing effort using the principle that deriding
> everyone else will make them look better. Whether they have a good
> product is another question entirely.
>
> A
> --
> Andrew Sullivan
> ajs(at)commandprompt(dot)com
> +1 503 667 4564 x104
> http://www.commandprompt.com/
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

From:	"Richard Broersma" <richard(dot)broersma(at)gmail(dot)com>
To:	Amber <guxiaobo1982(at)hotmail(dot)com>
Cc:	"Andrew Sullivan" <ajs(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 15:17:33
Message-ID:	396486430809090817n7b7cc3b5g21dd8ae0a0052c0f@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, Sep 9, 2008 at 7:06 AM, Amber <guxiaobo1982(at)hotmail(dot)com> wrote:
> Yes, we don't care about the performance results, but we do care about the point that PostgreSQL can't give the correct results of TPC-H queries.

It would be nice to know about the data, queries, and the expected
results of their tests just so we could see this for ourselves.

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug

From:	Robert Treat <robert(at)omniti(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Cc:	"Amber" <guxiaobo1982(at)hotmail(dot)com>, "Andrew Sullivan" <ajs(at)commandprompt(dot)com>
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 15:29:54
Message-ID:	200809091129.54467.robert@omniti.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tuesday 09 September 2008 10:06:01 Amber wrote:
> From: "Andrew Sullivan" <ajs(at)commandprompt(dot)com>
> Sent: Tuesday, September 09, 2008 8:39 PM
> To: <pgsql-general(at)postgresql(dot)org>
> Subject: Re: [GENERAL] PostgreSQL TPC-H test result?
>
> > On Tue, Sep 09, 2008 at 07:59:49PM +0800, Amber wrote:
> >> I read something from
> >> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html
> >
> > Given that the point of that "study" is to prove something about
> > performance, one should be leery of any claims based on an "out of the
> > box" comparison. Particularly since the "box" their own product comes
> > out of is "compiled from CVS checkout". Their argument seems to be
> > that people can learn how to drive CVS and to compile software under
> > active development, but can't read the manual that comes with Postgres
> > (and a release of Postgres well over a year old, at that).
> >
> > I didn't get any further in reading the claims, because it's obviously
> > nothing more than a marketing effort using the principle that deriding
> > everyone else will make them look better. Whether they have a good
> > product is another question entirely.
> >
> > >Yes, we don't care about the performance results, but we do care
> > >about the
> > > point that PostgreSQL can't give the correct results of TPC-H queries.

Given the point of those benchmarks is to make other systems look bad, I think
you have to take them with a grain of salt. Since we don't know what the
errors/results were, and no information is giving, we are left to wonder if
this is a problem with the software or the tester. The site would have us
believe the former, but I think I would lean toward the latter... case in
point, I did a quick google and turned up this link:
http://www.it.iitb.ac.in/~chetanv/personal/acads/db/report_html/node10.html.
It isn't terribly informative, but it doesindicate one thing, someone else
was able to run query #6 correctly, while the above site claims it returns an
error. Now when I look at query#6 from that site, I notice it shows the
following syntax:

interval '1' year.

when I saw that, it jumped out at me as something that could be an issue, and
it is:

pagila=# select now() - interval '1' year, now() - interval '1 year';
?column? | ?column?
-------------------------------+-------------------------------
2008-09-09 11:28:46.938209-04 | 2007-09-09 11:28:46.938209-04
(1 row)

Now, I'm not sure if there is an issue that monet supports the first syntax
and so when they ran thier test on postgres this query produced wrong
results, but that seems possible. In this case I would wonder if the first
syntax is sql compliant, but it doesn't really matter, the tpc-h allows for
changes to queries to support syntax variations between databases; I'm pretty
sure I could make suttle changes to "break" other databases as well.

Incidentally, I poked Mark Wong, who used to work at the OSDL (big linux
kernel hacking shop), and he noted he has successfully run the tpc-h tests
before on postgres.

In the end, I can't speak to what the issues are wrt monet and postgres and
thier tpc-h benchmarks, but personally I don't think they are worth worring
about.

--
Robert Treat
http://www.omniti.com
Database: Scalability: Consulting:

From:	"Merlin Moncure" <mmoncure(at)gmail(dot)com>
To:	Amber <guxiaobo1982(at)hotmail(dot)com>
Cc:	"Andrew Sullivan" <ajs(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 15:35:16
Message-ID:	b42b73150809090835m6e7781f1uf4ba76fd889447d3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, Sep 9, 2008 at 10:06 AM, Amber <guxiaobo1982(at)hotmail(dot)com> wrote:
> Yes, we don't care about the performance results, but we do care about the point that PostgreSQL can't give the correct results of TPC-H queries.

PostgreSQL, at least in terms of the open source databases, is
probably your best bet if you are all concerned about correctness. Do
not give any credence to a vendor published benchmark unless the test
is published and can be independently verifed.

merlin

From:	Andrew Sullivan <ajs(at)commandprompt(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 15:51:13
Message-ID:	20080909155113.GM59092@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, Sep 09, 2008 at 10:06:01PM +0800, Amber wrote:
> Yes, we don't care about the performance results, but we do care about the point that PostgreSQL can't give the correct results of TPC-H queries.
>

I have never heard a reputable source claim this. I have grave doubts
about their claim: they don't specify what implementation of TPC-H
they use. They don't actually have the right, AIUI, to claim they
tested under TPC-H, since their results aren't listed anywhere on
http://www.tpc.org/tpch/results/tpch_results.asp?orderby=dbms. It
could well be that they made up something that kinda does something
like TPC-H, tailored to how their database works, and then claimed
others can't do the job. That's nice marketing material, but it's not
a meaningful test result.

Without access to the methodology, you should be wary of accepting any
of the conclusions.

There is, I understand, an implementation of something like TPC-H that
you could use to test it yourself. http://osdldbt.sourceforge.net/.
DBT-3 is supposed to be that workload. Please note that the license
does not allow you to publish competitive tests for marketing
reasons. but you could see for yourself whether the claim is true
that way.

A
--
Andrew Sullivan
ajs(at)commandprompt(dot)com
+1 503 667 4564 x104
http://www.commandprompt.com/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Treat <robert(at)omniti(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org, "Amber" <guxiaobo1982(at)hotmail(dot)com>, "Andrew Sullivan" <ajs(at)commandprompt(dot)com>
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 20:26:55
Message-ID:	24765.1220992015@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Robert Treat <robert(at)omniti(dot)com> writes:
> http://www.it.iitb.ac.in/~chetanv/personal/acads/db/report_html/node10.html.
> It isn't terribly informative, but it doesindicate one thing, someone else
> was able to run query #6 correctly, while the above site claims it returns an
> error. Now when I look at query#6 from that site, I notice it shows the
> following syntax:

> interval '1' year.

> when I saw that, it jumped out at me as something that could be an issue, and
> it is:

Yeah. This is SQL spec syntax, but it's not fully implemented in
Postgres: the grammar supports it but the info doesn't get propagated to
interval_in, and interval_in wouldn't know what to do even if it did
have the information that there was a YEAR qualifier after the literal.

That's probably not good because it *looks* like we support the syntax,
but in fact produce non-spec-compliant results. I think it might be
better if we threw an error.

Or someone could try to make it work, but given that no one has taken
the slightest interest since Tom Lockhart left the project, I wouldn't
hold my breath waiting for that.

regards, tom lane

From:	Greg Smith <gsmith(at)gregsmith(dot)com>
To:	Amber <guxiaobo1982(at)hotmail(dot)com>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>, J(dot)K(dot)Shah(at)Sun(dot)COM
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 21:42:50
Message-ID:	Pine.GSO.4.64.0809091723090.29002@westnet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, 9 Sep 2008, Amber wrote:

> I read something from
> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html
> saying that PostgreSQL can't give the correct result of the some TPC-H
> queries

Jignesh Shah at Sun ran into that same problem. It's mentioned briefly in
his presentation at
http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_postgresql on
pages 26 and 27. 5 of the 22 reference TCP-H queries (4, 5, 6, 10, 14)
returned zero rows immediately for his tests. Looks like the MonetDB crew
is saying it does that on queries 4,5,6,10,12,14,15 and that 20 takes too
long to run to generate a result. Maybe 12/15/20 were fixed by changes in
8.3, or perhaps there were subtle errors there that Jignesh didn't
catch--it's not like he did a formal submission run, was just kicking the
tires. I suspect the difference on 20 was that his hardware and tuning
was much better, so it probably did execute fast enough.

While some of the MonetDB bashing in this thread was unwarranted, it is
quite inappropriate that they published performance results here. Would
be nice if someone in the community were to grab ahold of the TPC-H
problems and try to shake them out.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

From:	"Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-09 22:28:54
Message-ID:	48C6F8A6.6070308@sun.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Greg Smith wrote:
> On Tue, 9 Sep 2008, Amber wrote:
>
>> I read something from
>> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html
>> saying that PostgreSQL can't give the correct result of the some
>> TPC-H queries
>
> Jignesh Shah at Sun ran into that same problem. It's mentioned
> briefly in his presentation at
> http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_postgresql
> on pages 26 and 27. 5 of the 22 reference TCP-H queries (4, 5, 6, 10,
> 14) returned zero rows immediately for his tests. Looks like the
> MonetDB crew is saying it does that on queries 4,5,6,10,12,14,15 and
> that 20 takes too long to run to generate a result. Maybe 12/15/20
> were fixed by changes in 8.3, or perhaps there were subtle errors
> there that Jignesh didn't catch--it's not like he did a formal
> submission run, was just kicking the tires. I suspect the difference
> on 20 was that his hardware and tuning was much better, so it probably
> did execute fast enough.
>
> While some of the MonetDB bashing in this thread was unwarranted, it
> is quite inappropriate that they published performance results here.
> Would be nice if someone in the community were to grab ahold of the
> TPC-H problems and try to shake them out.
>

Hmm This is the second time MonetDB has published PostgreSQL numbers. I
think I will try to spend few days on TPC-H again on a much smaller
scale (to match what MonetDB used) and start discussions on solving the
problem.. Keep tuned.

Regards,
Jignesh

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Treat <robert(at)omniti(dot)com>, pgsql-general(at)postgresql(dot)org, Amber <guxiaobo1982(at)hotmail(dot)com>, Andrew Sullivan <ajs(at)commandprompt(dot)com>
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-10 09:20:32
Message-ID:	1221038432.3913.647.camel@ebony.2ndQuadrant
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, 2008-09-09 at 16:26 -0400, Tom Lane wrote:

> That's probably not good because it *looks* like we support the syntax,
> but in fact produce non-spec-compliant results. I think it might be
> better if we threw an error.

Definitely. If we accept SQL Standard syntax like this but then not do
what we should, it is clearly an ERROR. Our reputation will be damaged
if we don't, since people will think that we are blase about standards
compliance and about query correctness. Please lets move swiftly to plug
this hole, as if it were a data loss bug (it is, if it causes wrong
answers to queries for unsuspecting users).

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

From:	Andrew Sullivan <ajs(at)commandprompt(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-10 14:08:26
Message-ID:	20080910140825.GE61018@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Tue, Sep 09, 2008 at 05:42:50PM -0400, Greg Smith wrote:
>
> While some of the MonetDB bashing in this thread was unwarranted,

What bashing? I didn't see any bashing of them.

A
--
Andrew Sullivan
ajs(at)commandprompt(dot)com
+1 503 667 4564 x104
http://www.commandprompt.com/

From:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc:	Robert Treat <robert(at)omniti(dot)com>, pgsql-general(at)postgresql(dot)org, Amber <guxiaobo1982(at)hotmail(dot)com>, Andrew Sullivan <ajs(at)commandprompt(dot)com>, simon(at)2ndQuadrant(dot)com
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-11 22:48:54
Message-ID:	48C9A056.4010205@cheapcomplexdevices.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Tom Lane wrote:
>> interval '1' year.
>
> ...is SQL spec syntax, but it's not fully implemented in Postgres...
>
> Or someone could try to make it work, but given that no one has taken
> the slightest interest since Tom Lockhart left the project, I wouldn't
> hold my breath waiting for that.

I have interest. For 5 years I've been maintaining a patch for a client
that allows the input of ISO-8601 intervals (like 'P1YT1M') rather than
the nonstandard shorthand ('1Y1M') that postgresql supports[1].

I'd be interested in working on this. Especially if supporting SQL
standard interval syntax could improve the chances of getting my
ISO-8601-interval-syntax replacing nonstandard-postgres-shorthand-intervals
patch accepted again, I'd be quite happy work on it.

Tom in 2003 said my code looked cleaner than the current code[2], and
the patch was accepted[3] for a while before being rejected - I believe
because Peter said he'd like to see the SQL standard intervals first.
I see it's still a TODO, though.

> the grammar supports it but the info doesn't get propagated to
> interval_in, and interval_in wouldn't know what to do even if it did
> have the information that there was a YEAR qualifier after the literal.

Any hints on how best to propagate the needed info from the grammar?
Or should it be obvious to me from reading the code?

[1] http://archives.postgresql.org/pgsql-patches/2003-09/msg00119.php
[2] http://archives.postgresql.org/pgsql-patches/2003-09/msg00121.php
[3] http://archives.postgresql.org/pgsql-patches/2003-12/msg00253.php

Ron Mayer
(formerly ron(at)intervideo(dot)com who
posted those ISO-8601 interval patches)

From:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To:	Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Treat <robert(at)omniti(dot)com>, pgsql-general(at)postgresql(dot)org, Amber <guxiaobo1982(at)hotmail(dot)com>, Andrew Sullivan <ajs(at)commandprompt(dot)com>, simon(at)2ndQuadrant(dot)com
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-11 23:09:24
Message-ID:	48C9A524.80407@cheapcomplexdevices.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Ron Mayer wrote:
> Tom Lane wrote:
>> Or someone could try to make it work, but given that no one has taken
>> the slightest interest since Tom Lockhart left the project, I wouldn't
>> hold my breath waiting for that.
>
> I have interest. For 5 years I've been maintaining a patch for a client

Doh. Now that I catch up on emails I see Tom has a patch
in a different thread. I'll follow up there...

From:	"Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To:	Greg Smith <gsmith(at)gregsmith(dot)com>
Cc:	Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: PostgreSQL TPC-H test result?
Date:	2008-09-12 03:30:39
Message-ID:	48C9E25F.4070203@sun.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

Moving this thread to Performance alias as it might make more sense for
folks searching on this topic:

I redid a quick test with the same workload on one of my systems with SF
10 which is about 10GB
(I hope it comes out properly displayed)

Jignesh From Monet (8.3T/8.2.9)

Q Time PG8.3.3 Time PG8.2.9 Ratio

1 429.01 510 0.84

2 3.65 54 0.07

3 33.49 798 0.04

4 6.53 Empty 35 (E) 0.19

5 8.45 Empty 5.5(E) 1.54

6 32.84 Empty 172 (E) 0.19

7 477.95 439 1.09

8 58.55 251 0.23

9 781.96 2240 0.35

10 9.03 Empty 6.1(E) 1.48

11 3.57 Empty 25 0.14

12 56.11 Empty 179 (E) 0.31

13 61.01 140 0.44

14 30.69 Empty 169 (E) 0.18

15 32.81 Empty 168 (E) 0.2

16 23.98 115 0.21

17 Did not finish Did not finish

18 58.93 882 0.07

19 71.55 218 0.33

20 Did not finish Did not finish

21 550.51 477 1.15

22 6.21 Did not finish

All time is in seconds (sub seconds where availabe)
Ratio > 1 means 8.3.3 is slower and <1 means 8.3.3 is faster

My take on the results:

* I had to tweak the statement of Q1 in order to execute it.
(TPC-H kit does not directly support POSTGRESQL statements)

* Timings with 8.3.3 and bit of tuning gives much better time overall
This was expected (Some queries finish in 7% of the time than what
MonetDB reported. From the queries that worked only Q7 & Q21 seem to
have regressed)

* However Empty rows results is occuring consistently
(Infact Q11 also returned empty for me while it worked in their test)
Queries: 4,5,6,10,11,12,14,15
(ACTION ITEM: I will start separate threads for each of those queries in
HACKERS alias to figure out the problem since it looks like Functional
problem to me and should be interesting to hackers alias)

* Two queries 17,20 looks like will not finish (I let Q17 to run for 18
hrs and
yet it had not completed. As for Q20 I killed it as it was approaching
an hour.)
(ACTION ITEM: Not sure whether debugging for these queries will go in
hackers or
perform alias but I will start a thread on them too.)

* Looks like bit of tuning is required for Q1, Q7, Q9, Q21 to improve their
overall time. Specially understanding if PostgreSQL is missing a more
efficient
plan for them.
(ACTION ITEM: I will start separate threads on performance alias to
dig into
those queries)

I hope to start separate threads for each queries so we can track them
easier. I hope to provide explain analyze outputs for each one of them
and lets see if there are any problems.

Feedback welcome on what you want to see for each threads.

Regards,
Jignesh

--
Jignesh Shah http://blogs.sun.com/jkshah
Sun Microsystems,Inc http://sun.com/postgresql

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc:	Greg Smith <gsmith(at)gregsmith(dot)com>, Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: [GENERAL] PostgreSQL TPC-H test result?
Date:	2008-09-12 03:48:06
Message-ID:	24184.1221191286@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

"Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
> * However Empty rows results is occuring consistently
> (Infact Q11 also returned empty for me while it worked in their test)
> Queries: 4,5,6,10,11,12,14,15
> (ACTION ITEM: I will start separate threads for each of those queries in
> HACKERS alias to figure out the problem since it looks like Functional
> problem to me and should be interesting to hackers alias)

See discussion suggesting that this is connected to misinterpretation of
INTERVAL literals. If TPC-H relies heavily on syntax that we'd get
wrong, then pretty much every test result has to be under suspicion,
since we might be fetching many more or fewer rows than the test
intends.

I've recently committed fixes that I think would cover this, but you'd
really need to compare specific query rowcounts against other DBMSes
to make sure we're all on the same page.

regards, tom lane

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>
Cc:	"Greg Smith" <gsmith(at)gregsmith(dot)com>, Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: [GENERAL] PostgreSQL TPC-H test result?
Date:	2008-09-12 04:46:55
Message-ID:	36e682920809112146y2a1f43cdk5bc2b085de92e16d@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-performance

On Thu, Sep 11, 2008 at 11:30 PM, Jignesh K. Shah <J(dot)K(dot)Shah(at)sun(dot)com> wrote:
> Moving this thread to Performance alias as it might make more sense for
> folks searching on this topic:

You should be using DBT-3. Similarly, a scale factor of 10 is
pointless. How many data warehouses are only 10GB? Also, it's
well-known that MonetDB will quickly fall over when you run a test
larger than can fit in memory. In the real benchmark, the minimum
scale factor is 100GB; try it and see what you get.

If you have the resources and want to compare it to something, compare
it with Oracle on the exact same system. If tuned properly, Oracle
10g (Standard Edition with the exact same tables/indexes/queries as
Postgres) is ~5-10x faster and Enterprise Edition is ~50-100x faster.
To be fair, an Oracle Enterprise Edition configuration for TPC-H uses
advanced partitioning and materialized views, both of which Postgres
does not support, which makes it an apples-to-oranges comparison. I
haven't tried 11g, but I expect it will perform a bit better in this
area given several of the enhancements. Also, while it's not widely
known, if you wanted to compare systems and don't want to set it all
up yourself, Oracle released Oracle-compatible versions of OSDL's
Database Test Suite for DBT-2 (TPC-C) and DBT-3 (TPC-H) as part of
their Linux Test Kit which can be found at oss.oracle.com. Due to
Oracle's license, I can't give you exact timings, but I have confirmed
with several other benchmark professionals that the results mentioned
above have been confirmed by others as well.

To be clear, I'm not trying to bash on PG and I don't want to start a
flame-war. I just think that people should be aware of where we stand
in comparison to commercial systems and understand that there's quite
a bit of work to be done in the VLDB area. Specifically, I don't
think we should be striving for great TPC-H performance, but I believe
there is some areas we could improve based on it. Similarly, this is
an area where a properly-utilized fadvise may show some benefit.

As for running the TPC-H on Postgres, you need a
default_statistics_target of at least 250. IIRC, the last time I
checked (on 8.3), you really needed a statistics target around
400-500. For the most part, the planner is choosing a bad plan for
several of the queries. After you resolve that, you'll quickly notice
that Postgres' buffer manager design and the lack of a good
multi-block read quickly comes into play. The hash join
implementation also has a couple issues which I've recently seen
mentioned in other threads.

Use DBT-3, it will save you quite a few headaches :)

--
Jonah H. Harris, Senior DBA
myYearbook.com