Re: [PERFORM] MySQL vs PG TPC-H benchmarks

Lists: pgsql-advocacypgsql-performance
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-advocacy(at)postgresql(dot)org
Subject: Fwd: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-21 17:31:31
Message-ID: 200404211031.31007.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

---------- Forwarded Message ----------

Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: Wed, 21 Apr 2004 13:55:21 +0100
From: Paul Thomas <paul(at)tmsl(dot)demon(dot)co(dot)uk>
To: Nick Barr <nicky(at)chuckie(dot)co(dot)uk>
Cc: "pgsql-performance @ postgresql . org" <pgsql-performance(at)postgresql(dot)org>

On 21/04/2004 09:31 Nick Barr wrote:
> Hi,
>
> Has anyone had a look at:
>
> http://people.ac.upc.es/zgomez/
>
> I realize that MySQL & PG cannot really be compared (especially when you
> consider the issues that MySQL has with things like data integrity) but
> still surely PG would perform better than the stats show (i.e. #7 31.28
> seconds versus 42 minutes!!!).

Looks like he's using the default postgresql.conf settings in which case
I'm not suprised at pg looking so slow. His stated use of foreign keys
invalidates the tests anyway as MyISAM tables don't support FKs so we're
probably seeing FK check overheads in pg that are simply ignore by MySQL.
In an honest test, MySQL should be reported as failing those tests.

Perhaps one of the advocay team will pick up the batton?

> On a side note it certainly looks like linux kernel 2.6 is quite a bit
> faster in comparision to 2.4.

Yes, I've seen other benchmarks which also show that.

--
Paul Thomas
+------------------------------+---------------------------------------------
+

| Thomas Micro Systems Limited | Software Solutions for

Business |

| Computer Consultants |

http://www.thomas-micro-systems-ltd.co.uk |
+------------------------------+---------------------------------------------
+

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

-------------------------------------------------------

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-advocacy(at)postgresql(dot)org
Cc: pgsql-performance(at)postgreSQL(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-21 17:47:03
Message-ID: 200404211047.03808.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Folks,

I've sent a polite e-mail to Mr. Gomez offering our help. Please, nobody
flame him!

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-21 19:38:47
Message-ID: 4086CDC7.5000204@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Josh Berkus wrote:

> Folks,
>
> I've sent a polite e-mail to Mr. Gomez offering our help. Please, nobody
> flame him!
>

Please keep in mind that the entire test has, other than a similar
database schema and query types maybe, nothing to do with a TPC-H. I
don't see any kind of SUT. Foreign key support on the DB level is not
required by any of the TPC benchmarks. But the System Under Test, which
is the combination of middleware application and database together with
all computers and network components these parts are running on, must
implement all the required semantics, like ACID properties, referential
integrity &c. One could implement a TPC-H with flat files, it's just a
major pain in the middleware.

A proper TPC benchmark implementation would for example be a complete
PHP+DB application, where the user interaction is done by an emulated
"browser" and what is measured is the http response times, not anything
going on between PHP and the DB. Assuming that all requirements of the
TPC specification are implemented by either using available DB features,
or including appropriate workarounds in the PHP code, that would very
well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by timing such a system
after a considerable rampup time, giving the DB system a chance to
properly populate caches and so forth. Rebooting the machine just before
the test is the wrong thing here and will especially kill any advanced
cache algorithms like ARC.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 12:53:18
Message-ID: 20040422125318.26053.qmail@web60601.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Folks,

Im doing the 100GB TPC-H and Ill show the previous
results to our community (Postgres) in 3 weeks before
finishing the study.

My intention is to carry through a test with a VLDB in
a low cost platform (PostgreSQL, Linux and cheap HW)
and not to compare with another DBMS.

So far I can tell you that the load time on PG 7.4.2
with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
with 8 disks (960 GB) loaded the database in less than
24 hours.
About 7hs:30min to load the data and 16:09:25 to
create the indexes

The Power test still running and thats why Ill not
present anything so far. Now Ill just send to the
list my environment configuration.

- The configuration of the machine is:
Dual opteron 64 bits model 240
4GB RAM
960 GB on RAID 0
Mandrake Linux 64 with Kernel 2.6.5 (I compiled a
kernel for this test)
Java SDK java version "1.4.2_04"
PostgreSQL JDBC pg74.1jdbc3.jar

- The TPC-H configuration is:
TPC-H 2.0.0
100GB
load using flat files
Refresh functions using java

- The PostgreSQL 7.4.2 configuration is:

add_missing_from | on
australian_timezones | off
authentication_timeout | 60
check_function_bodies | on
checkpoint_segments | 128
checkpoint_timeout | 300
checkpoint_warning | 30
client_encoding | SQL_ASCII
client_min_messages | notice
commit_delay | 0
commit_siblings | 5
cpu_index_tuple_cost | 0.001
cpu_operator_cost | 0.0025
cpu_tuple_cost | 0.01
DateStyle | ISO, MDY
db_user_namespace | off
deadlock_timeout | 1000
debug_pretty_print | off
debug_print_parse | off
debug_print_plan | off
debug_print_rewritten | off
default_statistics_target | 10
default_transaction_isolation | read committed
default_transaction_read_only | off
dynamic_library_path | $libdir
effective_cache_size | 150000
enable_hashagg | on
enable_hashjoin | on
enable_indexscan | on
enable_mergejoin | on
enable_nestloop | on
enable_seqscan | on
enable_sort | on
enable_tidscan | on
explain_pretty_print | on
extra_float_digits | 0
from_collapse_limit | 8
fsync | off
geqo | on
geqo_effort | 1
geqo_generations | 0
geqo_pool_size | 0
geqo_selection_bias | 2
geqo_threshold | 11
join_collapse_limit | 8
krb_server_keyfile | unset
lc_collate | en_US
lc_ctype | en_US
lc_messages | C
lc_monetary | C
lc_numeric | C
lc_time | C
log_connections | off
log_duration | off
log_error_verbosity | default
log_executor_stats | off
log_hostname | off
log_min_duration_statement | -1
log_min_error_statement | panic
log_min_messages | notice
log_parser_stats | off
log_pid | off
log_planner_stats | off
log_source_port | off
log_statement | off
log_statement_stats | off
log_timestamp | off
max_connections | 10
max_expr_depth | 10000
max_files_per_process | 1000
max_fsm_pages | 20000
max_fsm_relations | 1000
max_locks_per_transaction | 64
password_encryption | on
port | 5432
pre_auth_delay | 0
preload_libraries | unset
random_page_cost | 1.25
regex_flavor | advanced
rendezvous_name | unset
search_path | $user,public
server_encoding | SQL_ASCII
server_version | 7.4.2
shared_buffers | 40000
silent_mode | off
sort_mem | 65536
sql_inheritance | on
ssl | off
statement_timeout | 10000000
stats_block_level | off
stats_command_string | off
stats_reset_on_server_start | on
stats_row_level | off
stats_start_collector | on
superuser_reserved_connections | 2
syslog | 0
syslog_facility | LOCAL0
syslog_ident | postgres
tcpip_socket | on
TimeZone | unknown
trace_notify | off
transaction_isolation | read committed
transaction_read_only | off
transform_null_equals | off
unix_socket_directory | unset
unix_socket_group | unset
unix_socket_permissions | 511
vacuum_mem | 65536
virtual_host | unset
wal_buffers | 32
wal_debug | 0
wal_sync_method | fdatasync
zero_damaged_pages | off
(113 rows)

suggestions, doubts and commentaries are very welcome

regards
______________________________
Eduardo Cunha de Almeida
Administrao de Banco de Dados
UFPR - CCE
+55-41-361-3321
eduardo(dot)almeida(at)ufpr(dot)br
edalmeida(at)yahoo(dot)com

--- Jan Wieck <JanWieck(at)Yahoo(dot)com> wrote:
> Josh Berkus wrote:
>
> > Folks,
> >
> > I've sent a polite e-mail to Mr. Gomez offering
> our help. Please, nobody
> > flame him!
> >
>
> Please keep in mind that the entire test has, other
> than a similar
> database schema and query types maybe, nothing to do
> with a TPC-H. I
> don't see any kind of SUT. Foreign key support on
> the DB level is not
> required by any of the TPC benchmarks. But the
> System Under Test, which
> is the combination of middleware application and
> database together with
> all computers and network components these parts are
> running on, must
> implement all the required semantics, like ACID
> properties, referential
> integrity &c. One could implement a TPC-H with flat
> files, it's just a
> major pain in the middleware.
>
> A proper TPC benchmark implementation would for
> example be a complete
> PHP+DB application, where the user interaction is
> done by an emulated
> "browser" and what is measured is the http response
> times, not anything
> going on between PHP and the DB. Assuming that all
> requirements of the
> TPC specification are implemented by either using
> available DB features,
> or including appropriate workarounds in the PHP
> code, that would very
> well lead to something that can compare PHP+MySQL
> vs. PHP+PostgreSQL.
>
> All TPC benchmarks I have seen are performed by
> timing such a system
> after a considerable rampup time, giving the DB
> system a chance to
> properly populate caches and so forth. Rebooting the
> machine just before
> the test is the wrong thing here and will especially
> kill any advanced
> cache algorithms like ARC.
>
>
> Jan
>
> --
>
#======================================================================#
> # It's easier to get forgiveness for being wrong
> than for being right. #
> # Let's break this rule - forgive me.
> #
> #==================================================
> JanWieck(at)Yahoo(dot)com #
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>
> http://www.postgresql.org/docs/faqs/FAQ.htmlIP 5:
> Have you checked our extensive FAQ?
>
>
http://www.postgresql.org/docs/faqs/FAQ.html



__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25
http://photos.yahoo.com/ph/print_splash


From: Grega Bremec <grega(dot)bremec(at)noviforum(dot)si>
To: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 13:42:49
Message-ID: 20040422134249.GA14342@elbereth.noviforum.si
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

...and on Thu, Apr 22, 2004 at 05:53:18AM -0700, Eduardo Almeida used the keyboard:
>
> - The configuration of the machine is:
> Dual opteron 64 bits model 240
> 4GB RAM
> 960 GB on RAID 0
> Mandrake Linux 64 with Kernel 2.6.5 (I compiled a
> kernel for this test)
> Java SDK java version "1.4.2_04"
> PostgreSQL JDBC pg74.1jdbc3.jar
>
> - The TPC-H configuration is:
> TPC-H 2.0.0
> 100GB
> load using flat files
> Refresh functions using java
>

I'll just add for the reference, to those that aren't aware of it, the Java
virtual machine for x86_64 only exists in the 1.5 branch so far, and it's so
utterly unstable that most every notable shuffling around in the memory
crashes it. :)

Hence the 1.4.2_04 is a 32-bit application running in 32-bit mode.

I won't be getting into how much this affects the benchmarks as I didn't
really get into how CPU- and memory-intensive the refresh functions are in
these, so as I said - let's keep it a reference.

Cheers,
--
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


From: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
To: Grega Bremec <grega(dot)bremec(at)noviforum(dot)si>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 13:59:10
Message-ID: 20040422135910.24475.qmail@web60607.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Grega,

Thats why I used java 32bits and needed to compile
the kernel 2.6.5 with the 32bits modules.
To reference, Sun has java 64bits just to IA64 and
Solaris Sparc 64 not to Opteron.

regards,
Eduardo
--- Grega Bremec <grega(dot)bremec(at)noviforum(dot)si> wrote:
> ...and on Thu, Apr 22, 2004 at 05:53:18AM -0700,
> Eduardo Almeida used the keyboard:
> >
> > - The configuration of the machine is:
> > Dual opteron 64 bits model 240
> > 4GB RAM
> > 960 GB on RAID 0
> > Mandrake Linux 64 with Kernel 2.6.5 (I compiled a
> > kernel for this test)
> > Java SDK java version "1.4.2_04"
> > PostgreSQL JDBC pg74.1jdbc3.jar
> >
> > - The TPC-H configuration is:
> > TPC-H 2.0.0
> > 100GB
> > load using flat files
> > Refresh functions using java
> >
>
> I'll just add for the reference, to those that
> aren't aware of it, the Java
> virtual machine for x86_64 only exists in the 1.5
> branch so far, and it's so
> utterly unstable that most every notable shuffling
> around in the memory
> crashes it. :)
>
> Hence the 1.4.2_04 is a 32-bit application running
> in 32-bit mode.
>
> I won't be getting into how much this affects the
> benchmarks as I didn't
> really get into how CPU- and memory-intensive the
> refresh functions are in
> these, so as I said - let's keep it a reference.
>
> Cheers,
> --
> Grega Bremec
> Senior Administrator
> Noviforum Ltd., Software & Media
> http://www.noviforum.si/
>

> ATTACHMENT part 2 application/pgp-signature



__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25
http://photos.yahoo.com/ph/print_splash


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 15:54:48
Message-ID: 7444.1082649288@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Eduardo Almeida <edalmeida(at)yahoo(dot)com> writes:
> About 7hs:30min to load the data and 16:09:25 to
> create the indexes

You could probably improve the index-create time by temporarily
increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX
several hundred meg to work in. (You don't want sort_mem that big
normally, because there may be many sorts happening in parallel,
but in a data-loading context there'll just be one active sort.)

regards, tom lane


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Eduardo Almeida <edalmeida(at)Yahoo(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 16:19:47
Message-ID: 4087F0A3.6040009@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Eduardo Almeida wrote:

> Folks,
>
> I’m doing the 100GB TPC-H and I’ll show the previous
> results to our community (Postgres) in 3 weeks before
> finishing the study.
>
> My intention is to carry through a test with a VLDB in
> a low cost platform (PostgreSQL, Linux and cheap HW)
> and not to compare with another DBMS.

QphH and Price/QphH will be enought for us to see where in the list we
are. Unfortunately there are only Sybase and MS SQL results published in
the 100 GB category. The 300 GB has DB2 as well. Oracle starts at 1 TB
and in the 10 TB category Oracle and DB2 are the only players left.

Jan

>
> So far I can tell you that the load time on PG 7.4.2
> with kernel 2.6.5 on Opteron 64 model 240 in RAID 0
> with 8 disks (960 GB) loaded the database in less than
> 24 hours.
> About 7hs:30min to load the data and 16:09:25 to
> create the indexes
>
> The Power test still running and that’s why I’ll not
> present anything so far. Now I´ll just send to the
> list my environment configuration.
>
> - The configuration of the machine is:
> Dual opteron 64 bits model 240
> 4GB RAM
> 960 GB on RAID 0
> Mandrake Linux 64 with Kernel 2.6.5 (I compiled a
> kernel for this test)
> Java SDK java version "1.4.2_04"
> PostgreSQL JDBC pg74.1jdbc3.jar
>
> - The TPC-H configuration is:
> TPC-H 2.0.0
> 100GB
> load using flat files
> Refresh functions using java
>
> - The PostgreSQL 7.4.2 configuration is:
>
> add_missing_from | on
> australian_timezones | off
> authentication_timeout | 60
> check_function_bodies | on
> checkpoint_segments | 128
> checkpoint_timeout | 300
> checkpoint_warning | 30
> client_encoding | SQL_ASCII
> client_min_messages | notice
> commit_delay | 0
> commit_siblings | 5
> cpu_index_tuple_cost | 0.001
> cpu_operator_cost | 0.0025
> cpu_tuple_cost | 0.01
> DateStyle | ISO, MDY
> db_user_namespace | off
> deadlock_timeout | 1000
> debug_pretty_print | off
> debug_print_parse | off
> debug_print_plan | off
> debug_print_rewritten | off
> default_statistics_target | 10
> default_transaction_isolation | read committed
> default_transaction_read_only | off
> dynamic_library_path | $libdir
> effective_cache_size | 150000
> enable_hashagg | on
> enable_hashjoin | on
> enable_indexscan | on
> enable_mergejoin | on
> enable_nestloop | on
> enable_seqscan | on
> enable_sort | on
> enable_tidscan | on
> explain_pretty_print | on
> extra_float_digits | 0
> from_collapse_limit | 8
> fsync | off
> geqo | on
> geqo_effort | 1
> geqo_generations | 0
> geqo_pool_size | 0
> geqo_selection_bias | 2
> geqo_threshold | 11
> join_collapse_limit | 8
> krb_server_keyfile | unset
> lc_collate | en_US
> lc_ctype | en_US
> lc_messages | C
> lc_monetary | C
> lc_numeric | C
> lc_time | C
> log_connections | off
> log_duration | off
> log_error_verbosity | default
> log_executor_stats | off
> log_hostname | off
> log_min_duration_statement | -1
> log_min_error_statement | panic
> log_min_messages | notice
> log_parser_stats | off
> log_pid | off
> log_planner_stats | off
> log_source_port | off
> log_statement | off
> log_statement_stats | off
> log_timestamp | off
> max_connections | 10
> max_expr_depth | 10000
> max_files_per_process | 1000
> max_fsm_pages | 20000
> max_fsm_relations | 1000
> max_locks_per_transaction | 64
> password_encryption | on
> port | 5432
> pre_auth_delay | 0
> preload_libraries | unset
> random_page_cost | 1.25
> regex_flavor | advanced
> rendezvous_name | unset
> search_path | $user,public
> server_encoding | SQL_ASCII
> server_version | 7.4.2
> shared_buffers | 40000
> silent_mode | off
> sort_mem | 65536
> sql_inheritance | on
> ssl | off
> statement_timeout | 10000000
> stats_block_level | off
> stats_command_string | off
> stats_reset_on_server_start | on
> stats_row_level | off
> stats_start_collector | on
> superuser_reserved_connections | 2
> syslog | 0
> syslog_facility | LOCAL0
> syslog_ident | postgres
> tcpip_socket | on
> TimeZone | unknown
> trace_notify | off
> transaction_isolation | read committed
> transaction_read_only | off
> transform_null_equals | off
> unix_socket_directory | unset
> unix_socket_group | unset
> unix_socket_permissions | 511
> vacuum_mem | 65536
> virtual_host | unset
> wal_buffers | 32
> wal_debug | 0
> wal_sync_method | fdatasync
> zero_damaged_pages | off
> (113 rows)
>
>
> suggestions, doubts and commentaries are very welcome
>
> regards
> ______________________________
> Eduardo Cunha de Almeida
> Administração de Banco de Dados
> UFPR - CCE
> +55-41-361-3321
> eduardo(dot)almeida(at)ufpr(dot)br
> edalmeida(at)yahoo(dot)com
>
> --- Jan Wieck <JanWieck(at)Yahoo(dot)com> wrote:
>> Josh Berkus wrote:
>>
>> > Folks,
>> >
>> > I've sent a polite e-mail to Mr. Gomez offering
>> our help. Please, nobody
>> > flame him!
>> >
>>
>> Please keep in mind that the entire test has, other
>> than a similar
>> database schema and query types maybe, nothing to do
>> with a TPC-H. I
>> don't see any kind of SUT. Foreign key support on
>> the DB level is not
>> required by any of the TPC benchmarks. But the
>> System Under Test, which
>> is the combination of middleware application and
>> database together with
>> all computers and network components these parts are
>> running on, must
>> implement all the required semantics, like ACID
>> properties, referential
>> integrity &c. One could implement a TPC-H with flat
>> files, it's just a
>> major pain in the middleware.
>>
>> A proper TPC benchmark implementation would for
>> example be a complete
>> PHP+DB application, where the user interaction is
>> done by an emulated
>> "browser" and what is measured is the http response
>> times, not anything
>> going on between PHP and the DB. Assuming that all
>> requirements of the
>> TPC specification are implemented by either using
>> available DB features,
>> or including appropriate workarounds in the PHP
>> code, that would very
>> well lead to something that can compare PHP+MySQL
>> vs. PHP+PostgreSQL.
>>
>> All TPC benchmarks I have seen are performed by
>> timing such a system
>> after a considerable rampup time, giving the DB
>> system a chance to
>> properly populate caches and so forth. Rebooting the
>> machine just before
>> the test is the wrong thing here and will especially
>> kill any advanced
>> cache algorithms like ARC.
>>
>>
>> Jan
>>
>> --
>>
> #======================================================================#
>> # It's easier to get forgiveness for being wrong
>> than for being right. #
>> # Let's break this rule - forgive me.
>> #
>> #==================================================
>> JanWieck(at)Yahoo(dot)com #
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: Have you checked our extensive FAQ?
>>
>>
>> http://www.postgresql.org/docs/faqs/FAQ.htmlIP 5:
>> Have you checked our extensive FAQ?
>>
>>
> http://www.postgresql.org/docs/faqs/FAQ.html
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Photos: High-quality 4x6 digital prints for 25¢
> http://photos.yahoo.com/ph/print_splash

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 17:10:34
Message-ID: 20040422171034.91737.qmail@web60607.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Folks,

I forgot to mention that I used Shell scripts to load
the data and use Java just to run the refresh
functions.

Talking about sort_mem config, I used 65000 but in the
TPCH specification they said that you are not able to
change the configs when you start the benchmark, is
that a big problem to use 65000? In the TPCH 100GB we
run 5 streams in parallel for the throughput test! To
power test I think is not a problem because it runs
one query after another.

Another thing is that I put statement_timeout =
10000000

Some queries may exceed this timeout and Ill send the
EXPLAIN for this ones.

The last thing is that Jan forgets to mention that
Teradata doesnt show up now but in older lists shows
3TB and 10TB results.

regards
Eduardo
--- Jan Wieck <JanWieck(at)Yahoo(dot)com> wrote:
> Eduardo Almeida wrote:
>
> > Folks,
> >
> > Im doing the 100GB TPC-H and Ill show the
> previous
> > results to our community (Postgres) in 3 weeks
> before
> > finishing the study.
> >
> > My intention is to carry through a test with a
> VLDB in
> > a low cost platform (PostgreSQL, Linux and cheap
> HW)
> > and not to compare with another DBMS.
>
> QphH and Price/QphH will be enought for us to see
> where in the list we
> are. Unfortunately there are only Sybase and MS SQL
> results published in
> the 100 GB category. The 300 GB has DB2 as well.
> Oracle starts at 1 TB
> and in the 10 TB category Oracle and DB2 are the
> only players left.
>
>
> Jan
>
> >
> > So far I can tell you that the load time on PG
> 7.4.2
> > with kernel 2.6.5 on Opteron 64 model 240 in RAID
> 0
> > with 8 disks (960 GB) loaded the database in less
> than
> > 24 hours.
> > About 7hs:30min to load the data and 16:09:25 to
> > create the indexes
> >
> > The Power test still running and thats why Ill
> not
> > present anything so far. Now Ill just send to the
> > list my environment configuration.
> >
> > - The configuration of the machine is:
> > Dual opteron 64 bits model 240
> > 4GB RAM
> > 960 GB on RAID 0
> > Mandrake Linux 64 with Kernel 2.6.5 (I compiled a
> > kernel for this test)
> > Java SDK java version "1.4.2_04"
> > PostgreSQL JDBC pg74.1jdbc3.jar
> >
> > - The TPC-H configuration is:
> > TPC-H 2.0.0
> > 100GB
> > load using flat files
> > Refresh functions using java
> >
> > - The PostgreSQL 7.4.2 configuration is:
> >
> > add_missing_from | on
> > australian_timezones | off
> > authentication_timeout | 60
> > check_function_bodies | on
> > checkpoint_segments | 128
> > checkpoint_timeout | 300
> > checkpoint_warning | 30
> > client_encoding | SQL_ASCII
> > client_min_messages | notice
> > commit_delay | 0
> > commit_siblings | 5
> > cpu_index_tuple_cost | 0.001
> > cpu_operator_cost | 0.0025
> > cpu_tuple_cost | 0.01
> > DateStyle | ISO, MDY
> > db_user_namespace | off
> > deadlock_timeout | 1000
> > debug_pretty_print | off
> > debug_print_parse | off
> > debug_print_plan | off
> > debug_print_rewritten | off
> > default_statistics_target | 10
> > default_transaction_isolation | read committed
> > default_transaction_read_only | off
> > dynamic_library_path | $libdir
> > effective_cache_size | 150000
> > enable_hashagg | on
> > enable_hashjoin | on
> > enable_indexscan | on
> > enable_mergejoin | on
> > enable_nestloop | on
> > enable_seqscan | on
> > enable_sort | on
> > enable_tidscan | on
> > explain_pretty_print | on
> > extra_float_digits | 0
> > from_collapse_limit | 8
> > fsync | off
> > geqo | on
> > geqo_effort | 1
> > geqo_generations | 0
> > geqo_pool_size | 0
> > geqo_selection_bias | 2
> > geqo_threshold | 11
> > join_collapse_limit | 8
> > krb_server_keyfile | unset
> > lc_collate | en_US
> > lc_ctype | en_US
> > lc_messages | C
> > lc_monetary | C
> > lc_numeric | C
> > lc_time | C
> > log_connections | off
> > log_duration | off
> > log_error_verbosity | default
> > log_executor_stats | off
> > log_hostname | off
> > log_min_duration_statement | -1
> > log_min_error_statement | panic
> > log_min_messages | notice
> > log_parser_stats | off
> > log_pid | off
> > log_planner_stats | off
> > log_source_port | off
> > log_statement | off
> > log_statement_stats | off
> > log_timestamp | off
> > max_connections | 10
> > max_expr_depth | 10000
> > max_files_per_process | 1000
> > max_fsm_pages | 20000
> > max_fsm_relations | 1000
> > max_locks_per_transaction | 64
> > password_encryption | on
> > port | 5432
> > pre_auth_delay | 0
> > preload_libraries | unset
> > random_page_cost | 1.25
> > regex_flavor | advanced
> > rendezvous_name | unset
> > search_path | $user,public
> > server_encoding | SQL_ASCII
> > server_version | 7.4.2
> > shared_buffers | 40000
> > silent_mode | off
> > sort_mem | 65536
> > sql_inheritance | on
> > ssl | off
> > statement_timeout | 10000000
> > stats_block_level | off
> > stats_command_string | off
> > stats_reset_on_server_start | on
> > stats_row_level | off
> > stats_start_collector | on
> > superuser_reserved_connections | 2
> > syslog | 0
> > syslog_facility | LOCAL0
> > syslog_ident | postgres
> > tcpip_socket | on
> > TimeZone | unknown
> > trace_notify | off
> > transaction_isolation | read committed
> > transaction_read_only | off
> > transform_null_equals | off
> > unix_socket_directory | unset
> > unix_socket_group | unset
> > unix_socket_permissions | 511
> > vacuum_mem | 65536
> > virtual_host | unset
> > wal_buffers | 32
> > wal_debug | 0
> > wal_sync_method | fdatasync
> > zero_damaged_pages | off
> > (113 rows)
> >
> >
> > suggestions, doubts and commentaries are very
> welcome
> >
> > regards
> > ______________________________
> > Eduardo Cunha de Almeida
> > Administrao de Banco de Dados
> > UFPR - CCE
> > +55-41-361-3321
> > eduardo(dot)almeida(at)ufpr(dot)br
> > edalmeida(at)yahoo(dot)com
> >
> > --- Jan Wieck <JanWieck(at)Yahoo(dot)com> wrote:
> >> Josh Berkus wrote:
> >>
> >> > Folks,
> >> >
> >> > I've sent a polite e-mail to Mr. Gomez offering
> >> our help. Please, nobody
> >> > flame him!
> >> >
> >>
> >> Please keep in mind that the entire test has,
> other
>
=== message truncated ===



__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25
http://photos.yahoo.com/ph/print_splash


From: Markus Bertheau <twanger(at)bluetwanger(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Eduardo Almeida <edalmeida(at)yahoo(dot)com>, Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 18:20:47
Message-ID: 1082658047.4524.0.camel@yarrow.bertheau.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

В Чтв, 22.04.2004, в 17:54, Tom Lane пишет:
> Eduardo Almeida <edalmeida(at)yahoo(dot)com> writes:
> > About 7hs:30min to load the data and 16:09:25 to
> > create the indexes
>
> You could probably improve the index-create time by temporarily
> increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX
> several hundred meg to work in. (You don't want sort_mem that big
> normally, because there may be many sorts happening in parallel,
> but in a data-loading context there'll just be one active sort.)

Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

--
Markus Bertheau <twanger(at)bluetwanger(dot)de>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Bertheau <twanger(at)bluetwanger(dot)de>
Cc: Eduardo Almeida <edalmeida(at)yahoo(dot)com>, Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks
Date: 2004-04-22 19:22:51
Message-ID: 9413.1082661771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

Markus Bertheau <twanger(at)bluetwanger(dot)de> writes:
>> You could probably improve the index-create time by temporarily
>> increasing sort_mem. It wouldn't be unreasonable to give CREATE INDEX
>> several hundred meg to work in. (You don't want sort_mem that big
>> normally, because there may be many sorts happening in parallel,
>> but in a data-loading context there'll just be one active sort.)

> Doesn't this provide a reason for CREATE INDEX not to honour sort_mem?

Already done for 7.5.

http://archives.postgresql.org/pgsql-committers/2004-02/msg00025.php

regards, tom lane


From: Grega Bremec <grega(dot)bremec(at)noviforum(dot)si>
To: Eduardo Almeida <edalmeida(at)yahoo(dot)com>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] MySQL vs PG TPC-H benchmarks
Date: 2004-05-06 08:59:13
Message-ID: 20040506085913.GA4102@elbereth.noviforum.si
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy pgsql-performance

...and on Thu, Apr 22, 2004 at 06:59:10AM -0700, Eduardo Almeida used the keyboard:

<snip>
>
> To reference, Sun has java 64bits just to IA64 and
> Solaris Sparc 64 not to Opteron.
>

As I mentioned, that is true for the 1.4.x release of the JVMs. We have been
testing some JCA builds of 1.5.0 on x86_64 so far, but it is too unstable for
any kind of serious work.

Cheers,
--
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/