Re: URGENT HELP about 'duration' stats

Lists: pgsql-hackers
From: Camilo Porto <camiloporto(at)hotmail(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: URGENT HELP about 'duration' stats
Date: 2007-10-27 02:46:42
Message-ID: BLU111-W521DC75D94C33D2F5034C0BC970@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I
have been doing my master degree research and I am using Postgresql
8.2.5. I would appreciate if anyone could help me, ASAP, concerning the
following question:

How can I obtain the time spent by the Postgresql EXECUTOR during a given time interval?

I am trying to get the above information in the following way::

Using
the BenchmarkSQL Software, I submit random transactions (by simulating
the TPC-C benchmark) during a time interval (say 30 seconds).

PostgreSQL then generates statistics for each phase that
a SQL Command is executed. In thje following, it is shown the time
duration of the EXECUTE Phase of some SQL Command:

.
.
.
.
2007-10-11 15:49:
05.221 BRT [46994] LOG: duration: 0.012 ms execute S_1: BEGIN
2007-10-11
15:49:05.430 BRT [46994] LOG: duration: 94.073 ms execute
<unnamed>: SELECT c_discount, c_last, c_credit, w_tax FROM
customer, warehouse WHERE

w_id = $1 AND w_id = c_w_id AND c_d_id = $2 AND c_id = $3
2007-10-11
15:49:05.462 BRT [46994] LOG: duration: 13.662 ms execute
<unnamed>: SELECT d_next_o_id, d_tax FROM district WHERE d_id =
$1 AND d_w_id = $2 FO

R UPDATE
2007-10-11 15:49:05.562 BRT [46994] LOG: duration:
98.818 ms execute <unnamed>: INSERT INTO NEW_ORDER (no_o_id,
no_d_id, no_w_id) VALUES ( $1, $2, $3)
2007-10-11 15:49:05.579 BRT [46994] LOG: duration:
15.459 ms execute <unnamed>: UPDATE district SET d_next_o_id = d_next_o_id + 1 WHERE d_id = $1 AND d_w
_id = $2
2007-10-11
15:49:05.748 BRT [46994] LOG: duration: 122.387 ms execute
<unnamed>: INSERT INTO OORDER (o_id, o_d_id, o_w_id, o_c_id,
o_entry_d, o_ol_cnt, o

_all_local) VALUES ($1, $2, $3, $4, $5, $6, $7)
2007-10-11
15:49:05.779 BRT [46994] LOG: duration: 29.787 ms execute
<unnamed>: SELECT i_price, i_name , i_data FROM item WHERE i_id =
$1
2007-10-11 15:49:
05.846 BRT [46994] LOG: duration: 64.963 ms execute <unnamed>:
SELECT s_quantity, s_data, s_dist_01, s_dist_02, s_dist_03, s_dist_04,
s_d
ist_05, s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE

2007-10-11 15:49:05.882 BRT [46994] LOG: duration: 33.865 ms
execute <unnamed>: SELECT i_price, i_name , i_data FROM item
WHERE i_id = $1
2007-10-11 15:49:05.931 BRT [46994] LOG:
duration: 48.075 ms execute <unnamed>: SELECT s_quantity,
s_data, s_dist_01, s_dist_02, s_dist_03, s_dist_04, s_d

ist_05, s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE
2007-10-11
15:49:05.962 BRT [46994] LOG: duration: 28.782 ms execute
<unnamed>: SELECT i_price, i_name , i_data FROM item WHERE i_id =
$1

2007-10-11
15:49:06.002 BRT [46994] LOG: duration: 38.026 ms execute
<unnamed>: SELECT s_quantity, s_data, s_dist_01, s_dist_02,
s_dist_03, s_dist_04, s_d
ist_05, s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE

2007-10-11 15:49:06.027 BRT [46994] LOG: duration: 23.220 ms
execute <unnamed>: SELECT i_price, i_name , i_data FROM item
WHERE i_id = $1
2007-10-11 15:49:06.043 BRT [46994] LOG:
duration: 14.308 ms execute <unnamed>: SELECT s_quantity,
s_data, s_dist_01, s_dist_02, s_dist_03, s_dist_04, s_d

ist_05, s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10 FROM stock WHERE s_i_id = $1 AND s_w_id = $2 FOR UPDATE
.
.
.
.

Trying to answer my initial question, I am doing something like this:

1. Through the statements TIMESTAMP, I calculate the total time
interval in which all the statments had been executed: (END
TIMESTAMP(2007-10-11 15:49: 06.043) - BEGIN TIMESTAMP (2007-10-11
15:49:05.221))
2. Then I sum all the 'duration' fields of all
EXECUTOR Sections of all statements executed. This, in thesis, give me
the amount of time that the executor had been used during the total
time interval in which the statements had been executed)

The problem I have encountered is that the sum of executor's
duration time is, *sometimes*, bigger than the total time interval in
which the statements had been executed!! And this makes no sense!

In some tests I have done, I have found these results:
- Total Time interval which BenchmarkSQL submitted queries = 30 seconds

- sum of executor's duration time during this period of time = 2 minutes and 36 seconds!! (5x longer than total time)

How could the sum of duration time of executor be longer than the sum of duration of the statements that use it?

Can anyone, please, help me to solve this? Many thanks, Camilo
_________________________________________________________________
Veja mapas e encontre as melhores rotas para fugir do trânsito com o Live Search Maps!
http://www.livemaps.com.br/index.aspx?tr=true


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Camilo Porto <camiloporto(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-27 03:06:22
Message-ID: 6193.1193454382@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Camilo Porto <camiloporto(at)hotmail(dot)com> writes:
> The problem I have encountered is that the sum of executor's
> duration time is, *sometimes*, bigger than the total time interval in
> which the statements had been executed!! And this makes no sense!

Umm ... why not? If you have, say, two queries executing in parallel
for 1 second, they'll each report a duration: of 1 second, thus summing
to 2 seconds, but the elapsed time was only 1 second.

If you don't see that always, then your benchmark program isn't trying
very hard to run more than one query in parallel ...

regards, tom lane


From: Camilo Porto <camiloporto(at)hotmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-27 14:10:06
Message-ID: BLU111-W4F245D9D14469DA3F8AA8BC970@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[Camilo Porto]

> To: camiloporto(at)hotmail(dot)com
> CC: pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> Date: Fri, 26 Oct 2007 23:06:22 -0400
> From: tgl(at)sss(dot)pgh(dot)pa(dot)us
>
> Camilo Porto <camiloporto(at)hotmail(dot)com> writes:
> > The problem I have encountered is that the sum of executor's
> > duration time is, *sometimes*, bigger than the total time interval in
> > which the statements had been executed!! And this makes no sense!
>
> Umm ... why not? If you have, say, two queries executing in parallel
> for 1 second, they'll each report a duration: of 1 second, thus summing
> to 2 seconds, but the elapsed time was only 1 second.
>
> If you don't see that always, then your benchmark program isn't trying
> very hard to run more than one query in parallel ...

This really make sense, but let me add some questions:

The parallelism happens even if my PC has only one processor?
Each query is executed in a separeted Thread?
I am simulating only 1 client with the Benchmark. Can 1 Client submit parallel queries, in single-processor enviroment?

Many Thanks
>
> regards, tom lane

_________________________________________________________________
Veja mapas e encontre as melhores rotas para fugir do trânsito com o Live Search Maps!
http://www.livemaps.com.br/index.aspx?tr=true


From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: "Camilo Porto" <camiloporto(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-27 17:12:28
Message-ID: 92869e660710271012k560de08et84844a6503315520@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2007/10/27, Camilo Porto <camiloporto(at)hotmail(dot)com>:

> The parallelism happens even if my PC has only one processor?

PostgreSQL creates a separate process for each client connection -
whether you have one CPU or more.

> Each query is executed in a separeted Thread?
No threads, at least on Unix. Regular processes (sometimes called backends).

> I am simulating only 1 client with the Benchmark. Can 1 Client submit
> parallel queries, in single-processor enviroment?
If this client uses two connections, you can run two queries in paralell.

this chapter of documentation can be helpful:
http://www.postgresql.org/docs/8.2/static/overview.html


From: Camilo Porto <camiloporto(at)hotmail(dot)com>
To: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-30 14:16:39
Message-ID: BLU111-W11F14CA3300A0720458C25BC920@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[Camilo Porto]

> Date: Sat, 27 Oct 2007 18:12:28 +0100
> From: plk(dot)zuber(at)gmail(dot)com
> To: camiloporto(at)hotmail(dot)com
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> CC: pgsql-hackers(at)postgresql(dot)org
>
> 2007/10/27, Camilo Porto <camiloporto(at)hotmail(dot)com>:
>
> > The parallelism happens even if my PC has only one processor?
>
> PostgreSQL creates a separate process for each client connection -
> whether you have one CPU or more.
>
> > Each query is executed in a separeted Thread?
> No threads, at least on Unix. Regular processes (sometimes called backends).
>
> > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > parallel queries, in single-processor enviroment?
> If this client uses two connections, you can run two queries in paralell.
The client uses only 1 connection. In this situation is possible that the EXECUTOR's duration time become greater than the time period which the Query was observed? (as stated in my first topic)?
>
>
> this chapter of documentation can be helpful:
> http://www.postgresql.org/docs/8.2/static/overview.html

_________________________________________________________________
Receba GRÁTIS as mensagens do Messenger no seu celular quando você estiver offline. Conheça o MSN Mobile!
http://mobile.live.com/signup/signup2.aspx?lc=pt-br


From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: "Camilo Porto" <camiloporto(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-30 16:09:01
Message-ID: 92869e660710300909n4ce1551am9ac105ab9cbea28f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2007/10/30, Camilo Porto <camiloporto(at)hotmail(dot)com>:

> > > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > > parallel queries, in single-processor enviroment?
> > If this client uses two connections, you can run two queries in paralell.
> The client uses only 1 connection. In this situation is possible that
> the EXECUTOR's duration time become greater than the time period which the
> Query was observed? (as stated in my first topic)?

I guess it's possible under some circumstances...
2007-10-30 16:07:00 GMT [123] LOG: duration: 99000.000 ms select longfunc()
2007-10-30 16:07:01 GMT [123] LOG: duration: 1000.000 ms select shortfunc()
interval is 1 second, sum of durations 100 seconds :)

AFAIK, timestamps in the front of each line are assigned by log
writer, ie. *in the moment of writing* to the log. I'd better trust
"duration: xxx ms " messages. they are calculated in backend directly.

In this log sample you showed us, the sum of durations is circa 625
ms. and the interval between first and last log entry is circa 822 ms.
If you have a test case which shows that much difference you speak of,
could you please present it here, along with your logging settings?

--
Filip Rembiałkowski


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Camilo Porto <camiloporto(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-10-30 21:04:00
Message-ID: 20071030210400.GA2568@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Camilo Porto wrote:

> In some tests I have done, I have found these results:
> - Total Time interval which BenchmarkSQL submitted queries = 30 seconds
>
> - sum of executor's duration time during this period of time = 2
> minutes and 36 seconds!! (5x longer than total time)

Are you sure you're not confusing the units? The log you posted shows
milliseconds ...

Another thought is that some queries may count the time more than once.
For example if you submit an UPDATE that acts on a table with foreign
keys, and then the SELECT FOR UPDATE that the FK code does internally is
also logged, I think the UPDATE would include the time consumed in the
SELECT FOR UPDATE. So if you just sum both, the total obtained would
include the SELECT FOR UPDATE twice. This is just a wild guess; I'm not
sure if the queries for FKs checking are logged.

--
Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton."


From: Camilo Porto <camiloporto(at)hotmail(dot)com>
To: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: URGENT HELP about 'duration' stats
Date: 2007-11-01 15:49:32
Message-ID: BLU111-W5283EBA5222878EBF17393BC8C0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[Camilo Porto]

> Date: Tue, 30 Oct 2007 16:09:01 +0000
> From: plk(dot)zuber(at)gmail(dot)com
> To: camiloporto(at)hotmail(dot)com
> Subject: Re: [HACKERS] URGENT HELP about 'duration' stats
> CC: pgsql-hackers(at)postgresql(dot)org
>
> 2007/10/30, Camilo Porto <camiloporto(at)hotmail(dot)com>:
>
> > > > I am simulating only 1 client with the Benchmark. Can 1 Client submit
> > > > parallel queries, in single-processor enviroment?
> > > If this client uses two connections, you can run two queries in paralell.
> > The client uses only 1 connection. In this situation is possible that
> > the EXECUTOR's duration time become greater than the time period which the
> > Query was observed? (as stated in my first topic)?
>
> I guess it's possible under some circumstances...
> 2007-10-30 16:07:00 GMT [123] LOG: duration: 99000.000 ms select longfunc()
> 2007-10-30 16:07:01 GMT [123] LOG: duration: 1000.000 ms select shortfunc()
> interval is 1 second, sum of durations 100 seconds :)
>
>
> AFAIK, timestamps in the front of each line are assigned by log
> writer, ie. *in the moment of writing* to the log. I'd better trust
> "duration: xxx ms " messages. they are calculated in backend directly.
>
>
> In this log sample you showed us, the sum of durations is circa 625
> ms. and the interval between first and last log entry is circa 822 ms.
> If you have a test case which shows that much difference you speak of,
> could you please present it here, along with your logging settings?

I tried to generate a log file that could present the above situation, but was not sucessfully! This case happens when I wa using version 7.4 os PostgreSQL. Using 8.2.5, the problem (with 1 client and 1 connection) appear not to happen anymore! I've just tried with more than one client (1 connection per client) and the problem turn to appear. But now this make sense (considering the comments submitted to this topic).

Well, I will think about how can I calculate the % of time the executor passing executing queries along a period of time.

I would be grateful if someone could help me to resolve this...

Many Thanks!!!

Camilo
>
>
>
> --
> Filip Rembiałkowski

_________________________________________________________________
Conheça o Windows Live Spaces, a rede de relacionamentos conectada ao Messenger!
http://spaces.live.com/signup.aspx