Re: very slow when writing query to file

Lists: pgadmin-support
From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: very slow when writing query to file
Date: 2011-10-26 12:57:15
Message-ID: 4EA803AB.4090709@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support


I have a postgresql database 8.3 on a server. When querying the data
with the pgAdmin sql editor, I can get an answer in about 10 s, for
100'000 rows. When I'm pressing the button to execute the query to a
file it takes more than 1 hour to get the query results saved (writes
about 10 MB in 45 minutes).

Am I doing something wrong?
Are there some parameters to set?

This happens with the version 1.14 of pgAdmin, but used also to be very
slow with the previous versions (2-3 years ago).

OS : Archlinux (but had the same problem 1-2 years ago with ubuntu)

Thank you ,
Boris


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-10-28 17:58:18
Message-ID: 1319824698.2119.5.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Wed, 2011-10-26 at 14:57 +0200, boris pezzatti wrote:
> I have a postgresql database 8.3 on a server. When querying the data
> with the pgAdmin sql editor, I can get an answer in about 10 s, for
> 100'000 rows. When I'm pressing the button to execute the query to a
> file it takes more than 1 hour to get the query results saved (writes
> about 10 MB in 45 minutes).
>
> Am I doing something wrong?
> Are there some parameters to set?
>

Never heard of such a behaviour before. I'm not sure what happens on
your computer.

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-10-29 20:44:25
Message-ID: 4EAC65A9.4070208@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

Does anyone have any proposition how I could further test where the
problem is. This is really a strange behaviour, that I noticed with
different versions of pgAdmin.
Could a firewall produce a difference when querying data visually or for
a write to file? (should not ...)

Thank you,
Boris

Guillaume Lelarge wrote:
> On Wed, 2011-10-26 at 14:57 +0200, boris pezzatti wrote:
>> I have a postgresql database 8.3 on a server. When querying the data
>> with the pgAdmin sql editor, I can get an answer in about 10 s, for
>> 100'000 rows. When I'm pressing the button to execute the query to a
>> file it takes more than 1 hour to get the query results saved (writes
>> about 10 MB in 45 minutes).
>>
>> Am I doing something wrong?
>> Are there some parameters to set?
>>
> Never heard of such a behaviour before. I'm not sure what happens on
> your computer.
>
>


From: Francisco Leovey <fleovey(at)yahoo(dot)com>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-10-29 21:01:43
Message-ID: 1319922103.52174.YahooMailNeo@web39310.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

Is the file where you write the query output located on the same server as the DB?

________________________________
From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: pgadmin-support(at)postgresql(dot)org
Sent: Saturday, October 29, 2011 5:44 PM
Subject: Re: [pgadmin-support] very slow when writing query to file

Does anyone have any proposition how I could further test where the problem is. This is really a strange behaviour, that I noticed with different versions of pgAdmin.
Could a firewall produce a difference when querying data visually or for a write to file? (should not ...)

Thank you,
Boris

Guillaume Lelarge wrote:
> On Wed, 2011-10-26 at 14:57 +0200, boris pezzatti wrote:
>> I have a postgresql database 8.3 on a server. When querying the data
>> with the pgAdmin sql editor, I can get an answer in about 10 s, for
>> 100'000 rows. When I'm pressing the button to execute the query to a
>> file it takes more than 1 hour to get the query results saved (writes
>> about 10 MB in 45 minutes).
>>
>> Am I doing something wrong?
>> Are there some parameters to set?
>>
> Never heard of such a behaviour before. I'm not sure what happens on
> your computer.
>
>

-- Sent via pgadmin-support mailing list (pgadmin-support(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-support


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: Francisco Leovey <fleovey(at)yahoo(dot)com>
Cc: "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-10-30 21:28:49
Message-ID: 4EADC191.8050102@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

No, actually it is on my machine ...

On 10/29/2011 11:01 PM, Francisco Leovey wrote:
> Is the file where you write the query output located on the same
> server as the DB?
>
> ------------------------------------------------------------------------
> *From:* boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
> *To:* Guillaume Lelarge <guillaume(at)lelarge(dot)info>
> *Cc:* pgadmin-support(at)postgresql(dot)org
> *Sent:* Saturday, October 29, 2011 5:44 PM
> *Subject:* Re: [pgadmin-support] very slow when writing query to file
>
> Does anyone have any proposition how I could further test where the
> problem is. This is really a strange behaviour, that I noticed with
> different versions of pgAdmin.
> Could a firewall produce a difference when querying data visually or
> for a write to file? (should not ...)
>
> Thank you,
> Boris
>
>
> Guillaume Lelarge wrote:
> > On Wed, 2011-10-26 at 14:57 +0200, boris pezzatti wrote:
> >> I have a postgresql database 8.3 on a server. When querying the data
> >> with the pgAdmin sql editor, I can get an answer in about 10 s, for
> >> 100'000 rows. When I'm pressing the button to execute the query to a
> >> file it takes more than 1 hour to get the query results saved (writes
> >> about 10 MB in 45 minutes).
> >>
> >> Am I doing something wrong?
> >> Are there some parameters to set?
> >>
> > Never heard of such a behaviour before. I'm not sure what happens on
> > your computer.
> >
> >
>
>
> -- Sent via pgadmin-support mailing list
> (pgadmin-support(at)postgresql(dot)org <mailto:pgadmin-support(at)postgresql(dot)org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgadmin-support
>
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: Francisco Leovey <fleovey(at)yahoo(dot)com>
Cc: "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-10-31 13:04:25
Message-ID: 4EAE9CD9.5010602@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

Actually I do not have access to the server file system.
What I can not understand is that:
1) if I can execute the query and see the data in the table viewer the
data are also sent over the network
2) how does the execute to file command differ with respect to the point
1) ? Is each line retrieved separately and written to the file
location? Or is written first to a temporary location and then copied?

Maybe I'll have to start looking into the source code ....
Thanks anyway,

Boris

On 10/31/2011 12:54 PM, Francisco Leovey wrote:
> I propose you output to a file located on the same server as the DB
> and then copy that file to your PC
> IMO your problem is network related.
>
> *From:* boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
> *To:* Francisco Leovey <fleovey(at)yahoo(dot)com>
> *Cc:* "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
> *Sent:* Sunday, October 30, 2011 6:28 PM
> *Subject:* Re: [pgadmin-support] very slow when writing query to file
> No, actually it is on my machine ... On 10/29/2011 11:01 PM, Francisco
> Leovey wrote:
>> Is the file where you write the query output located on the same
>> server as the DB?
>>
>> *From:* boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
>> <mailto:boris(dot)pezzatti(at)wsl(dot)ch>
>> *To:* Guillaume Lelarge <guillaume(at)lelarge(dot)info>
>> <mailto:guillaume(at)lelarge(dot)info>
>> *Cc:* pgadmin-support(at)postgresql(dot)org
>> <mailto:pgadmin-support(at)postgresql(dot)org>
>> *Sent:* Saturday, October 29, 2011 5:44 PM
>> *Subject:* Re: [pgadmin-support] very slow when writing query to file
>> Does anyone have any proposition how I could further test where the
>> problem is. This is really a strange behaviour, that I noticed with
>> different versions of pgAdmin. Could a firewall produce a difference
>> when querying data visually or for a write to file? (should not ...)
>> Thank you, Boris Guillaume Lelarge wrote: > On Wed, 2011-10-26 at
>> 14:57 +0200, boris pezzatti wrote: >> I have a postgresql database
>> 8.3 on a server. When querying the data >> with the pgAdmin sql
>> editor, I can get an answer in about 10 s, for >> 100'000 rows. When
>> I'm pressing the button to execute the query to a >> file it takes
>> more than 1 hour to get the query results saved (writes >> about 10
>> MB in 45 minutes). >> >> Am I doing something wrong? >> Are there
>> some parameters to set? >> > Never heard of such a behaviour before.
>> I'm not sure what happens on > your computer. > > -- Sent via
>> pgadmin-support mailing list (pgadmin-support(at)postgresql(dot)org
>> <mailto:pgadmin-support(at)postgresql(dot)org>) To make changes to your
>> subscription: http://www.postgresql.org/mailpref/pgadmin-support
> -- *Boris Pezzatti
> Swiss Federal Research Institute WSL
> *
> Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland phone direct ++41 91 821 52 32
> phone ++41 91 821 52 30
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch <mailto:boris(dot)pezzatti(at)wsl(dot)ch>
> http://www.wsl.ch/
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: Fernando Hevia <fhevia(at)gmail(dot)com>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: Francisco Leovey <fleovey(at)yahoo(dot)com>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-10-31 21:26:27
Message-ID: CAGYT1XRw_zUvNR09F0G56=Baq+Uwck_sGz9d6Hi7Kf9T-0PRSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

Hello.

I could reproduce the issue in a fresh Windows 7 install with no other apps
running other than pgAdmin v1.14.0.
>From what I could see, the execute-to-file function runs in 2 stages:
1. Rows are retrieved from DB server to RAM
2. Rows are written from RAM to file

The delay clearly occurs in step 2.
While with small datasets (<2000) the writing to disk delay is barely
perceivable, when the number of rows is incremented (>10k) it is quite
distinctive how step 1 keeps completing in the expected time frame but step
2 takes much much longer. In any case it should be the other way around.

With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
seconds to write the file to an SATA 7200 disk with write-through cache.
With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
writing part took over 4 minutes (didn't wait for it to finish).

The file is being written at an avg 60 KB per second, which is extremely
slow.
Streaming the data from the saved file to a copy took under 2 seconds.

On Mon, Oct 31, 2011 at 10:04, boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch> wrote:

> Actually I do not have access to the server file system.
> What I can not understand is that:
> 1) if I can execute the query and see the data in the table viewer the
> data are also sent over the network
> 2) how does the execute to file command differ with respect to the point
> 1) ? Is each line retrieved separately and written to the file location?
> Or is written first to a temporary location and then copied?
>
> Maybe I'll have to start looking into the source code ....
> Thanks anyway,
>
> Boris
>
>
>
> On 10/31/2011 12:54 PM, Francisco Leovey wrote:
>
> I propose you output to a file located on the same server as the DB and
> then copy that file to your PC
> IMO your problem is network related.
>
> *From:* boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch> <boris(dot)pezzatti(at)wsl(dot)ch>
> *To:* Francisco Leovey <fleovey(at)yahoo(dot)com> <fleovey(at)yahoo(dot)com>
> *Cc:* "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
> <pgadmin-support(at)postgresql(dot)org> <pgadmin-support(at)postgresql(dot)org>
> *Sent:* Sunday, October 30, 2011 6:28 PM
>
> *Subject:* Re: [pgadmin-support] very slow when writing query to file
> **
> No, actually it is on my machine ...** ** ** On 10/29/2011 11:01 PM,
> Francisco Leovey wrote:
>
> Is the file where you write the query output located on the same server
> as the DB?
>
> *From:* boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch> <boris(dot)pezzatti(at)wsl(dot)ch>
> *To:* Guillaume Lelarge <guillaume(at)lelarge(dot)info> <guillaume(at)lelarge(dot)info>
> *Cc:* pgadmin-support(at)postgresql(dot)org
> *Sent:* Saturday, October 29, 2011 5:44 PM
> *Subject:* Re: [pgadmin-support] very slow when writing query to file
> ** Does anyone have any proposition how I could further test where the
> problem is. This is really a strange behaviour, that I noticed with
> different versions of pgAdmin.** Could a firewall produce a difference
> when querying data visually or for a write to file? (should not ...)** **Thank you,
> ** Boris** ** ** Guillaume Lelarge wrote:** > On Wed, 2011-10-26 at 14:57
> +0200, boris pezzatti wrote:** >> I have a postgresql database 8.3 on a
> server. When querying the data** >> with the pgAdmin sql editor, I can
> get an answer in about 10 s, for** >> 100'000 rows. When I'm pressing the
> button to execute the query to a** >> file it takes more than 1 hour to
> get the query results saved (writes** >> about 10 MB in 45 minutes).** >>
> ** >> Am I doing something wrong?** >> Are there some parameters to set?**>>
> ** > Never heard of such a behaviour before. I'm not sure what happens on*
> * > your computer.** > ** > ** ** ** -- Sent via pgadmin-support mailing
> list (pgadmin-support(at)postgresql(dot)org)** To make changes to your
> subscription:** http://www.postgresql.org/mailpref/pgadmin-support** ** **
>
> **
> -- ** *Boris Pezzatti
> Swiss Federal Research Institute WSL
> *
> Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland
> phone direct ++41 91 821 52 32
> phone
> ++41 91 821 52 30
>
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch
> http://www.wsl.ch/ **
> ****
>
>
> --
> *Boris Pezzatti
> Swiss Federal Research Institute WSL
> * Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland phone direct ++41 91 821 52 32
> phone ++41 91 821 52 30
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch
> http://www.wsl.ch
>


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Fernando Hevia <fhevia(at)gmail(dot)com>
Cc: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>, Francisco Leovey <fleovey(at)yahoo(dot)com>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-11-01 10:23:05
Message-ID: 1320142985.2122.13.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:
> [...]
> I could reproduce the issue in a fresh Windows 7 install with no other apps
> running other than pgAdmin v1.14.0.
> From what I could see, the execute-to-file function runs in 2 stages:
> 1. Rows are retrieved from DB server to RAM
> 2. Rows are written from RAM to file
>

That's right.

> The delay clearly occurs in step 2.

That's also right.

> While with small datasets (<2000) the writing to disk delay is barely
> perceivable, when the number of rows is incremented (>10k) it is quite
> distinctive how step 1 keeps completing in the expected time frame but step
> 2 takes much much longer. In any case it should be the other way around.
>

Nope, step 2 does a lot of work.

> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
> seconds to write the file to an SATA 7200 disk with write-through cache.
> With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
> writing part took over 4 minutes (didn't wait for it to finish).
>

I don't have the same numbers:

* 1000 : 1s
* 10000 : 1s
* 200000 : 7s
* 1000000 : 18s

> The file is being written at an avg 60 KB per second, which is extremely
> slow.

If the only thing pgAdmin does was writing, I would agree. But,
actually, it does a lot more things:

* for each row
* for each column
* adds the column separator, if needed
* grabs one cell's value
* quotes the value, if needed (which also means doubling the quote
if it's within the value)
* adds the line separator
* converts it to the encoding, if needed
* writes it to the file

That could take some time.

I searched if there were some parts that took much longer than others,
but failed to find one.

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Fernando Hevia <fhevia(at)gmail(dot)com>, Francisco Leovey <fleovey(at)yahoo(dot)com>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-11-02 07:58:22
Message-ID: 4EB0F81E.70708@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

Thank you Fernando for reproducing this.
I suspect there must be some part of code in the

* for each row
* for each column

loops that result inefficient only on some machines or OS's (I'm using
Archlinux).
In fact the extra time I and Fernando get can not only be attributed to
adding commas, " and line feeds. The same file which I retrived in more
than one hour (32Mb), could be saved with openoffice in 10 seconds,
changing column separators, text delimiters and encoding.

Could it be possible that the use of e.g. error catching in the loops
results in poor performance on some systems? (I'm not a c++ programmer ...)

On 11/01/2011 11:23 AM, Guillaume Lelarge wrote:
> On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:
>> [...]
>> I could reproduce the issue in a fresh Windows 7 install with no other apps
>> running other than pgAdmin v1.14.0.
>> From what I could see, the execute-to-file function runs in 2 stages:
>> 1. Rows are retrieved from DB server to RAM
>> 2. Rows are written from RAM to file
>>
> That's right.
>
>> The delay clearly occurs in step 2.
> That's also right.
>
>> While with small datasets (<2000) the writing to disk delay is barely
>> perceivable, when the number of rows is incremented (>10k) it is quite
>> distinctive how step 1 keeps completing in the expected time frame but step
>> 2 takes much much longer. In any case it should be the other way around.
>>
> Nope, step 2 does a lot of work.
>
>> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
>> seconds to write the file to an SATA 7200 disk with write-through cache.
>> With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
>> writing part took over 4 minutes (didn't wait for it to finish).
>>
> I don't have the same numbers:
>
> * 1000 : 1s
> * 10000 : 1s
> * 200000 : 7s
> * 1000000 : 18s
>
>> The file is being written at an avg 60 KB per second, which is extremely
>> slow.
> If the only thing pgAdmin does was writing, I would agree. But,
> actually, it does a lot more things:
>
> * for each row
> * for each column
> * adds the column separator, if needed
> * grabs one cell's value
> * quotes the value, if needed (which also means doubling the quote
> if it's within the value)
> * adds the line separator
> * converts it to the encoding, if needed
> * writes it to the file
>
> That could take some time.
>
> I searched if there were some parts that took much longer than others,
> but failed to find one.
>
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-02 08:10:21
Message-ID: 4EB0FAED.2070306@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support


or maybe the retrieved data in RAM are somehow lazy bound ... ?

On 11/02/2011 08:58 AM, boris pezzatti wrote:
> Thank you Fernando for reproducing this.
> I suspect there must be some part of code in the
> * for each row
> * for each column
> loops that result inefficient only on some machines or OS's (I'm using
> Archlinux).
> In fact the extra time I and Fernando get can not only be attributed
> to adding commas, " and line feeds. The same file which I retrived in
> more than one hour (32Mb), could be saved with openoffice in 10
> seconds, changing column separators, text delimiters and encoding.
>
> Could it be possible that the use of e.g. error catching in the loops
> results in poor performance on some systems? (I'm not a c++ programmer
> ...)
>
>
>
>
>
>
>
> On 11/01/2011 11:23 AM, Guillaume Lelarge wrote:
>> On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:
>>> [...]
>>> I could reproduce the issue in a fresh Windows 7 install with no other apps
>>> running other than pgAdmin v1.14.0.
>>> > From what I could see, the execute-to-file function runs in 2 stages:
>>> 1. Rows are retrieved from DB server to RAM
>>> 2. Rows are written from RAM to file
>>>
>> That's right.
>>
>>> The delay clearly occurs in step 2.
>> That's also right.
>>
>>> While with small datasets (<2000) the writing to disk delay is barely
>>> perceivable, when the number of rows is incremented (>10k) it is quite
>>> distinctive how step 1 keeps completing in the expected time frame but step
>>> 2 takes much much longer. In any case it should be the other way around.
>>>
>> Nope, step 2 does a lot of work.
>>
>>> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
>>> seconds to write the file to an SATA 7200 disk with write-through cache.
>>> With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
>>> writing part took over 4 minutes (didn't wait for it to finish).
>>>
>> I don't have the same numbers:
>>
>> * 1000 : 1s
>> * 10000 : 1s
>> * 200000 : 7s
>> * 1000000 : 18s
>>
>>> The file is being written at an avg 60 KB per second, which is extremely
>>> slow.
>> If the only thing pgAdmin does was writing, I would agree. But,
>> actually, it does a lot more things:
>>
>> * for each row
>> * for each column
>> * adds the column separator, if needed
>> * grabs one cell's value
>> * quotes the value, if needed (which also means doubling the quote
>> if it's within the value)
>> * adds the line separator
>> * converts it to the encoding, if needed
>> * writes it to the file
>>
>> That could take some time.
>>
>> I searched if there were some parts that took much longer than others,
>> but failed to find one.
>>
>>
>
> --
> *Boris Pezzatti
> Swiss Federal Research Institute WSL
> *
> Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland phone direct ++41 91 821 52 32
> phone ++41 91 821 52 30
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch
> http://www.wsl.ch <http://www.wsl.ch/>
>
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-02 08:21:40
Message-ID: 4EB0FD94.7010703@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support


... or (I promise this is the last guess) each row is appended to the
file in the the loop, and on some OS's there is a problem keeping open
the file (so that each time the file must be opened again ... adding a
lot of extra time). Maybe creating all the "virtual file" in a variable
and storing all at once (or let say in chunks of 100 Mb) could solve the
problem ...

On 11/02/2011 09:10 AM, boris pezzatti wrote:
>
> or maybe the retrieved data in RAM are somehow lazy bound ... ?
>
>
>
> On 11/02/2011 08:58 AM, boris pezzatti wrote:
>> Thank you Fernando for reproducing this.
>> I suspect there must be some part of code in the
>> * for each row
>> * for each column
>> loops that result inefficient only on some machines or OS's (I'm
>> using Archlinux).
>> In fact the extra time I and Fernando get can not only be attributed
>> to adding commas, " and line feeds. The same file which I retrived in
>> more than one hour (32Mb), could be saved with openoffice in 10
>> seconds, changing column separators, text delimiters and encoding.
>>
>> Could it be possible that the use of e.g. error catching in the loops
>> results in poor performance on some systems? (I'm not a c++
>> programmer ...)
>>
>>
>>
>>
>>
>>
>>
>> On 11/01/2011 11:23 AM, Guillaume Lelarge wrote:
>>> On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:
>>>> [...]
>>>> I could reproduce the issue in a fresh Windows 7 install with no other apps
>>>> running other than pgAdmin v1.14.0.
>>>> > From what I could see, the execute-to-file function runs in 2 stages:
>>>> 1. Rows are retrieved from DB server to RAM
>>>> 2. Rows are written from RAM to file
>>>>
>>> That's right.
>>>
>>>> The delay clearly occurs in step 2.
>>> That's also right.
>>>
>>>> While with small datasets (<2000) the writing to disk delay is barely
>>>> perceivable, when the number of rows is incremented (>10k) it is quite
>>>> distinctive how step 1 keeps completing in the expected time frame but step
>>>> 2 takes much much longer. In any case it should be the other way around.
>>>>
>>> Nope, step 2 does a lot of work.
>>>
>>>> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
>>>> seconds to write the file to an SATA 7200 disk with write-through cache.
>>>> With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
>>>> writing part took over 4 minutes (didn't wait for it to finish).
>>>>
>>> I don't have the same numbers:
>>>
>>> * 1000 : 1s
>>> * 10000 : 1s
>>> * 200000 : 7s
>>> * 1000000 : 18s
>>>
>>>> The file is being written at an avg 60 KB per second, which is extremely
>>>> slow.
>>> If the only thing pgAdmin does was writing, I would agree. But,
>>> actually, it does a lot more things:
>>>
>>> * for each row
>>> * for each column
>>> * adds the column separator, if needed
>>> * grabs one cell's value
>>> * quotes the value, if needed (which also means doubling the quote
>>> if it's within the value)
>>> * adds the line separator
>>> * converts it to the encoding, if needed
>>> * writes it to the file
>>>
>>> That could take some time.
>>>
>>> I searched if there were some parts that took much longer than others,
>>> but failed to find one.
>>>
>>>
>>
>> --
>> *Boris Pezzatti
>> Swiss Federal Research Institute WSL
>> *
>> Research unit Community Ecology
>> Team Insubric Ecosystems
>> via Belsoggiorno 22
>> CH-6500 Bellinzona
>> Switzerland phone direct ++41 91 821 52 32
>> phone ++41 91 821 52 30
>> fax ++41 91 821 52 39
>> boris(dot)pezzatti(at)wsl(dot)ch
>> http://www.wsl.ch <http://www.wsl.ch/>
>>
>>
>
> --
> *Boris Pezzatti
> Swiss Federal Research Institute WSL
> *
> Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland phone direct ++41 91 821 52 32
> phone ++41 91 821 52 30
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch
> http://www.wsl.ch <http://www.wsl.ch/>
>
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: Fernando Hevia <fhevia(at)gmail(dot)com>, Francisco Leovey <fleovey(at)yahoo(dot)com>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-11-02 21:25:32
Message-ID: 1320269133.2145.29.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Wed, 2011-11-02 at 08:58 +0100, boris pezzatti wrote:
> Thank you Fernando for reproducing this.
> I suspect there must be some part of code in the
>
> * for each row
> * for each column
>
> loops that result inefficient only on some machines or OS's (I'm using
> Archlinux).

I'm using Fedora, so we both use Linux.

> In fact the extra time I and Fernando get can not only be attributed to
> adding commas, " and line feeds. The same file which I retrived in more
> than one hour (32Mb), could be saved with openoffice in 10 seconds,
> changing column separators, text delimiters and encoding.
>

Actually, it does.

> Could it be possible that the use of e.g. error catching in the loops
> results in poor performance on some systems? (I'm not a c++ programmer ...)
>

Nope, I don't think so.

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-02 21:26:38
Message-ID: 1320269199.2145.30.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Wed, 2011-11-02 at 09:21 +0100, boris pezzatti wrote:
> ... or (I promise this is the last guess) each row is appended to the
> file in the the loop, and on some OS's there is a problem keeping open
> the file (so that each time the file must be opened again ... adding a
> lot of extra time). Maybe creating all the "virtual file" in a variable
> and storing all at once (or let say in chunks of 100 Mb) could solve the
> problem ...
>

Tried that yesterday, wasn't much faster (two seconds less, which isn't
much).

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Fernando Hevia <fhevia(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>, Francisco Leovey <fleovey(at)yahoo(dot)com>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Subject: Re: very slow when writing query to file
Date: 2011-11-02 22:58:16
Message-ID: CAGYT1XTnu0K8_N_ORzV+y1aLCLVrC+RtXtFyiOaTndrkM_xPHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Tue, Nov 1, 2011 at 07:23, Guillaume Lelarge <guillaume(at)lelarge(dot)info>wrote:

> [...]
>
> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
> > seconds to write the file to an SATA 7200 disk with write-through cache.
> > With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
> > writing part took over 4 minutes (didn't wait for it to finish).
> >
>
> I don't have the same numbers:
>
> * 1000 : 1s
> * 10000 : 1s
> * 200000 : 7s
> * 1000000 : 18s
>
>
Interesting... seeing your explanation below, our timing difference
probably has to do with me testing on a table with lots of columns, many of
type timestamp and inet which I assume are costlier to convert. Boris,
might that be the case with you also?
I'll repeat my tests with a smaller and simpler table later.

> > The file is being written at an avg 60 KB per second, which is extremely
> > slow.
>
> If the only thing pgAdmin does was writing, I would agree. But,
> actually, it does a lot more things:
>
> * for each row
> * for each column
> * adds the column separator, if needed
> * grabs one cell's value
> * quotes the value, if needed (which also means doubling the quote
> if it's within the value)
> * adds the line separator
> * converts it to the encoding, if needed
> * writes it to the file
>
> That could take some time.
>

I see how I was completely underrating the workload involved in the file
writing process.

> I searched if there were some parts that took much longer than others,
> but failed to find one.
>

The effort is much appreciated. Thanks.

Regards,
Fernando.


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-03 10:22:30
Message-ID: 4EB26B66.7000403@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

I tested after column types, and the colum that appears to create the
very slow behaviour is of type boolean. Retrieving only 1 column for
3000 records delays at ca. 1 minute.

Do you know any possible reason?

On 11/02/2011 11:58 PM, Fernando Hevia wrote:
>
> On Tue, Nov 1, 2011 at 07:23, Guillaume Lelarge
> <guillaume(at)lelarge(dot)info <mailto:guillaume(at)lelarge(dot)info>> wrote:
>
> [...]
>
> > With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from
> DB and 40
> > seconds to write the file to an SATA 7200 disk with
> write-through cache.
> > With 100k rows (23 MB) the DB retrieve went for 35 seconds while
> the file
> > writing part took over 4 minutes (didn't wait for it to finish).
> >
>
> I don't have the same numbers:
>
> * 1000 : 1s
> * 10000 : 1s
> * 200000 : 7s
> * 1000000 : 18s
>
>
> Interesting... seeing your explanation below, our timing difference
> probably has to do with me testing on a table with lots of columns,
> many of type timestamp and inet which I assume are costlier to
> convert. Boris, might that be the case with you also?
> I'll repeat my tests with a smaller and simpler table later.
>
> > The file is being written at an avg 60 KB per second, which is
> extremely
> > slow.
>
> If the only thing pgAdmin does was writing, I would agree. But,
> actually, it does a lot more things:
>
> * for each row
> * for each column
> * adds the column separator, if needed
> * grabs one cell's value
> * quotes the value, if needed (which also means doubling the quote
> if it's within the value)
> * adds the line separator
> * converts it to the encoding, if needed
> * writes it to the file
>
> That could take some time.
>
>
> I see how I was completely underrating the workload involved in the
> file writing process.
>
> I searched if there were some parts that took much longer than others,
> but failed to find one.
>
>
> The effort is much appreciated. Thanks.
>
> Regards,
> Fernando.

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-03 11:02:22
Message-ID: 1320318143.2130.26.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Thu, 2011-11-03 at 11:22 +0100, boris pezzatti wrote:
> I tested after column types, and the colum that appears to create the
> very slow behaviour is of type boolean. Retrieving only 1 column for
> 3000 records delays at ca. 1 minute.
>
> Do you know any possible reason?
>

That's good to know. I tried with a boolean column, which gave me 7
seconds for 10000 rows. Which doesn't seem much, but it's actually 7
times more than without the boolean column.

It seems we keep executing the same SQL command, 10000 times (actually
as much as your number of rows). And that is weird.

I'll look into this.

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
Cc: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-03 11:20:48
Message-ID: 1320319248.2130.29.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Thu, 2011-11-03 at 12:02 +0100, Guillaume Lelarge wrote:
> On Thu, 2011-11-03 at 11:22 +0100, boris pezzatti wrote:
> > I tested after column types, and the colum that appears to create the
> > very slow behaviour is of type boolean. Retrieving only 1 column for
> > 3000 records delays at ca. 1 minute.
> >
> > Do you know any possible reason?
> >
>
> That's good to know. I tried with a boolean column, which gave me 7
> seconds for 10000 rows. Which doesn't seem much, but it's actually 7
> times more than without the boolean column.
>
> It seems we keep executing the same SQL command, 10000 times (actually
> as much as your number of rows). And that is weird.
>
> I'll look into this.
>

OK, found it. We introduced some time ago a cache for type lookup. It
seems the boolean type doesn't make it to the cache. The patch is
commited. It should be available with the release of 1.14.1.

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-03 15:35:16
Message-ID: 4EB2B4B4.1070800@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

great,
thank you a lot Guillaume!

Cheers,
Boris

On 11/03/2011 12:20 PM, Guillaume Lelarge wrote:
> On Thu, 2011-11-03 at 12:02 +0100, Guillaume Lelarge wrote:
>> On Thu, 2011-11-03 at 11:22 +0100, boris pezzatti wrote:
>>> I tested after column types, and the colum that appears to create the
>>> very slow behaviour is of type boolean. Retrieving only 1 column for
>>> 3000 records delays at ca. 1 minute.
>>>
>>> Do you know any possible reason?
>>>
>> That's good to know. I tried with a boolean column, which gave me 7
>> seconds for 10000 rows. Which doesn't seem much, but it's actually 7
>> times more than without the boolean column.
>>
>> It seems we keep executing the same SQL command, 10000 times (actually
>> as much as your number of rows). And that is weird.
>>
>> I'll look into this.
>>
> OK, found it. We introduced some time ago a cache for type lookup. It
> seems the boolean type doesn't make it to the cache. The patch is
> commited. It should be available with the release of 1.14.1.
>
>


From: Fernando Hevia <fhevia(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>, pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-03 16:36:58
Message-ID: CAGYT1XSnbKC_KT7=fjm8DwDoWg9bv7wkzJC34ZSycVpnT0DtVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgadmin-support

On Thu, Nov 3, 2011 at 08:20, Guillaume Lelarge <guillaume(at)lelarge(dot)info>wrote:

>
> OK, found it. We introduced some time ago a cache for type lookup. It
> seems the boolean type doesn't make it to the cache. The patch is
> commited. It should be available with the release of 1.14.1.
>
>
Nice.
Thanks for fixing this. Excellent response as usual.