Re: [PERFORM] psql -A (unaligned format) eats too much

Lists: pgsql-hackerspgsql-performance
From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: pgsql-performance(at)postgresql(dot)org
Subject: psql -A (unaligned format) eats too much memory
Date: 2006-06-04 22:01:24
Message-ID: 44835834.2020309@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Hi,

I just noticed that psql's unformatted output uses too much
memory. Is it normal? It seems that psql draws all records
of a query off the server before it displays or writes the output.
I would expect this only with formatted output.

Problem is, I have an export that produces 500'000+ records
which changes frequently. Several (20+) sites run this query
nightly with different parameters and download it. The SELECTs
that run in psql -A -t -c '...' may overlap and the query that runs
in less than 1.5 minutes if it's the only one at the time may take
3+ hours if ten such queries overlap. The time is mostly spent
in swapping, all psql processes take up 300+ MB, so the 1GB
server is brought to its knees quickly, peek swap usage is 1.8 GB.
I watched the progress in top and the postmaster processes finished
their work in about half an hour (that would still be acceptable)
then the psql processes started eating up memory as they read
the records.

PostgreSQL 8.1.4 was used on RHEL3.

Is there a way to convince psql to use less memory in unformatted
mode? I know COPY will be able to use arbitrary SELECTs
but until then I am still stuck with redirecting psql's output.

Best regards,
Zoltán Böszörményi


From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: psql -A (unaligned format) eats too much memory
Date: 2006-06-04 22:32:38
Message-ID: 44835F86.5040008@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Hi,

answering to myself. :-)

Zoltan Boszormenyi írta:
> Hi,
>
> I just noticed that psql's unformatted output uses too much
> memory. Is it normal? It seems that psql draws all records
> of a query off the server before it displays or writes the output.
> I would expect this only with formatted output.
>
> Problem is, I have an export that produces 500'000+ records
> which changes frequently. Several (20+) sites run this query
> nightly with different parameters and download it. The SELECTs
> that run in psql -A -t -c '...' may overlap and the query that runs
> in less than 1.5 minutes if it's the only one at the time may take
> 3+ hours if ten such queries overlap. The time is mostly spent
> in swapping, all psql processes take up 300+ MB, so the 1GB
> server is brought to its knees quickly, peek swap usage is 1.8 GB.
> I watched the progress in top and the postmaster processes finished
> their work in about half an hour (that would still be acceptable)
> then the psql processes started eating up memory as they read
> the records.
>
> PostgreSQL 8.1.4 was used on RHEL3.
>
> Is there a way to convince psql to use less memory in unformatted
> mode? I know COPY will be able to use arbitrary SELECTs
> but until then I am still stuck with redirecting psql's output.

The answer it to use SELECT INTO TEMP and then COPY.
Psql will use much less memory that way. But still...

Best regards,
Zoltán Böszörményi


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much memory
Date: 2006-06-05 15:22:59
Message-ID: 20060605152258.GT53487@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Moving to -hackers

On Mon, Jun 05, 2006 at 12:32:38AM +0200, Zoltan Boszormenyi wrote:
> >I just noticed that psql's unformatted output uses too much
> >memory. Is it normal? It seems that psql draws all records
> >of a query off the server before it displays or writes the output.
> >I would expect this only with formatted output.
> >
> >Problem is, I have an export that produces 500'000+ records
> >which changes frequently. Several (20+) sites run this query
> >nightly with different parameters and download it. The SELECTs
> >that run in psql -A -t -c '...' may overlap and the query that runs
> >in less than 1.5 minutes if it's the only one at the time may take
> >3+ hours if ten such queries overlap. The time is mostly spent
> >in swapping, all psql processes take up 300+ MB, so the 1GB
> >server is brought to its knees quickly, peek swap usage is 1.8 GB.
> >I watched the progress in top and the postmaster processes finished
> >their work in about half an hour (that would still be acceptable)
> >then the psql processes started eating up memory as they read
> >the records.
> >
> >PostgreSQL 8.1.4 was used on RHEL3.
> >
> >Is there a way to convince psql to use less memory in unformatted
> >mode? I know COPY will be able to use arbitrary SELECTs
> >but until then I am still stuck with redirecting psql's output.
>
> The answer it to use SELECT INTO TEMP and then COPY.
> Psql will use much less memory that way. But still...

I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
largetable' > /dev/null results in psql consuming vast quantities of
memory. Why is this? ISTM this is a bug...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much memory
Date: 2006-06-05 15:27:30
Message-ID: 20512.1149521250@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
> largetable' > /dev/null results in psql consuming vast quantities of
> memory. Why is this?

Is it different without the -A?

I'm reading this as just another uninformed complaint about libpq's
habit of buffering the whole query result. It's possible that there's
a memory leak in the -A path specifically, but nothing said so far
provided any evidence for that.

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much memory
Date: 2006-06-05 15:53:49
Message-ID: 20060605155349.GU53487@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> > I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
> > largetable' > /dev/null results in psql consuming vast quantities of
> > memory. Why is this?
>
> Is it different without the -A?

Nope.

> I'm reading this as just another uninformed complaint about libpq's
> habit of buffering the whole query result. It's possible that there's
> a memory leak in the -A path specifically, but nothing said so far
> provided any evidence for that.

Certainly seems like it. It seems like it would be good to allow for
libpq not to buffer, since there's cases where it's not needed...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much memory
Date: 2006-06-05 16:00:35
Message-ID: 20813.1149523235@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
>> I'm reading this as just another uninformed complaint about libpq's
>> habit of buffering the whole query result. It's possible that there's
>> a memory leak in the -A path specifically, but nothing said so far
>> provided any evidence for that.

> Certainly seems like it. It seems like it would be good to allow for
> libpq not to buffer, since there's cases where it's not needed...

See past discussions. The problem is that libpq's API says that when it
hands you back the completed query result, the command is complete and
guaranteed not to fail later. A streaming interface could not make that
guarantee, so it's not a transparent substitution.

I wouldn't have any strong objection to providing a separate API that
operates in a streaming fashion, but defining it is something no one's
bothered to do yet. In practice, if you have to code to a variant API,
it's not that much more trouble to use a cursor...

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Mark Woodward <pgsql(at)mohawksoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 16:40:38
Message-ID: 44845E86.3080001@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Mark Woodward wrote:
>> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>>
>>> On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
>>>
>>>> I'm reading this as just another uninformed complaint about libpq's
>>>> habit of buffering the whole query result. It's possible that there's
>>>> a memory leak in the -A path specifically, but nothing said so far
>>>> provided any evidence for that.
>>>>
>>> Certainly seems like it. It seems like it would be good to allow for
>>> libpq not to buffer, since there's cases where it's not needed...
>>>
>> See past discussions. The problem is that libpq's API says that when it
>> hands you back the completed query result, the command is complete and
>> guaranteed not to fail later. A streaming interface could not make that
>> guarantee, so it's not a transparent substitution.
>>
>> I wouldn't have any strong objection to providing a separate API that
>> operates in a streaming fashion, but defining it is something no one's
>> bothered to do yet. In practice, if you have to code to a variant API,
>> it's not that much more trouble to use a cursor...
>>
>>
>
> Wouldn't the "COPY (select ...) TO STDOUT" format being discussed solve
> this for free?
>
>
>

It won't solve it in the general case for clients that expect a result
set. ISTM that "use a cursor" is a perfectly reasonable answer, though.

cheers

andrew


From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 16:45:13
Message-ID: 44845F99.7050306@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Hi!

Tom Lane írta:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>
>> I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
>> largetable' > /dev/null results in psql consuming vast quantities of
>> memory. Why is this?
>>
>
> Is it different without the -A?
>
> I'm reading this as just another uninformed complaint about libpq's
> habit of buffering the whole query result. It's possible that there's
> a memory leak in the -A path specifically, but nothing said so far
> provided any evidence for that.
>
> regards, tom lane
>

So, is libpq always buffering the result? Thanks.
I thought psql buffers only because in its formatted output mode
it has to know the widest value for all the columns.

Then the SELECT INTO TEMP ; COPY TO STDOUT solution
I found is _the_ solution.

I guess then the libpq-based ODBC driver suffers
from the same problem? It certainly explains the
performance problems I observed: the server
finishes the query, the ODBC driver (or libpq underneath)
fetches all the records and the application receives
the first record after all these. Nice.

Best regards,
Zoltán Böszörményi


From: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Zoltan Boszormenyi" <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 16:48:12
Message-ID: 18006.24.91.171.78.1149526092.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>> On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
>>> I'm reading this as just another uninformed complaint about libpq's
>>> habit of buffering the whole query result. It's possible that there's
>>> a memory leak in the -A path specifically, but nothing said so far
>>> provided any evidence for that.
>
>> Certainly seems like it. It seems like it would be good to allow for
>> libpq not to buffer, since there's cases where it's not needed...
>
> See past discussions. The problem is that libpq's API says that when it
> hands you back the completed query result, the command is complete and
> guaranteed not to fail later. A streaming interface could not make that
> guarantee, so it's not a transparent substitution.
>
> I wouldn't have any strong objection to providing a separate API that
> operates in a streaming fashion, but defining it is something no one's
> bothered to do yet. In practice, if you have to code to a variant API,
> it's not that much more trouble to use a cursor...
>

Wouldn't the "COPY (select ...) TO STDOUT" format being discussed solve
this for free?


From: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Zoltan Boszormenyi" <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 17:01:45
Message-ID: 18067.24.91.171.78.1149526905.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

> Mark Woodward wrote:
>>> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>>>
>>>> On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
>>>>
>>>>> I'm reading this as just another uninformed complaint about libpq's
>>>>> habit of buffering the whole query result. It's possible that
>>>>> there's
>>>>> a memory leak in the -A path specifically, but nothing said so far
>>>>> provided any evidence for that.
>>>>>
>>>> Certainly seems like it. It seems like it would be good to allow for
>>>> libpq not to buffer, since there's cases where it's not needed...
>>>>
>>> See past discussions. The problem is that libpq's API says that when
>>> it
>>> hands you back the completed query result, the command is complete and
>>> guaranteed not to fail later. A streaming interface could not make
>>> that
>>> guarantee, so it's not a transparent substitution.
>>>
>>> I wouldn't have any strong objection to providing a separate API that
>>> operates in a streaming fashion, but defining it is something no one's
>>> bothered to do yet. In practice, if you have to code to a variant API,
>>> it's not that much more trouble to use a cursor...
>>>
>>>
>>
>> Wouldn't the "COPY (select ...) TO STDOUT" format being discussed solve
>> this for free?
>>
>>
>>
>
> It won't solve it in the general case for clients that expect a result
> set. ISTM that "use a cursor" is a perfectly reasonable answer, though.

I'm not sure I agree -- surprise!

psql is often used as a command line tool and using a cursor is not
acceptable.

Granted, with an unaligned output, perhaps psql should not buffer the
WHOLE result at once, but without rewriting that behavior, a COPY from
query may be close enough.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Mark Woodward <pgsql(at)mohawksoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 17:03:24
Message-ID: 448463DC.8030306@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Mark Woodward wrote:
>>>>
>>>>
>>>>
>>> Wouldn't the "COPY (select ...) TO STDOUT" format being discussed solve
>>> this for free?
>>>
>>>
>>>
>>>
>> It won't solve it in the general case for clients that expect a result
>> set. ISTM that "use a cursor" is a perfectly reasonable answer, though.
>>
>
> I'm not sure I agree -- surprise!
>
> psql is often used as a command line tool and using a cursor is not
> acceptable.
>
> Granted, with an unaligned output, perhaps psql should not buffer the
> WHOLE result at once, but without rewriting that behavior, a COPY from
> query may be close enough.
>
>

You have missed my point. Surprise!

I didn't say it wasn't OK in the psql case, I said it wasn't helpful in
the case of *other* libpq clients.

Expecting clients generally to split and interpret COPY output is not
reasonable, but if they want large result sets they should use a cursor.

cheers

andrew


From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Mark Woodward <pgsql(at)mohawksoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 17:17:31
Message-ID: 4484672B.8080606@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Andrew Dunstan írta:
> Mark Woodward wrote:
>>> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>>>
>>>> On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
>>>>
>>>>> I'm reading this as just another uninformed complaint about libpq's
>>>>> habit of buffering the whole query result. It's possible that
>>>>> there's
>>>>> a memory leak in the -A path specifically, but nothing said so far
>>>>> provided any evidence for that.
>>>>>
>>>> Certainly seems like it. It seems like it would be good to allow for
>>>> libpq not to buffer, since there's cases where it's not needed...
>>>>
>>> See past discussions. The problem is that libpq's API says that
>>> when it
>>> hands you back the completed query result, the command is complete and
>>> guaranteed not to fail later. A streaming interface could not make
>>> that
>>> guarantee, so it's not a transparent substitution.
>>>
>>> I wouldn't have any strong objection to providing a separate API that
>>> operates in a streaming fashion, but defining it is something no one's
>>> bothered to do yet. In practice, if you have to code to a variant API,
>>> it's not that much more trouble to use a cursor...
>>>
>>>
>>
>> Wouldn't the "COPY (select ...) TO STDOUT" format being discussed solve
>> this for free?

Yes, it would for me.

> It won't solve it in the general case for clients that expect a result
> set. ISTM that "use a cursor" is a perfectly reasonable answer, though.

The general case cannot be applied for all particular cases.
E.g. you cannot use cursors from shell scripts and just for
producing an "export file" it's not too reasonable either.
Redirecting psql's output or COPY is enough.

Best regards,
Zoltán Böszörényi


From: Neil Conway <neilc(at)samurai(dot)com>
To: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 17:52:25
Message-ID: 1149529945.5577.5.camel@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
> The general case cannot be applied for all particular cases.
> E.g. you cannot use cursors from shell scripts

This could be fixed by adding an option to psql to transparently produce
SELECT result sets via a cursor.

-Neil


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-05 18:10:24
Message-ID: 21820.1149531024@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Neil Conway <neilc(at)samurai(dot)com> writes:
> On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
>> The general case cannot be applied for all particular cases.
>> E.g. you cannot use cursors from shell scripts

> This could be fixed by adding an option to psql to transparently produce
> SELECT result sets via a cursor.

Note of course that such a thing would push the incomplete-result
problem further upstream. For instance in (hypothetical --cursor
switch)
psql --cursor -c "select ..." | myprogram
there would be no very good way for myprogram to find out that it'd
been sent an incomplete result due to error partway through the SELECT.

regards, tom lane


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-06 12:30:12
Message-ID: 1149597013.3818.7.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Ühel kenal päeval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
> Neil Conway <neilc(at)samurai(dot)com> writes:
> > On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
> >> The general case cannot be applied for all particular cases.
> >> E.g. you cannot use cursors from shell scripts
>
> > This could be fixed by adding an option to psql to transparently produce
> > SELECT result sets via a cursor.

I think this is an excellent idea.

psql --cursor --fetchby 10000 -c "select ..." | myprogram

> Note of course that such a thing would push the incomplete-result
> problem further upstream. For instance in (hypothetical --cursor
> switch)
> psql --cursor -c "select ..." | myprogram
> there would be no very good way for myprogram to find out that it'd
> been sent an incomplete result due to error partway through the SELECT.

would it not learn about it at the point of error ?

even without --cursor there is still no very good way to find out when
something else goes wrong, like the result inside libpq taking up all
memory and so psql runs out of memory on formatting some longer lines.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-06 13:48:43
Message-ID: 15738.1149601723@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Hannu Krosing <hannu(at)skype(dot)net> writes:
> hel kenal peval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
>> Note of course that such a thing would push the incomplete-result
>> problem further upstream. For instance in (hypothetical --cursor
>> switch)
>> psql --cursor -c "select ..." | myprogram
>> there would be no very good way for myprogram to find out that it'd
>> been sent an incomplete result due to error partway through the SELECT.

> would it not learn about it at the point of error ?

No, it would merely see EOF after some number of result rows. (I'm
assuming you're also using -A -t so that the output is unadorned.)

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Neil Conway <neilc(at)samurai(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-06 14:39:19
Message-ID: 20060606143918.GA53487@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Tue, Jun 06, 2006 at 09:48:43AM -0400, Tom Lane wrote:
> Hannu Krosing <hannu(at)skype(dot)net> writes:
> > hel kenal peval, E, 2006-06-05 kell 14:10, kirjutas Tom Lane:
> >> Note of course that such a thing would push the incomplete-result
> >> problem further upstream. For instance in (hypothetical --cursor
> >> switch)
> >> psql --cursor -c "select ..." | myprogram
> >> there would be no very good way for myprogram to find out that it'd
> >> been sent an incomplete result due to error partway through the SELECT.
>
> > would it not learn about it at the point of error ?
>
> No, it would merely see EOF after some number of result rows. (I'm
> assuming you're also using -A -t so that the output is unadorned.)

So if an error occurs partway through reading a cursor, no error message
is generated? That certainly sounds like a bug to me...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Neil Conway <neilc(at)samurai(dot)com>, Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] psql -A (unaligned format) eats too much
Date: 2006-06-06 14:47:30
Message-ID: 17223.1149605250@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Tue, Jun 06, 2006 at 09:48:43AM -0400, Tom Lane wrote:
>>> psql --cursor -c "select ..." | myprogram
>>> there would be no very good way for myprogram to find out that it'd
>>> been sent an incomplete result due to error partway through the SELECT.

> So if an error occurs partway through reading a cursor, no error message
> is generated? That certainly sounds like a bug to me...

Sure an error is generated. But it goes to stderr. The guy at the
downstream end of the stdout pipe cannot see either the error message,
or the nonzero status that psql will (hopefully) exit with.

You can theoretically deal with this by having the shell script calling
this combination check psql exit status and discard the results of
myprogram on failure, but it's not easy or simple.

regards, tom lane