Re: Out of memory error on huge resultset

Lists: pgsql-generalpgsql-hackerspgsql-jdbc
From: "Nick Fankhauser" <nickf(at)ontko(dot)com>
To: "pgsql-jdbc" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Out of memory error on huge resultset
Date: 2002-10-10 15:24:49
Message-ID: NEBBLAAHGLEEPCGOBHDGCEFIFPAA.nickf@ontko.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Nick Fankhauser <nickf(at)ontko(dot)com>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-10 15:35:48
Message-ID: 1034264150.1796.21.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


Nick,

Use a cursor, the current driver doesn't support caching, the backend
gives you everything you ask for, you can't just say you want a limited
set.

So if you use cursors you can fetch a subset

Dave
On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
>
> I'm selecting a huge ResultSet from our database- about one million rows,
> with one of the fields being varchar(500). I get an out of memory error from
> java.
>
> If the whole ResultSet gets stashed in memory, this isn't really surprising,
> but I'm wondering why this happens (if it does), rather than a subset around
> the current record being cached and other rows being retrieved as needed.
>
> If it turns out that there are good reasons for it to all be in memory, then
> my question is whether there is a better approach that people typically use
> in this situation. For now, I'm simply breaking up the select into smaller
> chunks, but that approach won't be satisfactory in the long run.
>
> Thanks
>
> -Nick
>
> --------------------------------------------------------------------------
> Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
> Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>


From: Barry Lind <barry(at)xythos(dot)com>
To: nickf(at)ontko(dot)com
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-10 16:40:19
Message-ID: 3DA5AD73.7090502@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:
> I'm selecting a huge ResultSet from our database- about one million rows,
> with one of the fields being varchar(500). I get an out of memory error from
> java.
>
> If the whole ResultSet gets stashed in memory, this isn't really surprising,
> but I'm wondering why this happens (if it does), rather than a subset around
> the current record being cached and other rows being retrieved as needed.
>
> If it turns out that there are good reasons for it to all be in memory, then
> my question is whether there is a better approach that people typically use
> in this situation. For now, I'm simply breaking up the select into smaller
> chunks, but that approach won't be satisfactory in the long run.
>
> Thanks
>
> -Nick
>
> --------------------------------------------------------------------------
> Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
> Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 12:27:40
Message-ID: 200210111427.41955.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

In C library is 'execute query' without fetch - in jdbc execute fetch all rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> Nick,
>
> This has been discussed before on this list many times. But the short
> answer is that that is how the postgres server handles queries. If you
> issue a query the server will return the entire result. (try the same
> query in psql and you will have the same problem). To work around this
> you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> sql commands for postgres).
>
> thanks,
> --Barry
>
> Nick Fankhauser wrote:
> > I'm selecting a huge ResultSet from our database- about one million rows,
> > with one of the fields being varchar(500). I get an out of memory error
> > from java.
> >
> > If the whole ResultSet gets stashed in memory, this isn't really
> > surprising, but I'm wondering why this happens (if it does), rather than
> > a subset around the current record being cached and other rows being
> > retrieved as needed.
> >
> > If it turns out that there are good reasons for it to all be in memory,
> > then my question is whether there is a better approach that people
> > typically use in this situation. For now, I'm simply breaking up the
> > select into smaller chunks, but that approach won't be satisfactory in
> > the long run.
> >
> > Thanks
> >
> > -Nick
> >
> > -------------------------------------------------------------------------
> >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > http://www.ontko.com/
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org


From: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: MySQL vs PostgreSQL.
Date: 2002-10-11 13:20:22
Message-ID: Pine.GSO.4.44.0210111535560.2057-100000@paju.oulu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


Check out:

http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

MySQL AB compares MySQL with PostgreSQL.

Quoted from one page
> Because we couldn't get vacuum() to work reliable with PostgreSQL 7.1.1,
> we haven't been able to generate a --fast version of the benchmarks yet
> (where we would have done a vacuum() at critical places in the benchmark
> to get better performance for PostgreSQL). We will do a new run of the
> benchmarks as soon as the PostgreSQL developers can point out what we
> have done wrong or have fixed vacuum() so that it works again.

and from another.

> Drawbacks with PostgreSQL compared to MySQL Server:
>
> VACUUM makes PostgreSQL hard to use in a 24/7 environment.

They also state that they have more sophisticated ALTER TABLE...

Only usable feature in their ALTER TABLE that doesn't (yet) exist in
PostgreSQL was changing column order (ok, the order by in table creation
could be nice), and that's still almost purely cosmetic. Anyway, I could
have used that command yesterday. Could this be added to pgsql.

MySQL supports data compression between front and back ends. This could be
easily implemented, or is it already supported?

I think all the other statements were misleading in the sense, that they
compared their newest product with PostgreSQL 7.1.1.

There's also following line:

> PostgreSQL currently offers the following advantages over MySQL Server:

After which there's only one empty line.

> Note that because we know the MySQL road map, we have included in the
> following table the version when MySQL Server should support this
> feature. Unfortunately we couldn't do this for
> previous comparisons, because we don't know the PostgreSQL roadmap.

They could be provided one... ;-)

> Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
> you don't need to dump/restore your data, as you have to do with most
> PostgreSQL upgrades.

Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
linux box.

Of course PostgreSQL isn't yet as fast as it could be. ;)

--
Antti Haapala


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 13:36:52
Message-ID: 1034343413.5812.6.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, 2002-10-11 at 09:20, Antti Haapala wrote:
>
> Check out:
>
> http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html
>
> MySQL AB compares MySQL with PostgreSQL.

I wouldn't look too far into these at all. I've tried to get
' " as identifier quote (ANSI SQL) ' corrected on the crash-me pages for
us a couple of times (they say we don't support it for some reason).

I've not looked, but I thought 7.1 supported rename table as well.

Anyway, max table row length was wrong with 7.1 wrong too unless I'm
confused as to what a blob is (is text and varchar a blob -- what about
your own 10Mb fixed length datatype -- how about a huge array of
integers if the previous are considered blobs?)

--
Rod Taylor


From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 13:38:23
Message-ID: 3DA721A7.12077.A0A79DE@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On 11 Oct 2002 at 16:20, Antti Haapala wrote:

> Check out:
> http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

Well, I guess there are many threads on this. You can dig around archives..
> > Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
> > you don't need to dump/restore your data, as you have to do with most
> > PostgreSQL upgrades.
>
> Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
> linux box.

Well, that remains as a point. Imagine a 100GB database on a 150GB disk array.
How do you dump and reload? In place conversion of data is an absolute
necessary feature and it's already on TODO.


> Of course PostgreSQL isn't yet as fast as it could be. ;)

Check few posts I have made in last three weeks. You will find that postgresql
is fast enough to surpass mysql in what are considered as mysql strongholds. Of
course it's not a handy win but for sure, postgresql is not slow.

And for vacuum thing, I have written a autovacuum daemon that can automatically
vacuum databases depending upon their activity. Check it at
gborg.postgresql.org. (I can't imagine this as an advertisement of myself but
looks like the one)

Let thread be rested. Postgresql certaily needs some maketing hand but refuting
claims in that article is not the best way to start it. I guess most hackers
would agree with this..

Bye
Shridhar

--
Cat, n.: Lapwarmer with built-in buzzer.


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:30:30
Message-ID: 1034346630.1796.74.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch normally

Dave

On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
> What would be the disadvantage of making the JDBC driver use a
> cursor under the covers (always)? Is it significantly slower or more
> resource-intensive than fetching all the data at once? Certainly it seems
> like it would save memory in some cases.
>
> Aaron
>
> On 10 Oct 2002, Dave Cramer wrote:
> > Nick,
> >
> > Use a cursor, the current driver doesn't support caching, the backend
> > gives you everything you ask for, you can't just say you want a limited
> > set.
> >
> > So if you use cursors you can fetch a subset
> >
> > Dave
> > On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
> > >
> > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > with one of the fields being varchar(500). I get an out of memory error from
> > > java.
> > >
> > > If the whole ResultSet gets stashed in memory, this isn't really surprising,
> > > but I'm wondering why this happens (if it does), rather than a subset around
> > > the current record being cached and other rows being retrieved as needed.
> > >
> > > If it turns out that there are good reasons for it to all be in memory, then
> > > my question is whether there is a better approach that people typically use
> > > in this situation. For now, I'm simply breaking up the select into smaller
> > > chunks, but that approach won't be satisfactory in the long run.
> > >
> > > Thanks
> > >
> > > -Nick
> > >
> > > --------------------------------------------------------------------------
> > > Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
> > > Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>


From: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:30:49
Message-ID: Pine.LNX.4.44.0210111029160.695-100000@www.princetongames.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)? Is it significantly slower or more
resource-intensive than fetching all the data at once? Certainly it seems
like it would save memory in some cases.

Aaron

On 10 Oct 2002, Dave Cramer wrote:
> Nick,
>
> Use a cursor, the current driver doesn't support caching, the backend
> gives you everything you ask for, you can't just say you want a limited
> set.
>
> So if you use cursors you can fetch a subset
>
> Dave
> On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:
> >
> > I'm selecting a huge ResultSet from our database- about one million rows,
> > with one of the fields being varchar(500). I get an out of memory error from
> > java.
> >
> > If the whole ResultSet gets stashed in memory, this isn't really surprising,
> > but I'm wondering why this happens (if it does), rather than a subset around
> > the current record being cached and other rows being retrieved as needed.
> >
> > If it turns out that there are good reasons for it to all be in memory, then
> > my question is whether there is a better approach that people typically use
> > in this situation. For now, I'm simply breaking up the select into smaller
> > chunks, but that approach won't be satisfactory in the long run.
> >
> > Thanks
> >
> > -Nick
> >
> > --------------------------------------------------------------------------
> > Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
> > Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


From: Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 14:42:12
Message-ID: 1034347333.4862.64.camel@mouse.copelandconsulting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, 2002-10-11 at 08:20, Antti Haapala wrote:
> Quoted from one page
> > Because we couldn't get vacuum() to work reliable with PostgreSQL 7.1.1,

I have little respect for the MySQL advocacy guys. They purposely
spread misinformation. They always compare their leading edge alpha
software against Postgres' year+ old stable versions. In some cases,
I've seen them compare their alpha (4.x) software against 7.0. Very sad
that these people can't even attempt to be honest.

In the case above, since they are comparing 4.x, they should be
comparing it to 7.x at least. It's also very sad that their testers
don't seem to even understand something as simple as cron. If they
can't understand something as simple as cron, I fear any conclusions
they may arrive at throughout their testing (destined to be
incorrect/invalid).

> MySQL supports data compression between front and back ends. This could be
> easily implemented, or is it already supported?

Mammoth has such a feature...or at least it's been in development for a
while. If I understood them correctly, it will be donated back to core
sometime in the 7.5 or 7.7 series. Last I heard, their results were
absolutely wonderful.

>
> I think all the other statements were misleading in the sense, that they
> compared their newest product with PostgreSQL 7.1.1.

Ya, historically, they go out of their way to ensure unfair
comparisons. I have no respect for them.

>
> They could be provided one... ;-)

In other words, they need a list of features that they can one day hope
to add to MySQL.

>
> > Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
> > you don't need to dump/restore your data, as you have to do with most
> > PostgreSQL upgrades.
>
> Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
> linux box.
>
> Of course PostgreSQL isn't yet as fast as it could be. ;)
>

I consider this par for the course. This is something I've had to do
with Sybase, Oracle and MSSQL.

Greg


From: "Michael Paesold" <mpaesold(at)gmx(dot)at>
To: "Dave Cramer" <Dave(at)micro-automation(dot)net>, "Aaron Mulder" <ammulder(at)alumni(dot)princeton(dot)edu>
Cc: "pgsql-jdbc" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:44:36
Message-ID: 00e601c27134$b9260570$4201a8c0@beeblebrox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Dave Cramer <Dave(at)micro-automation(dot)net> wrote:

> No disadvantage, in fact that is what we would like to do.
>
>
> setFetchSize(size) turns on cursor support, otherwise fetch normally
>
> Dave
>
> On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
> > What would be the disadvantage of making the JDBC driver use a
> > cursor under the covers (always)? Is it significantly slower or more
> > resource-intensive than fetching all the data at once? Certainly it
seems
> > like it would save memory in some cases.
> >
> > Aaron

Well, using a cursor based result set *always* is not going to work. Cursors
will not be held over a commit, whereas a buffer result set will. So the
setFetchSize..

Regards,
Michael Paesold


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Michael Paesold <mpaesold(at)gmx(dot)at>
Cc: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:48:41
Message-ID: 1034347722.2777.81.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Michael,

You are correct, commit will effectively close the cursor.

This is the only way to deal with large result sets however.

Dave
On Fri, 2002-10-11 at 10:44, Michael Paesold wrote:
> Dave Cramer <Dave(at)micro-automation(dot)net> wrote:
>
> > No disadvantage, in fact that is what we would like to do.
> >
> >
> > setFetchSize(size) turns on cursor support, otherwise fetch normally
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:
> > > What would be the disadvantage of making the JDBC driver use a
> > > cursor under the covers (always)? Is it significantly slower or more
> > > resource-intensive than fetching all the data at once? Certainly it
> seems
> > > like it would save memory in some cases.
> > >
> > > Aaron
>
> Well, using a cursor based result set *always* is not going to work. Cursors
> will not be held over a commit, whereas a buffer result set will. So the
> setFetchSize..
>
> Regards,
> Michael Paesold
>
>


From: nferrier(at)tapsellferrier(dot)co(dot)uk
To: Dave Cramer <Dave(at)micro-automation(dot)net>
Cc: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:52:18
Message-ID: u7kgpxc2l.fsf@tapsellferrier.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Dave Cramer <Dave(at)micro-automation(dot)net> writes:

> No disadvantage, in fact that is what we would like to do.
>
>
> setFetchSize(size) turns on cursor support, otherwise fetch
> normally

I love PostgreSQL's default behaviour: it's great to know
that a server side cursor resource is probably not hanging around for
simple querys.

Nic


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: nferrier(at)tapsellferrier(dot)co(dot)uk
Cc: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:52:34
Message-ID: 1034347955.3461.83.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Just so you know this isn't implemented yet. My reference to
setFetchSize below was just a suggestion as to how to implement it

Dave
On Fri, 2002-10-11 at 10:52, nferrier(at)tapsellferrier(dot)co(dot)uk wrote:
> Dave Cramer <Dave(at)micro-automation(dot)net> writes:
>
> > No disadvantage, in fact that is what we would like to do.
> >
> >
> > setFetchSize(size) turns on cursor support, otherwise fetch
> > normally
>
> I love PostgreSQL's default behaviour: it's great to know
> that a server side cursor resource is probably not hanging around for
> simple querys.
>
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: nferrier(at)tapsellferrier(dot)co(dot)uk
Cc: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:55:54
Message-ID: 1034348155.3461.85.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Nic,

That would be great!

Dave
On Fri, 2002-10-11 at 10:57, nferrier(at)tapsellferrier(dot)co(dot)uk wrote:
> Dave Cramer <Dave(at)micro-automation(dot)net> writes:
>
> > Just so you know this isn't implemented yet. My reference to
> > setFetchSize below was just a suggestion as to how to implement it
>
> I'll do it if you like.
>
>
> Nic
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Rod Taylor <rbt(at)rbt(dot)ca>
Cc: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 14:57:15
Message-ID: 3DA6E6CB.DE0BD1EE@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Rod Taylor wrote:
>
> On Fri, 2002-10-11 at 09:20, Antti Haapala wrote:
> >
> > Check out:
> >
> > http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html
> >
> > MySQL AB compares MySQL with PostgreSQL.
>
> I wouldn't look too far into these at all. I've tried to get
> ' " as identifier quote (ANSI SQL) ' corrected on the crash-me pages for
> us a couple of times (they say we don't support it for some reason).

It's once again the typical MySQL propaganda. As usual they compare a
future version of MySQL against an old release of PostgreSQL. And they
just compare on buzzword level.
Do their foreign keys have referential actions and deferrability? Is log
based master slave replication all there can be?

And surely do we have something that compares to *their* roadmap. That
they cannot find it is because it's named HISTORY.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: nferrier(at)tapsellferrier(dot)co(dot)uk
To: Dave Cramer <Dave(at)micro-automation(dot)net>
Cc: nferrier(at)tapsellferrier(dot)co(dot)uk, Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 14:57:49
Message-ID: u4rbtxbte.fsf@tapsellferrier.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Dave Cramer <Dave(at)micro-automation(dot)net> writes:

> Just so you know this isn't implemented yet. My reference to
> setFetchSize below was just a suggestion as to how to implement it

I'll do it if you like.

Nic


From: Doug Fields <dfields-postgres(at)pexicom(dot)com>
To: snpe <snpe(at)snpe(dot)co(dot)yu>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 15:44:23
Message-ID: 5.1.0.14.2.20021011114023.02871348@pop.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

At 08:27 AM 10/11/2002, snpe wrote:
>Barry,
> Is it true ?
>I create table with one column varchar(500) and enter 1 milion rows with
>length 10-20 character.JDBC query 'select * from a' get error 'out of
>memory', but psql not.
>I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

>In C library is 'execute query' without fetch - in jdbc execute fetch all
>rows
>and this is problem - I think that executequery must prepare query and fetch
>(ResultSet.next or ...) must fetch only fetchSize rows.
>I am not sure, but I think that is problem with jdbc, not postgresql
>Hackers ?
>Does psql fetch all rows and if not how many ?
>Can I change fetch size in psql ?
>CURSOR , FETCH and MOVE isn't solution.
>If I use jdbc in third-party IDE, I can't force this solution
>
>regards
>
>On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > Nick,
> >
> > This has been discussed before on this list many times. But the short
> > answer is that that is how the postgres server handles queries. If you
> > issue a query the server will return the entire result. (try the same
> > query in psql and you will have the same problem). To work around this
> > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > sql commands for postgres).
> >
> > thanks,
> > --Barry
> >
> > Nick Fankhauser wrote:
> > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > with one of the fields being varchar(500). I get an out of memory error
> > > from java.
> > >
> > > If the whole ResultSet gets stashed in memory, this isn't really
> > > surprising, but I'm wondering why this happens (if it does), rather than
> > > a subset around the current record being cached and other rows being
> > > retrieved as needed.
> > >
> > > If it turns out that there are good reasons for it to all be in memory,
> > > then my question is whether there is a better approach that people
> > > typically use in this situation. For now, I'm simply breaking up the
> > > select into smaller chunks, but that approach won't be satisfactory in
> > > the long run.
> > >
> > > Thanks
> > >
> > > -Nick
> > >
> > > -------------------------------------------------------------------------
> > >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > http://www.ontko.com/
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Doug Fields <dfields-postgres(at)pexicom(dot)com>
Cc: snpe <snpe(at)snpe(dot)co(dot)yu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 15:58:04
Message-ID: 1034351885.2777.101.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> At 08:27 AM 10/11/2002, snpe wrote:
> >Barry,
> > Is it true ?
> >I create table with one column varchar(500) and enter 1 milion rows with
> >length 10-20 character.JDBC query 'select * from a' get error 'out of
> >memory', but psql not.
> >I insert 8 milion rows and psql work fine yet (slow, but work)
>
> The way the code works in JDBC is, in my opinion, a little poor but
> possibly mandated by JDBC design specs.
>
> It reads the entire result set from the database backend and caches it in a
> horrible Vector (which should really be a List and which should at least
> make an attempt to get the # of rows ahead of time to avoid all the
> resizing problems).
>
> Then, it doles it out from memory as you go through the ResultSet with the
> next() method.
>
> I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> WHOLE THING - through the result set as each row is returned from the
> backend, thus ensuring that you never use much more memory than one line.
> EVEN IF you have to keep the connection locked.
>
> The latter is what I expected it to do. The former is what it does. So, it
> necessitates you creating EVERY SELECT query which you think has more than
> a few rows (or which you think COULD have more than a few rows, "few" being
> defined by our VM memory limits) into a cursor based query. Really klugy. I
> intend to write a class to do that for every SELECT query for me automatically.
>
> Cheers,
>
> Doug
>
>
> >In C library is 'execute query' without fetch - in jdbc execute fetch all
> >rows
> >and this is problem - I think that executequery must prepare query and fetch
> >(ResultSet.next or ...) must fetch only fetchSize rows.
> >I am not sure, but I think that is problem with jdbc, not postgresql
> >Hackers ?
> >Does psql fetch all rows and if not how many ?
> >Can I change fetch size in psql ?
> >CURSOR , FETCH and MOVE isn't solution.
> >If I use jdbc in third-party IDE, I can't force this solution
> >
> >regards
> >
> >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > Nick,
> > >
> > > This has been discussed before on this list many times. But the short
> > > answer is that that is how the postgres server handles queries. If you
> > > issue a query the server will return the entire result. (try the same
> > > query in psql and you will have the same problem). To work around this
> > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > sql commands for postgres).
> > >
> > > thanks,
> > > --Barry
> > >
> > > Nick Fankhauser wrote:
> > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > with one of the fields being varchar(500). I get an out of memory error
> > > > from java.
> > > >
> > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > a subset around the current record being cached and other rows being
> > > > retrieved as needed.
> > > >
> > > > If it turns out that there are good reasons for it to all be in memory,
> > > > then my question is whether there is a better approach that people
> > > > typically use in this situation. For now, I'm simply breaking up the
> > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > the long run.
> > > >
> > > > Thanks
> > > >
> > > > -Nick
> > > >
> > > > -------------------------------------------------------------------------
> > > >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > http://www.ontko.com/
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 5: Have you checked our extensive FAQ?
> > > >
> > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>


From: Dave Cramer <dave(at)fastcrypt(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: move 0 behaviour
Date: 2002-10-11 16:04:31
Message-ID: 1034352271.2777.105.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Currently there is a TODO list item to have move 0 not position to the
end of the cursor.

Moving to the end of the cursor is useful, can we keep the behaviour and
change it to move end, or just leave it the way it is?

Dave


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 16:48:41
Message-ID: 200210111848.41337.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Hello,
Does it mean that psql uses cursors ?

regards
Haris Peco
On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> This really is an artifact of the way that postgres gives us the data.
>
> When you query the backend you get *all* of the results in the query,
> and there is no indication of how many results you are going to get. In
> simple selects it would be possible to get some idea by using
> count(field), but this wouldn't work nearly enough times to make it
> useful. So that leaves us with using cursors, which still won't tell you
> how many rows you are getting back, but at least you won't have the
> memory problems.
>
> This approach is far from trivial which is why it hasn't been
> implemented as of yet, keep in mind that result sets support things like
> move(n), first(), last(), the last of which will be the trickiest. Not
> to mention updateable result sets.
>
> As it turns out there is a mechanism to get to the end move 0 in
> 'cursor', which currently is being considered a bug.
>
> Dave
>
> On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > At 08:27 AM 10/11/2002, snpe wrote:
> > >Barry,
> > > Is it true ?
> > >I create table with one column varchar(500) and enter 1 milion rows with
> > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > >memory', but psql not.
> > >I insert 8 milion rows and psql work fine yet (slow, but work)
> >
> > The way the code works in JDBC is, in my opinion, a little poor but
> > possibly mandated by JDBC design specs.
> >
> > It reads the entire result set from the database backend and caches it in
> > a horrible Vector (which should really be a List and which should at
> > least make an attempt to get the # of rows ahead of time to avoid all the
> > resizing problems).
> >
> > Then, it doles it out from memory as you go through the ResultSet with
> > the next() method.
> >
> > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > WHOLE THING - through the result set as each row is returned from the
> > backend, thus ensuring that you never use much more memory than one line.
> > EVEN IF you have to keep the connection locked.
> >
> > The latter is what I expected it to do. The former is what it does. So,
> > it necessitates you creating EVERY SELECT query which you think has more
> > than a few rows (or which you think COULD have more than a few rows,
> > "few" being defined by our VM memory limits) into a cursor based query.
> > Really klugy. I intend to write a class to do that for every SELECT query
> > for me automatically.
> >
> > Cheers,
> >
> > Doug
> >
> > >In C library is 'execute query' without fetch - in jdbc execute fetch
> > > all rows
> > >and this is problem - I think that executequery must prepare query and
> > > fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > >I am not sure, but I think that is problem with jdbc, not postgresql
> > >Hackers ?
> > >Does psql fetch all rows and if not how many ?
> > >Can I change fetch size in psql ?
> > >CURSOR , FETCH and MOVE isn't solution.
> > >If I use jdbc in third-party IDE, I can't force this solution
> > >
> > >regards
> > >
> > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > Nick,
> > > >
> > > > This has been discussed before on this list many times. But the
> > > > short answer is that that is how the postgres server handles queries.
> > > > If you issue a query the server will return the entire result. (try
> > > > the same query in psql and you will have the same problem). To work
> > > > around this you can use explicit cursors (see the DECLARE CURSOR,
> > > > FETCH, and MOVE sql commands for postgres).
> > > >
> > > > thanks,
> > > > --Barry
> > > >
> > > > Nick Fankhauser wrote:
> > > > > I'm selecting a huge ResultSet from our database- about one million
> > > > > rows, with one of the fields being varchar(500). I get an out of
> > > > > memory error from java.
> > > > >
> > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > surprising, but I'm wondering why this happens (if it does), rather
> > > > > than a subset around the current record being cached and other rows
> > > > > being retrieved as needed.
> > > > >
> > > > > If it turns out that there are good reasons for it to all be in
> > > > > memory, then my question is whether there is a better approach that
> > > > > people typically use in this situation. For now, I'm simply
> > > > > breaking up the select into smaller chunks, but that approach won't
> > > > > be satisfactory in the long run.
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Nick
> > > > >
> > > > > -------------------------------------------------------------------
> > > > >------ - Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > > http://www.ontko.com/
> > > > >
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 5: Have you checked our
> > > > > extensive FAQ?
> > > > >
> > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > >---------------------------(end of broadcast)---------------------------
> > >TIP 2: you can get off all lists at once with the unregister command
> > > (send "unregister YourEmailAddressHere" to
> > > majordomo(at)postgresql(dot)org)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org


From: Barry Lind <barry(at)xythos(dot)com>
To: Doug Fields <dfields-postgres(at)pexicom(dot)com>
Cc: snpe <snpe(at)snpe(dot)co(dot)yu>, pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 16:55:57
Message-ID: 3DA7029D.5080001@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Doug Fields wrote:

>
> It reads the entire result set from the database backend and caches it
> in a horrible Vector (which should really be a List and which should at
> least make an attempt to get the # of rows ahead of time to avoid all
> the resizing problems).
>

The problem here is that we would then need two completely different
implementations for jdbc1 and jdbc2/3 since List is not part of jdk1.1.
We could build our own List implementation that works on jdk1.1, but I
am not sure the gain in performance is worth it. If you could do some
testing and come back with some numbers of the differences in
performance between ResultSets implemented with Vectors and Lists that
would probably give us enough information to guage how to proceed on
this suggested improvement.

> Then, it doles it out from memory as you go through the ResultSet with
> the next() method.
>
> I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
> THE WHOLE THING - through the result set as each row is returned from
> the backend, thus ensuring that you never use much more memory than one
> line. EVEN IF you have to keep the connection locked.
>

This had actually been tried in the past (just getting the records from
the server connection as requested), but this behavior violates the spec
and broke many peoples applications. The problem is that if you don't
use cursors, you end up tying up the connection until you finish
fetching all rows. So code like the following no longer works:

get result set
while (rs.next()) {
get some values from the result
use them to update/insert some other table using a preparedstatement
}

Since the connection is locked until all the results are fetched, you
can't use the connection to perform the update/insert you want to do for
each itteration of the loop.

> The latter is what I expected it to do. The former is what it does. So,
> it necessitates you creating EVERY SELECT query which you think has more
> than a few rows (or which you think COULD have more than a few rows,
> "few" being defined by our VM memory limits) into a cursor based query.
> Really klugy. I intend to write a class to do that for every SELECT
> query for me automatically.
>
> Cheers,
>
> Doug
>

--Barry


From: Aaron Mulder <ammulder(at)alumni(dot)princeton(dot)edu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 17:05:37
Message-ID: Pine.LNX.4.44.0210111303060.1260-100000@www.princetongames.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:
> This really is an artifact of the way that postgres gives us the data.
>
> When you query the backend you get *all* of the results in the query,
> and there is no indication of how many results you are going to get. In
> simple selects it would be possible to get some idea by using
> count(field), but this wouldn't work nearly enough times to make it
> useful. So that leaves us with using cursors, which still won't tell you
> how many rows you are getting back, but at least you won't have the
> memory problems.
>
> This approach is far from trivial which is why it hasn't been
> implemented as of yet, keep in mind that result sets support things like
> move(n), first(), last(), the last of which will be the trickiest. Not
> to mention updateable result sets.
>
> As it turns out there is a mechanism to get to the end move 0 in
> 'cursor', which currently is being considered a bug.
>
> Dave
>
> On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > At 08:27 AM 10/11/2002, snpe wrote:
> > >Barry,
> > > Is it true ?
> > >I create table with one column varchar(500) and enter 1 milion rows with
> > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > >memory', but psql not.
> > >I insert 8 milion rows and psql work fine yet (slow, but work)
> >
> > The way the code works in JDBC is, in my opinion, a little poor but
> > possibly mandated by JDBC design specs.
> >
> > It reads the entire result set from the database backend and caches it in a
> > horrible Vector (which should really be a List and which should at least
> > make an attempt to get the # of rows ahead of time to avoid all the
> > resizing problems).
> >
> > Then, it doles it out from memory as you go through the ResultSet with the
> > next() method.
> >
> > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > WHOLE THING - through the result set as each row is returned from the
> > backend, thus ensuring that you never use much more memory than one line.
> > EVEN IF you have to keep the connection locked.
> >
> > The latter is what I expected it to do. The former is what it does. So, it
> > necessitates you creating EVERY SELECT query which you think has more than
> > a few rows (or which you think COULD have more than a few rows, "few" being
> > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > intend to write a class to do that for every SELECT query for me automatically.
> >
> > Cheers,
> >
> > Doug
> >
> >
> > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > >rows
> > >and this is problem - I think that executequery must prepare query and fetch
> > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > >I am not sure, but I think that is problem with jdbc, not postgresql
> > >Hackers ?
> > >Does psql fetch all rows and if not how many ?
> > >Can I change fetch size in psql ?
> > >CURSOR , FETCH and MOVE isn't solution.
> > >If I use jdbc in third-party IDE, I can't force this solution
> > >
> > >regards
> > >
> > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > Nick,
> > > >
> > > > This has been discussed before on this list many times. But the short
> > > > answer is that that is how the postgres server handles queries. If you
> > > > issue a query the server will return the entire result. (try the same
> > > > query in psql and you will have the same problem). To work around this
> > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > sql commands for postgres).
> > > >
> > > > thanks,
> > > > --Barry
> > > >
> > > > Nick Fankhauser wrote:
> > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > from java.
> > > > >
> > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > a subset around the current record being cached and other rows being
> > > > > retrieved as needed.
> > > > >
> > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > then my question is whether there is a better approach that people
> > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > the long run.
> > > > >
> > > > > Thanks
> > > > >
> > > > > -Nick
> > > > >
> > > > > -------------------------------------------------------------------------
> > > > >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > > http://www.ontko.com/
> > > > >
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 5: Have you checked our extensive FAQ?
> > > > >
> > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 6: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > >
> > >---------------------------(end of broadcast)---------------------------
> > >TIP 2: you can get off all lists at once with the unregister command
> > > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>


From: Dror Matalon <dror(at)zapatec(dot)com>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 17:12:31
Message-ID: 20021011171231.GZ74492@four.zapatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> It wouldn't be bad to start with a naive implementation of
> last()... If the only problem we have is that last() doesn't perform
> well, we're probably making good progress. :)
> On the other hand, I would think the updateable result sets would
> be the most challenging; does the server provide any analogous features
> with its cursors?
>
> Aaron
>
> On 11 Oct 2002, Dave Cramer wrote:
> > This really is an artifact of the way that postgres gives us the data.
> >
> > When you query the backend you get *all* of the results in the query,
> > and there is no indication of how many results you are going to get. In
> > simple selects it would be possible to get some idea by using
> > count(field), but this wouldn't work nearly enough times to make it
> > useful. So that leaves us with using cursors, which still won't tell you
> > how many rows you are getting back, but at least you won't have the
> > memory problems.
> >
> > This approach is far from trivial which is why it hasn't been
> > implemented as of yet, keep in mind that result sets support things like
> > move(n), first(), last(), the last of which will be the trickiest. Not
> > to mention updateable result sets.
> >
> > As it turns out there is a mechanism to get to the end move 0 in
> > 'cursor', which currently is being considered a bug.
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > At 08:27 AM 10/11/2002, snpe wrote:
> > > >Barry,
> > > > Is it true ?
> > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > >memory', but psql not.
> > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > >
> > > The way the code works in JDBC is, in my opinion, a little poor but
> > > possibly mandated by JDBC design specs.
> > >
> > > It reads the entire result set from the database backend and caches it in a
> > > horrible Vector (which should really be a List and which should at least
> > > make an attempt to get the # of rows ahead of time to avoid all the
> > > resizing problems).
> > >
> > > Then, it doles it out from memory as you go through the ResultSet with the
> > > next() method.
> > >
> > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > WHOLE THING - through the result set as each row is returned from the
> > > backend, thus ensuring that you never use much more memory than one line.
> > > EVEN IF you have to keep the connection locked.
> > >
> > > The latter is what I expected it to do. The former is what it does. So, it
> > > necessitates you creating EVERY SELECT query which you think has more than
> > > a few rows (or which you think COULD have more than a few rows, "few" being
> > > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > > intend to write a class to do that for every SELECT query for me automatically.
> > >
> > > Cheers,
> > >
> > > Doug
> > >
> > >
> > > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > > >rows
> > > >and this is problem - I think that executequery must prepare query and fetch
> > > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > >Hackers ?
> > > >Does psql fetch all rows and if not how many ?
> > > >Can I change fetch size in psql ?
> > > >CURSOR , FETCH and MOVE isn't solution.
> > > >If I use jdbc in third-party IDE, I can't force this solution
> > > >
> > > >regards
> > > >
> > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > Nick,
> > > > >
> > > > > This has been discussed before on this list many times. But the short
> > > > > answer is that that is how the postgres server handles queries. If you
> > > > > issue a query the server will return the entire result. (try the same
> > > > > query in psql and you will have the same problem). To work around this
> > > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > > sql commands for postgres).
> > > > >
> > > > > thanks,
> > > > > --Barry
> > > > >
> > > > > Nick Fankhauser wrote:
> > > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > > from java.
> > > > > >
> > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > > a subset around the current record being cached and other rows being
> > > > > > retrieved as needed.
> > > > > >
> > > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > > then my question is whether there is a better approach that people
> > > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > > the long run.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > -Nick
> > > > > >
> > > > > > -------------------------------------------------------------------------
> > > > > >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > > > http://www.ontko.com/
> > > > > >
> > > > > >
> > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > TIP 5: Have you checked our extensive FAQ?
> > > > > >
> > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 6: Have you searched our list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > >
> > > >---------------------------(end of broadcast)---------------------------
> > > >TIP 2: you can get off all lists at once with the unregister command
> > > > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> > >
> > >
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: Dror Matalon <dror(at)zapatec(dot)com>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 17:59:41
Message-ID: 1034359185.3461.114.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
>
> Hi,
>
> I'm jumping in late into this discussion but ...
>
> In my mind a lot of these features break the model. From an application
> prespective, if I want to do last, I do a count(*) and then I do a fetch
> with limit; Not quite the same, but all these methods of fetching the
> whole data locally and manipulating it to a large exten defeat the
> purpose. Let the backend do the work, instead of trying to replicate the
> functionality in JDBC.
>
> That said I do understand that some of these are required by the JDBC 2.0
> spec.
>
> Dror
>
> On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > It wouldn't be bad to start with a naive implementation of
> > last()... If the only problem we have is that last() doesn't perform
> > well, we're probably making good progress. :)
> > On the other hand, I would think the updateable result sets would
> > be the most challenging; does the server provide any analogous features
> > with its cursors?
> >
> > Aaron
> >
> > On 11 Oct 2002, Dave Cramer wrote:
> > > This really is an artifact of the way that postgres gives us the data.
> > >
> > > When you query the backend you get *all* of the results in the query,
> > > and there is no indication of how many results you are going to get. In
> > > simple selects it would be possible to get some idea by using
> > > count(field), but this wouldn't work nearly enough times to make it
> > > useful. So that leaves us with using cursors, which still won't tell you
> > > how many rows you are getting back, but at least you won't have the
> > > memory problems.
> > >
> > > This approach is far from trivial which is why it hasn't been
> > > implemented as of yet, keep in mind that result sets support things like
> > > move(n), first(), last(), the last of which will be the trickiest. Not
> > > to mention updateable result sets.
> > >
> > > As it turns out there is a mechanism to get to the end move 0 in
> > > 'cursor', which currently is being considered a bug.
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > >Barry,
> > > > > Is it true ?
> > > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > > >memory', but psql not.
> > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > >
> > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > possibly mandated by JDBC design specs.
> > > >
> > > > It reads the entire result set from the database backend and caches it in a
> > > > horrible Vector (which should really be a List and which should at least
> > > > make an attempt to get the # of rows ahead of time to avoid all the
> > > > resizing problems).
> > > >
> > > > Then, it doles it out from memory as you go through the ResultSet with the
> > > > next() method.
> > > >
> > > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > > WHOLE THING - through the result set as each row is returned from the
> > > > backend, thus ensuring that you never use much more memory than one line.
> > > > EVEN IF you have to keep the connection locked.
> > > >
> > > > The latter is what I expected it to do. The former is what it does. So, it
> > > > necessitates you creating EVERY SELECT query which you think has more than
> > > > a few rows (or which you think COULD have more than a few rows, "few" being
> > > > defined by our VM memory limits) into a cursor based query. Really klugy. I
> > > > intend to write a class to do that for every SELECT query for me automatically.
> > > >
> > > > Cheers,
> > > >
> > > > Doug
> > > >
> > > >
> > > > >In C library is 'execute query' without fetch - in jdbc execute fetch all
> > > > >rows
> > > > >and this is problem - I think that executequery must prepare query and fetch
> > > > >(ResultSet.next or ...) must fetch only fetchSize rows.
> > > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > > >Hackers ?
> > > > >Does psql fetch all rows and if not how many ?
> > > > >Can I change fetch size in psql ?
> > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > >
> > > > >regards
> > > > >
> > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > Nick,
> > > > > >
> > > > > > This has been discussed before on this list many times. But the short
> > > > > > answer is that that is how the postgres server handles queries. If you
> > > > > > issue a query the server will return the entire result. (try the same
> > > > > > query in psql and you will have the same problem). To work around this
> > > > > > you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
> > > > > > sql commands for postgres).
> > > > > >
> > > > > > thanks,
> > > > > > --Barry
> > > > > >
> > > > > > Nick Fankhauser wrote:
> > > > > > > I'm selecting a huge ResultSet from our database- about one million rows,
> > > > > > > with one of the fields being varchar(500). I get an out of memory error
> > > > > > > from java.
> > > > > > >
> > > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > > surprising, but I'm wondering why this happens (if it does), rather than
> > > > > > > a subset around the current record being cached and other rows being
> > > > > > > retrieved as needed.
> > > > > > >
> > > > > > > If it turns out that there are good reasons for it to all be in memory,
> > > > > > > then my question is whether there is a better approach that people
> > > > > > > typically use in this situation. For now, I'm simply breaking up the
> > > > > > > select into smaller chunks, but that approach won't be satisfactory in
> > > > > > > the long run.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > -Nick
> > > > > > >
> > > > > > > -------------------------------------------------------------------------
> > > > > > >- Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > > > > http://www.ontko.com/
> > > > > > >
> > > > > > >
> > > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > > TIP 5: Have you checked our extensive FAQ?
> > > > > > >
> > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > >
> > > > > > ---------------------------(end of broadcast)---------------------------
> > > > > > TIP 6: Have you searched our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > >
> > > > >---------------------------(end of broadcast)---------------------------
> > > > >TIP 2: you can get off all lists at once with the unregister command
> > > > > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > > >
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 6: Have you searched our list archives?
> > > >
> > > > http://archives.postgresql.org
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> > >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
>
> --
> Dror Matalon
> Zapatec Inc
> 1700 MLK Way
> Berkeley, CA 94709
> http://www.zapatec.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: snpe <snpe(at)snpe(dot)co(dot)yu>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 18:13:05
Message-ID: 1034359986.1796.116.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

No,

It doesn't have to store them, only display them

Dave
On Fri, 2002-10-11 at 12:48, snpe wrote:
> Hello,
> Does it mean that psql uses cursors ?
>
> regards
> Haris Peco
> On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > This really is an artifact of the way that postgres gives us the data.
> >
> > When you query the backend you get *all* of the results in the query,
> > and there is no indication of how many results you are going to get. In
> > simple selects it would be possible to get some idea by using
> > count(field), but this wouldn't work nearly enough times to make it
> > useful. So that leaves us with using cursors, which still won't tell you
> > how many rows you are getting back, but at least you won't have the
> > memory problems.
> >
> > This approach is far from trivial which is why it hasn't been
> > implemented as of yet, keep in mind that result sets support things like
> > move(n), first(), last(), the last of which will be the trickiest. Not
> > to mention updateable result sets.
> >
> > As it turns out there is a mechanism to get to the end move 0 in
> > 'cursor', which currently is being considered a bug.
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > At 08:27 AM 10/11/2002, snpe wrote:
> > > >Barry,
> > > > Is it true ?
> > > >I create table with one column varchar(500) and enter 1 milion rows with
> > > >length 10-20 character.JDBC query 'select * from a' get error 'out of
> > > >memory', but psql not.
> > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > >
> > > The way the code works in JDBC is, in my opinion, a little poor but
> > > possibly mandated by JDBC design specs.
> > >
> > > It reads the entire result set from the database backend and caches it in
> > > a horrible Vector (which should really be a List and which should at
> > > least make an attempt to get the # of rows ahead of time to avoid all the
> > > resizing problems).
> > >
> > > Then, it doles it out from memory as you go through the ResultSet with
> > > the next() method.
> > >
> > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
> > > WHOLE THING - through the result set as each row is returned from the
> > > backend, thus ensuring that you never use much more memory than one line.
> > > EVEN IF you have to keep the connection locked.
> > >
> > > The latter is what I expected it to do. The former is what it does. So,
> > > it necessitates you creating EVERY SELECT query which you think has more
> > > than a few rows (or which you think COULD have more than a few rows,
> > > "few" being defined by our VM memory limits) into a cursor based query.
> > > Really klugy. I intend to write a class to do that for every SELECT query
> > > for me automatically.
> > >
> > > Cheers,
> > >
> > > Doug
> > >
> > > >In C library is 'execute query' without fetch - in jdbc execute fetch
> > > > all rows
> > > >and this is problem - I think that executequery must prepare query and
> > > > fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > >I am not sure, but I think that is problem with jdbc, not postgresql
> > > >Hackers ?
> > > >Does psql fetch all rows and if not how many ?
> > > >Can I change fetch size in psql ?
> > > >CURSOR , FETCH and MOVE isn't solution.
> > > >If I use jdbc in third-party IDE, I can't force this solution
> > > >
> > > >regards
> > > >
> > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > Nick,
> > > > >
> > > > > This has been discussed before on this list many times. But the
> > > > > short answer is that that is how the postgres server handles queries.
> > > > > If you issue a query the server will return the entire result. (try
> > > > > the same query in psql and you will have the same problem). To work
> > > > > around this you can use explicit cursors (see the DECLARE CURSOR,
> > > > > FETCH, and MOVE sql commands for postgres).
> > > > >
> > > > > thanks,
> > > > > --Barry
> > > > >
> > > > > Nick Fankhauser wrote:
> > > > > > I'm selecting a huge ResultSet from our database- about one million
> > > > > > rows, with one of the fields being varchar(500). I get an out of
> > > > > > memory error from java.
> > > > > >
> > > > > > If the whole ResultSet gets stashed in memory, this isn't really
> > > > > > surprising, but I'm wondering why this happens (if it does), rather
> > > > > > than a subset around the current record being cached and other rows
> > > > > > being retrieved as needed.
> > > > > >
> > > > > > If it turns out that there are good reasons for it to all be in
> > > > > > memory, then my question is whether there is a better approach that
> > > > > > people typically use in this situation. For now, I'm simply
> > > > > > breaking up the select into smaller chunks, but that approach won't
> > > > > > be satisfactory in the long run.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > -Nick
> > > > > >
> > > > > > -------------------------------------------------------------------
> > > > > >------ - Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax
> > > > > > 1.765.962.9788 Ray Ontko & Co. Software Consulting Services
> > > > > > http://www.ontko.com/
> > > > > >
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 5: Have you checked our
> > > > > > extensive FAQ?
> > > > > >
> > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > >---------------------------(end of broadcast)---------------------------
> > > >TIP 2: you can get off all lists at once with the unregister command
> > > > (send "unregister YourEmailAddressHere" to
> > > > majordomo(at)postgresql(dot)org)
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>


From: Doug Fields <dfields-postgres(at)pexicom(dot)com>
To: Barry Lind <barry(at)xythos(dot)com>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 18:13:07
Message-ID: 5.1.0.14.2.20021011140433.0260e858@pop.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


>The problem here is that we would then need two completely different
>implementations for jdbc1 and jdbc2/3 since List is not part of
>jdk1.1. We could build our own List implementation that works on jdk1.1,
>but I am not sure the gain in performance is worth it. If you could do
>some testing and come back with some numbers of the differences in
>performance between ResultSets implemented with Vectors and Lists that
>would probably give us enough information to guage how to proceed on this
>suggested improvement.

In the past, I have done this sort of thing.

The "synchronized" overhead of a "synchronized method" is about 7 times the
overhead of a regular method call. I did many empirical tests of this on
JDK 1.3 and 1.4 on Linux (2.2 and 2.4) due to the high performance demands
of the software my firm uses. Now, that all depends on how many times you
invoke those methods and how fast they are otherwise. I'm unwilling to do
that for PostgreSQL, but I have to imagine that scrapping JDK 1.1 support
would not be a bad thing and may even be a good thing. Anyone still using
JDK 1.1 is also probably using it in conjunction with other products from
that era, so having a modern product compatible with a very out of date
product makes no sense in my estimation.

I don't make policy, though - that seems to be your job generally, Barry. :)

>This had actually been tried in the past (just getting the records from
>the server connection as requested), but this behavior violates the spec
>and broke many peoples applications. The problem is that if you don't use
>cursors, you end up tying up the connection until you finish fetching all
>rows. So code like the following no longer works:
>
>get result set
>while (rs.next()) {
> get some values from the result
> use them to update/insert some other table using a preparedstatement
>}
>
>Since the connection is locked until all the results are fetched, you
>can't use the connection to perform the update/insert you want to do for
>each itteration of the loop.

Agreed on this point. However, nonetheless, and regardless of the fact that
this may break the spec and should not be the default behavior, this should
be an option, because the current way of the driver working is a horror for
anyone who has to deal with large result sets (such as I do on a regular
basis).

I don't mind keeping a connection locked for ages. I do mind running out of
VM space for a large result set which is streamed FROM THE DATABASE SERVER.
Result sets should have the ability to be streamed end to end, IMO - even
if it's a non-standard extension or an option to the connection when
created or the statement when created.

Again, I don't make policy, and I'm insufficiently motivated to do it
myself. Don't think it invalidates my opinion, but I won't kvetch about it
either. I just designed a class which does the same thing by taking a query
and turning it into a cursor-based query.

Cheers,

Doug


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: snpe <snpe(at)snpe(dot)co(dot)yu>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 20:26:25
Message-ID: 1034367986.2777.126.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

AFAIK, it doesn't work unless the driver is running natively on linux.
It used to use a native library, but I did see something which looks
like a net protocol ?

dave
On Fri, 2002-10-11 at 16:33, snpe wrote:
> There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> Last version don't wokr with pgsql 7.3 and I don't test more.
> I will try next day, when I download pgsql 7.2
>
> regards
> Haris Peco
> On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > Agreed, but there are selects where count(*) won't work. Even so, what
> > we are talking about here is hiding the implementation of cursors behind
> > the result set. What I would envision is some sort of cacheing where
> > when the user set's the fetchsize to 10 for instance we do the select,
> > and when they ask for next() we check to see if we have these rows in
> > the cache, and go get them if necessary 10 at a time, possibly keeping
> > one set of ten behind where we are and one set of 10 ahead of where we
> > are. So recalling that resultSets have absolute positioning, as well as
> > first(), and last() positioning we need the ability to move with the
> > minimum number of trips to the backend.
> >
> > As it turns out the move command in postgres does support moving to the
> > end (move 0 ); at the moment this is considered a bug, and is on the
> > todo list to be removed. I expect we can get some sort of implementation
> > which allows us to move to the end ( move end )
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > Hi,
> > >
> > > I'm jumping in late into this discussion but ...
> > >
> > > In my mind a lot of these features break the model. From an application
> > > prespective, if I want to do last, I do a count(*) and then I do a fetch
> > > with limit; Not quite the same, but all these methods of fetching the
> > > whole data locally and manipulating it to a large exten defeat the
> > > purpose. Let the backend do the work, instead of trying to replicate the
> > > functionality in JDBC.
> > >
> > > That said I do understand that some of these are required by the JDBC 2.0
> > > spec.
> > >
> > > Dror
> > >
> > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > It wouldn't be bad to start with a naive implementation of
> > > > last()... If the only problem we have is that last() doesn't perform
> > > > well, we're probably making good progress. :)
> > > > On the other hand, I would think the updateable result sets would
> > > > be the most challenging; does the server provide any analogous features
> > > > with its cursors?
> > > >
> > > > Aaron
> > > >
> > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > This really is an artifact of the way that postgres gives us the
> > > > > data.
> > > > >
> > > > > When you query the backend you get *all* of the results in the query,
> > > > > and there is no indication of how many results you are going to get.
> > > > > In simple selects it would be possible to get some idea by using
> > > > > count(field), but this wouldn't work nearly enough times to make it
> > > > > useful. So that leaves us with using cursors, which still won't tell
> > > > > you how many rows you are getting back, but at least you won't have
> > > > > the memory problems.
> > > > >
> > > > > This approach is far from trivial which is why it hasn't been
> > > > > implemented as of yet, keep in mind that result sets support things
> > > > > like move(n), first(), last(), the last of which will be the
> > > > > trickiest. Not to mention updateable result sets.
> > > > >
> > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > 'cursor', which currently is being considered a bug.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > >Barry,
> > > > > > > Is it true ?
> > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > rows with length 10-20 character.JDBC query 'select * from a' get
> > > > > > > error 'out of memory', but psql not.
> > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > >
> > > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > > possibly mandated by JDBC design specs.
> > > > > >
> > > > > > It reads the entire result set from the database backend and caches
> > > > > > it in a horrible Vector (which should really be a List and which
> > > > > > should at least make an attempt to get the # of rows ahead of time
> > > > > > to avoid all the resizing problems).
> > > > > >
> > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > with the next() method.
> > > > > >
> > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > more memory than one line. EVEN IF you have to keep the connection
> > > > > > locked.
> > > > > >
> > > > > > The latter is what I expected it to do. The former is what it does.
> > > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > > has more than a few rows (or which you think COULD have more than a
> > > > > > few rows, "few" being defined by our VM memory limits) into a
> > > > > > cursor based query. Really klugy. I intend to write a class to do
> > > > > > that for every SELECT query for me automatically.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Doug
> > > > > >
> > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > fetch all rows
> > > > > > >and this is problem - I think that executequery must prepare query
> > > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > > > > >I am not sure, but I think that is problem with jdbc, not
> > > > > > > postgresql Hackers ?
> > > > > > >Does psql fetch all rows and if not how many ?
> > > > > > >Can I change fetch size in psql ?
> > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > >
> > > > > > >regards
> > > > > > >
> > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > Nick,
> > > > > > > >
> > > > > > > > This has been discussed before on this list many times. But
> > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > handles queries. If you issue a query the server will return
> > > > > > > > the entire result. (try the same query in psql and you will
> > > > > > > > have the same problem). To work around this you can use
> > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > commands for postgres).
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > --Barry
> > > > > > > >
> > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > get an out of memory error from java.
> > > > > > > > >
> > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > > does), rather than a subset around the current record being
> > > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > > >
> > > > > > > > > If it turns out that there are good reasons for it to all be
> > > > > > > > > in memory, then my question is whether there is a better
> > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > -Nick
> > > > > > > > >
> > > > > > > > > -------------------------------------------------------------
> > > > > > > > >------------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co.
> > > > > > > > > Software Consulting Services http://www.ontko.com/
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > > our extensive FAQ?
> > > > > > > > >
> > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > > our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > >---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > > lists at once with the unregister command (send "unregister
> > > > > > > YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > > list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > > Usenet, please send an appropriate subscribe-nomail command to
> > > > majordomo(at)postgresql(dot)org so that your message can get through to the
> > > > mailing list cleanly
> > >
> > > --
> > > Dror Matalon
> > > Zapatec Inc
> > > 1700 MLK Way
> > > Berkeley, CA 94709
> > > http://www.zapatec.com
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 20:33:24
Message-ID: 200210112233.24505.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco
On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> Agreed, but there are selects where count(*) won't work. Even so, what
> we are talking about here is hiding the implementation of cursors behind
> the result set. What I would envision is some sort of cacheing where
> when the user set's the fetchsize to 10 for instance we do the select,
> and when they ask for next() we check to see if we have these rows in
> the cache, and go get them if necessary 10 at a time, possibly keeping
> one set of ten behind where we are and one set of 10 ahead of where we
> are. So recalling that resultSets have absolute positioning, as well as
> first(), and last() positioning we need the ability to move with the
> minimum number of trips to the backend.
>
> As it turns out the move command in postgres does support moving to the
> end (move 0 ); at the moment this is considered a bug, and is on the
> todo list to be removed. I expect we can get some sort of implementation
> which allows us to move to the end ( move end )
>
> Dave
>
> On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > Hi,
> >
> > I'm jumping in late into this discussion but ...
> >
> > In my mind a lot of these features break the model. From an application
> > prespective, if I want to do last, I do a count(*) and then I do a fetch
> > with limit; Not quite the same, but all these methods of fetching the
> > whole data locally and manipulating it to a large exten defeat the
> > purpose. Let the backend do the work, instead of trying to replicate the
> > functionality in JDBC.
> >
> > That said I do understand that some of these are required by the JDBC 2.0
> > spec.
> >
> > Dror
> >
> > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > It wouldn't be bad to start with a naive implementation of
> > > last()... If the only problem we have is that last() doesn't perform
> > > well, we're probably making good progress. :)
> > > On the other hand, I would think the updateable result sets would
> > > be the most challenging; does the server provide any analogous features
> > > with its cursors?
> > >
> > > Aaron
> > >
> > > On 11 Oct 2002, Dave Cramer wrote:
> > > > This really is an artifact of the way that postgres gives us the
> > > > data.
> > > >
> > > > When you query the backend you get *all* of the results in the query,
> > > > and there is no indication of how many results you are going to get.
> > > > In simple selects it would be possible to get some idea by using
> > > > count(field), but this wouldn't work nearly enough times to make it
> > > > useful. So that leaves us with using cursors, which still won't tell
> > > > you how many rows you are getting back, but at least you won't have
> > > > the memory problems.
> > > >
> > > > This approach is far from trivial which is why it hasn't been
> > > > implemented as of yet, keep in mind that result sets support things
> > > > like move(n), first(), last(), the last of which will be the
> > > > trickiest. Not to mention updateable result sets.
> > > >
> > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > 'cursor', which currently is being considered a bug.
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > >Barry,
> > > > > > Is it true ?
> > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > rows with length 10-20 character.JDBC query 'select * from a' get
> > > > > > error 'out of memory', but psql not.
> > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > >
> > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > possibly mandated by JDBC design specs.
> > > > >
> > > > > It reads the entire result set from the database backend and caches
> > > > > it in a horrible Vector (which should really be a List and which
> > > > > should at least make an attempt to get the # of rows ahead of time
> > > > > to avoid all the resizing problems).
> > > > >
> > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > with the next() method.
> > > > >
> > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > returned from the backend, thus ensuring that you never use much
> > > > > more memory than one line. EVEN IF you have to keep the connection
> > > > > locked.
> > > > >
> > > > > The latter is what I expected it to do. The former is what it does.
> > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > has more than a few rows (or which you think COULD have more than a
> > > > > few rows, "few" being defined by our VM memory limits) into a
> > > > > cursor based query. Really klugy. I intend to write a class to do
> > > > > that for every SELECT query for me automatically.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Doug
> > > > >
> > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > fetch all rows
> > > > > >and this is problem - I think that executequery must prepare query
> > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> > > > > >I am not sure, but I think that is problem with jdbc, not
> > > > > > postgresql Hackers ?
> > > > > >Does psql fetch all rows and if not how many ?
> > > > > >Can I change fetch size in psql ?
> > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > >
> > > > > >regards
> > > > > >
> > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > Nick,
> > > > > > >
> > > > > > > This has been discussed before on this list many times. But
> > > > > > > the short answer is that that is how the postgres server
> > > > > > > handles queries. If you issue a query the server will return
> > > > > > > the entire result. (try the same query in psql and you will
> > > > > > > have the same problem). To work around this you can use
> > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > commands for postgres).
> > > > > > >
> > > > > > > thanks,
> > > > > > > --Barry
> > > > > > >
> > > > > > > Nick Fankhauser wrote:
> > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > get an out of memory error from java.
> > > > > > > >
> > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > does), rather than a subset around the current record being
> > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > >
> > > > > > > > If it turns out that there are good reasons for it to all be
> > > > > > > > in memory, then my question is whether there is a better
> > > > > > > > approach that people typically use in this situation. For
> > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > -Nick
> > > > > > > >
> > > > > > > > -------------------------------------------------------------
> > > > > > > >------------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co.
> > > > > > > > Software Consulting Services http://www.ontko.com/
> > > > > > > >
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > our extensive FAQ?
> > > > > > > >
> > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > >---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > lists at once with the unregister command (send "unregister
> > > > > > YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > Usenet, please send an appropriate subscribe-nomail command to
> > > majordomo(at)postgresql(dot)org so that your message can get through to the
> > > mailing list cleanly
> >
> > --
> > Dror Matalon
> > Zapatec Inc
> > 1700 MLK Way
> > Berkeley, CA 94709
> > http://www.zapatec.com
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)


From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: snpe <snpe(at)snpe(dot)co(dot)yu>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 20:38:35
Message-ID: 1034368716.2777.132.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Looking at their code, default fetch size is 1000?

Anyways, I think there is sufficient interest in this that we should
have something running soon here

Dave
On Fri, 2002-10-11 at 17:02, snpe wrote:
> I am tried with jxdbcon - it don't work with large table, too.
> 'out of memory' is when executeQuery()
>
> regards
> Haris Peco
> On Friday 11 October 2002 10:33 pm, snpe wrote:
> > There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> > Last version don't wokr with pgsql 7.3 and I don't test more.
> > I will try next day, when I download pgsql 7.2
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > > Agreed, but there are selects where count(*) won't work. Even so, what
> > > we are talking about here is hiding the implementation of cursors behind
> > > the result set. What I would envision is some sort of cacheing where
> > > when the user set's the fetchsize to 10 for instance we do the select,
> > > and when they ask for next() we check to see if we have these rows in
> > > the cache, and go get them if necessary 10 at a time, possibly keeping
> > > one set of ten behind where we are and one set of 10 ahead of where we
> > > are. So recalling that resultSets have absolute positioning, as well as
> > > first(), and last() positioning we need the ability to move with the
> > > minimum number of trips to the backend.
> > >
> > > As it turns out the move command in postgres does support moving to the
> > > end (move 0 ); at the moment this is considered a bug, and is on the
> > > todo list to be removed. I expect we can get some sort of implementation
> > > which allows us to move to the end ( move end )
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > > Hi,
> > > >
> > > > I'm jumping in late into this discussion but ...
> > > >
> > > > In my mind a lot of these features break the model. From an application
> > > > prespective, if I want to do last, I do a count(*) and then I do a
> > > > fetch with limit; Not quite the same, but all these methods of fetching
> > > > the whole data locally and manipulating it to a large exten defeat the
> > > > purpose. Let the backend do the work, instead of trying to replicate
> > > > the functionality in JDBC.
> > > >
> > > > That said I do understand that some of these are required by the JDBC
> > > > 2.0 spec.
> > > >
> > > > Dror
> > > >
> > > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > > It wouldn't be bad to start with a naive implementation of
> > > > > last()... If the only problem we have is that last() doesn't perform
> > > > > well, we're probably making good progress. :)
> > > > > On the other hand, I would think the updateable result sets would
> > > > > be the most challenging; does the server provide any analogous
> > > > > features with its cursors?
> > > > >
> > > > > Aaron
> > > > >
> > > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > > This really is an artifact of the way that postgres gives us the
> > > > > > data.
> > > > > >
> > > > > > When you query the backend you get *all* of the results in the
> > > > > > query, and there is no indication of how many results you are going
> > > > > > to get. In simple selects it would be possible to get some idea by
> > > > > > using count(field), but this wouldn't work nearly enough times to
> > > > > > make it useful. So that leaves us with using cursors, which still
> > > > > > won't tell you how many rows you are getting back, but at least you
> > > > > > won't have the memory problems.
> > > > > >
> > > > > > This approach is far from trivial which is why it hasn't been
> > > > > > implemented as of yet, keep in mind that result sets support things
> > > > > > like move(n), first(), last(), the last of which will be the
> > > > > > trickiest. Not to mention updateable result sets.
> > > > > >
> > > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > > 'cursor', which currently is being considered a bug.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > > >Barry,
> > > > > > > > Is it true ?
> > > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > > rows with length 10-20 character.JDBC query 'select * from a'
> > > > > > > > get error 'out of memory', but psql not.
> > > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > > >
> > > > > > > The way the code works in JDBC is, in my opinion, a little poor
> > > > > > > but possibly mandated by JDBC design specs.
> > > > > > >
> > > > > > > It reads the entire result set from the database backend and
> > > > > > > caches it in a horrible Vector (which should really be a List and
> > > > > > > which should at least make an attempt to get the # of rows ahead
> > > > > > > of time to avoid all the resizing problems).
> > > > > > >
> > > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > > with the next() method.
> > > > > > >
> > > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > > more memory than one line. EVEN IF you have to keep the
> > > > > > > connection locked.
> > > > > > >
> > > > > > > The latter is what I expected it to do. The former is what it
> > > > > > > does. So, it necessitates you creating EVERY SELECT query which
> > > > > > > you think has more than a few rows (or which you think COULD have
> > > > > > > more than a few rows, "few" being defined by our VM memory
> > > > > > > limits) into a cursor based query. Really klugy. I intend to
> > > > > > > write a class to do that for every SELECT query for me
> > > > > > > automatically.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Doug
> > > > > > >
> > > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > > fetch all rows
> > > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > > fetchSize rows. I am not sure, but I think that is problem with
> > > > > > > > jdbc, not postgresql Hackers ?
> > > > > > > >Does psql fetch all rows and if not how many ?
> > > > > > > >Can I change fetch size in psql ?
> > > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > > >
> > > > > > > >regards
> > > > > > > >
> > > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > > Nick,
> > > > > > > > >
> > > > > > > > > This has been discussed before on this list many times. But
> > > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > > handles queries. If you issue a query the server will return
> > > > > > > > > the entire result. (try the same query in psql and you will
> > > > > > > > > have the same problem). To work around this you can use
> > > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > > commands for postgres).
> > > > > > > > >
> > > > > > > > > thanks,
> > > > > > > > > --Barry
> > > > > > > > >
> > > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > > get an out of memory error from java.
> > > > > > > > > >
> > > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > > really surprising, but I'm wondering why this happens (if
> > > > > > > > > > it does), rather than a subset around the current record
> > > > > > > > > > being cached and other rows being retrieved as needed.
> > > > > > > > > >
> > > > > > > > > > If it turns out that there are good reasons for it to all
> > > > > > > > > > be in memory, then my question is whether there is a better
> > > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > -Nick
> > > > > > > > > >
> > > > > > > > > > -----------------------------------------------------------
> > > > > > > > > >-- ------------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
> > > > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ---------------------------(end of
> > > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > > checked our extensive FAQ?
> > > > > > > > > >
> > > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > > searched our list archives?
> > > > > > > > >
> > > > > > > > > http://archives.postgresql.org
> > > > > > > >
> > > > > > > >---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 2: you can get off
> > > > > > > > all lists at once with the unregister command (send "unregister
> > > > > > > > YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > > through Usenet, please send an appropriate subscribe-nomail command
> > > > > to majordomo(at)postgresql(dot)org so that your message can get through to
> > > > > the mailing list cleanly
> > > >
> > > > --
> > > > Dror Matalon
> > > > Zapatec Inc
> > > > 1700 MLK Way
> > > > Berkeley, CA 94709
> > > > http://www.zapatec.com
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > > commands go to majordomo(at)postgresql(dot)org
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister command
> > > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
>


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 21:02:03
Message-ID: 200210112302.03412.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

I am tried with jxdbcon - it don't work with large table, too.
'out of memory' is when executeQuery()

regards
Haris Peco
On Friday 11 October 2002 10:33 pm, snpe wrote:
> There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> Last version don't wokr with pgsql 7.3 and I don't test more.
> I will try next day, when I download pgsql 7.2
>
> regards
> Haris Peco
>
> On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > Agreed, but there are selects where count(*) won't work. Even so, what
> > we are talking about here is hiding the implementation of cursors behind
> > the result set. What I would envision is some sort of cacheing where
> > when the user set's the fetchsize to 10 for instance we do the select,
> > and when they ask for next() we check to see if we have these rows in
> > the cache, and go get them if necessary 10 at a time, possibly keeping
> > one set of ten behind where we are and one set of 10 ahead of where we
> > are. So recalling that resultSets have absolute positioning, as well as
> > first(), and last() positioning we need the ability to move with the
> > minimum number of trips to the backend.
> >
> > As it turns out the move command in postgres does support moving to the
> > end (move 0 ); at the moment this is considered a bug, and is on the
> > todo list to be removed. I expect we can get some sort of implementation
> > which allows us to move to the end ( move end )
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > Hi,
> > >
> > > I'm jumping in late into this discussion but ...
> > >
> > > In my mind a lot of these features break the model. From an application
> > > prespective, if I want to do last, I do a count(*) and then I do a
> > > fetch with limit; Not quite the same, but all these methods of fetching
> > > the whole data locally and manipulating it to a large exten defeat the
> > > purpose. Let the backend do the work, instead of trying to replicate
> > > the functionality in JDBC.
> > >
> > > That said I do understand that some of these are required by the JDBC
> > > 2.0 spec.
> > >
> > > Dror
> > >
> > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > It wouldn't be bad to start with a naive implementation of
> > > > last()... If the only problem we have is that last() doesn't perform
> > > > well, we're probably making good progress. :)
> > > > On the other hand, I would think the updateable result sets would
> > > > be the most challenging; does the server provide any analogous
> > > > features with its cursors?
> > > >
> > > > Aaron
> > > >
> > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > This really is an artifact of the way that postgres gives us the
> > > > > data.
> > > > >
> > > > > When you query the backend you get *all* of the results in the
> > > > > query, and there is no indication of how many results you are going
> > > > > to get. In simple selects it would be possible to get some idea by
> > > > > using count(field), but this wouldn't work nearly enough times to
> > > > > make it useful. So that leaves us with using cursors, which still
> > > > > won't tell you how many rows you are getting back, but at least you
> > > > > won't have the memory problems.
> > > > >
> > > > > This approach is far from trivial which is why it hasn't been
> > > > > implemented as of yet, keep in mind that result sets support things
> > > > > like move(n), first(), last(), the last of which will be the
> > > > > trickiest. Not to mention updateable result sets.
> > > > >
> > > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > > 'cursor', which currently is being considered a bug.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > >Barry,
> > > > > > > Is it true ?
> > > > > > >I create table with one column varchar(500) and enter 1 milion
> > > > > > > rows with length 10-20 character.JDBC query 'select * from a'
> > > > > > > get error 'out of memory', but psql not.
> > > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > > >
> > > > > > The way the code works in JDBC is, in my opinion, a little poor
> > > > > > but possibly mandated by JDBC design specs.
> > > > > >
> > > > > > It reads the entire result set from the database backend and
> > > > > > caches it in a horrible Vector (which should really be a List and
> > > > > > which should at least make an attempt to get the # of rows ahead
> > > > > > of time to avoid all the resizing problems).
> > > > > >
> > > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > > with the next() method.
> > > > > >
> > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > LOADING THE WHOLE THING - through the result set as each row is
> > > > > > returned from the backend, thus ensuring that you never use much
> > > > > > more memory than one line. EVEN IF you have to keep the
> > > > > > connection locked.
> > > > > >
> > > > > > The latter is what I expected it to do. The former is what it
> > > > > > does. So, it necessitates you creating EVERY SELECT query which
> > > > > > you think has more than a few rows (or which you think COULD have
> > > > > > more than a few rows, "few" being defined by our VM memory
> > > > > > limits) into a cursor based query. Really klugy. I intend to
> > > > > > write a class to do that for every SELECT query for me
> > > > > > automatically.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Doug
> > > > > >
> > > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > > fetch all rows
> > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > fetchSize rows. I am not sure, but I think that is problem with
> > > > > > > jdbc, not postgresql Hackers ?
> > > > > > >Does psql fetch all rows and if not how many ?
> > > > > > >Can I change fetch size in psql ?
> > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > > >
> > > > > > >regards
> > > > > > >
> > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > Nick,
> > > > > > > >
> > > > > > > > This has been discussed before on this list many times. But
> > > > > > > > the short answer is that that is how the postgres server
> > > > > > > > handles queries. If you issue a query the server will return
> > > > > > > > the entire result. (try the same query in psql and you will
> > > > > > > > have the same problem). To work around this you can use
> > > > > > > > explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
> > > > > > > > commands for postgres).
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > --Barry
> > > > > > > >
> > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > > million rows, with one of the fields being varchar(500). I
> > > > > > > > > get an out of memory error from java.
> > > > > > > > >
> > > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > > really surprising, but I'm wondering why this happens (if
> > > > > > > > > it does), rather than a subset around the current record
> > > > > > > > > being cached and other rows being retrieved as needed.
> > > > > > > > >
> > > > > > > > > If it turns out that there are good reasons for it to all
> > > > > > > > > be in memory, then my question is whether there is a better
> > > > > > > > > approach that people typically use in this situation. For
> > > > > > > > > now, I'm simply breaking up the select into smaller chunks,
> > > > > > > > > but that approach won't be satisfactory in the long run.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > -Nick
> > > > > > > > >
> > > > > > > > > -----------------------------------------------------------
> > > > > > > > >-- ------------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
> > > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > checked our extensive FAQ?
> > > > > > > > >
> > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > searched our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > >---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 2: you can get off
> > > > > > > all lists at once with the unregister command (send "unregister
> > > > > > > YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > through Usenet, please send an appropriate subscribe-nomail command
> > > > to majordomo(at)postgresql(dot)org so that your message can get through to
> > > > the mailing list cleanly
> > >
> > > --
> > > Dror Matalon
> > > Zapatec Inc
> > > 1700 MLK Way
> > > Berkeley, CA 94709
> > > http://www.zapatec.com
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > commands go to majordomo(at)postgresql(dot)org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly


From: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
To: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 21:05:46
Message-ID: 200210111405.46467.list-pgsql-hackers@empires.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

> They also state that they have more sophisticated ALTER TABLE...
>
> Only usable feature in their ALTER TABLE that doesn't (yet) exist in
> PostgreSQL was changing column order (ok, the order by in table creation
> could be nice), and that's still almost purely cosmetic. Anyway, I could
> have used that command yesterday. Could this be added to pgsql.
>

I agree with your message except for that statement. MySQL alter table
provides the ability to change column types and cast the records
automatically. I remember that feature as really the only thing from MySQL
that I've ever missed.

Of course, it's not that wonderful in theory. During development you can
easily drop/recreate the tables and reload the test data; during production
you don't change the data types of your attributes.

But in practice, during development it's handy sometimes.

Regards,
Jeff


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
Cc: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 21:14:16
Message-ID: Pine.LNX.4.33.0210111513090.1778-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, 11 Oct 2002, Jeff Davis wrote:

> > They also state that they have more sophisticated ALTER TABLE...
> >
> > Only usable feature in their ALTER TABLE that doesn't (yet) exist in
> > PostgreSQL was changing column order (ok, the order by in table creation
> > could be nice), and that's still almost purely cosmetic. Anyway, I could
> > have used that command yesterday. Could this be added to pgsql.
> >
>
> I agree with your message except for that statement. MySQL alter table
> provides the ability to change column types and cast the records
> automatically. I remember that feature as really the only thing from MySQL
> that I've ever missed.
>
> Of course, it's not that wonderful in theory. During development you can
> easily drop/recreate the tables and reload the test data; during production
> you don't change the data types of your attributes.
>
> But in practice, during development it's handy sometimes.

I still remember a post from somebody on the phpbuilder site that had
changed a field from varchar to date and all the dates he had got changed
to 0000-00-00.

He most unimpressed, especially since he (being typical of a lot of MySQL
users) didn't have a backup.


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: Dave Cramer <Dave(at)micro-automation(dot)net>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 21:18:03
Message-ID: 200210112318.03841.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

I test Oracle JDeveloper and jdbc driver for postgresql work fine now
Meanwhile, for production systems I have to have setFetchSize for large tables
I think that it is same with any Java IDE.

Best solution is that we have only n rows from backend, but I don't know is it
possible
regards
Haris Peco

On Friday 11 October 2002 10:38 pm, Dave Cramer wrote:
> Looking at their code, default fetch size is 1000?
>
> Anyways, I think there is sufficient interest in this that we should
> have something running soon here
>
> Dave
>
> On Fri, 2002-10-11 at 17:02, snpe wrote:
> > I am tried with jxdbcon - it don't work with large table, too.
> > 'out of memory' is when executeQuery()
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 10:33 pm, snpe wrote:
> > > There is jxdbcon Postgresql jdbc driver with setFetchSize method.
> > > Last version don't wokr with pgsql 7.3 and I don't test more.
> > > I will try next day, when I download pgsql 7.2
> > >
> > > regards
> > > Haris Peco
> > >
> > > On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:
> > > > Agreed, but there are selects where count(*) won't work. Even so,
> > > > what we are talking about here is hiding the implementation of
> > > > cursors behind the result set. What I would envision is some sort of
> > > > cacheing where when the user set's the fetchsize to 10 for instance
> > > > we do the select, and when they ask for next() we check to see if we
> > > > have these rows in the cache, and go get them if necessary 10 at a
> > > > time, possibly keeping one set of ten behind where we are and one set
> > > > of 10 ahead of where we are. So recalling that resultSets have
> > > > absolute positioning, as well as first(), and last() positioning we
> > > > need the ability to move with the minimum number of trips to the
> > > > backend.
> > > >
> > > > As it turns out the move command in postgres does support moving to
> > > > the end (move 0 ); at the moment this is considered a bug, and is on
> > > > the todo list to be removed. I expect we can get some sort of
> > > > implementation which allows us to move to the end ( move end )
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:
> > > > > Hi,
> > > > >
> > > > > I'm jumping in late into this discussion but ...
> > > > >
> > > > > In my mind a lot of these features break the model. From an
> > > > > application prespective, if I want to do last, I do a count(*) and
> > > > > then I do a fetch with limit; Not quite the same, but all these
> > > > > methods of fetching the whole data locally and manipulating it to a
> > > > > large exten defeat the purpose. Let the backend do the work,
> > > > > instead of trying to replicate the functionality in JDBC.
> > > > >
> > > > > That said I do understand that some of these are required by the
> > > > > JDBC 2.0 spec.
> > > > >
> > > > > Dror
> > > > >
> > > > > On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:
> > > > > > It wouldn't be bad to start with a naive implementation of
> > > > > > last()... If the only problem we have is that last() doesn't
> > > > > > perform well, we're probably making good progress. :)
> > > > > > On the other hand, I would think the updateable result sets
> > > > > > would be the most challenging; does the server provide any
> > > > > > analogous features with its cursors?
> > > > > >
> > > > > > Aaron
> > > > > >
> > > > > > On 11 Oct 2002, Dave Cramer wrote:
> > > > > > > This really is an artifact of the way that postgres gives us
> > > > > > > the data.
> > > > > > >
> > > > > > > When you query the backend you get *all* of the results in the
> > > > > > > query, and there is no indication of how many results you are
> > > > > > > going to get. In simple selects it would be possible to get
> > > > > > > some idea by using count(field), but this wouldn't work nearly
> > > > > > > enough times to make it useful. So that leaves us with using
> > > > > > > cursors, which still won't tell you how many rows you are
> > > > > > > getting back, but at least you won't have the memory problems.
> > > > > > >
> > > > > > > This approach is far from trivial which is why it hasn't been
> > > > > > > implemented as of yet, keep in mind that result sets support
> > > > > > > things like move(n), first(), last(), the last of which will be
> > > > > > > the trickiest. Not to mention updateable result sets.
> > > > > > >
> > > > > > > As it turns out there is a mechanism to get to the end move 0
> > > > > > > in 'cursor', which currently is being considered a bug.
> > > > > > >
> > > > > > > Dave
> > > > > > >
> > > > > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > > > > >Barry,
> > > > > > > > > Is it true ?
> > > > > > > > >I create table with one column varchar(500) and enter 1
> > > > > > > > > milion rows with length 10-20 character.JDBC query 'select
> > > > > > > > > * from a' get error 'out of memory', but psql not.
> > > > > > > > >I insert 8 milion rows and psql work fine yet (slow, but
> > > > > > > > > work)
> > > > > > > >
> > > > > > > > The way the code works in JDBC is, in my opinion, a little
> > > > > > > > poor but possibly mandated by JDBC design specs.
> > > > > > > >
> > > > > > > > It reads the entire result set from the database backend and
> > > > > > > > caches it in a horrible Vector (which should really be a List
> > > > > > > > and which should at least make an attempt to get the # of
> > > > > > > > rows ahead of time to avoid all the resizing problems).
> > > > > > > >
> > > > > > > > Then, it doles it out from memory as you go through the
> > > > > > > > ResultSet with the next() method.
> > > > > > > >
> > > > > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> > > > > > > > LOADING THE WHOLE THING - through the result set as each row
> > > > > > > > is returned from the backend, thus ensuring that you never
> > > > > > > > use much more memory than one line. EVEN IF you have to keep
> > > > > > > > the connection locked.
> > > > > > > >
> > > > > > > > The latter is what I expected it to do. The former is what it
> > > > > > > > does. So, it necessitates you creating EVERY SELECT query
> > > > > > > > which you think has more than a few rows (or which you think
> > > > > > > > COULD have more than a few rows, "few" being defined by our
> > > > > > > > VM memory limits) into a cursor based query. Really klugy. I
> > > > > > > > intend to write a class to do that for every SELECT query for
> > > > > > > > me automatically.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Doug
> > > > > > > >
> > > > > > > > >In C library is 'execute query' without fetch - in jdbc
> > > > > > > > > execute fetch all rows
> > > > > > > > >and this is problem - I think that executequery must prepare
> > > > > > > > > query and fetch (ResultSet.next or ...) must fetch only
> > > > > > > > > fetchSize rows. I am not sure, but I think that is problem
> > > > > > > > > with jdbc, not postgresql Hackers ?
> > > > > > > > >Does psql fetch all rows and if not how many ?
> > > > > > > > >Can I change fetch size in psql ?
> > > > > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > > > > >If I use jdbc in third-party IDE, I can't force this
> > > > > > > > > solution
> > > > > > > > >
> > > > > > > > >regards
> > > > > > > > >
> > > > > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > > > > Nick,
> > > > > > > > > >
> > > > > > > > > > This has been discussed before on this list many times.
> > > > > > > > > > But the short answer is that that is how the postgres
> > > > > > > > > > server handles queries. If you issue a query the server
> > > > > > > > > > will return the entire result. (try the same query in
> > > > > > > > > > psql and you will have the same problem). To work around
> > > > > > > > > > this you can use explicit cursors (see the DECLARE
> > > > > > > > > > CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > > > > > >
> > > > > > > > > > thanks,
> > > > > > > > > > --Barry
> > > > > > > > > >
> > > > > > > > > > Nick Fankhauser wrote:
> > > > > > > > > > > I'm selecting a huge ResultSet from our database- about
> > > > > > > > > > > one million rows, with one of the fields being
> > > > > > > > > > > varchar(500). I get an out of memory error from java.
> > > > > > > > > > >
> > > > > > > > > > > If the whole ResultSet gets stashed in memory, this
> > > > > > > > > > > isn't really surprising, but I'm wondering why this
> > > > > > > > > > > happens (if it does), rather than a subset around the
> > > > > > > > > > > current record being cached and other rows being
> > > > > > > > > > > retrieved as needed.
> > > > > > > > > > >
> > > > > > > > > > > If it turns out that there are good reasons for it to
> > > > > > > > > > > all be in memory, then my question is whether there is
> > > > > > > > > > > a better approach that people typically use in this
> > > > > > > > > > > situation. For now, I'm simply breaking up the select
> > > > > > > > > > > into smaller chunks, but that approach won't be
> > > > > > > > > > > satisfactory in the long run.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > -Nick
> > > > > > > > > > >
> > > > > > > > > > > -------------------------------------------------------
> > > > > > > > > > >---- -- ------------ - Nick Fankhauser nickf(at)ontko(dot)com
> > > > > > > > > > > Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko &
> > > > > > > > > > > Co. Software Consulting Services http://www.ontko.com/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > ---------------------------(end of
> > > > > > > > > > > broadcast)--------------------------- TIP 5: Have you
> > > > > > > > > > > checked our extensive FAQ?
> > > > > > > > > > >
> > > > > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > > > > >
> > > > > > > > > > ---------------------------(end of
> > > > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > > > searched our list archives?
> > > > > > > > > >
> > > > > > > > > > http://archives.postgresql.org
> > > > > > > > >
> > > > > > > > >---------------------------(end of
> > > > > > > > > broadcast)--------------------------- TIP 2: you can get
> > > > > > > > > off all lists at once with the unregister command (send
> > > > > > > > > "unregister YourEmailAddressHere" to
> > > > > > > > > majordomo(at)postgresql(dot)org)
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 6: Have you
> > > > > > > > searched our list archives?
> > > > > > > >
> > > > > > > > http://archives.postgresql.org
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 3: if posting/reading
> > > > > > through Usenet, please send an appropriate subscribe-nomail
> > > > > > command to majordomo(at)postgresql(dot)org so that your message can get
> > > > > > through to the mailing list cleanly
> > > > >
> > > > > --
> > > > > Dror Matalon
> > > > > Zapatec Inc
> > > > > 1700 MLK Way
> > > > > Berkeley, CA 94709
> > > > > http://www.zapatec.com
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 1: subscribe and
> > > > > unsubscribe commands go to majordomo(at)postgresql(dot)org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > lists at once with the unregister command (send "unregister
> > > > YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 3: if posting/reading through
> > > Usenet, please send an appropriate subscribe-nomail command to
> > > majordomo(at)postgresql(dot)org so that your message can get through to the
> > > mailing list cleanly
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html


From: snpe <snpe(at)snpe(dot)co(dot)yu>
To: Dave Cramer <Dave(at)micro-automation(dot)net>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 21:42:59
Message-ID: 200210112342.59826.snpe@snpe.co.yu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Can You do this :
We save 1000 (or fetchSize rows) first from beginning
If table have < 1000 rows we save all rows, but if table have more rows
and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows
or maybe continue fetching, if it possible)
When user request last we fetch all rows, but save only last 1000 etc

We save only fetchSize rows and seek from begining when user request
backward (or maybe seek always when user request out our 'fetchSize' window)

This is slow for large tables, but this is solution until developer get us
better solution from backend.If table have < fetchSize rows this is same
current solution and we can fix minimal fetchSize for better performance with
small tables.

regards
Haris Peco
On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> No,
>
> It doesn't have to store them, only display them
>
> Dave
>
> On Fri, 2002-10-11 at 12:48, snpe wrote:
> > Hello,
> > Does it mean that psql uses cursors ?
> >
> > regards
> > Haris Peco
> >
> > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > This really is an artifact of the way that postgres gives us the data.
> > >
> > > When you query the backend you get *all* of the results in the query,
> > > and there is no indication of how many results you are going to get. In
> > > simple selects it would be possible to get some idea by using
> > > count(field), but this wouldn't work nearly enough times to make it
> > > useful. So that leaves us with using cursors, which still won't tell
> > > you how many rows you are getting back, but at least you won't have the
> > > memory problems.
> > >
> > > This approach is far from trivial which is why it hasn't been
> > > implemented as of yet, keep in mind that result sets support things
> > > like move(n), first(), last(), the last of which will be the trickiest.
> > > Not to mention updateable result sets.
> > >
> > > As it turns out there is a mechanism to get to the end move 0 in
> > > 'cursor', which currently is being considered a bug.
> > >
> > > Dave
> > >
> > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > >Barry,
> > > > > Is it true ?
> > > > >I create table with one column varchar(500) and enter 1 milion rows
> > > > > with length 10-20 character.JDBC query 'select * from a' get error
> > > > > 'out of memory', but psql not.
> > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > >
> > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > possibly mandated by JDBC design specs.
> > > >
> > > > It reads the entire result set from the database backend and caches
> > > > it in a horrible Vector (which should really be a List and which
> > > > should at least make an attempt to get the # of rows ahead of time to
> > > > avoid all the resizing problems).
> > > >
> > > > Then, it doles it out from memory as you go through the ResultSet
> > > > with the next() method.
> > > >
> > > > I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
> > > > THE WHOLE THING - through the result set as each row is returned from
> > > > the backend, thus ensuring that you never use much more memory than
> > > > one line. EVEN IF you have to keep the connection locked.
> > > >
> > > > The latter is what I expected it to do. The former is what it does.
> > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > has more than a few rows (or which you think COULD have more than a
> > > > few rows, "few" being defined by our VM memory limits) into a cursor
> > > > based query. Really klugy. I intend to write a class to do that for
> > > > every SELECT query for me automatically.
> > > >
> > > > Cheers,
> > > >
> > > > Doug
> > > >
> > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > fetch all rows
> > > > >and this is problem - I think that executequery must prepare query
> > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows. I
> > > > > am not sure, but I think that is problem with jdbc, not postgresql
> > > > > Hackers ?
> > > > >Does psql fetch all rows and if not how many ?
> > > > >Can I change fetch size in psql ?
> > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > >
> > > > >regards
> > > > >
> > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > Nick,
> > > > > >
> > > > > > This has been discussed before on this list many times. But the
> > > > > > short answer is that that is how the postgres server handles
> > > > > > queries. If you issue a query the server will return the entire
> > > > > > result. (try the same query in psql and you will have the same
> > > > > > problem). To work around this you can use explicit cursors (see
> > > > > > the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > >
> > > > > > thanks,
> > > > > > --Barry
> > > > > >
> > > > > > Nick Fankhauser wrote:
> > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > million rows, with one of the fields being varchar(500). I get
> > > > > > > an out of memory error from java.
> > > > > > >
> > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > does), rather than a subset around the current record being
> > > > > > > cached and other rows being retrieved as needed.
> > > > > > >
> > > > > > > If it turns out that there are good reasons for it to all be in
> > > > > > > memory, then my question is whether there is a better approach
> > > > > > > that people typically use in this situation. For now, I'm
> > > > > > > simply breaking up the select into smaller chunks, but that
> > > > > > > approach won't be satisfactory in the long run.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > -Nick
> > > > > > >
> > > > > > > ---------------------------------------------------------------
> > > > > > >---- ------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
> > > > > > > Consulting Services http://www.ontko.com/
> > > > > > >
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > our extensive FAQ?
> > > > > > >
> > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > >
> > > > > > ---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > our list archives?
> > > > > >
> > > > > > http://archives.postgresql.org
> > > > >
> > > > >---------------------------(end of
> > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > lists at once with the unregister command (send "unregister
> > > > > YourEmailAddressHere" to
> > > > > majordomo(at)postgresql(dot)org)
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > list archives?
> > > >
> > > > http://archives.postgresql.org
> > >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > commands go to majordomo(at)postgresql(dot)org
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


From: Mike Mascari <mascarm(at)mascari(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>, Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-11 22:38:44
Message-ID: 3DA752F4.3060904@mascari.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

scott.marlowe wrote:
> On Fri, 11 Oct 2002, Jeff Davis wrote:
>
>>I agree with your message except for that statement. MySQL alter table
>>provides the ability to change column types and cast the records
>>automatically. I remember that feature as really the only thing from MySQL
>>that I've ever missed.
>>
>>Of course, it's not that wonderful in theory. During development you can
>>easily drop/recreate the tables and reload the test data; during production
>>you don't change the data types of your attributes.
>>
>>But in practice, during development it's handy sometimes.
>
>
> I still remember a post from somebody on the phpbuilder site that had
> changed a field from varchar to date and all the dates he had got changed
> to 0000-00-00.
>
> He most unimpressed, especially since he (being typical of a lot of MySQL
> users) didn't have a backup.

Couldn't he just do ROLLBACK? ;-)

(for the humor impaired, that's a joke...)

Mike Mascari
mascarm(at)mascari(dot)com


From: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-12 02:08:18
Message-ID: 200210111908.18170.list-pgsql-hackers@empires.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

>
> I still remember a post from somebody on the phpbuilder site that had
> changed a field from varchar to date and all the dates he had got changed
> to 0000-00-00.
>
> He most unimpressed, especially since he (being typical of a lot of MySQL
> users) didn't have a backup.
>

Ah, yes. Classic.

I was talking about a development scenario. Anyone who changes a huge amount
of important data to a new form without a clearly defined algorithm is not
making a wise choice. That's kind of like if you have a perl script operating
on an important file: you don't want it to just kill all your data, so you do
a few tests first.

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which is
mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
that a feature does something safely, and it kills all your data.

So, you're right. It's probably better that it's never implemented.

Regards,
Jeff


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
Cc: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-12 02:16:24
Message-ID: 20021012021624.GB4443@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:

> And it really is a minor matter of convenience. I end up dropping and
> recreating all my tables a lot in the early stages of development, which is
> mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
> that a feature does something safely, and it kills all your data.

Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
any more the need to do such frequent drop/create of tables.

And things just keep getting better and better. This is really amazing.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"We are who we choose to be", sang the goldfinch
when the sun is high (Sandman)


From: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
To: "Pgsql-Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-12 04:18:23
Message-ID: 200210112118.23709.list-pgsql-hackers@empires.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


Oh yes, I agree. ALTER TABLE ... DROP COLUMN helps out a lot. I actually don't
use that for much yet because 7.3 is still in beta. However, I certainly
can't complain to the developers for it since it's already developed :)

I am consistantly amazed by every minor version release. If postgres had a
marketing team it would be at version 37.3 by now. In my last email I agreed
with Scott Marlowe that postgres is better off without the casting of an
entire column, since that's kind of a dangeous procedure and can be completed
in a round-about (read: explicit) way by postgres anyway, that doesn't lose
your data until after you've had a chance to look at the new stuff.

Regards,
Jeff

On Friday 11 October 2002 07:16 pm, you wrote:
> On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:
> > And it really is a minor matter of convenience. I end up dropping and
> > recreating all my tables a lot in the early stages of development, which
> > is mildly annoying. Certainly not as bad, I suppose, as if you're led to
> > believe that a feature does something safely, and it kills all your data.
>
> Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
> any more the need to do such frequent drop/create of tables.
>
> And things just keep getting better and better. This is really amazing.


From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-12 08:37:08
Message-ID: Pine.LNX.4.21.0210121833480.26079-100000@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On 12 Oct 2002, Hannu Krosing wrote:

> Alvaro Herrera kirjutas L, 12.10.2002 kell 04:16:
> > On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:
> >
> > > And it really is a minor matter of convenience. I end up dropping and
> > > recreating all my tables a lot in the early stages of development, which is
> > > mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
> > > that a feature does something safely, and it kills all your data.
> >
> > Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
> > any more the need to do such frequent drop/create of tables.
>
> Did attlognum's (for changing column order) get implemented for 7.2 ?

I cannot think of any reason why changing column order should be
implemented in Postgres. Seems like a waste of time/more code bloat for
something which is strictly asthetic.

Regardless, I do have collegues/clients who ask when such a feature will
be implemented. Why is this useful?

Gavin


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-12 08:57:32
Message-ID: 1034413052.26027.7.camel@taru.tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Alvaro Herrera kirjutas L, 12.10.2002 kell 04:16:
> On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:
>
> > And it really is a minor matter of convenience. I end up dropping and
> > recreating all my tables a lot in the early stages of development, which is
> > mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
> > that a feature does something safely, and it kills all your data.
>
> Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
> any more the need to do such frequent drop/create of tables.

Did attlognum's (for changing column order) get implemented for 7.2 ?

------------
Hannu


From: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-12 09:43:37
Message-ID: Pine.GSO.4.44.0210121234220.23107-100000@paju.oulu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc


> I cannot think of any reason why changing column order should be
> implemented in Postgres. Seems like a waste of time/more code bloat for
> something which is strictly asthetic.

What about copy? AFAIK, copy doesn't allow column names being specified,
so it's not purely aesthetic...


From: Jeff Davis <list-pgsql-hackers(at)empires(dot)org>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-12 09:54:20
Message-ID: 200210120254.20255.list-pgsql-hackers@empires.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

> >
> > Did attlognum's (for changing column order) get implemented for 7.2 ?
>
> I cannot think of any reason why changing column order should be
> implemented in Postgres. Seems like a waste of time/more code bloat for
> something which is strictly asthetic.
>
> Regardless, I do have collegues/clients who ask when such a feature will
> be implemented. Why is this useful?
>

I think even "asthetic" might go too far. It seems mostly irrelevent except
for people who are obsessive compulsive and operate in interactive psql a
lot. It's marginally simpler to get the columns ordered the way you want so
that you can just do "SELECT * ..." rather than "SELECT att0,att1,... ..." at
the interactive psql prompt, and still get the columns in your favorite
order.

As far as I can tell, the order the attributes are returned makes no
difference in a client application, unless you're referencing attributes by
number. All applications that I've made or seen all use the name instead, and
I've never heard otherwise, or heard any advantage to using numbers to
reference columns.

When someone asks, ask them "why?". I'd be interested to know if they have
some other reason. I would think that if they absolutely wanted to fine-tune
the order of columns they'd use a view (seems a little easier than
continually changing order around by individual SQL statements).

Regards,
Jeff


From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-12 10:50:03
Message-ID: 3DA84BB3.19627.31201C@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On 12 Oct 2002 at 2:54, Jeff Davis wrote:

> As far as I can tell, the order the attributes are returned makes no
> difference in a client application, unless you're referencing attributes by
> number. All applications that I've made or seen all use the name instead, and
> I've never heard otherwise, or heard any advantage to using numbers to
> reference columns.

Even in that case you can obtain field number for a given name and vise versa..

> When someone asks, ask them "why?". I'd be interested to know if they have
> some other reason. I would think that if they absolutely wanted to fine-tune
> the order of columns they'd use a view (seems a little easier than
> continually changing order around by individual SQL statements).

Sounds fine but what is about that "continually changing"? A view needs a
change only if it alters fields selected/tables to select from/selection
criteria. Field order does not figure in there..

Bye
Shridhar

--
QOTD: "A child of 5 could understand this! Fetch me a child of 5."


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)tm(dot)ee>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Jeff Davis <list-pgsql-hackers(at)empires(dot)org>, "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL vs PostgreSQL.
Date: 2002-10-12 14:37:32
Message-ID: 200210121437.g9CEbWm15079@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Hannu Krosing wrote:
> Alvaro Herrera kirjutas L, 12.10.2002 kell 04:16:
> > On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:
> >
> > > And it really is a minor matter of convenience. I end up dropping and
> > > recreating all my tables a lot in the early stages of development, which is
> > > mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
> > > that a feature does something safely, and it kills all your data.
> >
> > Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
> > any more the need to do such frequent drop/create of tables.
>
> Did attlognum's (for changing column order) get implemented for 7.2 ?

No, changing column order isn't even on the TODO list.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-12 15:45:13
Message-ID: 20021012154513.GA12337@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Sat, Oct 12, 2002 at 12:43:37 +0300,
Antti Haapala <antti(dot)haapala(at)iki(dot)fi> wrote:
>
> > I cannot think of any reason why changing column order should be
> > implemented in Postgres. Seems like a waste of time/more code bloat for
> > something which is strictly asthetic.
>
> What about copy? AFAIK, copy doesn't allow column names being specified,
> so it's not purely aesthetic...

The SQL COPY command does (at least in 7.3). The \copy psql command
doesn't seem to allow this though.


From: Darko Prenosil <darko(dot)prenosil(at)finteh(dot)hr>
To: shridhar_daithankar(at)persistent(dot)co(dot)in, "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [HACKERS] MySQL vs PostgreSQL.
Date: 2002-10-12 18:58:53
Message-ID: 200210121758.53166.darko.prenosil@finteh.hr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Saturday 12 October 2002 09:02, Shridhar Daithankar wrote:
> On 12 Oct 2002 at 11:36, Darko Prenosil wrote:
> > On Friday 11 October 2002 12:38, Shridhar Daithankar wrote:
> > > On 11 Oct 2002 at 16:20, Antti Haapala wrote:
> > > > Check out:
> > > > http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html
> > >
> > > Well, I guess there are many threads on this. You can dig around
> > > archives..
> > >
> > > > > Upgrading MySQL Server is painless. When you are upgrading MySQL
> > > > > Server, you don't need to dump/restore your data, as you have to do
> > > > > with most PostgreSQL upgrades.
> > > >
> > > > Ok... this is true, but not so hard - yesterday I installed 7.3b2
> > > > onto my linux box.
> > >
> > > Well, that remains as a point. Imagine a 100GB database on a 150GB disk
> > > array. How do you dump and reload? In place conversion of data is an
> > > absolute necessary feature and it's already on TODO.
> >
> > From PostgreSQL 7.3 Documentation :
> >
> > Use compressed dumps. Use your favorite compression program, for example
> > gzip. pg_dump dbname | gzip > filename.gz
>
> Yes. but that may not be enough. Strech the situation. 300GB database 350GB
> space. GZip can't compress better than 3:1. And don't think it's
> imagination. I am preparing a database of 600GB in near future. Don't want
> to provide 1TB of space to include redump.
>
Where You store Your regular backup (The one You use for security reasons, not
for version change)? Or You are not doing backup at all ???


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruno Wolff III <bruno(at)wolff(dot)to>
Cc: Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: \copy needs work (was Re: Changing Column Order)
Date: 2002-10-12 20:36:44
Message-ID: 14051.1034455004@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> On Sat, Oct 12, 2002 at 12:43:37 +0300,
> Antti Haapala <antti(dot)haapala(at)iki(dot)fi> wrote:
>> What about copy? AFAIK, copy doesn't allow column names being specified,
>> so it's not purely aesthetic...

> The SQL COPY command does (at least in 7.3). The \copy psql command
> doesn't seem to allow this though.

That's an oversight; \copy should have been fixed for 7.3.

Do we want to look at this as a bug (okay to fix for 7.3) or a new
feature (wait for 7.4)?

I see something that I think is a must-fix omission in the same code:
it should allow a schema-qualified table name. So I'm inclined to fix
both problems now.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: \copy needs work (was Re: Changing Column Order)
Date: 2002-10-14 03:45:36
Message-ID: 200210140345.g9E3ja103663@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> > On Sat, Oct 12, 2002 at 12:43:37 +0300,
> > Antti Haapala <antti(dot)haapala(at)iki(dot)fi> wrote:
> >> What about copy? AFAIK, copy doesn't allow column names being specified,
> >> so it's not purely aesthetic...
>
> > The SQL COPY command does (at least in 7.3). The \copy psql command
> > doesn't seem to allow this though.
>
> That's an oversight; \copy should have been fixed for 7.3.
>
> Do we want to look at this as a bug (okay to fix for 7.3) or a new
> feature (wait for 7.4)?
>
> I see something that I think is a must-fix omission in the same code:
> it should allow a schema-qualified table name. So I'm inclined to fix
> both problems now.

I don't think we can say \copy missing columns is a bug; we never had
it in previous release. Seems like a missing feature. The COPY schema
names seems valid.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: \copy needs work (was Re: Changing Column Order)
Date: 2002-10-14 03:53:20
Message-ID: 5378.1034567600@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom Lane wrote:
>> Do we want to look at this as a bug (okay to fix for 7.3) or a new
>> feature (wait for 7.4)?

> I don't think we can say \copy missing columns is a bug; we never had
> it in previous release. Seems like a missing feature. The COPY schema
> names seems valid.

Well, we never had schema names in previous releases either. So I'm not
sure that I see a bright line between these items. The real issue is
that psql's \copy has failed to track the capabilities of backend COPY.
I think we should just fix it.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Antti Haapala <antti(dot)haapala(at)iki(dot)fi>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Hannu Krosing <hannu(at)tm(dot)ee>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: \copy needs work (was Re: Changing Column Order)
Date: 2002-10-14 04:12:10
Message-ID: 200210140412.g9E4CAI07835@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Do we want to look at this as a bug (okay to fix for 7.3) or a new
> >> feature (wait for 7.4)?
>
> > I don't think we can say \copy missing columns is a bug; we never had
> > it in previous release. Seems like a missing feature. The COPY schema
> > names seems valid.
>
> Well, we never had schema names in previous releases either. So I'm not
> sure that I see a bright line between these items. The real issue is
> that psql's \copy has failed to track the capabilities of backend COPY.
> I think we should just fix it.

OK, I added it to the open items list.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: "pgsql-general" <pgsql-general(at)postgresql(dot)org>, Darko Prenosil <darko(dot)prenosil(at)finteh(dot)hr>
Subject: Re: [HACKERS] MySQL vs PostgreSQL.
Date: 2002-10-14 06:13:46
Message-ID: 3DAAADF2.27234.8AB10FE@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On 12 Oct 2002 at 17:58, Darko Prenosil wrote:

> On Saturday 12 October 2002 09:02, Shridhar Daithankar wrote:
> > Yes. but that may not be enough. Strech the situation. 300GB database 350GB
> > space. GZip can't compress better than 3:1. And don't think it's
> > imagination. I am preparing a database of 600GB in near future. Don't want
> > to provide 1TB of space to include redump.
> >
> Where You store Your regular backup (The one You use for security reasons, not
> for version change)? Or You are not doing backup at all ???

No regular backups. Data gets recycled in fixed intervals. It's not stored
permannently anyway. And this is not single machine database. It's a cluster
with redundant components like RAID etc. So risk goes further down..

Lucky me.. didn't have to devise regular backup scheme for such a database..

Bye
Shridhar

--
Real Time, adj.: Here and now, as opposed to fake time, which only occurs there
and then.


From: Alessio Bragadini <alessio(at)albourne(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-14 13:39:37
Message-ID: 1034602777.6205.203.camel@iris
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Sat, 2002-10-12 at 11:37, Gavin Sherry wrote:

> I cannot think of any reason why changing column order should be
> implemented in Postgres. Seems like a waste of time/more code bloat for
> something which is strictly asthetic.
>
> Regardless, I do have collegues/clients who ask when such a feature will
> be implemented. Why is this useful?

Has column ordering any effect on the physical tuple disposition? I've
heard discussions about keeping fixed-size fields at the beginning of
the tuple and similar.

Sorry for the lame question. :-)

--
Alessio F. Bragadini alessio(at)albourne(dot)com
APL Financial Services http://village.albourne.com
Nicosia, Cyprus phone: +357-22-755750

"It is more complicated than you think"
-- The Eighth Networking Truth from RFC 1925


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alessio Bragadini <alessio(at)albourne(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-14 15:04:07
Message-ID: 200210141504.g9EF47W09439@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Alessio Bragadini wrote:
> On Sat, 2002-10-12 at 11:37, Gavin Sherry wrote:
>
> > I cannot think of any reason why changing column order should be
> > implemented in Postgres. Seems like a waste of time/more code bloat for
> > something which is strictly asthetic.
> >
> > Regardless, I do have collegues/clients who ask when such a feature will
> > be implemented. Why is this useful?
>
> Has column ordering any effect on the physical tuple disposition? I've
> heard discussions about keeping fixed-size fields at the beginning of
> the tuple and similar.
>
> Sorry for the lame question. :-)

Yes, column ordering matches physical column ordering in the file, and
yes, there is a small penalty for accessing any columns after the first
variable-length column (pg_type.typlen < 0). CHAR() used to be a fixed
length column, but with TOAST (large offline storage) it became variable
length too. I don't think there is much of a performance hit, though.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Alessio Bragadini <alessio(at)albourne(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-14 15:15:59
Message-ID: 20021014151559.GA2301@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Mon, Oct 14, 2002 at 11:04:07AM -0400, Bruce Momjian wrote:
> Alessio Bragadini wrote:
> > On Sat, 2002-10-12 at 11:37, Gavin Sherry wrote:
> >
> > > I cannot think of any reason why changing column order should be
> > > implemented in Postgres. Seems like a waste of time/more code bloat for
> > > something which is strictly asthetic.
> >
> > Has column ordering any effect on the physical tuple disposition? I've
> > heard discussions about keeping fixed-size fields at the beginning of
> > the tuple and similar.
>
> Yes, column ordering matches physical column ordering in the file, and
> yes, there is a small penalty for accessing any columns after the first
> variable-length column (pg_type.typlen < 0).

And note that if column ordering was to be implemented through the use
of attlognum or something similar, the physical ordering would not be
affected. The only way to physically reoder the columns would be to
completely rebuild the table.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Aprende a avergonzarte mas ante ti que ante los demas" (Democrito)


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Alessio Bragadini <alessio(at)albourne(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-15 14:37:41
Message-ID: 3DAC2835.2BA5CB60@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian wrote:
>
> Alessio Bragadini wrote:
> > On Sat, 2002-10-12 at 11:37, Gavin Sherry wrote:
> >
> > > I cannot think of any reason why changing column order should be
> > > implemented in Postgres. Seems like a waste of time/more code bloat for
> > > something which is strictly asthetic.
> > >
> > > Regardless, I do have collegues/clients who ask when such a feature will
> > > be implemented. Why is this useful?
> >
> > Has column ordering any effect on the physical tuple disposition? I've
> > heard discussions about keeping fixed-size fields at the beginning of
> > the tuple and similar.
> >
> > Sorry for the lame question. :-)
>
> Yes, column ordering matches physical column ordering in the file, and
> yes, there is a small penalty for accessing any columns after the first
> variable-length column (pg_type.typlen < 0). CHAR() used to be a fixed
> length column, but with TOAST (large offline storage) it became variable
> length too. I don't think there is much of a performance hit, though.

When was char() fixed size? We had fixed size things like char, char2,
char4 ... char16. But char() is internally bpchar() and has allways been
variable-length.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Alessio Bragadini <alessio(at)albourne(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-16 01:18:34
Message-ID: 200210160118.g9G1IYR18653@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Jan Wieck wrote:
> Bruce Momjian wrote:
> >
> > Alessio Bragadini wrote:
> > > On Sat, 2002-10-12 at 11:37, Gavin Sherry wrote:
> > >
> > > > I cannot think of any reason why changing column order should be
> > > > implemented in Postgres. Seems like a waste of time/more code bloat for
> > > > something which is strictly asthetic.
> > > >
> > > > Regardless, I do have collegues/clients who ask when such a feature will
> > > > be implemented. Why is this useful?
> > >
> > > Has column ordering any effect on the physical tuple disposition? I've
> > > heard discussions about keeping fixed-size fields at the beginning of
> > > the tuple and similar.
> > >
> > > Sorry for the lame question. :-)
> >
> > Yes, column ordering matches physical column ordering in the file, and
> > yes, there is a small penalty for accessing any columns after the first
> > variable-length column (pg_type.typlen < 0). CHAR() used to be a fixed
> > length column, but with TOAST (large offline storage) it became variable
> > length too. I don't think there is much of a performance hit, though.
>
> When was char() fixed size? We had fixed size things like char, char2,
> char4 ... char16. But char() is internally bpchar() and has allways been
> variable-length.

char() was fixed size only in that you could cache the column offsets
for char() becuase it was always the same width on disk before TOAST.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Alessio Bragadini <alessio(at)albourne(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-16 03:15:11
Message-ID: 20553.1034738111@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Jan Wieck wrote:
>> When was char() fixed size?

> char() was fixed size only in that you could cache the column offsets
> for char() becuase it was always the same width on disk before TOAST.

But that was already broken by MULTIBYTE.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Alessio Bragadini <alessio(at)albourne(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changing Column Order (Was Re: MySQL vs PostgreSQL.)
Date: 2002-10-16 03:19:09
Message-ID: 200210160319.g9G3J9d18325@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Jan Wieck wrote:
> >> When was char() fixed size?
>
> > char() was fixed size only in that you could cache the column offsets
> > for char() becuase it was always the same width on disk before TOAST.
>
> But that was already broken by MULTIBYTE.

Yes, I think there was conditional code that had the optimization only
for non-multibyte servers. Of course, now multibyte is default.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Dave Cramer <dave(at)fastcrypt(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-10-30 05:04:15
Message-ID: 200210300504.g9U54Fi16676@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Dave Cramer wrote:
> Currently there is a TODO list item to have move 0 not position to the
> end of the cursor.
>
> Moving to the end of the cursor is useful, can we keep the behaviour and
> change it to move end, or just leave it the way it is?

I did some research on this. It turns out the parser uses 0 for ALL, so
when you do a FETCH ALL it is passing zero. Now, when you do MOVE 0,
you are really asking for FETCH ALL and all the tuples are thrown away
because of the MOVE.

So, that is why MOVE 0 goes to the end of the cursor. One idea would be
for MOVE 0 to actually move nothing, but jdbc and others need the
ability to move the end of the cursor, perhaps to then back up a certain
amount and read from there. Seems MOVE 0 is the logical way to do that.
(I can't think of another reasonable value).

I have the following patch which just documents the fact that MOVE 0
goes to the end of the cursor. It does not change any behavior, just
document it.

If/when I apply the patch, I will remove the TODO item. Another idea
would be to require MOVE END to move to the end.

Comments?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 2.8 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-10-30 18:19:27
Message-ID: 213.1036001967@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> I did some research on this. It turns out the parser uses 0 for ALL, so
> when you do a FETCH ALL it is passing zero. Now, when you do MOVE 0,
> you are really asking for FETCH ALL and all the tuples are thrown away
> because of the MOVE.

Yeah. I think this is a bug and "MOVE 0" ought to be a no-op ... but
changing it requires a different parsetree representation for MOVE ALL,
which is tedious enough that it hasn't gotten done yet.

> I have the following patch which just documents the fact that MOVE 0
> goes to the end of the cursor. It does not change any behavior, just
> document it.

It should be documented as behavior that is likely to change. Also,
I believe FETCH 0 has the same issue.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-10-30 18:32:12
Message-ID: Pine.LNX.4.44.0210301928450.2006-100000@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian writes:

> So, that is why MOVE 0 goes to the end of the cursor. One idea would be
> for MOVE 0 to actually move nothing, but jdbc and others need the
> ability to move the end of the cursor, perhaps to then back up a certain
> amount and read from there. Seems MOVE 0 is the logical way to do that.
> (I can't think of another reasonable value).

It would seem more logical and reasonable for MOVE 0 to do nothing and
have some special syntax such as MOVE LAST to move to the end. (MOVE LAST
would actually be consistent with the standard syntax FETCH LAST.)

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-10-31 04:52:28
Message-ID: 200210310452.g9V4qTF02535@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > So, that is why MOVE 0 goes to the end of the cursor. One idea would be
> > for MOVE 0 to actually move nothing, but jdbc and others need the
> > ability to move the end of the cursor, perhaps to then back up a certain
> > amount and read from there. Seems MOVE 0 is the logical way to do that.
> > (I can't think of another reasonable value).
>
> It would seem more logical and reasonable for MOVE 0 to do nothing and
> have some special syntax such as MOVE LAST to move to the end. (MOVE LAST
> would actually be consistent with the standard syntax FETCH LAST.)

Yea, I started thinking and we need to get MOVE/FETCH to make sense.
The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
to the end. I was going to use the word END, but if LAST is more
standard, we will use that. It uses INT_MAX in the grammar for FETCH
ALL/MOVE LAST, but maps that to zero so it is consistent in the
/executor code.

I will keep this patch for 7.4.

JDBC folks, I realize you need this. Seems you will have to use MOVE 0
for 7,3 and MOVE LAST for 7.4.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 6.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-10-31 05:16:47
Message-ID: 4102.1036041407@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
> to the end.

Do not hack up PerformPortalFetch; put the special case for INT_MAX in
utility.c's FetchStmt code, instead. As-is, you probably broke other
callers of PerformPortalFetch.

BTW, there's a comment in parsenodes.h that needs to be fixed too:

int howMany; /* amount to fetch ("ALL" --> 0) */

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: am(at)fx(dot)ro
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-01 08:23:29
Message-ID: 20021101082328.GA26000@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, Nov 01, 2002 at 12:43:48PM +0200, am(at)fx(dot)ro wrote:
> Hello everyone!
>
> I have 2 questions:
>
> --1-- Some days ago, I've been trying to get the number of tuples
> that FETCH ALL would return, *before* fetching anything.
> (the program is written in C++, using libpq ; PostgreSQL 7.2.3).

Well, to get an answer, the server needs to execute the entire query. It
won't do that unless you explicitly ask for it.

> The solution i've found was something like:
>
> int nr_tuples;
>
> res = PQexec(conn, "MOVE ALL in CURS");
> sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
> PQclear(res);

That would work. But why do you need to know the total beforehand? You could
just do a FETCH ALL and then use PQntuples to get the number. If you're
using it to decide whether to provide a Next link, just FETCH one more item
than you intend to display and if you get it you display the link.

> I'm wondering: is there any better way to get that number?
>
> ( just an idea: maybe it would be useful to make PQcmdTuples
> work for MOVE commands ... ? )

Interesting idea. I'm not sure whether MOVE actually executes the query or
not.

> --2-- I found out that if i reach the end of the cursor, and want
> to move backwards, i have to increase the MOVE command's argument by 1:

No idea, the cursor has probably moved off the end to indicate the query is
done. So you need the extra one to move it back. That's just a guess though.

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.


From: am(at)fx(dot)ro
To: pgsql-general(at)postgresql(dot)org
Subject: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-01 10:43:48
Message-ID: 20021101124348.A344@coto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Hello everyone!

I have 2 questions:

--1-- Some days ago, I've been trying to get the number of tuples
that FETCH ALL would return, *before* fetching anything.
(the program is written in C++, using libpq ; PostgreSQL 7.2.3).

The solution i've found was something like:

int nr_tuples;

res = PQexec(conn, "MOVE ALL in CURS");
sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
PQclear(res);

I'm wondering: is there any better way to get that number?

( just an idea: maybe it would be useful to make PQcmdTuples
work for MOVE commands ... ? )

--2-- I found out that if i reach the end of the cursor, and want
to move backwards, i have to increase the MOVE command's argument by 1:

MOVE ALL in CURS --> i get the number of tuples: 590

MOVE -590 in CURS
FETCH ALL --> i get all tuples except the first one

MOVE -591 in CURS
FETCH ALL --> i get all the tuples

MOVE -1 in CURS
FETCH ALL --> i get nothing !

MOVE -2 in CURS
FETCH ALL --> i get the last tuple

This happens only if the current position is at the end of the cursor.

Is this the normal behaviour?

Best regards,
Adrian Maier
(am(at)fx(dot)ro)


From: am(at)fx(dot)ro
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-01 18:14:33
Message-ID: 20021101201433.A568@coto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, Nov 01, 2002 at 07:23:29PM +1100, Martijn van Oosterhout wrote:
> > The solution i've found was something like:
> >
> > int nr_tuples;
> >
> > res = PQexec(conn, "MOVE ALL in CURS");
> > sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
> > PQclear(res);
>
> That would work. But why do you need to know the total beforehand? You could
> just do a FETCH ALL and then use PQntuples to get the number.

If the table has, let's say, 10000 rows, it's unlikely that the user
will ever browse all of them ( my program permits the user to set some
filters ; the interface is ncurses-based). Fetching everything
would be unnecessary.

So, for speed reasons, i prefer to fetch maximum 500 rows.
But i want to display in the screen's corner the total number
of rows .

> > I'm wondering: is there any better way to get that number?
> >
> > ( just an idea: maybe it would be useful to make PQcmdTuples
> > work for MOVE commands ... ? )
>
> Interesting idea. I'm not sure whether MOVE actually executes the query or
> not.

I guess it doesn't execute the whole query. MOVE ALL is *much*
faster than FETCH ALL + PQcmdTuples

> > --2-- I found out that if i reach the end of the cursor, and want
> > to move backwards, i have to increase the MOVE command's argument by 1:
>
> No idea, the cursor has probably moved off the end to indicate the query is
> done. So you need the extra one to move it back. That's just a guess though.

Yeah, this could be the explanation.

Thanks for your answer

Adrian Maier


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: am(at)fx(dot)ro
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-02 02:39:16
Message-ID: 20021102023916.GC2571@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, Nov 01, 2002 at 08:14:33PM +0200, am(at)fx(dot)ro wrote:
> On Fri, Nov 01, 2002 at 07:23:29PM +1100, Martijn van Oosterhout wrote:
> > That would work. But why do you need to know the total beforehand? You could
> > just do a FETCH ALL and then use PQntuples to get the number.
>
> If the table has, let's say, 10000 rows, it's unlikely that the user
> will ever browse all of them ( my program permits the user to set some
> filters ; the interface is ncurses-based). Fetching everything
> would be unnecessary.
>
> So, for speed reasons, i prefer to fetch maximum 500 rows.
> But i want to display in the screen's corner the total number
> of rows .

Maybe do what google does. If there's lots of rows, give an estimate. I
don't know how they do it but if there are more than 1000 rows then the user
probably won't care if you wrote 1000, 2000 or a million.

Maybe some whacky curve fitting. If there's still a 98% match after 100
matches, there must be around 5000 matches.

> > Interesting idea. I'm not sure whether MOVE actually executes the query or
> > not.
>
> I guess it doesn't execute the whole query. MOVE ALL is *much*
> faster than FETCH ALL + PQcmdTuples

Curious. I wonder how it does it then.

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: am(at)fx(dot)ro, pgsql-general(at)postgresql(dot)org
Subject: Re: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-02 03:03:17
Message-ID: 17095.1036206197@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
>> I guess it doesn't execute the whole query. MOVE ALL is *much*
>> faster than FETCH ALL + PQcmdTuples

> Curious. I wonder how it does it then.

MOVE does execute the query, it just doesn't ship the tuples to the
client. This would save some formatting overhead (no need to run
the datatype I/O conversion procedures), but unless you have a slow
network link between client and server I would not expect it to be
"much" faster ...

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-11-02 03:14:56
Message-ID: 200211020314.gA23EuN07245@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
> > to the end.
>
> Do not hack up PerformPortalFetch; put the special case for INT_MAX in
> utility.c's FetchStmt code, instead. As-is, you probably broke other
> callers of PerformPortalFetch.

I thought about that, but I need to fail if the cursor name is invalid.
Those tests are done in PerformPortalFetch(). The good news is that no
one else call it. Other ideas?

> BTW, there's a comment in parsenodes.h that needs to be fixed too:
>
> int howMany; /* amount to fetch ("ALL" --> 0) */

Done.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-11-02 04:49:32
Message-ID: 17718.1036212572@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom Lane wrote:
>> Do not hack up PerformPortalFetch; put the special case for INT_MAX in
>> utility.c's FetchStmt code, instead. As-is, you probably broke other
>> callers of PerformPortalFetch.

> I thought about that, but I need to fail if the cursor name is invalid.

What has that got to do with it?

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Dave Cramer <dave(at)fastcrypt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-11-02 05:41:28
Message-ID: 200211020541.gA25fS329715@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Do not hack up PerformPortalFetch; put the special case for INT_MAX in
> >> utility.c's FetchStmt code, instead. As-is, you probably broke other
> >> callers of PerformPortalFetch.
>
> > I thought about that, but I need to fail if the cursor name is invalid.
>
> What has that got to do with it?

If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
will not get the checks for invalid cursor names, and I will not get the
proper return tag. I don't see how to do anything in utility.c. I
assume this is the code you want to move to utility.c:

+ /* If zero count, we are done */
+ if (count == 0)
+ return;
+
+ /* Internally, zero count processes all portal rows */
+ if (count == INT_MAX)
+ count = 0;
+

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-11-02 15:06:35
Message-ID: 21151.1036249595@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom Lane wrote:
>> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
>>> I thought about that, but I need to fail if the cursor name is invalid.
>>
>> What has that got to do with it?

> If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
> will not get the checks for invalid cursor names, and I will not get the
> proper return tag.

Oh, I see. Yeah, you're probably right, we have to change the calling
convention for PerformPortalFetch.

BTW, portalcmds.h also contains a comment that would need to be fixed.

regards, tom lane


From: am(at)fx(dot)ro
To: pgsql-general(at)postgresql(dot)org
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, am(at)fx(dot)ro, pgsql-general(at)postgresql(dot)org
Subject: Re: Cursors: getting the number of tuples; moving backwards
Date: 2002-11-02 19:33:43
Message-ID: 20021102213342.A292@coto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

On Fri, Nov 01, 2002 at 10:03:17PM -0500, Tom Lane wrote:
> MOVE does execute the query, it just doesn't ship the tuples to the
> client. This would save some formatting overhead (no need to run
> the datatype I/O conversion procedures), but unless you have a slow
> network link between client and server I would not expect it to be
> "much" faster ...

It must be the fact that the computer is quite old : Cyrix 6x86 166Mhz.
( this is not the deplyoment machine ).

Using MOVE is about 5 times faster in my case :
For 150784 tuples in the table, FETCH-ing took about 1m30 ,
while MOVE-ing took only about 17sec.

| Real | User | Sys
-------------------------------------------------------------------
select * from PRODTEST | 1m30.843s | 0m42.960s | 0m1.720s
-------------------------------------------------------------------
declare cursor... + FETCH | 1m32.835s | 0m42.680s | 0m1.780s
-------------------------------------------------------------------
declare cursor... + MOVE | 0m17.215s | 0m0.030s | 0m0.030s
-------------------------------------------------------------------
( i used commands like: time psql -f test.sql db_rex
to get those timings )

The difference must be smaller on fast machines.

So i guess that my computer is pretty good when it comes to finding
performance problems in applications ;-)

Bye,
Adrian Maier
(am(at)fx(dot)ro)


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move 0 behaviour
Date: 2002-11-03 01:21:11
Message-ID: 200211030121.gA31LBd00398@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-jdbc

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> >>> I thought about that, but I need to fail if the cursor name is invalid.
> >>
> >> What has that got to do with it?
>
> > If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
> > will not get the checks for invalid cursor names, and I will not get the
> > proper return tag.
>
> Oh, I see. Yeah, you're probably right, we have to change the calling
> convention for PerformPortalFetch.
>
> BTW, portalcmds.h also contains a comment that would need to be fixed.

Updated.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073