Efficiency question: VARCHAR with empty string vs NULL

Lists: pgsql-generalpgsql-sql
From: "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>
To: <pgsql-general(at)postgresql(dot)org>, <pgsql-sql(at)postgresql(dot)org>
Subject: Performance issues with compaq server
Date: 2002-05-07 22:40:32
Message-ID: 001501c1f618$32d0d780$110a010a@headquarters.wcgroup.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

I tried to load data using insert statement to our new database server (Compaq DL580, DL590, DL380 using SCSI driver) but it was very very slow. The database runs on Red Hat Linux 7.2 and PostgreSQL version 7.1.3

Does anybody know any idea to what cause this problem ? or things that I need to look or check (in the config. file)

I appreciate any help.

Regards,
Samuel


From: postgres(at)vrane(dot)com
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance issues with compaq server
Date: 2002-05-07 23:21:32
Message-ID: 20020507192132.A3944@amd.universe
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

One thing you can try is to use "-F"
switch which will turn off fsync.

$ man postmaster

There are other possibilities however

On Tue, May 07, 2002 at 06:40:32PM -0400, Samuel J. Sutjiono wrote:
> I tried to load data using insert statement to our new database server (Compaq DL580, DL590, DL380 using SCSI driver) but it was very very slow. The database runs on Red Hat Linux 7.2 and PostgreSQL version 7.1.3
>
> Does anybody know any idea to what cause this problem ? or things that I need to look or check (in the config. file)
>
> I appreciate any help.
>
> Regards,
> Samuel


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-sql(at)postgresql(dot)org
Subject: Re: Performance issues with compaq server
Date: 2002-05-08 01:25:27
Message-ID: 20020508112527.A17584@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Tue, May 07, 2002 at 06:40:32PM -0400, Samuel J. Sutjiono wrote:

> I tried to load data using insert statement to our new database server
> (Compaq DL580, DL590, DL380 using SCSI driver) but it was very very slow.
> The database runs on Red Hat Linux 7.2 and PostgreSQL version 7.1.3
>
> Does anybody know any idea to what cause this problem ? or things that I
> need to look or check (in the config. file)

Put them in a trasaction (begin/commit). Without that, each insert becomes
it's own transaction which is rather expensive. Postgresql 7.2 improves this
a bit but the transaction will help anyway.

HTH,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America


From: Holger Marzen <holger(at)marzen(dot)de>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, <pgsql-general(at)postgresql(dot)org>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-08 07:05:50
Message-ID: Pine.LNX.4.44.0205080903080.11632-100000@bluebell.marzen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wed, 8 May 2002, Martijn van Oosterhout wrote:

> On Tue, May 07, 2002 at 06:40:32PM -0400, Samuel J. Sutjiono wrote:
>
> > I tried to load data using insert statement to our new database server
> > (Compaq DL580, DL590, DL380 using SCSI driver) but it was very very slow.
> > The database runs on Red Hat Linux 7.2 and PostgreSQL version 7.1.3
> >
> > Does anybody know any idea to what cause this problem ? or things that I
> > need to look or check (in the config. file)
>
> Put them in a trasaction (begin/commit). Without that, each insert becomes
> it's own transaction which is rather expensive. Postgresql 7.2 improves this
> a bit but the transaction will help anyway.

ACK. On a given hardware I get about 150 inserts per second. Using a
begin/end transaction for a group of 100 inserts speeds it up to about
450 inserts per second.

But beware: if one insert fails (duplicate key, faulty data) then you
have to re-insert the remaining rows as single transactions, else all
rows of the previous transaction are discarded.

--
PGP/GPG Key-ID:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1


From: Doug McNaught <doug(at)wireboard(dot)com>
To: holger(at)marzen(dot)de
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, <pgsql-general(at)postgresql(dot)org>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-08 15:02:35
Message-ID: m3offqk7hw.fsf@varsoon.wireboard.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Holger Marzen <holger(at)marzen(dot)de> writes:

> ACK. On a given hardware I get about 150 inserts per second. Using a
> begin/end transaction for a group of 100 inserts speeds it up to about
> 450 inserts per second.

COPY is even faster as there is less query parsing to be done, plus
you get a transaction per COPY statement even without BEGIN/END.

> But beware: if one insert fails (duplicate key, faulty data) then you
> have to re-insert the remaining rows as single transactions, else all
> rows of the previous transaction are discarded.

Hmm don't you have to ROLLBACK and redo the whole transaction without
the offending row(s), since you can't commit while in ABORT state? Or
am I misunderstanding?

-Doug


From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance issues with compaq server
Date: 2002-05-08 17:53:08
Message-ID: 20020508105308.A14127@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wed, May 08, 2002 at 11:02:35AM -0400, Doug McNaught wrote:
> Holger Marzen <holger(at)marzen(dot)de> writes:
>
> > ACK. On a given hardware I get about 150 inserts per second. Using a
> > begin/end transaction for a group of 100 inserts speeds it up to about
> > 450 inserts per second.
>
> COPY is even faster as there is less query parsing to be done, plus
> you get a transaction per COPY statement even without BEGIN/END.

Does anyone have any performance figures to hand on COPY vs BEGIN/INSERT/END,
on indexed or unindexed tables?

Cheers,
Steve


From: Holger Marzen <holger(at)marzen(dot)de>
To: Doug McNaught <doug(at)wireboard(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, <pgsql-general(at)postgresql(dot)org>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-08 20:01:05
Message-ID: Pine.LNX.4.44.0205082155550.7269-100000@bluebell.marzen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On 8 May 2002, Doug McNaught wrote:

> Holger Marzen <holger(at)marzen(dot)de> writes:
>
> > ACK. On a given hardware I get about 150 inserts per second. Using a
> > begin/end transaction for a group of 100 inserts speeds it up to about
> > 450 inserts per second.
>
> COPY is even faster as there is less query parsing to be done, plus
> you get a transaction per COPY statement even without BEGIN/END.

Yes, but I wanted to change something in some rows, so I used perl and
insert.

> > But beware: if one insert fails (duplicate key, faulty data) then you
> > have to re-insert the remaining rows as single transactions, else all
> > rows of the previous transaction are discarded.
>
> Hmm don't you have to ROLLBACK and redo the whole transaction without
> the offending row(s), since you can't commit while in ABORT state? Or
> am I misunderstanding?

Postgres complains and doesn't accept the following inserts after a
failed one until end of transaction. I didn't have the time yet to
figure out if it rolls back the preceeding inserts.

Is there a rule in SQL standards that describes what should happen if
some statemens in a transaction fail and the program issues a commit?

--
PGP/GPG Key-ID:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1


From: Mike Diehl <jdiehl(at)sandia(dot)gov>
To: pgsql-sql(at)postgresql(dot)org
Subject: having trouble w/ having clause...
Date: 2002-05-08 20:58:59
Message-ID: 20020507215118.239013F33@dominion.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Hi all,

This is a simple question, but for the life of me, I can't see what is wrong with this query:

select host,slot,port,count(mac) as d from cam group by host,slot,port having d>1;

I'm running 7.2.

Thanx in advance,

--
Mike Diehl
Network Tools Devl.
Sandia Labs
(505) 284-3137


From: Ian Barwick <barwick(at)gmx(dot)net>
To: jdiehl(at)sandia(dot)gov, pgsql-sql(at)postgresql(dot)org
Subject: Re: having trouble w/ having clause...
Date: 2002-05-08 23:13:22
Message-ID: 200205090113.22872.barwick@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wednesday 08 May 2002 22:58, Mike Diehl wrote:
> Hi all,
>
> This is a simple question, but for the life of me, I can't see what is
> wrong with this query:
>
> select host,slot,port,count(mac) as d from cam group by host,slot,port
> having d>1;

Try:

select host,slot,port,count(mac) as d from cam group by host,slot,port
having count(mac) > 1

See:

http://www.postgresql.org/idocs/index.php?sql-select.html

"output_name

Specifies another name for an output column using the AS clause. This name is
primarily used to label the column for display. It can also be used to refer
to the column's value in ORDER BY and GROUP BY clauses. But the output_name
cannot be used in the WHERE or HAVING clauses; write out the expression
instead."

HTH

Ian Barwick


From: Doug McNaught <doug(at)wireboard(dot)com>
To: holger(at)marzen(dot)de
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 04:24:52
Message-ID: m3offqhrsb.fsf@varsoon.wireboard.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Holger Marzen <holger(at)marzen(dot)de> writes:

> On 8 May 2002, Doug McNaught wrote:
>
> > Hmm don't you have to ROLLBACK and redo the whole transaction without
> > the offending row(s), since you can't commit while in ABORT state? Or
> > am I misunderstanding?
>
> Postgres complains and doesn't accept the following inserts after a
> failed one until end of transaction. I didn't have the time yet to
> figure out if it rolls back the preceeding inserts.

I'm pretty sure that the transaction goes into ABORT state and all you
can do is rollback and start over, minus the offending row(s).

> Is there a rule in SQL standards that describes what should happen if
> some statemens in a transaction fail and the program issues a commit?

I think PG's is standard behavior; that's kind of the whole point of
having transactions.

-Doug


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Holger Marzen <holger(at)marzen(dot)de>
Cc: pgsql-general(at)postgresql(dot)org, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 07:11:59
Message-ID: Pine.NEB.4.43.0205091611370.454-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wed, 8 May 2002, Holger Marzen wrote:

> > COPY is even faster as there is less query parsing to be done, plus
> > you get a transaction per COPY statement even without BEGIN/END.
>
> Yes, but I wanted to change something in some rows, so I used perl and
> insert.

Why not use Perl and COPY?

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Holger Marzen <holger(at)marzen(dot)de>
To: Doug McNaught <doug(at)wireboard(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 11:23:17
Message-ID: Pine.LNX.4.44.0205091321110.4909-100000@bluebell.marzen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On 9 May 2002, Doug McNaught wrote:

> Holger Marzen <holger(at)marzen(dot)de> writes:
>
> > Is there a rule in SQL standards that describes what should happen if
> > some statemens in a transaction fail and the program issues a commit?
>
> I think PG's is standard behavior; that's kind of the whole point of
> having transactions.

- rolling back the whole transaction if you do a COMMIT
or
- keeping the changes until the first failing statement and ignoring
everything that follows if you do a COMMIT

?

--
PGP/GPG Key-ID:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1


From: Holger Marzen <holger(at)marzen(dot)de>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 11:24:12
Message-ID: Pine.LNX.4.44.0205091323390.4909-100000@bluebell.marzen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Thu, 9 May 2002, Curt Sampson wrote:

> On Wed, 8 May 2002, Holger Marzen wrote:
>
> > > COPY is even faster as there is less query parsing to be done, plus
> > > you get a transaction per COPY statement even without BEGIN/END.
> >
> > Yes, but I wanted to change something in some rows, so I used perl and
> > insert.
>
> Why not use Perl and COPY?

A good idea. I'll try it if the customer complains about the speed.

--
PGP/GPG Key-ID:
http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1


From: John Gray <jgray(at)azuli(dot)co(dot)uk>
To: holger(at)marzen(dot)de
Cc: Doug McNaught <doug(at)wireboard(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 11:43:33
Message-ID: 1020944616.1592.103.camel@adzuki
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Thu, 2002-05-09 at 12:23, Holger Marzen wrote:
> On 9 May 2002, Doug McNaught wrote:
>
> > Holger Marzen <holger(at)marzen(dot)de> writes:
> >
> > > Is there a rule in SQL standards that describes what should happen if
> > > some statemens in a transaction fail and the program issues a commit?
> >
> > I think PG's is standard behavior; that's kind of the whole point of
> > having transactions.
>
> - rolling back the whole transaction if you do a COMMIT
> or
> - keeping the changes until the first failing statement and ignoring
> everything that follows if you do a COMMIT
>

I can't speak to the phrasing of the standard, but PG behaviour is:

Everything in the transaction is ignored. All the inserts made before
the error, since the BEGIN, are rolled back. All subsequent inserts are
ignored (generating an error that transaction is in ABORT state). The
subsequent COMMIT will end the transaction (and thus clear the abort
state) but *will not commit anything*.

This means that scripts will work as intended -namely that all or none
of the commands within a transaction block will succeed[*]. So if you
split your insert into blocks, and one block fails, you will only need
to correct that block and try to load it again.

Regards

John

[*] This is handy for doing things like reindexing or drop/recreate
table in a script. The drop table stands a good chance of succeeding,
whereas the prospect of a syntax error in the create table makes it more
likely to fail. A subsequent commit of the DROP TABLE would not produce
the end result you intended....

--
John Gray
Azuli IT
www.azuli.co.uk


From: Denis Gasparin <denis(at)edistar(dot)com>
To: "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>
Cc: Postgresql General Mailing List <pgsql-general(at)postgresql(dot)org>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Performance issues with compaq server
Date: 2002-05-09 12:29:50
Message-ID: 1020947390.2035.3.camel@edspctec12
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

The Compaq servers you indicated use the smart array controller which is
very very slow in storing data into hard disks. Are you using the RAID?

I suggest you to buy a controller with a larger write cache (the smart
array controller does not have w-cache)... You should note a large
performance progress...

Also do not forget to set fsync to false and to use the COPY command
whenever is possible...

Bye!
--
Doct. Eng. Denis Gasparin: denis(at)edistar(dot)com
---------------------------
Programmer & System Administrator - Edistar srl

Il mer, 2002-05-08 alle 00:40, Samuel J. Sutjiono ha scritto:
> I tried to load data using insert statement to our new database server (Compaq DL580, DL590, DL380 using SCSI driver) but it was very very slow. The database runs on Red Hat Linux 7.2 and PostgreSQL version 7.1.3
>
> Does anybody know any idea to what cause this problem ? or things that I need to look or check (in the config. file)
>
> I appreciate any help.
>
> Regards,
> Samuel


From: Mike Diehl <jdiehl(at)sandia(dot)gov>
To: "Ian Barwick" <barwick(at)gmx(dot)net>, pgsql-sql(at)postgresql(dot)org
Subject: Re: having trouble w/ having clause...
Date: 2002-05-09 19:21:42
Message-ID: 20020508201406.6FC763FEF@dominion.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Yep, this fixed it. Thanx,

On Wednesday 08 May 2002 05:13 pm, Ian Barwick wrote:
> On Wednesday 08 May 2002 22:58, Mike Diehl wrote:
> > Hi all,
> >
> > This is a simple question, but for the life of me, I can't see what
> > is wrong with this query:
> >
> > select host,slot,port,count(mac) as d from cam group by
> > host,slot,port having d>1;
>
> Try:
>
> select host,slot,port,count(mac) as d from cam group by host,slot,port
> having count(mac) > 1
>
> See:
>
> http://www.postgresql.org/idocs/index.php?sql-select.html
>
> "output_name
>
> Specifies another name for an output column using the AS clause. This
> name is primarily used to label the column for display. It can also be
> used to refer to the column's value in ORDER BY and GROUP BY clauses.
> But the output_name cannot be used in the WHERE or HAVING clauses;
> write out the expression instead."
>
> HTH
>
> Ian Barwick

--
Mike Diehl
Network Tools Devl.
Sandia Labs
(505) 284-3137


From: Curt Sampson <cjs(at)cynic(dot)net>
To: Denis Gasparin <denis(at)edistar(dot)com>
Cc: "Samuel J(dot) Sutjiono" <ssutjiono(at)wc-group(dot)com>, Postgresql General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance issues with compaq server
Date: 2002-05-10 07:49:29
Message-ID: Pine.NEB.4.43.0205101648540.541-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On 9 May 2002, Denis Gasparin wrote:

> The Compaq servers you indicated use the smart array controller which is
> very very slow in storing data into hard disks. Are you using the RAID?
>
> I suggest you to buy a controller with a larger write cache (the smart
> array controller does not have w-cache)... You should note a large
> performance progress...

And take your transaction log off of RAID entirely and put it on
a mirrored pair of disks instead.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC


From: Steve Lane <slane(at)fmpro(dot)com>
To: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>, <pgsql-general(at)postgresql(dot)org>
Cc: <jlx(at)commandprompt(dot)com>
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-21 01:59:32
Message-ID: B90F1435.D909%slane@fmpro.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On 5/20/02 9:25 PM, "Command Prompt, Inc." <pgsql-general(at)commandprompt(dot)com>
wrote:

> Good day,
>
> Does anyone have any experience in increasing the amount of shared memory
> buffers available to Mac OS X/Darwin? When starting with more than 128
> backend connections on a test G3 I encounter:
>
> ------------------------------------------------------------------------------
> -----
> IpcMemoryCreate: shmget(key=5432001, size=33357824, 03600) failed: Invalid
> argument
>
> This error usually means that PostgreSQL's request for a shared memory
> segment exceeded your kernel's SHMMAX parameter. You can either
> reduce the request size or reconfigure the kernel with larger SHMMAX.
> ------------------------------------------------------------------------------
> -----
>
> However, I can find no helpful information looking around for how Darwin
> handles this configuration. Hopefully it does not require a recompile of
> the Kernel. ;)

Actually --- :-> I'm no expert on this topic, but Darwin is BSD and I do
recall a post on this very topic, to the effect that the poster believed
some flavors of BSD, possibly MOSX included, do require rebuilding the
kernel to change this value.

As a MOSX user I'd certainly love to be found wrong.

-- sgl

=======================================================
Steve Lane

Vice President
Chris Moyer Consulting, Inc.
833 West Chicago Ave Suite 203

Voice: (312) 433-2421 Email: slane(at)fmpro(dot)com
Fax: (312) 850-3930 Web: http://www.fmpro.com
=======================================================


From: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Cc: <jlx(at)commandprompt(dot)com>
Subject: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-21 02:25:11
Message-ID: Pine.LNX.4.30.0205201919440.902-100000@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Good day,

Does anyone have any experience in increasing the amount of shared memory
buffers available to Mac OS X/Darwin? When starting with more than 128
backend connections on a test G3 I encounter:

-----------------------------------------------------------------------------------
IpcMemoryCreate: shmget(key=5432001, size=33357824, 03600) failed: Invalid argument

This error usually means that PostgreSQL's request for a shared memory
segment exceeded your kernel's SHMMAX parameter. You can either
reduce the request size or reconfigure the kernel with larger SHMMAX.
-----------------------------------------------------------------------------------

However, I can find no helpful information looking around for how Darwin
handles this configuration. Hopefully it does not require a recompile of
the Kernel. ;)

Any knowledge on this topic would be greatly appreciated.

Best Regards,
Jw.
--
jlx(at)commandprompt(dot)com
by way of pgsql-general(at)commandprompt(dot)com
http://www.postgresql.info/
http://www.commandprompt.com/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, jlx(at)commandprompt(dot)com
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-21 03:09:23
Message-ID: 17483.1021950563@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

"Command Prompt, Inc." <pgsql-general(at)commandprompt(dot)com> writes:
> Does anyone have any experience in increasing the amount of shared memory
> buffers available to Mac OS X/Darwin?

A little bit of googling turns up Apple bug reports stating that OSX
has a hardwired limit on shared memory --- so no, short of modifying
and recompiling the kernel, you're outta luck.

regards, tom lane


From: Gregory Seidman <gss+pg(at)cs(dot)brown(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-22 17:02:51
Message-ID: 20020522130251.B7586@jamaica.cs.brown.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Command Prompt, Inc. sez:
[...]
} [...actually, a minute later of looking at the source...]
}
} Okay, so it appears to be when it's checking sockets for data in
} src/backend/postmaster/pgstat.c, line 1579...I just tried re-building the
} kernel with a higher value for kern.ipc.maxsockets but that didn't have
} any effect, still get that error.
}
} Anyone have any ideas on what specifically would cause this part of pgstat
} to fail?

Recompiling the kernel was necessary to change the shared memory limit, but
kern.ipc.maxsockets can be changed on the fly with sysctl. It might be
getting changed in one of the scripts (perhaps SystemTuning) in
/System/Library/StartupItems (there are also sometimes scripts in
/Library/StartupItems; both directories are used in the startup process).
When your system is up and running, try sysctl kern.ipc.maxsockets and see
if the value is what you expected. If not, dig around in the StartupItems
directories and see if something is changing it.

} Best regards,
} Jw.
--Greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
Cc: Steve Lane <slane(at)fmpro(dot)com>, jlx(at)commandprompt(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-22 17:20:32
Message-ID: 11281.1022088032@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

"Command Prompt, Inc." <pgsql-general(at)commandprompt(dot)com> writes:
> I was able to pull down the source for the Kernel, and increased the
> SHMMAX by a factor of 16. Upgraded to the new Kernel and I am now able to
> get safely 512 connections, even up around 900, but bumping it up to 1024,
> I run into the following error:

> PGSTATBUFF: recvfrom(2): Resource temporarily unavailable
> DEBUG: statistics collector process (pid 1988) exited with exit code 1

> ...which then repeats itself infinitely until the calling process is
> stopped. ;)

[ scratches head... ] "Resource temporarily unavailable" is EAGAIN
according to /usr/include/sys/errno.h on my OSX machine. But the man
page for recvfrom doesn't mention any plausible reasons for EAGAIN to
be signaled. select() just told us there was data available on the
socket, so WTF? Could this be a kernel bug?

You could try modifying pgstat.c to continue its loop rather than
exiting after it gets a recvfrom error. But if the error condition
recurs that'll just put pgstat.c into an infinite loop, so I'm not
sure this is any solution --- just a way of gathering more data.

regards, tom lane


From: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
To: Steve Lane <slane(at)fmpro(dot)com>
Cc: <jlx(at)commandprompt(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-22 17:30:00
Message-ID: Pine.LNX.4.30.0205220939590.14696-100000@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Mon, 20 May 2002, Steve Lane wrote:
>Actually --- :-> I'm no expert on this topic, but Darwin is BSD and I do
>recall a post on this very topic, to the effect that the poster believed
>some flavors of BSD, possibly MOSX included, do require rebuilding the
>kernel to change this value.
>
>As a MOSX user I'd certainly love to be found wrong.

Well, after doing some digging in the OS X Darwin Developer archives, it
definitely appears that at this stage in Darwin's design it DOES require a
re-build of the Kernel -- thanks to those of you who pointed me in that
direction. ;)

Fortunately it's a fairly painless process I've found; you just need to
sign up with the Apple Open Source site, and you have instant CVS access.

I was able to pull down the source for the Kernel, and increased the
SHMMAX by a factor of 16. Upgraded to the new Kernel and I am now able to
get safely 512 connections, even up around 900, but bumping it up to 1024,
I run into the following error:

PGSTATBUFF: recvfrom(2): Resource temporarily unavailable
DEBUG: statistics collector process (pid 1988) exited with exit code 1

...which then repeats itself infinitely until the calling process is
stopped. ;)

I've upped the limit on the number of open files with ulimit, but I was
wondering if anyone knows what specifically invokes this "resource
temporarily unavailable" message from PGSTATBUFF? I don't see it in the
source casually grepping around, so is it I take it the result of the
recvfrom() function?

[...actually, a minute later of looking at the source...]

Okay, so it appears to be when it's checking sockets for data in
src/backend/postmaster/pgstat.c, line 1579...I just tried re-building the
kernel with a higher value for kern.ipc.maxsockets but that didn't have
any effect, still get that error.

Anyone have any ideas on what specifically would cause this part of pgstat
to fail?

Best regards,
Jw.
--
jlx(at)commandprompt(dot)com
by way of pgsql-general(at)commandprompt(dot)com
http://www.postgresql.info/
http://www.commandprompt.com/


From: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
To: Gregory Seidman <gss+pg(at)cs(dot)brown(dot)edu>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-22 18:57:43
Message-ID: Pine.LNX.4.30.0205221152310.18514-100000@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wed, 22 May 2002, Gregory Seidman wrote:
>Command Prompt, Inc. sez:
>[...]
>}Okay, so it appears to be when it's checking sockets for data in
>}src/backend/postmaster/pgstat.c, line 1579...I just tried re-building the
>}kernel with a higher value for kern.ipc.maxsockets but that didn't have
>}any effect, still get that error.
>}
>}Anyone have any ideas on what specifically would cause this part of pgstat
>}to fail?
>Recompiling the kernel was necessary to change the shared memory limit, but
>kern.ipc.maxsockets can be changed on the fly with sysctl.

Hm. I tried that, with "sysctl -w kern.ipcmaxsockets=2048", and got:

sysctl: oid 'kern.ipc.maxsockets' is read only

>It might be getting changed in one of the scripts (perhaps SystemTuning) in
>/System/Library/StartupItems (there are also sometimes scripts in
>/Library/StartupItems; both directories are used in the startup process).
>When your system is up and running, try sysctl kern.ipc.maxsockets and see
>if the value is what you expected. If not, dig around in the StartupItems
>directories and see if something is changing it.

After recompile it is actually the value I expected. It appears that the
problem is something less clear. :/

Jw.
--
jlx(at)commandprompt(dot)com
by way of pgsql-general(at)commandprompt(dot)com
http://www.postgresql.info/
http://www.commandprompt.com/


From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>, Steve Lane <slane(at)fmpro(dot)com>, jlx(at)commandprompt(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-23 14:09:53
Message-ID: Pine.LNX.4.21.0205231506380.12663-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

On Wed, 22 May 2002, Tom Lane wrote:

> "Command Prompt, Inc." <pgsql-general(at)commandprompt(dot)com> writes:
> > I was able to pull down the source for the Kernel, and increased the
> > SHMMAX by a factor of 16. Upgraded to the new Kernel and I am now able to
> > get safely 512 connections, even up around 900, but bumping it up to 1024,
> > I run into the following error:
>
> > PGSTATBUFF: recvfrom(2): Resource temporarily unavailable
> > DEBUG: statistics collector process (pid 1988) exited with exit code 1
>
> > ...which then repeats itself infinitely until the calling process is
> > stopped. ;)
>
> [ scratches head... ] "Resource temporarily unavailable" is EAGAIN
> according to /usr/include/sys/errno.h on my OSX machine. But the man
> page for recvfrom doesn't mention any plausible reasons for EAGAIN to
> be signaled. select() just told us there was data available on the
> socket, so WTF? Could this be a kernel bug?
>
> You could try modifying pgstat.c to continue its loop rather than
> exiting after it gets a recvfrom error. But if the error condition
> recurs that'll just put pgstat.c into an infinite loop, so I'm not
> sure this is any solution --- just a way of gathering more data.

My 'standard' way of doing this is to loop while the error is EAGAIN but with a
small retry counter to limit the number of attempts made. Hopefully the second
time through the call isn't interrupted again. Therefore I'm usually quite
happy saying that after something like 5 calls to a routine then there really
is an error.

Isn't this how people normally handle this?

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
Cc: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>, Steve Lane <slane(at)fmpro(dot)com>, jlx(at)commandprompt(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: MacOS X Shared Buffers (SHMMAX)?
Date: 2002-05-23 14:49:13
Message-ID: 5589.1022165353@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

"Nigel J. Andrews" <nandrews(at)investsystems(dot)co(dot)uk> writes:
> My 'standard' way of doing this is to loop while the error is EAGAIN
> but with a small retry counter to limit the number of attempts
> made. Hopefully the second time through the call isn't interrupted
> again.

I think you're thinking of EINTR, not EAGAIN.

regards, tom lane


From: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Efficiency question: VARCHAR with empty string vs NULL
Date: 2002-05-24 17:20:43
Message-ID: 5.1.0.14.2.20020524131720.02c09fa0@mail.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Hello,

I have a table with about 15-20 VARCHAR items. I use VARCHAR on the
assumption that it is more space efficient and not too much slower than
CHAR, especially when you expect many of those VARCHAR to be null or empty
strings.

Is there a performance difference with VARCHAR elements of value NULL and
zero-length string? If so, can I assume that NULL is the better performer,
and thus re-code my INSERTS to replace empty strings with NULLs (or vice
versa)?

Now that I've had to re-write all my data to substring it to fit in the
VARCHARs, one more "thing" won't matter much.

Many Thanks,

Doug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Efficiency question: VARCHAR with empty string vs NULL
Date: 2002-05-24 18:54:14
Message-ID: 13510.1022266454@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> Is there a performance difference with VARCHAR elements of value NULL and
> zero-length string?

These are not semantically equivalent (if you think they are, you've
been using Oracle too long). You will almost certainly regret it
if you try recoding your app to make them equivalent. An example:

regression=# select 'foo'::varchar || ''::varchar;
?column?
----------
foo
(1 row)

regression=# select 'foo'::varchar || null::varchar;
?column?
----------

(1 row)

But to answer your question, NULLs might save a couple bytes if there
are multiple NULLs per row. I think that replacing a single empty
varchar with a NULL would net out to no change (you lose the 4-byte
varchar length word, but have to store a null-values bitmap instead),
depending on alignment issues and how many columns there are in the
table.

regards, tom lane


From: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Efficiency question: VARCHAR with empty string
Date: 2002-05-24 19:10:45
Message-ID: 5.1.0.14.2.20020524150418.01ecdeb8@mail.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-sql

At 02:54 PM 5/24/2002, you wrote:
>Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> > Is there a performance difference with VARCHAR elements of value NULL and
> > zero-length string?
>
>These are not semantically equivalent (if you think they are, you've
>been using Oracle too long). You will almost certainly regret it
>if you try recoding your app to make them equivalent. An example:

Sorry, I haven't used Oracle in quite some time so I don't have any bad
Oracle-isms. :)

Thanks, Tom. Yes, I am aware of this already and my application largely
doesn't care (i.e. in the places it does it asks = '' as well as OR IS
NULL, but in most places it doesn't bother because it doesn't matter).

I'm merely trying to wring the last ounce of performance from the database,
especially given that most of these columns are rarely searched and
certainly not indexed.

>But to answer your question, NULLs might save a couple bytes if there
>are multiple NULLs per row. I think that replacing a single empty
>varchar with a NULL would net out to no change (you lose the 4-byte
>varchar length word, but have to store a null-values bitmap instead),
>depending on alignment issues and how many columns there are in the
>table.

So, in essence, having a zero-length VARCHAR requires no additional page
loads (from the VARCHAR heap, if such a thing is used) and does not
increase the record size over having NULLs? My intent is to make this query
as fast as possible and require as few disk hits: SELECT * FROM table with
this many VARCHAR'd table.

Thanks,

Doug