Re: requested shared memory size overflows size_t

Lists: pgsql-performance
From: Bob Lunney <bob_lunney(at)yahoo(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-11 06:25:47
Message-ID: 584872.74177.qm@web39702.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Tom,

First off, I wouldn't use a VM if I could help it, however, sometimes you have to make compromises. With a 16 Gb machine running 64-bit Ubuntu and only PostgreSQL, I'd start by allocating 4 Gb to shared_buffers. That should leave more than enough room for the OS and file system cache. Then I'd begin testing by measuring response times of representative queries with significant amounts of data.

Also, what is the disk setup for the box? Filesystem? Can WAL files have their own disk? Is the workload OLTP or OLAP, or a mixture of both? There is more that goes into tuning a PG server for good performance than simply installing the software, setting a couple of GUCs and running it.

Bob

--- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:

> From: Tom Wilcox <hungrytom(at)gmail(dot)com>
> Subject: Re: [PERFORM] requested shared memory size overflows size_t
> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com>
> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
> Date: Thursday, June 10, 2010, 10:45 AM
> Thanks guys. I am currently
> installing Pg64 onto a Ubuntu Server 64-bit installation
> running as a VM in VirtualBox with 16GB of RAM accessible.
> If what you say is true then what do you suggest I do to
> configure my new setup to best use the available 16GB (96GB
> and native install eventually if the test goes well) of RAM
> on Linux.
>
> I was considering starting by using Enterprise DBs tuner to
> see if that optimises things to a better quality..
>
> Tom
>
> On 10/06/2010 15:41, Bob Lunney wrote:
> > True, plus there are the other issues of increased
> checkpoint times and I/O, bgwriter tuning, etc.  It may
> be better to let the OS cache the files and size
> shared_buffers to a smaller value.
> >
> > Bob Lunney
> >
> > --- On Wed, 6/9/10, Robert Haas<robertmhaas(at)gmail(dot)com> 
> wrote:
> >
> >   
> >> From: Robert Haas<robertmhaas(at)gmail(dot)com>
> >> Subject: Re: [PERFORM] requested shared memory
> size overflows size_t
> >> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com>
> >> Cc: pgsql-performance(at)postgresql(dot)org,
> "Tom Wilcox"<hungrytom(at)googlemail(dot)com>
> >> Date: Wednesday, June 9, 2010, 9:49 PM
> >> On Wed, Jun 2, 2010 at 9:26 PM, Bob
> >> Lunney<bob_lunney(at)yahoo(dot)com>
> >> wrote:
> >>     
> >>> Your other option, of course, is a nice 64-bit
> linux
> >>>       
> >> variant, which won't have this problem at all.
> >>
> >> Although, even there, I think I've heard that
> after 10GB
> >> you don't get
> >> much benefit from raising it further.  Not
> sure if
> >> that's accurate or
> >> not...
> >>
> >> -- Robert Haas
> >> EnterpriseDB: http://www.enterprisedb.com
> >> The Enterprise Postgres Company
> >>
> >>     
> >
> >
> >   
>
>


From: Tom Wilcox <hungrytom(at)gmail(dot)com>
To: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-14 18:53:01
Message-ID: 4C167A8D.1060002@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


Hi Bob,

Thanks a lot. Here's my best attempt to answer your questions:

The VM is setup with a virtual disk image dynamically expanding to fill
an allocation of 300GB on a fast, local hard drive (avg read speed =
778MB/s ).
WAL files can have their own disk, but how significantly would this
affect our performance?
The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the
guest filesystem is Ext2 (Ubuntu 64).
The workload is OLAP (lots of large, complex queries on large tables run
in sequence).

In addition, I have reconfigured my server to use more memory. Here's a
detailed blow by blow of how I reconfigured my system to get better
performance (for anyone who might be interested)...

In order to increase the shared memory on Ubuntu I edited the System V
IPC values using sysctl:

sysctl -w kernel.shmmax=16106127360*
*sysctl -w kernel.shmall=2097152

I had some fun with permissions as I somehow managed to change the
owner of the postgresql.conf to root where it needed to be postgres,
resulting in failure to start the service.. (Fixed with chown
postgres:postgres ./data/postgresql.conf and chmod u=rwx ./data -R).

I changed the following params in my configuration file..

default_statistics_target=10000
maintenance_work_mem=512MB
work_mem=512MB
shared_buffers=512MB
wal_buffers=128MB

With this config, the following command took 6,400,000ms:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;

With plan:
"Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278 width=232)
(actual time=0.379..464270.682 rows=27777961 loops=1)"
"Total runtime: 6398238.890 ms"

With these changes to the previous config, the same command took
5,610,000ms:

maintenance_work_mem=4GB
work_mem=4GB
shared_buffers=4GB
effective_cache_size=4GB
wal_buffers=1GB

Resulting plan:

"Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572 width=232)
(actual time=0.094..452793.430 rows=27777961 loops=1)"
"Total runtime: 5614140.786 ms"

Then I performed these changes to the postgresql.conf file:

max_connections=3
effective_cache_size=15GB
maintenance_work_mem=5GB
shared_buffers=7000MB
work_mem=5GB

And ran this query (for a quick look - can't afford the time for the
previous tests..):

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id
< 100000;

Result:

"Index Scan using match_data_pkey1 on match_data (cost=0.00..15662.17
rows=4490 width=232) (actual time=27.055..1908.027 rows=99999 loops=1)"
" Index Cond: (match_data_id < 100000)"
"Total runtime: 25909.372 ms"

I then ran EntrepriseDB's Tuner on my postgres install (for a dedicated
machine) and got the following settings and results:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id
< 100000;

"Index Scan using match_data_pkey1 on match_data (cost=0.00..13734.54
rows=4495 width=232) (actual time=0.348..2928.844 rows=99999 loops=1)"
" Index Cond: (match_data_id < 100000)"
"Total runtime: 1066580.293 ms"

For now, I will go with the config using 7000MB shared_buffers. Any
suggestions on how I can further optimise this config for a single
session, 64-bit install utilising ALL of 96GB RAM. I will spend the next
week making the case for a native install of Linux, but first we need to
be 100% sure that is the only way to get the most out of Postgres on
this machine.

Thanks very much. I now feel I am at a position where I can really
explore and find the optimal configuration for my system, but would
still appreciate any suggestions.

Cheers,
Tom

On 11/06/2010 07:25, Bob Lunney wrote:
> Tom,
>
> First off, I wouldn't use a VM if I could help it, however, sometimes you have to make compromises. With a 16 Gb machine running 64-bit Ubuntu and only PostgreSQL, I'd start by allocating 4 Gb to shared_buffers. That should leave more than enough room for the OS and file system cache. Then I'd begin testing by measuring response times of representative queries with significant amounts of data.
>
> Also, what is the disk setup for the box? Filesystem? Can WAL files have their own disk? Is the workload OLTP or OLAP, or a mixture of both? There is more that goes into tuning a PG server for good performance than simply installing the software, setting a couple of GUCs and running it.
>
> Bob
>
> --- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:
>
>
>> From: Tom Wilcox <hungrytom(at)gmail(dot)com>
>> Subject: Re: [PERFORM] requested shared memory size overflows size_t
>> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com>
>> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
>> Date: Thursday, June 10, 2010, 10:45 AM
>> Thanks guys. I am currently
>> installing Pg64 onto a Ubuntu Server 64-bit installation
>> running as a VM in VirtualBox with 16GB of RAM accessible.
>> If what you say is true then what do you suggest I do to
>> configure my new setup to best use the available 16GB (96GB
>> and native install eventually if the test goes well) of RAM
>> on Linux.
>>
>> I was considering starting by using Enterprise DBs tuner to
>> see if that optimises things to a better quality..
>>
>> Tom
>>
>> On 10/06/2010 15:41, Bob Lunney wrote:
>>
>>> True, plus there are the other issues of increased
>>>
>> checkpoint times and I/O, bgwriter tuning, etc. It may
>> be better to let the OS cache the files and size
>> shared_buffers to a smaller value.
>>
>>> Bob Lunney
>>>
>>> --- On Wed, 6/9/10, Robert Haas<robertmhaas(at)gmail(dot)com>
>>>
>> wrote:
>>
>>>
>>>
>>>> From: Robert Haas<robertmhaas(at)gmail(dot)com>
>>>> Subject: Re: [PERFORM] requested shared memory
>>>>
>> size overflows size_t
>>
>>>> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com>
>>>> Cc: pgsql-performance(at)postgresql(dot)org,
>>>>
>> "Tom Wilcox"<hungrytom(at)googlemail(dot)com>
>>
>>>> Date: Wednesday, June 9, 2010, 9:49 PM
>>>> On Wed, Jun 2, 2010 at 9:26 PM, Bob
>>>> Lunney<bob_lunney(at)yahoo(dot)com>
>>>> wrote:
>>>>
>>>>
>>>>> Your other option, of course, is a nice 64-bit
>>>>>
>> linux
>>
>>>>>
>>>>>
>>>> variant, which won't have this problem at all.
>>>>
>>>> Although, even there, I think I've heard that
>>>>
>> after 10GB
>>
>>>> you don't get
>>>> much benefit from raising it further. Not
>>>>
>> sure if
>>
>>>> that's accurate or
>>>> not...
>>>>
>>>> -- Robert Haas
>>>> EnterpriseDB: http://www.enterprisedb.com
>>>> The Enterprise Postgres Company
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-14 23:08:14
Message-ID: AANLkTinl1XU1IYUv2lmBXeKPKk37TKutBSNbcGvPmjJI@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Mon, Jun 14, 2010 at 2:53 PM, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:
> maintenance_work_mem=4GB
> work_mem=4GB
> shared_buffers=4GB
> effective_cache_size=4GB
> wal_buffers=1GB

It's pretty easy to drive your system into swap with such a large
value for work_mem - you'd better monitor that carefully.

The default value for wal_buffers is 64kB. I can't imagine why you'd
need to increase that by four orders of magnitude. I'm not sure
whether it will cause you a problem or not, but you're allocating
quite a lot of shared memory that way that you might not really need.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Tom Wilcox <hungrytom(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-14 23:21:35
Message-ID: 4C16B97F.4080201@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Thanks. I will try with a more sensible value of wal_buffers.. I was
hoping to keep more in memory and therefore reduce the frequency of disk
IOs..

Any suggestions for good monitoring software for linux?

On 15/06/2010 00:08, Robert Haas wrote:
> On Mon, Jun 14, 2010 at 2:53 PM, Tom Wilcox<hungrytom(at)gmail(dot)com> wrote:
>
>> maintenance_work_mem=4GB
>> work_mem=4GB
>> shared_buffers=4GB
>> effective_cache_size=4GB
>> wal_buffers=1GB
>>
> It's pretty easy to drive your system into swap with such a large
> value for work_mem - you'd better monitor that carefully.
>
> The default value for wal_buffers is 64kB. I can't imagine why you'd
> need to increase that by four orders of magnitude. I'm not sure
> whether it will cause you a problem or not, but you're allocating
> quite a lot of shared memory that way that you might not really need.
>
>


From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 00:26:51
Message-ID: AANLkTikNEA6bxWY1GZRhpuiK5hLzim0wgmpeGsuD06dj@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Tom

I always prefer to choose apps based on business needs, then the OS based on
the needs for the app.

Cynically, I often feel that the best answer to "we have a policy that says
we're only allowed to use operating system x" is to ignore the policy ....
the kind of people ignorant enough to be that blinkered are usually not
tech-savvy enough to notice when it gets flouted :-)

More seriously, is the policy "Windows only on the metal" or could you run
e.g. VMware ESX server? I/O is the area that takes the biggest hit in
virtualization, and ESX server has far less overhead loss than either
Hyper-V (which I presume you are using) or VMWare Workstation for NT
(kernels).

If it's a Windows-only policy, then perhaps you can run those traps in
reverse, and switch to a Windows database, i.e. Microsoft SQL Server.

Cheers
Dave

On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:

>
> Hi Bob,
>
> Thanks a lot. Here's my best attempt to answer your questions:
>
> The VM is setup with a virtual disk image dynamically expanding to fill an
> allocation of 300GB on a fast, local hard drive (avg read speed = 778MB/s ).
> WAL files can have their own disk, but how significantly would this affect
> our performance?
> The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the
> guest filesystem is Ext2 (Ubuntu 64).
> The workload is OLAP (lots of large, complex queries on large tables run in
> sequence).
>
> In addition, I have reconfigured my server to use more memory. Here's a
> detailed blow by blow of how I reconfigured my system to get better
> performance (for anyone who might be interested)...
>
> In order to increase the shared memory on Ubuntu I edited the System V IPC
> values using sysctl:
>
> sysctl -w kernel.shmmax=16106127360*
> *sysctl -w kernel.shmall=2097152
>
> I had some fun with permissions as I somehow managed to change the owner
> of the postgresql.conf to root where it needed to be postgres, resulting in
> failure to start the service.. (Fixed with chown postgres:postgres
> ./data/postgresql.conf and chmod u=rwx ./data -R).
>
> I changed the following params in my configuration file..
>
> default_statistics_target=10000
> maintenance_work_mem=512MB
> work_mem=512MB
> shared_buffers=512MB
> wal_buffers=128MB
>
> With this config, the following command took 6,400,000ms:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;
>
> With plan:
> "Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278 width=232)
> (actual time=0.379..464270.682 rows=27777961 loops=1)"
> "Total runtime: 6398238.890 ms"
>
> With these changes to the previous config, the same command took
> 5,610,000ms:
>
> maintenance_work_mem=4GB
> work_mem=4GB
> shared_buffers=4GB
> effective_cache_size=4GB
> wal_buffers=1GB
>
> Resulting plan:
>
> "Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572 width=232)
> (actual time=0.094..452793.430 rows=27777961 loops=1)"
> "Total runtime: 5614140.786 ms"
>
> Then I performed these changes to the postgresql.conf file:
>
> max_connections=3
> effective_cache_size=15GB
> maintenance_work_mem=5GB
> shared_buffers=7000MB
> work_mem=5GB
>
> And ran this query (for a quick look - can't afford the time for the
> previous tests..):
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id <
> 100000;
>
> Result:
>
> "Index Scan using match_data_pkey1 on match_data (cost=0.00..15662.17
> rows=4490 width=232) (actual time=27.055..1908.027 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 25909.372 ms"
>
> I then ran EntrepriseDB's Tuner on my postgres install (for a dedicated
> machine) and got the following settings and results:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id <
> 100000;
>
> "Index Scan using match_data_pkey1 on match_data (cost=0.00..13734.54
> rows=4495 width=232) (actual time=0.348..2928.844 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 1066580.293 ms"
>
> For now, I will go with the config using 7000MB shared_buffers. Any
> suggestions on how I can further optimise this config for a single session,
> 64-bit install utilising ALL of 96GB RAM. I will spend the next week making
> the case for a native install of Linux, but first we need to be 100% sure
> that is the only way to get the most out of Postgres on this machine.
>
> Thanks very much. I now feel I am at a position where I can really explore
> and find the optimal configuration for my system, but would still appreciate
> any suggestions.
>
> Cheers,
> Tom
>
>
> On 11/06/2010 07:25, Bob Lunney wrote:
>
>> Tom,
>>
>> First off, I wouldn't use a VM if I could help it, however, sometimes you
>> have to make compromises. With a 16 Gb machine running 64-bit Ubuntu and
>> only PostgreSQL, I'd start by allocating 4 Gb to shared_buffers. That
>> should leave more than enough room for the OS and file system cache. Then
>> I'd begin testing by measuring response times of representative queries with
>> significant amounts of data.
>>
>> Also, what is the disk setup for the box? Filesystem? Can WAL files have
>> their own disk? Is the workload OLTP or OLAP, or a mixture of both? There
>> is more that goes into tuning a PG server for good performance than simply
>> installing the software, setting a couple of GUCs and running it.
>>
>> Bob
>>
>> --- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:
>>
>>
>>
>>> From: Tom Wilcox <hungrytom(at)gmail(dot)com>
>>> Subject: Re: [PERFORM] requested shared memory size overflows size_t
>>> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com>
>>> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>,
>>> pgsql-performance(at)postgresql(dot)org
>>> Date: Thursday, June 10, 2010, 10:45 AM
>>> Thanks guys. I am currently
>>> installing Pg64 onto a Ubuntu Server 64-bit installation
>>> running as a VM in VirtualBox with 16GB of RAM accessible.
>>> If what you say is true then what do you suggest I do to
>>> configure my new setup to best use the available 16GB (96GB
>>> and native install eventually if the test goes well) of RAM
>>> on Linux.
>>>
>>> I was considering starting by using Enterprise DBs tuner to
>>> see if that optimises things to a better quality..
>>>
>>> Tom
>>>
>>> On 10/06/2010 15:41, Bob Lunney wrote:
>>>
>>>
>>>> True, plus there are the other issues of increased
>>>>
>>>>
>>> checkpoint times and I/O, bgwriter tuning, etc. It may
>>> be better to let the OS cache the files and size
>>> shared_buffers to a smaller value.
>>>
>>>
>>>> Bob Lunney
>>>>
>>>> --- On Wed, 6/9/10, Robert Haas<robertmhaas(at)gmail(dot)com>
>>>>
>>> wrote:
>>>
>>>
>>>>
>>>>
>>>>> From: Robert Haas<robertmhaas(at)gmail(dot)com>
>>>>> Subject: Re: [PERFORM] requested shared memory
>>>>>
>>>>>
>>>> size overflows size_t
>>>
>>>
>>>> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com>
>>>>> Cc: pgsql-performance(at)postgresql(dot)org,
>>>>>
>>>>>
>>>> "Tom Wilcox"<hungrytom(at)googlemail(dot)com>
>>>
>>>
>>>> Date: Wednesday, June 9, 2010, 9:49 PM
>>>>> On Wed, Jun 2, 2010 at 9:26 PM, Bob
>>>>> Lunney<bob_lunney(at)yahoo(dot)com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> Your other option, of course, is a nice 64-bit
>>>>>>
>>>>>>
>>>>> linux
>>>
>>>
>>>>
>>>>>>
>>>>> variant, which won't have this problem at all.
>>>>>
>>>>> Although, even there, I think I've heard that
>>>>>
>>>>>
>>>> after 10GB
>>>
>>>
>>>> you don't get
>>>>> much benefit from raising it further. Not
>>>>>
>>>>>
>>>> sure if
>>>
>>>
>>>> that's accurate or
>>>>> not...
>>>>>
>>>>> -- Robert Haas
>>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>> The Enterprise Postgres Company
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


From: Tom Wilcox <hungrytom(at)gmail(dot)com>
To: Dave Crooke <dcrooke(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 00:41:01
Message-ID: 4C16CC1D.5070108@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi Dave,

I am definitely able to switch OS if it will get the most out of
Postgres. So it is definitely a case of choosing the OS on the needs if
the app providing it is well justified.

Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM.

Cheers,
Tom

Dave Crooke wrote:
> Tom
>
> I always prefer to choose apps based on business needs, then the OS
> based on the needs for the app.
>
> Cynically, I often feel that the best answer to "we have a policy that
> says we're only allowed to use operating system x" is to ignore the
> policy .... the kind of people ignorant enough to be that blinkered
> are usually not tech-savvy enough to notice when it gets flouted :-)
>
> More seriously, is the policy "Windows only on the metal" or could you
> run e.g. VMware ESX server? I/O is the area that takes the biggest hit
> in virtualization, and ESX server has far less overhead loss than
> either Hyper-V (which I presume you are using) or VMWare Workstation
> for NT (kernels).
>
> If it's a Windows-only policy, then perhaps you can run those traps in
> reverse, and switch to a Windows database, i.e. Microsoft SQL Server.
>
> Cheers
> Dave
>
> On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>> wrote:
>
>
> Hi Bob,
>
> Thanks a lot. Here's my best attempt to answer your questions:
>
> The VM is setup with a virtual disk image dynamically expanding to
> fill an allocation of 300GB on a fast, local hard drive (avg read
> speed = 778MB/s ).
> WAL files can have their own disk, but how significantly would
> this affect our performance?
> The filesystem of the host OS is NTFS (Windows Server 2008 OS 64),
> the guest filesystem is Ext2 (Ubuntu 64).
> The workload is OLAP (lots of large, complex queries on large
> tables run in sequence).
>
> In addition, I have reconfigured my server to use more memory.
> Here's a detailed blow by blow of how I reconfigured my system to
> get better performance (for anyone who might be interested)...
>
> In order to increase the shared memory on Ubuntu I edited the
> System V IPC values using sysctl:
>
> sysctl -w kernel.shmmax=16106127360*
> *sysctl -w kernel.shmall=2097152
>
> I had some fun with permissions as I somehow managed to change the
> owner of the postgresql.conf to root where it needed to be
> postgres, resulting in failure to start the service.. (Fixed with
> chown postgres:postgres ./data/postgresql.conf and chmod u=rwx
> ./data -R).
>
> I changed the following params in my configuration file..
>
> default_statistics_target=10000
> maintenance_work_mem=512MB
> work_mem=512MB
> shared_buffers=512MB
> wal_buffers=128MB
>
> With this config, the following command took 6,400,000ms:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;
>
> With plan:
> "Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278
> width=232) (actual time=0.379..464270.682 rows=27777961 loops=1)"
> "Total runtime: 6398238.890 ms"
>
> With these changes to the previous config, the same command took
> 5,610,000ms:
>
> maintenance_work_mem=4GB
> work_mem=4GB
> shared_buffers=4GB
> effective_cache_size=4GB
> wal_buffers=1GB
>
> Resulting plan:
>
> "Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572
> width=232) (actual time=0.094..452793.430 rows=27777961 loops=1)"
> "Total runtime: 5614140.786 ms"
>
> Then I performed these changes to the postgresql.conf file:
>
> max_connections=3
> effective_cache_size=15GB
> maintenance_work_mem=5GB
> shared_buffers=7000MB
> work_mem=5GB
>
> And ran this query (for a quick look - can't afford the time for
> the previous tests..):
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
> match_data_id < 100000;
>
> Result:
>
> "Index Scan using match_data_pkey1 on match_data
> (cost=0.00..15662.17 rows=4490 width=232) (actual
> time=27.055..1908.027 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 25909.372 ms"
>
> I then ran EntrepriseDB's Tuner on my postgres install (for a
> dedicated machine) and got the following settings and results:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
> match_data_id < 100000;
>
> "Index Scan using match_data_pkey1 on match_data
> (cost=0.00..13734.54 rows=4495 width=232) (actual
> time=0.348..2928.844 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 1066580.293 ms"
>
> For now, I will go with the config using 7000MB shared_buffers.
> Any suggestions on how I can further optimise this config for a
> single session, 64-bit install utilising ALL of 96GB RAM. I will
> spend the next week making the case for a native install of Linux,
> but first we need to be 100% sure that is the only way to get the
> most out of Postgres on this machine.
>
> Thanks very much. I now feel I am at a position where I can really
> explore and find the optimal configuration for my system, but
> would still appreciate any suggestions.
>
> Cheers,
> Tom
>
>
> On 11/06/2010 07:25, Bob Lunney wrote:
>
> Tom,
>
> First off, I wouldn't use a VM if I could help it, however,
> sometimes you have to make compromises. With a 16 Gb machine
> running 64-bit Ubuntu and only PostgreSQL, I'd start by
> allocating 4 Gb to shared_buffers. That should leave more
> than enough room for the OS and file system cache. Then I'd
> begin testing by measuring response times of representative
> queries with significant amounts of data.
>
> Also, what is the disk setup for the box? Filesystem? Can
> WAL files have their own disk? Is the workload OLTP or OLAP,
> or a mixture of both? There is more that goes into tuning a
> PG server for good performance than simply installing the
> software, setting a couple of GUCs and running it.
>
> Bob
>
> --- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>> wrote:
>
>
>
> From: Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>>
> Subject: Re: [PERFORM] requested shared memory size
> overflows size_t
> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>
> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>,
> pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>
> Date: Thursday, June 10, 2010, 10:45 AM
> Thanks guys. I am currently
> installing Pg64 onto a Ubuntu Server 64-bit installation
> running as a VM in VirtualBox with 16GB of RAM accessible.
> If what you say is true then what do you suggest I do to
> configure my new setup to best use the available 16GB (96GB
> and native install eventually if the test goes well) of RAM
> on Linux.
>
> I was considering starting by using Enterprise DBs tuner to
> see if that optimises things to a better quality..
>
> Tom
>
> On 10/06/2010 15:41, Bob Lunney wrote:
>
>
> True, plus there are the other issues of increased
>
>
> checkpoint times and I/O, bgwriter tuning, etc. It may
> be better to let the OS cache the files and size
> shared_buffers to a smaller value.
>
>
> Bob Lunney
>
> --- On Wed, 6/9/10, Robert Haas<robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>
>
> wrote:
>
>
>
>
> From: Robert Haas<robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>
> Subject: Re: [PERFORM] requested shared memory
>
>
> size overflows size_t
>
>
> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>
> Cc: pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>,
>
>
> "Tom Wilcox"<hungrytom(at)googlemail(dot)com
> <mailto:hungrytom(at)googlemail(dot)com>>
>
>
> Date: Wednesday, June 9, 2010, 9:49 PM
> On Wed, Jun 2, 2010 at 9:26 PM, Bob
> Lunney<bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>
> wrote:
>
>
> Your other option, of course, is a nice 64-bit
>
>
> linux
>
>
>
>
> variant, which won't have this problem at all.
>
> Although, even there, I think I've heard that
>
>
> after 10GB
>
>
> you don't get
> much benefit from raising it further. Not
>
>
> sure if
>
>
> that's accurate or
> not...
>
> -- Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>
>
>
>
>
>
>
>
>
>
>
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>


From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 00:56:16
Message-ID: AANLkTilht6MjUhOm0Dv3yQ2lrYoIw59oVb0PeBL7iAbv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

With that clarification, I stand squarely behind what others are saying ...
if performance is important to you, then you should always run databases on
dedicated hardware, with the OS running on bare metal with no
virtualization. VirtualBox has even more I/O losses than Hyper-V. It's
simply not designed for this, and you're giving away a ton of performance.

If nothing else, my confusion should indicate to you how unconventional and
poorly performing this virtualizaed setup is ... I simply assumed that the
only plausible reason you were piggybacking on virtualization on Windows was
a mandated lack of alternative options.

Reload the hardware with an OS which PGSQL supports well, and get rid of the
VirtualBox and Windows layers. If you have hardware that only Windows
supports well, then you may need to make some hardware changes.

I haven't said anything about which Unix-like OS .... you may find people
arguing passionately for BSD vs. Linux .... however, the difference between
these is negligible compared to "virtualized vs. real system", and at this
point considerations like support base, ease of use and familiarity also
come into play.

IMHO Ubuntu would be a fine choice, and PGSQL is a "first-class" supported
package from the distributor ... however, at customer sites, I've typically
used Red Hat AS because they have a corporate preference for it, even though
it is less convenient to install and manage.

On Mon, Jun 14, 2010 at 7:41 PM, Tom Wilcox <hungrytom(at)gmail(dot)com> wrote:

> Hi Dave,
>
> I am definitely able to switch OS if it will get the most out of Postgres.
> So it is definitely a case of choosing the OS on the needs if the app
> providing it is well justified.
>
> Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM.
>
> Cheers,
> Tom
>
>
> Dave Crooke wrote:
>
>> Tom
>>
>> I always prefer to choose apps based on business needs, then the OS based
>> on the needs for the app.
>>
>> Cynically, I often feel that the best answer to "we have a policy that
>> says we're only allowed to use operating system x" is to ignore the policy
>> .... the kind of people ignorant enough to be that blinkered are usually not
>> tech-savvy enough to notice when it gets flouted :-)
>>
>> More seriously, is the policy "Windows only on the metal" or could you run
>> e.g. VMware ESX server? I/O is the area that takes the biggest hit in
>> virtualization, and ESX server has far less overhead loss than either
>> Hyper-V (which I presume you are using) or VMWare Workstation for NT
>> (kernels).
>>
>> If it's a Windows-only policy, then perhaps you can run those traps in
>> reverse, and switch to a Windows database, i.e. Microsoft SQL Server.
>>
>> Cheers
>> Dave
>>
>> On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox <hungrytom(at)gmail(dot)com <mailto:
>> hungrytom(at)gmail(dot)com>> wrote:
>>
>>
>> Hi Bob,
>>
>> Thanks a lot. Here's my best attempt to answer your questions:
>>
>> The VM is setup with a virtual disk image dynamically expanding to
>> fill an allocation of 300GB on a fast, local hard drive (avg read
>> speed = 778MB/s ).
>> WAL files can have their own disk, but how significantly would
>> this affect our performance?
>> The filesystem of the host OS is NTFS (Windows Server 2008 OS 64),
>> the guest filesystem is Ext2 (Ubuntu 64).
>> The workload is OLAP (lots of large, complex queries on large
>> tables run in sequence).
>>
>> In addition, I have reconfigured my server to use more memory.
>> Here's a detailed blow by blow of how I reconfigured my system to
>> get better performance (for anyone who might be interested)...
>>
>> In order to increase the shared memory on Ubuntu I edited the
>> System V IPC values using sysctl:
>>
>> sysctl -w kernel.shmmax=16106127360*
>> *sysctl -w kernel.shmall=2097152
>>
>> I had some fun with permissions as I somehow managed to change the
>> owner of the postgresql.conf to root where it needed to be
>> postgres, resulting in failure to start the service.. (Fixed with
>> chown postgres:postgres ./data/postgresql.conf and chmod u=rwx
>> ./data -R).
>>
>> I changed the following params in my configuration file..
>>
>> default_statistics_target=10000
>> maintenance_work_mem=512MB
>> work_mem=512MB
>> shared_buffers=512MB
>> wal_buffers=128MB
>>
>> With this config, the following command took 6,400,000ms:
>>
>> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;
>>
>> With plan:
>> "Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278
>> width=232) (actual time=0.379..464270.682 rows=27777961 loops=1)"
>> "Total runtime: 6398238.890 ms"
>>
>> With these changes to the previous config, the same command took
>> 5,610,000ms:
>>
>> maintenance_work_mem=4GB
>> work_mem=4GB
>> shared_buffers=4GB
>> effective_cache_size=4GB
>> wal_buffers=1GB
>>
>> Resulting plan:
>>
>> "Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572
>> width=232) (actual time=0.094..452793.430 rows=27777961 loops=1)"
>> "Total runtime: 5614140.786 ms"
>>
>> Then I performed these changes to the postgresql.conf file:
>>
>> max_connections=3
>> effective_cache_size=15GB
>> maintenance_work_mem=5GB
>> shared_buffers=7000MB
>> work_mem=5GB
>>
>> And ran this query (for a quick look - can't afford the time for
>> the previous tests..):
>>
>> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
>> match_data_id < 100000;
>>
>> Result:
>>
>> "Index Scan using match_data_pkey1 on match_data
>> (cost=0.00..15662.17 rows=4490 width=232) (actual
>> time=27.055..1908.027 rows=99999 loops=1)"
>> " Index Cond: (match_data_id < 100000)"
>> "Total runtime: 25909.372 ms"
>>
>> I then ran EntrepriseDB's Tuner on my postgres install (for a
>> dedicated machine) and got the following settings and results:
>>
>> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
>> match_data_id < 100000;
>>
>> "Index Scan using match_data_pkey1 on match_data
>> (cost=0.00..13734.54 rows=4495 width=232) (actual
>> time=0.348..2928.844 rows=99999 loops=1)"
>> " Index Cond: (match_data_id < 100000)"
>> "Total runtime: 1066580.293 ms"
>>
>> For now, I will go with the config using 7000MB shared_buffers.
>> Any suggestions on how I can further optimise this config for a
>> single session, 64-bit install utilising ALL of 96GB RAM. I will
>> spend the next week making the case for a native install of Linux,
>> but first we need to be 100% sure that is the only way to get the
>> most out of Postgres on this machine.
>>
>> Thanks very much. I now feel I am at a position where I can really
>> explore and find the optimal configuration for my system, but
>> would still appreciate any suggestions.
>>
>> Cheers,
>> Tom
>>
>>
>> On 11/06/2010 07:25, Bob Lunney wrote:
>>
>> Tom,
>>
>> First off, I wouldn't use a VM if I could help it, however,
>> sometimes you have to make compromises. With a 16 Gb machine
>> running 64-bit Ubuntu and only PostgreSQL, I'd start by
>> allocating 4 Gb to shared_buffers. That should leave more
>> than enough room for the OS and file system cache. Then I'd
>> begin testing by measuring response times of representative
>> queries with significant amounts of data.
>>
>> Also, what is the disk setup for the box? Filesystem? Can
>> WAL files have their own disk? Is the workload OLTP or OLAP,
>> or a mixture of both? There is more that goes into tuning a
>> PG server for good performance than simply installing the
>> software, setting a couple of GUCs and running it.
>>
>> Bob
>>
>> --- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com
>> <mailto:hungrytom(at)gmail(dot)com>> wrote:
>>
>>
>> From: Tom Wilcox <hungrytom(at)gmail(dot)com
>> <mailto:hungrytom(at)gmail(dot)com>>
>>
>> Subject: Re: [PERFORM] requested shared memory size
>> overflows size_t
>> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com
>> <mailto:bob_lunney(at)yahoo(dot)com>>
>>
>> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com
>> <mailto:robertmhaas(at)gmail(dot)com>>,
>> pgsql-performance(at)postgresql(dot)org
>> <mailto:pgsql-performance(at)postgresql(dot)org>
>>
>> Date: Thursday, June 10, 2010, 10:45 AM
>> Thanks guys. I am currently
>> installing Pg64 onto a Ubuntu Server 64-bit installation
>> running as a VM in VirtualBox with 16GB of RAM accessible.
>> If what you say is true then what do you suggest I do to
>> configure my new setup to best use the available 16GB (96GB
>> and native install eventually if the test goes well) of RAM
>> on Linux.
>>
>> I was considering starting by using Enterprise DBs tuner to
>> see if that optimises things to a better quality..
>>
>> Tom
>>
>> On 10/06/2010 15:41, Bob Lunney wrote:
>>
>> True, plus there are the other issues of increased
>>
>> checkpoint times and I/O, bgwriter tuning, etc. It may
>> be better to let the OS cache the files and size
>> shared_buffers to a smaller value.
>>
>> Bob Lunney
>>
>> --- On Wed, 6/9/10, Robert Haas<robertmhaas(at)gmail(dot)com
>> <mailto:robertmhaas(at)gmail(dot)com>>
>>
>> wrote:
>>
>>
>> From: Robert Haas<robertmhaas(at)gmail(dot)com
>> <mailto:robertmhaas(at)gmail(dot)com>>
>>
>> Subject: Re: [PERFORM] requested shared memory
>>
>> size overflows size_t
>>
>> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com
>> <mailto:bob_lunney(at)yahoo(dot)com>>
>>
>> Cc: pgsql-performance(at)postgresql(dot)org
>> <mailto:pgsql-performance(at)postgresql(dot)org>,
>>
>>
>> "Tom Wilcox"<hungrytom(at)googlemail(dot)com
>> <mailto:hungrytom(at)googlemail(dot)com>>
>>
>>
>> Date: Wednesday, June 9, 2010, 9:49 PM
>> On Wed, Jun 2, 2010 at 9:26 PM, Bob
>> Lunney<bob_lunney(at)yahoo(dot)com
>> <mailto:bob_lunney(at)yahoo(dot)com>>
>>
>> wrote:
>>
>> Your other option, of course, is a nice 64-bit
>>
>> linux
>>
>>
>> variant, which won't have this problem at all.
>>
>> Although, even there, I think I've heard that
>>
>> after 10GB
>>
>> you don't get
>> much benefit from raising it further. Not
>>
>> sure if
>>
>> that's accurate or
>> not...
>>
>> -- Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise Postgres Company
>>
>>
>>
>>
>>
>>
>>
>> -- Sent via pgsql-performance mailing list
>> (pgsql-performance(at)postgresql(dot)org
>> <mailto:pgsql-performance(at)postgresql(dot)org>)
>>
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>>
>>
>>
>


From: Tom Wilcox <hungrytom(at)gmail(dot)com>
To: Dave Crooke <dcrooke(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 01:12:31
Message-ID: 4C16D37F.8090705@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Thanks a lot Dave,

That's exactly the kind of answer I can use to justify the OS switch.
Motivation for the previous setup was based on the fact that we will use
the same machine for other projects that will use SQL Server and most of
our experience lies within the MS domain. However, these projects are
not a high priority currently and therefore I have been focusing on the
best solution for a Postgres-focused setup.

This does however mean that I will need to have the other projects
running in a VM on Linux. However, they are less demanding in terms of
resources.

Cheers,
Tom

Dave Crooke wrote:
> With that clarification, I stand squarely behind what others are
> saying ... if performance is important to you, then you should always
> run databases on dedicated hardware, with the OS running on bare metal
> with no virtualization. VirtualBox has even more I/O losses than
> Hyper-V. It's simply not designed for this, and you're giving away a
> ton of performance.
>
> If nothing else, my confusion should indicate to you how
> unconventional and poorly performing this virtualizaed setup is ... I
> simply assumed that the only plausible reason you were piggybacking on
> virtualization on Windows was a mandated lack of alternative options.
>
> Reload the hardware with an OS which PGSQL supports well, and get rid
> of the VirtualBox and Windows layers. If you have hardware that only
> Windows supports well, then you may need to make some hardware changes.
>
> I haven't said anything about which Unix-like OS .... you may find
> people arguing passionately for BSD vs. Linux .... however, the
> difference between these is negligible compared to "virtualized vs.
> real system", and at this point considerations like support base, ease
> of use and familiarity also come into play.
>
> IMHO Ubuntu would be a fine choice, and PGSQL is a "first-class"
> supported package from the distributor ... however, at customer sites,
> I've typically used Red Hat AS because they have a corporate
> preference for it, even though it is less convenient to install and
> manage.
>
> On Mon, Jun 14, 2010 at 7:41 PM, Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>> wrote:
>
> Hi Dave,
>
> I am definitely able to switch OS if it will get the most out of
> Postgres. So it is definitely a case of choosing the OS on the
> needs if the app providing it is well justified.
>
> Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM.
>
> Cheers,
> Tom
>
>
> Dave Crooke wrote:
>
> Tom
>
> I always prefer to choose apps based on business needs, then
> the OS based on the needs for the app.
>
> Cynically, I often feel that the best answer to "we have a
> policy that says we're only allowed to use operating system x"
> is to ignore the policy .... the kind of people ignorant
> enough to be that blinkered are usually not tech-savvy enough
> to notice when it gets flouted :-)
>
> More seriously, is the policy "Windows only on the metal" or
> could you run e.g. VMware ESX server? I/O is the area that
> takes the biggest hit in virtualization, and ESX server has
> far less overhead loss than either Hyper-V (which I presume
> you are using) or VMWare Workstation for NT (kernels).
>
> If it's a Windows-only policy, then perhaps you can run those
> traps in reverse, and switch to a Windows database, i.e.
> Microsoft SQL Server.
>
> Cheers
> Dave
>
> On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox
> <hungrytom(at)gmail(dot)com <mailto:hungrytom(at)gmail(dot)com>
> <mailto:hungrytom(at)gmail(dot)com <mailto:hungrytom(at)gmail(dot)com>>> wrote:
>
>
> Hi Bob,
>
> Thanks a lot. Here's my best attempt to answer your questions:
>
> The VM is setup with a virtual disk image dynamically
> expanding to
> fill an allocation of 300GB on a fast, local hard drive
> (avg read
> speed = 778MB/s ).
> WAL files can have their own disk, but how significantly would
> this affect our performance?
> The filesystem of the host OS is NTFS (Windows Server 2008
> OS 64),
> the guest filesystem is Ext2 (Ubuntu 64).
> The workload is OLAP (lots of large, complex queries on large
> tables run in sequence).
>
> In addition, I have reconfigured my server to use more memory.
> Here's a detailed blow by blow of how I reconfigured my
> system to
> get better performance (for anyone who might be interested)...
>
> In order to increase the shared memory on Ubuntu I edited the
> System V IPC values using sysctl:
>
> sysctl -w kernel.shmmax=16106127360*
> *sysctl -w kernel.shmall=2097152
>
> I had some fun with permissions as I somehow managed to
> change the
> owner of the postgresql.conf to root where it needed to be
> postgres, resulting in failure to start the service..
> (Fixed with
> chown postgres:postgres ./data/postgresql.conf and chmod u=rwx
> ./data -R).
>
> I changed the following params in my configuration file..
>
> default_statistics_target=10000
> maintenance_work_mem=512MB
> work_mem=512MB
> shared_buffers=512MB
> wal_buffers=128MB
>
> With this config, the following command took 6,400,000ms:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org;
>
> With plan:
> "Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278
> width=232) (actual time=0.379..464270.682 rows=27777961
> loops=1)"
> "Total runtime: 6398238.890 ms"
>
> With these changes to the previous config, the same command
> took
> 5,610,000ms:
>
> maintenance_work_mem=4GB
> work_mem=4GB
> shared_buffers=4GB
> effective_cache_size=4GB
> wal_buffers=1GB
>
> Resulting plan:
>
> "Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572
> width=232) (actual time=0.094..452793.430 rows=27777961
> loops=1)"
> "Total runtime: 5614140.786 ms"
>
> Then I performed these changes to the postgresql.conf file:
>
> max_connections=3
> effective_cache_size=15GB
> maintenance_work_mem=5GB
> shared_buffers=7000MB
> work_mem=5GB
>
> And ran this query (for a quick look - can't afford the
> time for
> the previous tests..):
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
> match_data_id < 100000;
>
> Result:
>
> "Index Scan using match_data_pkey1 on match_data
> (cost=0.00..15662.17 rows=4490 width=232) (actual
> time=27.055..1908.027 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 25909.372 ms"
>
> I then ran EntrepriseDB's Tuner on my postgres install (for a
> dedicated machine) and got the following settings and results:
>
> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
> match_data_id < 100000;
>
> "Index Scan using match_data_pkey1 on match_data
> (cost=0.00..13734.54 rows=4495 width=232) (actual
> time=0.348..2928.844 rows=99999 loops=1)"
> " Index Cond: (match_data_id < 100000)"
> "Total runtime: 1066580.293 ms"
>
> For now, I will go with the config using 7000MB shared_buffers.
> Any suggestions on how I can further optimise this config for a
> single session, 64-bit install utilising ALL of 96GB RAM. I
> will
> spend the next week making the case for a native install of
> Linux,
> but first we need to be 100% sure that is the only way to
> get the
> most out of Postgres on this machine.
>
> Thanks very much. I now feel I am at a position where I can
> really
> explore and find the optimal configuration for my system, but
> would still appreciate any suggestions.
>
> Cheers,
> Tom
>
>
> On 11/06/2010 07:25, Bob Lunney wrote:
>
> Tom,
>
> First off, I wouldn't use a VM if I could help it, however,
> sometimes you have to make compromises. With a 16 Gb
> machine
> running 64-bit Ubuntu and only PostgreSQL, I'd start by
> allocating 4 Gb to shared_buffers. That should leave more
> than enough room for the OS and file system cache.
> Then I'd
> begin testing by measuring response times of representative
> queries with significant amounts of data.
>
> Also, what is the disk setup for the box? Filesystem? Can
> WAL files have their own disk? Is the workload OLTP or
> OLAP,
> or a mixture of both? There is more that goes into
> tuning a
> PG server for good performance than simply installing the
> software, setting a couple of GUCs and running it.
>
> Bob
>
> --- On Thu, 6/10/10, Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>
> <mailto:hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>>> wrote:
>
>
> From: Tom Wilcox <hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>
> <mailto:hungrytom(at)gmail(dot)com
> <mailto:hungrytom(at)gmail(dot)com>>>
>
> Subject: Re: [PERFORM] requested shared memory size
> overflows size_t
> To: "Bob Lunney" <bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>
> <mailto:bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>>
>
> Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>
> <mailto:robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>>,
> pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>
> <mailto:pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>>
>
> Date: Thursday, June 10, 2010, 10:45 AM
> Thanks guys. I am currently
> installing Pg64 onto a Ubuntu Server 64-bit
> installation
> running as a VM in VirtualBox with 16GB of RAM
> accessible.
> If what you say is true then what do you suggest I
> do to
> configure my new setup to best use the available
> 16GB (96GB
> and native install eventually if the test goes
> well) of RAM
> on Linux.
>
> I was considering starting by using Enterprise DBs
> tuner to
> see if that optimises things to a better quality..
>
> Tom
>
> On 10/06/2010 15:41, Bob Lunney wrote:
>
> True, plus there are the other issues of increased
>
> checkpoint times and I/O, bgwriter tuning, etc. It may
> be better to let the OS cache the files and size
> shared_buffers to a smaller value.
>
> Bob Lunney
>
> --- On Wed, 6/9/10, Robert
> Haas<robertmhaas(at)gmail(dot)com <mailto:robertmhaas(at)gmail(dot)com>
> <mailto:robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>>
>
> wrote:
>
>
> From: Robert Haas<robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>
> <mailto:robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>>
>
> Subject: Re: [PERFORM] requested shared memory
>
> size overflows size_t
>
> To: "Bob Lunney"<bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>
> <mailto:bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>>
>
> Cc: pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>
> <mailto:pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>>,
>
>
> "Tom Wilcox"<hungrytom(at)googlemail(dot)com
> <mailto:hungrytom(at)googlemail(dot)com>
> <mailto:hungrytom(at)googlemail(dot)com
> <mailto:hungrytom(at)googlemail(dot)com>>>
>
>
> Date: Wednesday, June 9, 2010, 9:49 PM
> On Wed, Jun 2, 2010 at 9:26 PM, Bob
> Lunney<bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>
> <mailto:bob_lunney(at)yahoo(dot)com
> <mailto:bob_lunney(at)yahoo(dot)com>>>
>
> wrote:
>
> Your other option, of course, is a nice
> 64-bit
>
> linux
>
>
> variant, which won't have this problem at all.
>
> Although, even there, I think I've heard that
>
> after 10GB
>
> you don't get
> much benefit from raising it further. Not
>
> sure if
>
> that's accurate or
> not...
>
> -- Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>
>
>
>
>
>
>
> -- Sent via pgsql-performance mailing list
> (pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>
> <mailto:pgsql-performance(at)postgresql(dot)org
> <mailto:pgsql-performance(at)postgresql(dot)org>>)
>
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
>


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 02:06:49
Message-ID: 4C16E039.10803@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Tom Wilcox wrote:
> default_statistics_target=10000
> wal_buffers=1GB
> max_connections=3
> effective_cache_size=15GB
> maintenance_work_mem=5GB
> shared_buffers=7000MB
> work_mem=5GB

That value for default_statistics_target means that every single query
you ever run will take a seriously long time to generate a plan for.
Even on an OLAP system, I would consider 10,000 an appropriate setting
for a column or two in a particularly troublesome table. I wouldn't
consider a value of even 1,000 in the postgresql.conf to be a good
idea. You should consider making the system default much lower, and
increase it only on columns that need it, not for every column on every
table.

There is no reason to set wal_buffers larger than 16MB, the size of a
full WAL segment. Have you read
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server yet?
checkpoint_segments is the main parameter you haven't touched yet you
should consider increasing. Even if you have a low write load, when
VACUUM runs it will be very inefficient running against a large set of
tables without the checkpoint frequency being decreased some. Something
in the 16-32 range would be plenty for an OLAP setup.

At 3 connections, a work_mem of 5GB is possibly reasonable. I would
normally recommend that you make the default much smaller than that
though, and instead just increase to a large value for queries that
benefit from it. If someone later increases max_connections to
something higher, your server could run completely out of memory if
work_mem isn't cut way back as part of that change.

You could consider setting effective_cache_size to something even larger
than that,

> EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE
> match_data_id < 100000;

By the way--repeatedly running this form of query to test for
improvements in speed is not going to give you particularly good
results. Each run will execute a bunch of UPDATE statements that leave
behind dead rows. So the next run done for comparison sake will either
have to cope with that additional overhead, or it will end up triggering
autovacuum and suffer from that. If you're going to use an UPDATE
statement as your benchmark, at a minimum run a manual VACUUM ANALYZE in
between each test run, to level out the consistency of results a bit.
Ideally you'd restore the whole database to an initial state before each
test.

> I will spend the next week making the case for a native install of
> Linux, but first we need to be 100% sure that is the only way to get
> the most out of Postgres on this machine.

I really cannot imagine taking a system as powerful as you're using here
and crippling it by running through a VM. You should be running Ubuntu
directly on the hardware, ext3 filesystem without LVM, split off RAID-1
drive pairs dedicated to OS and WAL, then use the rest of them for the
database.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Bob Lunney <bob_lunney(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 03:27:11
Message-ID: A1A6FB57-537F-40C1-8A2A-563703DD7BE8@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


On Jun 14, 2010, at 11:53 AM, Tom Wilcox wrote:

>
>
> max_connections=3
> effective_cache_size=15GB
> maintenance_work_mem=5GB
> shared_buffers=7000MB
> work_mem=5GB
>

maintenance_work_mem doesn't need to be so high, it certainly has no effect on your queries below. It would affect vacuum, reindex, etc.

With fast disk like this (assuming your 700MB/sec above was not a typo) make sure you tune autovacuum up to be much more aggressive than the default (increase the allowable cost per sleep by at least 10x).

A big work_mem like above is OK if you know that no more than a couple sessions will be active at once. Worst case, a single connection ... probably ... won't use more than 2x that ammount.

> For now, I will go with the config using 7000MB shared_buffers. Any
> suggestions on how I can further optimise this config for a single
> session, 64-bit install utilising ALL of 96GB RAM. I will spend the next
> week making the case for a native install of Linux, but first we need to
> be 100% sure that is the only way to get the most out of Postgres on
> this machine.
>

Getting the most from the RAM does *_NOT_* mean making Postgres use all the RAM. Postgres relies on the OS file cache heavily. If there is a lot of free RAM for the OS to use to cache files, it will help the performance. Both Windows and Linux aggressively cache file pages and do a good job at it.


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Tom Wilcox <hungrytom(at)gmail(dot)com>, Bob Lunney <bob_lunney(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 03:49:40
Message-ID: 08A32E7F-5FFC-4341-A543-E0BFB6420B7E@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


On Jun 14, 2010, at 7:06 PM, Greg Smith wrote:

> I really cannot imagine taking a system as powerful as you're using here
> and crippling it by running through a VM. You should be running Ubuntu
> directly on the hardware, ext3 filesystem without LVM, split off RAID-1
> drive pairs dedicated to OS and WAL, then use the rest of them for the
> database.
>

Great points. There is one other option that is decent for the WAL:
If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved.

> --
> Greg Smith 2ndQuadrant US Baltimore, MD
> PostgreSQL Training, Services and Support
> greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Wilcox <hungrytom(at)gmail(dot)com>, Bob Lunney <bob_lunney(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-15 03:57:11
Message-ID: 26740.1276574231@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Scott Carey <scott(at)richrelevance(dot)com> writes:
> Great points. There is one other option that is decent for the WAL:
> If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved.

Uh, no, WAL really needs to be on its own *spindle*. The whole point
here is to have one disk head sitting on the WAL and not doing anything
else except writing to that file. Pushing WAL to a different partition
but still on the same physical disk is likely to be a net pessimization,
because it'll increase the average seek distance whenever the head does
have to move between WAL and everything-else-in-the-database.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Wilcox <hungrytom(at)gmail(dot)com>, Bob Lunney <bob_lunney(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-16 20:53:52
Message-ID: 1276721410-sup-6820@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Excerpts from Tom Lane's message of lun jun 14 23:57:11 -0400 2010:
> Scott Carey <scott(at)richrelevance(dot)com> writes:
> > Great points. There is one other option that is decent for the WAL:
> > If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved.
>
> Uh, no, WAL really needs to be on its own *spindle*. The whole point
> here is to have one disk head sitting on the WAL and not doing anything
> else except writing to that file.

However, there's another point here -- probably what Scott is on about:
on Linux (at least ext3), an fsync of any file does not limit to
flushing that file's blocks -- it flushes *ALL* blocks on *ALL* files in
the filesystem. This is particularly problematic if you have pgsql_tmp
in the same filesystem and do lots of disk-based sorts.

So if you have it in the same spindle but on a different filesystem, at
least you'll avoid that extra fsync work, even if you have to live with
the extra seeking.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-17 21:41:54
Message-ID: 4C1A96A2.2050607@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Tom Wilcox wrote:
> Any suggestions for good monitoring software for linux?

By monitoring, do you mean for alerting purposes or for graphing
purposes? Nagios is the only reasonable choice for the former, while
doing at best a mediocre job at the latter. For the later, I've found
that Munin does a good job of monitoring Linux and PostgreSQL in its out
of the box configuration, in terms of providing useful activity graphs.
And you can get it to play nice with Nagios.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Tom Wilcox <hungrytom(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-17 23:46:11
Message-ID: 4C1AB3C3.9090606@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 17/06/2010 22:41, Greg Smith wrote:
> Tom Wilcox wrote:
>> Any suggestions for good monitoring software for linux?
>
> By monitoring, do you mean for alerting purposes or for graphing
> purposes? Nagios is the only reasonable choice for the former, while
> doing at best a mediocre job at the latter. For the later, I've found
> that Munin does a good job of monitoring Linux and PostgreSQL in its
> out of the box configuration, in terms of providing useful activity
> graphs. And you can get it to play nice with Nagios.
>
Thanks Greg. Ill check Munin and Nagios out. It is very much for
graphing purposes. I would like to be able to perform objective,
platform-independent style performance comparisons.

Cheers,
Tom


From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Tom Wilcox <hungrytom(at)gmail(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-18 12:48:26
Message-ID: 20100618124826.GW23529@aart.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, Jun 18, 2010 at 12:46:11AM +0100, Tom Wilcox wrote:
> On 17/06/2010 22:41, Greg Smith wrote:
>> Tom Wilcox wrote:
>>> Any suggestions for good monitoring software for linux?
>>
>> By monitoring, do you mean for alerting purposes or for graphing purposes?
>> Nagios is the only reasonable choice for the former, while doing at best
>> a mediocre job at the latter. For the later, I've found that Munin does a
>> good job of monitoring Linux and PostgreSQL in its out of the box
>> configuration, in terms of providing useful activity graphs. And you can
>> get it to play nice with Nagios.
>>
> Thanks Greg. Ill check Munin and Nagios out. It is very much for graphing
> purposes. I would like to be able to perform objective,
> platform-independent style performance comparisons.
>
> Cheers,
> Tom
>
Zabbix-1.8+ is also worth taking a look at and it can run off our
favorite database. It allows for some very flexible monitoring and
trending data collection.

Regards,
Ken


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Tom Wilcox <hungrytom(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-18 18:11:53
Message-ID: 4C1BB6E9.1090105@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Kenneth Marshall wrote:
> Zabbix-1.8+ is also worth taking a look at and it can run off our
> favorite database. It allows for some very flexible monitoring and
> trending data collection.
>

Note that while Zabbix is perfectly reasonable general solution, the
number of things it monitors out of the box for PostgreSQL:
http://www.zabbix.com/wiki/howto/monitor/db/postgresql is only a
fraction of what Munin shows you. The main reason I've been suggesting
Munin lately is because it seems to get all the basics right for new
users without them having to do anything but activate the PostgreSQL
plug-in.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us


From: Scott Carey <scott(at)richrelevance(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Tom Wilcox <hungrytom(at)gmail(dot)com>, Bob Lunney <bob_lunney(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-19 02:59:14
Message-ID: C81FB781-FA32-462F-9C97-741E57A3DC76@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


On Jun 16, 2010, at 1:53 PM, Alvaro Herrera wrote:

> Excerpts from Tom Lane's message of lun jun 14 23:57:11 -0400 2010:
>> Scott Carey <scott(at)richrelevance(dot)com> writes:
>>> Great points. There is one other option that is decent for the WAL:
>>> If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved.
>>
>> Uh, no, WAL really needs to be on its own *spindle*. The whole point
>> here is to have one disk head sitting on the WAL and not doing anything
>> else except writing to that file.
>
> However, there's another point here -- probably what Scott is on about:
> on Linux (at least ext3), an fsync of any file does not limit to
> flushing that file's blocks -- it flushes *ALL* blocks on *ALL* files in
> the filesystem. This is particularly problematic if you have pgsql_tmp
> in the same filesystem and do lots of disk-based sorts.
>
> So if you have it in the same spindle but on a different filesystem, at
> least you'll avoid that extra fsync work, even if you have to live with
> the extra seeking.

yes, especially with a battery backed up caching raid controller the whole "own spindle" thing doesn't really matter, the WAL log writes fairly slowly and linearly and any controller with a damn will batch those up efficiently.

By FAR, the most important thing is to have WAL on its own file system. If using EXT3 in a way that is safe for your data (data = ordered or better), even with just one SATA disk, performance will improve a LOT if data and xlog are separated into different file systems. Yes, an extra spindle is better.

However with a decent RAID card or caching storage, 8 spindles for it all in one raid 10, with a partition for xlog and one for data, is often better performing than a mirrored pair for OS/xlog and 6 for data so long as the file systems are separated. With a dedicated xlog and caching reliable storage, you can even mount it direct to avoid polluting OS page cache.

>
> --
> Álvaro Herrera <alvherre(at)commandprompt(dot)com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Craig James <craig_james(at)emolecules(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-24 23:03:00
Message-ID: 4C23E424.8040605@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Can anyone tell me what's going on here? I hope this doesn't mean my system tables are corrupt...

Thanks,
Craig

select relname, pg_relation_size(relname) from pg_class
where pg_get_userbyid(relowner) = 'emol_warehouse_1'
and relname not like 'pg_%'
order by pg_relation_size(relname) desc;
ERROR: relation "rownum_temp" does not exist

emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
relname
----------------------
rownum_temp
(1 row)

emol_warehouse_1=> \d rownum_temp
Did not find any relation named "rownum_temp".
emol_warehouse_1=> create table rownum_temp(i int);
CREATE TABLE
emol_warehouse_1=> drop table rownum_temp;
DROP TABLE
emol_warehouse_1=> select relname, pg_relation_size(relname) from pg_class
where pg_get_userbyid(relowner) = 'emol_warehouse_1'
and relname not like 'pg_%'
order by pg_relation_size(relname) desc;
ERROR: relation "rownum_temp" does not exist

emol_warehouse_1=> select relname, pg_relation_size(relname) from pg_class;
ERROR: relation "tables" does not exist


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Craig James <craig_james(at)emolecules(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-24 23:19:25
Message-ID: 1277421485-sup-3743@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:

> select relname, pg_relation_size(relname) from pg_class
> where pg_get_userbyid(relowner) = 'emol_warehouse_1'
> and relname not like 'pg_%'
> order by pg_relation_size(relname) desc;
> ERROR: relation "rownum_temp" does not exist
>
> emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
> relname
> ----------------------
> rownum_temp
> (1 row)

What's the full row? I'd just add a "WHERE relkind = 'r'" to the above
query anyway.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Craig James <craig_james(at)emolecules(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: System tables screwed up? (WAS requested shared memory size overflows size_t)
Date: 2010-06-24 23:24:44
Message-ID: 4C23E93C.7010504@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 6/24/10 4:19 PM, Alvaro Herrera wrote:
> Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:
>
>> select relname, pg_relation_size(relname) from pg_class
>> where pg_get_userbyid(relowner) = 'emol_warehouse_1'
>> and relname not like 'pg_%'
>> order by pg_relation_size(relname) desc;
>> ERROR: relation "rownum_temp" does not exist
>>
>> emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
>> relname
>> ----------------------
>> rownum_temp
>> (1 row)
>
> What's the full row? I'd just add a "WHERE relkind = 'r'" to the above
> query anyway.

Thanks, in fact that works. But my concern is that these are system tables and system functions and yet they seem to be confused. I've used this query dozens of times and never seen this behavior before. It makes me really nervous...

Craig

P.S. Sorry I got the Subject wrong the first time by hitting the REPLY key mindlessly, I've changed it now.


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Craig James <craig_james(at)emolecules(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: System tables screwed up? (WAS requested shared memory size overflows size_t)
Date: 2010-06-24 23:55:01
Message-ID: 1277423407-sup-289@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Excerpts from Craig James's message of jue jun 24 19:24:44 -0400 2010:
> On 6/24/10 4:19 PM, Alvaro Herrera wrote:
> > Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:
> >
> >> select relname, pg_relation_size(relname) from pg_class
> >> where pg_get_userbyid(relowner) = 'emol_warehouse_1'
> >> and relname not like 'pg_%'
> >> order by pg_relation_size(relname) desc;
> >> ERROR: relation "rownum_temp" does not exist
> >>
> >> emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
> >> relname
> >> ----------------------
> >> rownum_temp
> >> (1 row)
> >
> > What's the full row? I'd just add a "WHERE relkind = 'r'" to the above
> > query anyway.
>
> Thanks, in fact that works. But my concern is that these are system tables and system functions and yet they seem to be confused. I've used this query dozens of times and never seen this behavior before. It makes me really nervous...

I think you're being bitten by lack of schema qualification. Perhaps
you ought to pass pg_class.oid to pg_relation_size instead of relname.
What did you do to make pg_relation_size to work on type name?

Why is this a -performance question anyway?

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Craig James <craig_james(at)emolecules(dot)com>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-25 03:05:06
Message-ID: AANLkTikWnvbzpgUr17fWe44lQkTmmIdqYwfx323FQ14K@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thu, Jun 24, 2010 at 7:19 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:
>
>> select relname, pg_relation_size(relname) from pg_class
>>          where pg_get_userbyid(relowner) = 'emol_warehouse_1'
>>          and relname not like 'pg_%'
>>          order by pg_relation_size(relname) desc;
>> ERROR:  relation "rownum_temp" does not exist
>>
>> emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
>>         relname
>> ----------------------
>>   rownum_temp
>> (1 row)
>
> What's the full row?  I'd just add a "WHERE relkind = 'r'" to the above
> query anyway.

Yeah - also, it would probably be good to call pg_relation_size on
pg_class.oid rather than pg_class.relname, to avoid any chance of
confusion over which objects are in which schema.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Jim Montgomery <monty1967(at)hotmail(dot)com>
To: <robertmhaas(at)gmail(dot)com>, <alvherre(at)commandprompt(dot)com>
Cc: <craig_james(at)emolecules(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: requested shared memory size overflows size_t
Date: 2010-06-25 04:59:33
Message-ID: SNT123-W49BF46CDDFB62AD8A15444A7C70@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


Remove me from your email traffic.

> Date: Thu, 24 Jun 2010 23:05:06 -0400
> Subject: Re: [PERFORM] requested shared memory size overflows size_t
> From: robertmhaas(at)gmail(dot)com
> To: alvherre(at)commandprompt(dot)com
> CC: craig_james(at)emolecules(dot)com; pgsql-performance(at)postgresql(dot)org
>
> On Thu, Jun 24, 2010 at 7:19 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
> > Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:
> >
> >> select relname, pg_relation_size(relname) from pg_class
> >> where pg_get_userbyid(relowner) = 'emol_warehouse_1'
> >> and relname not like 'pg_%'
> >> order by pg_relation_size(relname) desc;
> >> ERROR: relation "rownum_temp" does not exist
> >>
> >> emol_warehouse_1=> select relname from pg_class where relname = 'rownum_temp';
> >> relname
> >> ----------------------
> >> rownum_temp
> >> (1 row)
> >
> > What's the full row? I'd just add a "WHERE relkind = 'r'" to the above
> > query anyway.
>
> Yeah - also, it would probably be good to call pg_relation_size on
> pg_class.oid rather than pg_class.relname, to avoid any chance of
> confusion over which objects are in which schema.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

_________________________________________________________________
http://clk.atdmt.com/UKM/go/197222280/direct/01/
We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now