Re: PostgreSQL win32 fragmentation issue

Lists: pgsql-hackers
From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 06:58:44
Message-ID: 45712424.5060501@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I ran a simple test... I ran defrag on my drives. Then I initialized
pgbench with a -s 1000.

11% fragmentation. I dropped the bench database, and my fragmentation is 1%.

I know this isn't *our* fault :) but I am curious if there is anything
we can do about the way postgresql writes files to help limit fragmentation.

Essentially, this makes win32 impossible in a 24x7 environment (jokes
aside about Win32 in general) because we *have* to defrag on Windows and
Windows won't defrag open files (thus anything PostgreSQL is using).

Sincerely,

Joshua D. Drake


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 13:31:32
Message-ID: 20061202133132.GB26365@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
> I know this isn't *our* fault :) but I am curious if there is anything
> we can do about the way postgresql writes files to help limit fragmentation.
>
> Essentially, this makes win32 impossible in a 24x7 environment (jokes
> aside about Win32 in general) because we *have* to defrag on Windows and
> Windows won't defrag open files (thus anything PostgreSQL is using).

I thought fragmentation was something that disappeared with the FAT
filesystem. Isn't NTFS smart enought o avoid fragmentation in the first
place?

The way you avoid fragmentation is by preallocating larger blocks, but
most filesystems are smart enough to handle that somehwta
automatically.

BTW, do you know what 11% fragmentation means? Does that mean each file
is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
all that bad.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:17:27
Message-ID: 4571A717.3030207@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout wrote:
> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
>
>> I know this isn't *our* fault :) but I am curious if there is anything
>> we can do about the way postgresql writes files to help limit fragmentation.
>>
>> Essentially, this makes win32 impossible in a 24x7 environment (jokes
>> aside about Win32 in general) because we *have* to defrag on Windows and
>> Windows won't defrag open files (thus anything PostgreSQL is using).
>>
>
> BTW, do you know what 11% fragmentation means? Does that mean each file
> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
> all that bad.

Or more specifically do you have any idea what it means in terms of
performance? And why do you say that you *have to* defrag under
windows, isn't this only a performance issue and not necessarily a huge
one at that? Also, as a solution, I would think that CLUSTER might help.

Matt


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:34:04
Message-ID: 1165077244.18923.3.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2006-12-02 at 14:31 +0100, Martijn van Oosterhout wrote:
> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
> > I know this isn't *our* fault :) but I am curious if there is anything
> > we can do about the way postgresql writes files to help limit fragmentation.
> >
> > Essentially, this makes win32 impossible in a 24x7 environment (jokes
> > aside about Win32 in general) because we *have* to defrag on Windows and
> > Windows won't defrag open files (thus anything PostgreSQL is using).
>
> I thought fragmentation was something that disappeared with the FAT
> filesystem. Isn't NTFS smart enought o avoid fragmentation in the first
> place?

Nope... you still have the good old defrag command (well button now).

> BTW, do you know what 11% fragmentation means? Does that mean each file
> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
> all that bad.

No, the report had Total Fragmentation, and File Fragmentation. I was
reporting on Total, which I assume is some aggregation.

My concern is that this is over a single bench run. I could imagine that
after a week or two weeks of stead PostgreSQL use, the IO would
gradually get worse and worse.

Joshua D. Drake

>
> Have a nice day,
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:34:53
Message-ID: 20061202163453.GC11525@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:

> My concern is that this is over a single bench run. I could imagine that
> after a week or two weeks of stead PostgreSQL use, the IO would
> gradually get worse and worse.

Well, we mostly do random reads (seek) all the time anyway, so is this
really a concern?

Are you using NTFS or FAT? You didn't answer that question.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:36:04
Message-ID: 1165077364.18923.5.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Or more specifically do you have any idea what it means in terms of
> performance? And why do you say that you *have to* defrag under
> windows, isn't this only a performance issue and not necessarily a huge
> one at that? Also, as a solution, I would think that CLUSTER might help.

Yes it is a performance issue. And all performance issues if they can
not be resolved within the database or application, eventually become an
outage.

A CLUSTER may, but that puts us back to an outage. A CLUSTER is an
exclusive lock. Usually for long periods of time.

Joshua D. Drake

>
> Matt
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:41:18
Message-ID: 1165077678.18923.9.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2006-12-02 at 13:34 -0300, Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
> > My concern is that this is over a single bench run. I could imagine that
> > after a week or two weeks of stead PostgreSQL use, the IO would
> > gradually get worse and worse.
>
> Well, we mostly do random reads (seek) all the time anyway, so is this
> really a concern?

Wouldn't it be for writes?

>
> Are you using NTFS or FAT? You didn't answer that question.

NTFS of course :) (sorry).

Joshua D. Drake

>
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:43:27
Message-ID: 2463.24.211.165.134.1165077807.squirrel@www.dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
>> My concern is that this is over a single bench run. I could imagine that
>> after a week or two weeks of stead PostgreSQL use, the IO would
>> gradually get worse and worse.
>
> Well, we mostly do random reads (seek) all the time anyway, so is this
> really a concern?
>
> Are you using NTFS or FAT? You didn't answer that question.
>

If it's FAT then you deserve what you get.

If you really need this, a Slony setup would probably work well. Shut down
replica, defrag, start up replica, wait for sync completion, switchover
and repeat. The new Slony release is said to support Windows.

cheers

andrew


From: "Thomas H(dot)" <me(at)alternize(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:46:43
Message-ID: 0ea701c71631$73e87050$0201a8c0@iwing
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Martijn van Oosterhout wrote:
>> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
>>
>>> I know this isn't *our* fault :) but I am curious if there is anything
>>> we can do about the way postgresql writes files to help limit
>>> fragmentation.
>>>
>>> Essentially, this makes win32 impossible in a 24x7 environment (jokes
>>> aside about Win32 in general) because we *have* to defrag on Windows and
>>> Windows won't defrag open files (thus anything PostgreSQL is using).
>>>
>>
>> BTW, do you know what 11% fragmentation means? Does that mean each file
>> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
>> all that bad.

in our win32/ntfs environment, only 6 pgsql data-files are fragmented. but
they are heavily fragmented. fragmentiation ranges from 1369 fragments for a
14mb file to 4548 fragments for a 628mb one... the database is only 1 week
old.

- thomas


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 16:56:35
Message-ID: 1165078595.18923.24.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> > Well, we mostly do random reads (seek) all the time anyway, so is this
> > really a concern?
> >
> > Are you using NTFS or FAT? You didn't answer that question.
> >
>
> If it's FAT then you deserve what you get.

I am investigating this for the wider community. I do not, use
PostgreSQL Win32 for anything but testing, but guess what -- our silent
majority does. I have a customer with 5000 Win32 installations and they
are not the only ones with those types of deployments.

I am trying to get as much information as possible so that:

1. We can update the documentation to correctly reflect issues with
Win32.

2. We can hopefully become the dominant database on Win32.

>
> If you really need this, a Slony setup would probably work well.

The idea of Slony with Windows is a bit terryfying considering the
target Admin audience and the performance hit one would take.

> Shut down
> replica, defrag, start up replica, wait for sync completion, switchover
> and repeat. The new Slony release is said to support Windows.

O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
The fact that this was suggested as anything but a last resort isn't to
inspiring.

Sincerely,

Joshua D. Drake

>
> cheers
>
> andrew
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 17:05:40
Message-ID: 2491.24.211.165.134.1165079140.squirrel@www.dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
>
> O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> The fact that this was suggested as anything but a last resort isn't to
> inspiring.
>

Before you start ringing alarm bells, you need to instrument the actual
performance effect. So far all I have seen is an assumption that the
effect will be serious. Let's see some performance metrics that
demonstrate the problem you think might exist.

The Windows port has been out for nearly 2 years. The fact that we have
not seen complaints about this leads me to be somewhat skeptical. You
could be right, but I want hard evidence. (And why wouldn't this be a
problem for any DBMS running on Windows? There are just huge numbers of
24/7 Windows servers running SQLServer or Oracle.)

cheers

andrew


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 17:22:11
Message-ID: 1165080131.18923.33.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
> Joshua D. Drake wrote:
> >
> > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> > The fact that this was suggested as anything but a last resort isn't to
> > inspiring.
> >
>
> Before you start ringing alarm bells, you need to instrument the actual
> performance effect.

Alarm bells? I saw a potential issue, I brought it to the community.
That is all.

> So far all I have seen is an assumption that the
> effect will be serious. Let's see some performance metrics that
> demonstrate the problem you think might exist.

It is not an assumption that defragmentation causes performance issues.
The assumption is that the above fact will cause problems with
PostgreSQL performance.

Yes, the PostgreSQL portion is an assumption which is why I have brought
it to the community for further analysis.

>
> The Windows port has been out for nearly 2 years. The fact that we have
> not seen complaints about this leads me to be somewhat skeptical.

Fair enough but I don't hold too much weight to this argument as even
the remote idea that anything but the slimmest <1% of our Wndows users
even read this list (let alone post to it) beyond a possible Google
search is far fetched.

> You
> could be right, but I want hard evidence.

Sure, which is why I opened the dialog.

> (And why wouldn't this be a
> problem for any DBMS running on Windows? There are just huge numbers of
> 24/7 Windows servers running SQLServer or Oracle.)

Well I don't know about SQLServer but Oracle uses a single file (at
least it does on Unix). I don't know if that would make a difference or
not.

Sincerely,

Joshua D. Drake

>
> cheers
>
> andrew
>
>
--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 17:45:47
Message-ID: b42b73150612020945g57ccb798k26328c99b2e364f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/2/06, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
> Well I don't know about SQLServer but Oracle uses a single file (at
> least it does on Unix). I don't know if that would make a difference or
> not.

ditto sql server. I agree with Andrew though, let's determine there
to be negative performance impact before assuming the worst.

merlin


From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 18:23:53
Message-ID: 1077.24.211.165.134.1165083833.squirrel@www.dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
> On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
>> Joshua D. Drake wrote:
>> >
>> > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
>> > The fact that this was suggested as anything but a last resort isn't
>> to
>> > inspiring.
>> >
>>
>> Before you start ringing alarm bells, you need to instrument the actual
>> performance effect.
>
> Alarm bells? I saw a potential issue, I brought it to the community.
> That is all.

You said:

"Essentially, this makes win32 impossible in a 24x7 environment (jokes
aside about Win32 in general) because we *have* to defrag on Windows and
Windows won't defrag open files (thus anything PostgreSQL is using)."

What I am asking for is evidence to back up that assertion. At the moment
it is, as Tom likes to say, "a fact not in evidence".

(I should have thought a single file DB would be MORE liable to
fragmentation, BTW).

cheers

andrew


From: "Thomas H(dot)" <me(at)alternize(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 18:35:31
Message-ID: 0f5b01c71640$a65ab930$0201a8c0@iwing
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> (I should have thought a single file DB would be MORE liable to
> fragmentation, BTW).

AFAIK what mssql does against fragmentation is: it preallocates its
tablespace in chunks of several mb/gb and then starts filling the free
allocated space until it has to reallocate more. the fragmentation for such
a single file (actually in most times 2 files for mssql) is practically
ignorable.

- thomas


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 18:47:26
Message-ID: 1165085246.3778.991.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2006-12-02 at 09:22 -0800, Joshua D. Drake wrote:
> On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
> > Joshua D. Drake wrote:
> > >
> > > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> > > The fact that this was suggested as anything but a last resort isn't to
> > > inspiring.
> > >
> >
> > Before you start ringing alarm bells, you need to instrument the actual
> > performance effect.
>
> Alarm bells? I saw a potential issue, I brought it to the community.
> That is all.

That's understood and appreciated. In case no one else says it: thanks.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 20:17:38
Message-ID: 1165090658.18923.67.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Starting point: 1% Fragmentation
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 46.001927 (including connections establishing)
tps = 46.045008 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 50.507082 (including connections establishing)
tps = 50.558191 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 100000/100000
tps = 45.976461 (including connections establishing)
tps = 46.026986 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

Initialize with scaling of 100

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 43.054164 (including connections establishing)
tps = 43.090163 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

O.k. my guess here is that autovacuum is keeping up with the actual
pgbench runs and thus causing reusable tuples? Does that make sense?

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Thomas H(dot)" <me(at)alternize(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 20:21:26
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA0FD94@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> >>> I know this isn't *our* fault :) but I am curious if there is
> >>> anything we can do about the way postgresql writes files to help
> >>> limit fragmentation.
> >>>
> >>> Essentially, this makes win32 impossible in a 24x7 environment
> >>> (jokes aside about Win32 in general) because we *have* to
> defrag on
> >>> Windows and Windows won't defrag open files (thus
> anything PostgreSQL is using).
> >>>
> >>
> >> BTW, do you know what 11% fragmentation means? Does that mean each
> >> file is on average split in 9 pieces, because for a 1GB file, 9
> >> pieces isn't all that bad.
>
> in our win32/ntfs environment, only 6 pgsql data-files are
> fragmented. but they are heavily fragmented. fragmentiation
> ranges from 1369 fragments for a 14mb file to 4548 fragments
> for a 628mb one... the database is only 1 week old.

Which relations do these files represent?

//Magnus


From: "Thomas H(dot)" <me(at)alternize(dot)com>
To: "Magnus Hagander" <mha(at)sollentuna(dot)net>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 23:07:19
Message-ID: 102c01c71666$9f347940$0201a8c0@iwing
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> in our win32/ntfs environment, only 6 pgsql data-files are
>> fragmented. but they are heavily fragmented. fragmentiation
>> ranges from 1369 fragments for a 14mb file to 4548 fragments
>> for a 628mb one... the database is only 1 week old.

> Which relations do these files represent?

all 6 of them tables are tables. makes sense as reindex would recreate the
index file more or less at once and thus without much fragmentation...

- thomas


From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 23:09:18
Message-ID: 4572079E.9020204@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:

>
> (I should have thought a single file DB would be MORE liable to
> fragmentation, BTW).
>

The likes of Oracle/SQLServer preallocate the files(s) at database
creation time, so there will be little or no initial fragmentation.

However, they are typically allowed to grow to accommodate extra data -
this means that fragmentation over time *can* become a problem for these
guys too - but looks like we can suffer it from day 1.

Cheers

Mark


From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 23:31:37
Message-ID: 45720CD9.70809@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Kirkwood wrote:
> Andrew Dunstan wrote:
>
>>
>> (I should have thought a single file DB would be MORE liable to
>> fragmentation, BTW).
>>
>
> The likes of Oracle/SQLServer preallocate the files(s) at database
> creation time, so there will be little or no initial fragmentation.
>

... To clarify - preallocated with free space in the files :-) ...

A related point comes to mind, consider the single "tablespace" file for
example: It may have no fragmentation from a *file* perspective, but
supposing you have several busy relations being inserted or updated then
pages (or groups of pages) for these could be "fragmented" throughout
the tablespace file. This can reduce scan performance, since seeks are
now involved to "jump over" the page chunks for relations you are not
interested in.

Some Oracle DBA's I knew used to regularly export|rebuild|import their
databases precisely to "defragment" their relation|file page distribution.

I'm not trying to say that we don't have an issue, just pointing out
that preallocated files are not necessarily a complete or ideal solution.

Cheers

Mark


From: "Thomas H(dot)" <me(at)alternize(dot)com>
To: "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-02 23:58:46
Message-ID: 10e701c7166d$cef5f3a0$0201a8c0@iwing
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

by the way... at least on windows 2003 file fragmentation is *not* really an
issue. thanks to the vssvc (shadowcopy) even files that are in use are being
defragmented. as far as i know, vssvc is also there in xp, but i can't
verify.

running a quick defrag on the db data partition defragmented all the
beforementioned heavily fragmented relation files without any problems,
eventhough the tables are actually being updated and in use while
defragmenting.

a regular defrag will be sufficient to keep the performance hit low - if
there is any noticable in first place, that is.

- thomas


From: Shane Ambler <pgsql(at)007Marketing(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 03:03:59
Message-ID: 45723E9F.2080307@007Marketing.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:

> I am trying to get as much information as possible so that:
>

Well I have a small postgresql setup on my girlfriends Win XP Pro
machine that only had a little testing use a few months ago.

The data folder is 84MB and even though there is 25% total fragmentation
and 45% file fragmentation on the drive - the largest being the
pc-cillin log with 75,104 fragments - none of the fragmented files are
postgresql data files. (Been a while since I defraged this)

If we stop and think about how windows (and mac os x) allocate file
space as compared to *nix file systems then we can assume a few things
that could be tested to verify.

The first test you mentioned you defraged with an existing db then ran
the tests and then checked fragmentation again. Try this instead -
remove the contents of the data folder - defrag - initdb - run test and
see what fragmentation you get.

Try installing the data folder on a separate drive with nothing else on
it and run the test and then look at fragmentation.

How about some performance tests to measure the difference between badly
fragmented data files and un-fragmented data files. It may not make a
significant difference.

--

Shane Ambler
pgSQL(at)007Marketing(dot)com

Get Sheeky @ http://Sheeky.Biz


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 04:36:15
Message-ID: 12796.1165120575@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> writes:
> A related point comes to mind, consider the single "tablespace" file for
> example: It may have no fragmentation from a *file* perspective, but
> supposing you have several busy relations being inserted or updated then
> pages (or groups of pages) for these could be "fragmented" throughout
> the tablespace file.

+1 ... what was said upthread sounds to me like those other databases
are just hiding the fragmentation issue within their
huge-files-you-can't-see-into. I would very much like to see some proof
of performance problems before we worry about this.

regards, tom lane


From: "Gurjeet Singh" <singh(dot)gurjeet(at)gmail(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 08:56:34
Message-ID: 65937bea0612030056s2816d005ibac887ef0e8f5dae@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/2/06, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
>
> On Sat, 2006-12-02 at 13:34 -0300, Alvaro Herrera wrote:
> > Joshua D. Drake wrote:
> >
> > > My concern is that this is over a single bench run. I could imagine
> that
> > > after a week or two weeks of stead PostgreSQL use, the IO would
> > > gradually get worse and worse.
> >
> > Well, we mostly do random reads (seek) all the time anyway, so is this
> > really a concern?
>
> Wouldn't it be for writes?

Also, huge index range scans can be badly affected by file level
fragmentation.

Regards,

--
gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | yahoo }.com


From: "Gurjeet Singh" <singh(dot)gurjeet(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 09:03:12
Message-ID: 65937bea0612030103n76b755bv2bd219e9c7aac7f4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/3/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> writes:
> > A related point comes to mind, consider the single "tablespace" file for
> > example: It may have no fragmentation from a *file* perspective, but
> > supposing you have several busy relations being inserted or updated then
> > pages (or groups of pages) for these could be "fragmented" throughout
> > the tablespace file.
>
> +1 ... what was said upthread sounds to me like those other databases
> are just hiding the fragmentation issue within their
> huge-files-you-can't-see-into. I would very much like to see some proof
> of performance problems before we worry about this.
>
>
Does this raise the need for an in-postgres de-fragmenter?

--
gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | yahoo }.com
GTODO


From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Martijn van Oosterhout" <kleptog(at)svana(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 10:54:05
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA0FD97@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > A related point comes to mind, consider the single
> "tablespace" file
> > for
> > example: It may have no fragmentation from a *file*
> perspective, but
> > supposing you have several busy relations being inserted or updated
> > then pages (or groups of pages) for these could be "fragmented"
> > throughout the tablespace file.
>
> +1 ... what was said upthread sounds to me like those other databases
> are just hiding the fragmentation issue within their
> huge-files-you-can't-see-into. I would very much like to see
> some proof of performance problems before we worry about this.

Yes, they definitly do. That's why for exapmle SQLServer provides the
commands "DBCC CHECKCONTIG" and "DBCC INDEXDEFRAG". Along with comments
like this:

"DBCC SHOWCONTIG determines whether the table is heavily fragmented.
Table fragmentation occurs through the process of data modifications
(INSERT, UPDATE, and DELETE statements) made against the table. Because
these modifications are not ordinarily distributed equally among the
rows of the table, the fullness of each page can vary over time. For
queries that scan part or all of a table, such table fragmentation can
cause additional page reads. This hinders parallel scanning of data."

//Magnus


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Magnus Hagander <mha(at)sollentuna(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 13:57:08
Message-ID: 20061203135708.GA22386@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Dec 03, 2006 at 11:54:05AM +0100, Magnus Hagander wrote:
> Yes, they definitly do. That's why for exapmle SQLServer provides the
> commands "DBCC CHECKCONTIG" and "DBCC INDEXDEFRAG". Along with comments
> like this:
>
> "DBCC SHOWCONTIG determines whether the table is heavily fragmented.
> Table fragmentation occurs through the process of data modifications
> (INSERT, UPDATE, and DELETE statements) made against the table. Because
> these modifications are not ordinarily distributed equally among the
> rows of the table, the fullness of each page can vary over time. For
> queries that scan part or all of a table, such table fragmentation can
> cause additional page reads. This hinders parallel scanning of data."

But that's measuring something else I think. That's not looking at how
the pages are physically mapped on disk, but at how tuples are spread
across pages.. Maybe in sqlserver tuples can span pages?

That's not a kind of fragmentation relevent to postgresql. This is what
the FSM is for.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Mark Kirkwood" <markir(at)paradise(dot)net(dot)nz>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2006-12-03 14:22:20
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCEA0FD9D@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > "DBCC SHOWCONTIG determines whether the table is heavily fragmented.
> > Table fragmentation occurs through the process of data
> modifications
> > (INSERT, UPDATE, and DELETE statements) made against the table.
> > Because these modifications are not ordinarily distributed equally
> > among the rows of the table, the fullness of each page can
> vary over
> > time. For queries that scan part or all of a table, such table
> > fragmentation can cause additional page reads. This hinders
> parallel scanning of data."
>
> But that's measuring something else I think. That's not
> looking at how the pages are physically mapped on disk, but
> at how tuples are spread across pages.. Maybe in sqlserver
> tuples can span pages?

I don't beleive they can (except for IMAGE and TEXT data, which is
handled like our TOAST data).

That said, it returns two numbers:
Scan Density, which shows how many more pages it needs to hit than what
would be ideal
and
Locical Scan Fragmentation, which shows the percentage of "out-of-order
pages" it hits. And isn't "out-of-order pages" exactly what file system
fragmentation would leave us? The difference between the physical page
and the logical page location.

Given that they preallocate files, they only have this kind of
fragmentation at one level. Since we don't, we can have this both inside
the file and in the fliesystem. But it's still the same thing, isn't it?

//Magnus


From: jhaile(at)gmail(dot)com
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL win32 fragmentation issue
Date: 2007-01-16 20:51:06
Message-ID: 1168980665.522392.45920@a75g2000cwd.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have a production database that is running on Win32 NTFS with
PostgreSQL 8.2.1. The drive that the data is stored on is dedicated to
PostgreSQL data files (transaction log stored on a separate disk) The
data partition is currently 44% fragmented, which I believe is
resulting in performance degradation. I don't have any benchmarks
though.

"Joshua D. Drake" wrote:
> Hello,
>
> I ran a simple test... I ran defrag on my drives. Then I initialized
> pgbench with a -s 1000.
>
> 11% fragmentation. I dropped the bench database, and my fragmentation is 1%.
>
> I know this isn't *our* fault :) but I am curious if there is anything
> we can do about the way postgresql writes files to help limit fragmentation.
>
> Essentially, this makes win32 impossible in a 24x7 environment (jokes
> aside about Win32 in general) because we *have* to defrag on Windows and
> Windows won't defrag open files (thus anything PostgreSQL is using).
>
> Sincerely,
>
> Joshua D. Drake
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend