Re: Can postgres create a file with physically continuous blocks.

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Rob Wultsch <wultsch(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, flyusa2010 fly <flyusa2010(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Can postgres create a file with physically continuous blocks.
Date: 2010-12-22 07:15:17
Message-ID: 4D11A585.2080706@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22.12.2010 03:45, Rob Wultsch wrote:
> On Tue, Dec 21, 2010 at 4:49 AM, Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby<jim(at)nasby(dot)net> wrote:
>>> On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:
>>>> Does postgres make an effort to create a file with physically continuous blocks?
>>>
>>> AFAIK all files are expanded as needed. I don't think there's any flags you can pass to the filesystem to tell it "this file will eventually be 1GB in size". So, we're basically at the mercy of the FS to try and keep things contiguous.
>>
>> There have been some reports that we would do better on some
>> filesystems if we extended the file more than a block at a time, as we
>> do today. However, AFAIK, no one is pursuing this ATM.
>
> The has been found to be the case in the MySQL world, particularly
> when ext3 is in use:
> http://forge.mysql.com/worklog/task.php?id=4925
> http://www.facebook.com/note.php?note_id=194501560932

These seem to be about extending the transaction log, and we already
pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand
that extending that in small chunks would hurt performance a lot, as the
filesystem needs to flush the metadata changes to disk at every commit.
However, that's not an issue with extending data files, they are only
fsync'd at checkpoints.

It might well be advantageous to extend data files in larger chunks too,
but it's probably nowhere near as important as with the WAL.

> Also, InnoDB has an option for how much data should be allocated at
> the end of a tablespace when it needs to grow:
> http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path

Hmm, innodb_autoextend_increment seems more like what we're discussing
here
(http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
If I'm reading that correctly, InnoDB defaults to extending files in 8MB
chunks.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-12-22 07:20:47 Re: How much do the hint bits help?
Previous Message Fujii Masao 2010-12-22 06:51:09 Re: bug in SignalSomeChildren