Re: WAL logging volume and CREATE TABLE

Lists: pgsql-hackers
From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: WAL logging volume and CREATE TABLE
Date: 2011-08-02 13:34:56
Message-ID: 201108021334.p72DYuK08048@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Our docs suggest an optimization to reduce WAL logging when you are
creating and populating a table:

http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS

In minimal level, WAL-logging of some bulk operations, like CREATE
INDEX, CLUSTER and COPY on a table that was created or truncated in the
same transaction can be safely skipped, which can make those operations
much faster (see Section 14.4.7). But minimal WAL does not contain
enough information to reconstruct the data from a base backup and the
WAL logs, so either archive or hot_standby level must be used to enable
WAL archiving (archive_mode) and streaming replication.

I am confused why we issue significant WAL traffic for CREATE INDEX?
Isn't the index either created or removed if the transaction fails?
What crash recovery activity state do we need WAL logging for? I
realize we have to do WAL logging for streaming replication, but CREATE
TABLE isn't going to affect that. I also realize the index has to be
on disk on commit, but the same is true for doing the CREATE TABLE in
the same transaction block.

Does this optimization work for INSERT ... SELECT? Is this optimization
automatic for CREATE TABLE AS (SELECT INTO)?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 14:02:05
Message-ID: CAHyXU0zetcSka569P7DVk_SY6CUu7=FGkJqiEnpRahs3to_c5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 2, 2011 at 8:34 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Our docs suggest an optimization to reduce WAL logging when you are
> creating and populating a table:
>
>        http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
>
>        In minimal level, WAL-logging of some bulk operations, like CREATE
>        INDEX, CLUSTER and COPY on a table that was created or truncated in the
>        same transaction can be safely skipped, which can make those operations
>        much faster (see Section 14.4.7). But minimal WAL does not contain
>        enough information to reconstruct the data from a base backup and the
>        WAL logs, so either archive or hot_standby level must be used to enable
>        WAL archiving (archive_mode) and streaming replication.
>
> I am confused why we issue significant WAL traffic for CREATE INDEX?
> Isn't the index either created or removed if the transaction fails?
> What crash recovery activity state do we need WAL logging for?  I
> realize we have to do WAL logging for streaming replication, but CREATE
> TABLE isn't going to affect that.   I also realize the index has to be
> on disk on commit, but the same is true for doing the CREATE TABLE in
> the same transaction block.
>
> Does this optimization work for INSERT ... SELECT?

I don't think so -- insert/select doesn't take a full table lock and
it writes to the heap. The optimization only works when other
backends will never see/touch the data being written out until it is
finished and it doesn't matter if the data is scrambled due to a
crash. CREATE INDEX might work though.

merlin


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 14:27:13
Message-ID: 4E380941.8080901@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02.08.2011 16:34, Bruce Momjian wrote:
> Our docs suggest an optimization to reduce WAL logging when you are
> creating and populating a table:
>
> http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
>
> In minimal level, WAL-logging of some bulk operations, like CREATE
> INDEX, CLUSTER and COPY on a table that was created or truncated in the
> same transaction can be safely skipped, which can make those operations
> much faster (see Section 14.4.7). But minimal WAL does not contain
> enough information to reconstruct the data from a base backup and the
> WAL logs, so either archive or hot_standby level must be used to enable
> WAL archiving (archive_mode) and streaming replication.
>
> I am confused why we issue significant WAL traffic for CREATE INDEX?
> Isn't the index either created or removed if the transaction fails?
> What crash recovery activity state do we need WAL logging for? I
> realize we have to do WAL logging for streaming replication, but CREATE
> TABLE isn't going to affect that. I also realize the index has to be
> on disk on commit, but the same is true for doing the CREATE TABLE in
> the same transaction block.

I'm confused about what you're confused about. Crash recovery doesn't
need the WAL for CREATE INDEX, but WAL archiving does.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 14:52:44
Message-ID: 19345.1312296764@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Our docs suggest an optimization to reduce WAL logging when you are
> creating and populating a table:

> http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS

> In minimal level, WAL-logging of some bulk operations, like CREATE
> INDEX, CLUSTER and COPY on a table that was created or truncated in the
> same transaction can be safely skipped, which can make those operations
> much faster (see Section 14.4.7). But minimal WAL does not contain
> enough information to reconstruct the data from a base backup and the
> WAL logs, so either archive or hot_standby level must be used to enable
> WAL archiving (archive_mode) and streaming replication.

> I am confused why we issue significant WAL traffic for CREATE INDEX?

The point is that in minimal level we *don't*. We just fsync the index
file before committing. In higher levels we have to write the whole
index contents to the WAL, not only the disk file, so that the info
reaches the archive or standby slaves.

Same for the other cases.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 15:30:30
Message-ID: 201108021530.p72FUUl29245@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Our docs suggest an optimization to reduce WAL logging when you are
> > creating and populating a table:
>
> > http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
>
> > In minimal level, WAL-logging of some bulk operations, like CREATE
> > INDEX, CLUSTER and COPY on a table that was created or truncated in the
> > same transaction can be safely skipped, which can make those operations
> > much faster (see Section 14.4.7). But minimal WAL does not contain
> > enough information to reconstruct the data from a base backup and the
> > WAL logs, so either archive or hot_standby level must be used to enable
> > WAL archiving (archive_mode) and streaming replication.
>
> > I am confused why we issue significant WAL traffic for CREATE INDEX?
>
> The point is that in minimal level we *don't*. We just fsync the index
> file before committing. In higher levels we have to write the whole
> index contents to the WAL, not only the disk file, so that the info
> reaches the archive or standby slaves.
>
> Same for the other cases.

I realize the need for WAL logging CREATE INDEX for non-'minimal'
wal_level values.

But the documentation states the WAL logging is reduced for CREATE INDEX
by doing CREATE TABLE in the same transaction block. Why is this true?
Why would the CREATE TABLE affect the "CREATE INDEX" WAL volume?

I am wondering if the documention is correct about CLUSTER and COPY, but
incorrect for CREATE INDEX.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 15:32:57
Message-ID: 201108021532.p72FWvQ29530@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure wrote:
> On Tue, Aug 2, 2011 at 8:34 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Our docs suggest an optimization to reduce WAL logging when you are
> > creating and populating a table:
> >
> > ? ? ? ?http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
> >
> > ? ? ? ?In minimal level, WAL-logging of some bulk operations, like CREATE
> > ? ? ? ?INDEX, CLUSTER and COPY on a table that was created or truncated in the
> > ? ? ? ?same transaction can be safely skipped, which can make those operations
> > ? ? ? ?much faster (see Section 14.4.7). But minimal WAL does not contain
> > ? ? ? ?enough information to reconstruct the data from a base backup and the
> > ? ? ? ?WAL logs, so either archive or hot_standby level must be used to enable
> > ? ? ? ?WAL archiving (archive_mode) and streaming replication.
> >
> > I am confused why we issue significant WAL traffic for CREATE INDEX?
> > Isn't the index either created or removed if the transaction fails?
> > What crash recovery activity state do we need WAL logging for? ?I
> > realize we have to do WAL logging for streaming replication, but CREATE
> > TABLE isn't going to affect that. ? I also realize the index has to be
> > on disk on commit, but the same is true for doing the CREATE TABLE in
> > the same transaction block.
> >
> > Does this optimization work for INSERT ... SELECT?
>
> I don't think so -- insert/select doesn't take a full table lock and
> it writes to the heap. The optimization only works when other

My question is whether INSERT ... SELECT is/could be optimized when the
CREATE TABLE happens in the same transaction block.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-02 15:54:24
Message-ID: 2715.1312300464@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>> In minimal level, WAL-logging of some bulk operations, like CREATE
>>> INDEX, CLUSTER and COPY on a table that was created or truncated in the
>>> same transaction can be safely skipped, which can make those operations
>>> much faster (see Section 14.4.7).

> But the documentation states the WAL logging is reduced for CREATE INDEX
> by doing CREATE TABLE in the same transaction block. Why is this true?

It's not true, and it doesn't say that, or at least doesn't intend to
say that. That sentence is meant to be read as:

1. The optimization applies to CREATE INDEX.
2. The optimization applies to CLUSTER or COPY on a table that was
created or truncated in the current transaction.

I now see your point, which is that the sentence is easily misparsed.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-03 02:30:39
Message-ID: CA+TgmoZbp4Up2i=JbWTja3CK6sB+iaN0JuGsX-szg9M5V1dX5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 2, 2011 at 11:30 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Tom Lane wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> > Our docs suggest an optimization to reduce WAL logging when you are
>> > creating and populating a table:
>>
>> >     http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
>>
>> >     In minimal level, WAL-logging of some bulk operations, like CREATE
>> >     INDEX, CLUSTER and COPY on a table that was created or truncated in the
>> >     same transaction can be safely skipped, which can make those operations
>> >     much faster (see Section 14.4.7). But minimal WAL does not contain
>> >     enough information to reconstruct the data from a base backup and the
>> >     WAL logs, so either archive or hot_standby level must be used to enable
>> >     WAL archiving (archive_mode) and streaming replication.
>>
>> > I am confused why we issue significant WAL traffic for CREATE INDEX?
>>
>> The point is that in minimal level we *don't*.  We just fsync the index
>> file before committing.  In higher levels we have to write the whole
>> index contents to the WAL, not only the disk file, so that the info
>> reaches the archive or standby slaves.
>>
>> Same for the other cases.
>
> I realize the need for WAL logging CREATE INDEX for non-'minimal'
> wal_level values.
>
> But the documentation states the WAL logging is reduced for CREATE INDEX
> by doing CREATE TABLE in the same transaction block.  Why is this true?
> Why would the CREATE TABLE affect the "CREATE INDEX" WAL volume?
>
> I am wondering if the documention is correct about CLUSTER and COPY, but
> incorrect for CREATE INDEX.

I think the problem here might be ambiguous wording. I believe that
the modifier "on a table that was created or truncated in the same
transaction" is intended to apply only to "COPY", but the way it's
written, someone (such as you) might be forgiven for thinking that it
applied to the larger phrase "CREATE INDEX, CLUSTER, or COPY".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-03 02:46:55
Message-ID: 201108030246.p732kto14874@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Tue, Aug 2, 2011 at 11:30 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Tom Lane wrote:
> >> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> >> > Our docs suggest an optimization to reduce WAL logging when you are
> >> > creating and populating a table:
> >>
> >> > ? ? http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS
> >>
> >> > ? ? In minimal level, WAL-logging of some bulk operations, like CREATE
> >> > ? ? INDEX, CLUSTER and COPY on a table that was created or truncated in the
> >> > ? ? same transaction can be safely skipped, which can make those operations
> >> > ? ? much faster (see Section 14.4.7). But minimal WAL does not contain
> >> > ? ? enough information to reconstruct the data from a base backup and the
> >> > ? ? WAL logs, so either archive or hot_standby level must be used to enable
> >> > ? ? WAL archiving (archive_mode) and streaming replication.
> >>
> >> > I am confused why we issue significant WAL traffic for CREATE INDEX?
> >>
> >> The point is that in minimal level we *don't*. ?We just fsync the index
> >> file before committing. ?In higher levels we have to write the whole
> >> index contents to the WAL, not only the disk file, so that the info
> >> reaches the archive or standby slaves.
> >>
> >> Same for the other cases.
> >
> > I realize the need for WAL logging CREATE INDEX for non-'minimal'
> > wal_level values.
> >
> > But the documentation states the WAL logging is reduced for CREATE INDEX
> > by doing CREATE TABLE in the same transaction block. ?Why is this true?
> > Why would the CREATE TABLE affect the "CREATE INDEX" WAL volume?
> >
> > I am wondering if the documention is correct about CLUSTER and COPY, but
> > incorrect for CREATE INDEX.
>
> I think the problem here might be ambiguous wording. I believe that
> the modifier "on a table that was created or truncated in the same
> transaction" is intended to apply only to "COPY", but the way it's
> written, someone (such as you) might be forgiven for thinking that it
> applied to the larger phrase "CREATE INDEX, CLUSTER, or COPY".

I have created a documentation patch to clarify this, and to mention
CREATE TABLE AS which also has this optimization.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
/pgpatches/wal_level text/x-diff 1.6 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-03 03:53:12
Message-ID: 1312343425-sup-9110@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Bruce Momjian's message of mar ago 02 22:46:55 -0400 2011:

> I have created a documentation patch to clarify this, and to mention
> CREATE TABLE AS which also has this optimization.

It doesn't seem particularly better to me. How about something like

In minimal level, WAL-logging of some operations can be safely skipped,
which can make those operations much faster (see <blah>). Operations on
which this optimization can be applied include:
<simplelist>
<item>CREATE INDEX</item>
<item>CLUSTER</item>
<item>CREATE TABLE AS</item>
<item>COPY, when tables that were created or truncated in the same
transaction
</simplelist>

Minimal WAL does not contain enough information to reconstruct the data
from a base backup and the WAL logs, so either <literal>archive</> or
<literal>hot_standby</> level must be used to enable ...

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-03 18:52:01
Message-ID: 201108031852.p73Iq1h23402@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of mar ago 02 22:46:55 -0400 2011:
>
> > I have created a documentation patch to clarify this, and to mention
> > CREATE TABLE AS which also has this optimization.
>
> It doesn't seem particularly better to me. How about something like
>
> In minimal level, WAL-logging of some operations can be safely skipped,
> which can make those operations much faster (see <blah>). Operations on
> which this optimization can be applied include:
> <simplelist>
> <item>CREATE INDEX</item>
> <item>CLUSTER</item>
> <item>CREATE TABLE AS</item>
> <item>COPY, when tables that were created or truncated in the same
> transaction
> </simplelist>
>
> Minimal WAL does not contain enough information to reconstruct the data
> from a base backup and the WAL logs, so either <literal>archive</> or
> <literal>hot_standby</> level must be used to enable ...

Good idea --- updated patch attached.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
/pgpatches/wal_level text/x-diff 1.7 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging volume and CREATE TABLE
Date: 2011-08-04 16:07:00
Message-ID: 201108041607.p74G70i13662@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Patch applied.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > Excerpts from Bruce Momjian's message of mar ago 02 22:46:55 -0400 2011:
> >
> > > I have created a documentation patch to clarify this, and to mention
> > > CREATE TABLE AS which also has this optimization.
> >
> > It doesn't seem particularly better to me. How about something like
> >
> > In minimal level, WAL-logging of some operations can be safely skipped,
> > which can make those operations much faster (see <blah>). Operations on
> > which this optimization can be applied include:
> > <simplelist>
> > <item>CREATE INDEX</item>
> > <item>CLUSTER</item>
> > <item>CREATE TABLE AS</item>
> > <item>COPY, when tables that were created or truncated in the same
> > transaction
> > </simplelist>
> >
> > Minimal WAL does not contain enough information to reconstruct the data
> > from a base backup and the WAL logs, so either <literal>archive</> or
> > <literal>hot_standby</> level must be used to enable ...
>
> Good idea --- updated patch attached.
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + It's impossible for everything to be true. +

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +