Re: NOLOGGING option, or ?

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: NOLOGGING option, or ?
Date: 2005-05-31 23:01:43
Message-ID: 1117580503.3844.785.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Recent test results have shown a substantial performance improvement
(+25%) if WAL logging is disabled for large COPY statements. This is to
be expected, though has a price attached: losing the ability to crash
recover data loaded in this manner.

There are two parts to this proposal. First, when and whether to do this
at all. Second, syntax and invocation.

Why?

Performance.

The performance gain has a price and so should only be enabled if
requested explicitly by the user. It is up to the user whether they
accept this price, since in many useful cases it is a small price
against a huge saving.

The price is that if a crash occurs, then any table that was not empty
to begin with would not be in a transactionally consistent state
following crash recovery. It may have data in it, but it would be up to
the user to determine whether that was satisfactory or not. It could be
possible to sense what to do in this situation automatically, by putting
the table into a needs-recovery type state... I don't propose to handle
this *at this stage*.

Syntax and invocation:

Previously I had discussed adding a NOLOGGING option onto both COPY and
CREATE TABLE AS SELECT that would bypass the creation of wal logging
data. That is still a possibility, but would require manual code changes
to much of the SQL submitted.

Now, I would like to discuss adding an enable_logging USERSET GUC, that
would apply *only* to COPY and CREATE TABLE AS SELECT. The default of
this would be false.

How can we gain this performance benefit for those willing to accept the
restrictions imposed?

Your comments are sought and are most welcome.

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 02:47:30
Message-ID: 9361.1117594050@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Recent test results have shown a substantial performance improvement
> (+25%) if WAL logging is disabled for large COPY statements.

How much of that is left after we fix the 64-bit-CRC issue?

> Now, I would like to discuss adding an enable_logging USERSET GUC,

[ fear and loathing ... ]

I don't like the idea of a GUC at all, and USERSET is right out.
I think it would have to be system-wide (cf fsync) to be even
implementable let alone somewhat predictable. Even if it could
be done per-backend with reasonable semantics, random users should
not get to make that decision --- it should be the DBA's call,
which means it needs at least SUSET permissions.

BTW, I'm sure you are the last one who needs to be reminded that
any such thing breaks PITR completely. Which is surely sufficient
reason not to let it be USERSET.

regards, tom lane


From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 04:40:07
Message-ID: 20050601044007.GD19034@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, May 31, 2005 at 10:47:30PM -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Recent test results have shown a substantial performance improvement
> > (+25%) if WAL logging is disabled for large COPY statements.

> BTW, I'm sure you are the last one who needs to be reminded that
> any such thing breaks PITR completely. Which is surely sufficient
> reason not to let it be USERSET.

This doesn't work for COPY, but maybe for CREATE TABLE AS we could log
the fact that the command was executed, so the replayer could execute
the same command again.

Of course, this handwaving doesn't explain how the system in recovery
mode would be able to execute a full query to reconstruct the table, and
also it doesn't say a lot about the extra complexity at the source level
to implement this option.

For people loading big files into the database, maybe we could think
about a command to let a file be loaded directly as initial table
content. So all that we'd need is a program to write the file, which
could be done externally (The filewriter would have to have access to
the catalog and input functions for the involved types, though I think
for simple types it would be straighforward ... we could write frozen
tuples to avoid TransactionId problems.)

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
www.google.com: interfaz de línea de comando para la web.


From: Neil Conway <neilc(at)samurai(dot)com>
To: Alvaro Herrera <alvherre(at)surnet(dot)cl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 05:00:04
Message-ID: 1117602004.6678.70.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 00:40 -0400, Alvaro Herrera wrote:
> This doesn't work for COPY, but maybe for CREATE TABLE AS we could log
> the fact that the command was executed, so the replayer could execute
> the same command again.
>
> Of course, this handwaving doesn't explain how the system in recovery
> mode would be able to execute a full query to reconstruct the table

There's also the typical problem with this kind of approach: how do you
handle non-deterministic queries? (e.g. "CREATE TABLE ... AS SELECT
random(), gettimeofday(), some_func(t1.x) FROM t1 LIMIT 5")

-Neil


From: Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 06:25:42
Message-ID: 429D54E6.4010005@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Personally I don't think that it is a good idea to do that.
People will tend to corrupt their systems because they want speed
(sometimes without thinking about the consequences).

I can only think of one scenario where nologging would actually make
sense: Many people use session tables to keep track of user level
information on a website. corrupting a session table (usually not very
large) would not cause a lot of problems.

Doing it for COPY would be fatal. I can tell you from experience that
80% of all users will use that if the manual says that PostgreSQL will
beform better this way. This is a key feature to make people think that
PostgreSQL is reliable.

Best regards,

Hans

Simon Riggs wrote:
> Recent test results have shown a substantial performance improvement
> (+25%) if WAL logging is disabled for large COPY statements. This is to
> be expected, though has a price attached: losing the ability to crash
> recover data loaded in this manner.
>
> There are two parts to this proposal. First, when and whether to do this
> at all. Second, syntax and invocation.
>
> Why?
>
> Performance.
>
> The performance gain has a price and so should only be enabled if
> requested explicitly by the user. It is up to the user whether they
> accept this price, since in many useful cases it is a small price
> against a huge saving.
>
> The price is that if a crash occurs, then any table that was not empty
> to begin with would not be in a transactionally consistent state
> following crash recovery. It may have data in it, but it would be up to
> the user to determine whether that was satisfactory or not. It could be
> possible to sense what to do in this situation automatically, by putting
> the table into a needs-recovery type state... I don't propose to handle
> this *at this stage*.
>
> Syntax and invocation:
>
> Previously I had discussed adding a NOLOGGING option onto both COPY and
> CREATE TABLE AS SELECT that would bypass the creation of wal logging
> data. That is still a possibility, but would require manual code changes
> to much of the SQL submitted.
>
> Now, I would like to discuss adding an enable_logging USERSET GUC, that
> would apply *only* to COPY and CREATE TABLE AS SELECT. The default of
> this would be false.
>
> How can we gain this performance benefit for those willing to accept the
> restrictions imposed?
>
> Your comments are sought and are most welcome.
>
> Best Regards, Simon Riggs
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/664/393 39 74
www.cybertec.at, www.postgresql.at


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 08:16:46
Message-ID: 1117613806.3844.866.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2005-05-31 at 22:47 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Recent test results have shown a substantial performance improvement
> > (+25%) if WAL logging is disabled for large COPY statements.
>
> How much of that is left after we fix the 64-bit-CRC issue?

Well, I don't know. The I/O is the main thing I'm trying to avoid.

> > Now, I would like to discuss adding an enable_logging USERSET GUC,
>
> [ fear and loathing ... ]

OK. I needed to say the idea, to make sure we had considered it. I now
pronounce it dead and buried.

> BTW, I'm sure you are the last one who needs to be reminded that
> any such thing breaks PITR completely. Which is surely sufficient
> reason not to let it be USERSET.

You're right, thank you. But I do need to be reminded to say "this would
only apply when archiving is not enabled" just as the other two existing
WAL-avoidance optimisations do.

In answer to the other points raised, the main use case for this
optimisation is to improve data load performance into an empty table.
This is a very timeconsuming stage on a big warehouse and needs
performance optimization. I agree with Hans-Jurgen that this is a
dangerous option for use on general COPY commands, since these can be
used on empty and already populated tables. I seek a way to improve the
main use case though without leaving any danger in other situations.

I have two suggested approaches:
1. Introduce a new LOAD command that only works on empty tables.
Following a crash, the table is dropped and the user accepts that the
action-on-recovery is to reload the table. (Though in PITR mode, the
data would be logged).

2. Introduce NOT LOGGED INITIALLY mode, as DB2 has done. The first COPY
into an empty table would avoid WAL logging, if the user invokes that
option on the specific COPY command.

There are some other arguments in favour of a LOAD command.... Alon?

Best Regards, Simon Riggs


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Alvaro Herrera <alvherre(at)surnet(dot)cl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 08:21:39
Message-ID: 87acmav65o.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Neil Conway <neilc(at)samurai(dot)com> writes:

> On Wed, 2005-06-01 at 00:40 -0400, Alvaro Herrera wrote:
> > This doesn't work for COPY, but maybe for CREATE TABLE AS we could log
> > the fact that the command was executed, so the replayer could execute
> > the same command again.
> >
> > Of course, this handwaving doesn't explain how the system in recovery
> > mode would be able to execute a full query to reconstruct the table
>
> There's also the typical problem with this kind of approach: how do you
> handle non-deterministic queries? (e.g. "CREATE TABLE ... AS SELECT
> random(), gettimeofday(), some_func(t1.x) FROM t1 LIMIT 5")

For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
records at all. If it fails in the middle you just drop the table. When it
completes you do a checkpoint before acknowledging the COMMIT.

I think this is already done for CREATE INDEX/REINDEX, also only in the
non-PITR case.

--
greg


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 08:34:38
Message-ID: 429D731E.1030404@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> There are some other arguments in favour of a LOAD command.... Alon?

We already have LOAD, so you'll have to choose something else :)

Chris


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Greg Stark <gsstark(at)MIT(dot)EDU>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 08:44:24
Message-ID: 874qciv53r.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Greg Stark <gsstark(at)MIT(dot)EDU> writes:

> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
> records at all. If it fails in the middle you just drop the table. When it
> completes you do a checkpoint before acknowledging the COMMIT.
>
> I think this is already done for CREATE INDEX/REINDEX, also only in the
> non-PITR case.

Sorry to followup to my own message, but it occurs to me that COPY could be
made to automatically do this for the case of an empty destination table too.

I'm not sure if it should automatically check for an empty table or if there
should be an option for the user to indicate he wants COPY to replace the
current contents entirely. The latter might actually be more useful. .

But either way, you just WAL log a record indicating that the table should be
entirely empty. Then you fill it up without logging anything. Do a checkpoint
and then WAL log that the COPY is finished. If any failure occurs replay
leaves it empty.

Again this sadly only works in the non-PITR case.

--
greg


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 09:54:58
Message-ID: 1117619698.4772.14.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On K, 2005-06-01 at 09:16 +0100, Simon Riggs wrote:
> On Tue, 2005-05-31 at 22:47 -0400, Tom Lane wrote:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > Recent test results have shown a substantial performance improvement
> > > (+25%) if WAL logging is disabled for large COPY statements.
> >
> > How much of that is left after we fix the 64-bit-CRC issue?
>
> Well, I don't know. The I/O is the main thing I'm trying to avoid.

While avoiding IO is a good thing in general, WAL IO traffic can at
least easily made parallel to other IO by allocating own disk for WAL.

> > > Now, I would like to discuss adding an enable_logging USERSET GUC,
> >
> > [ fear and loathing ... ]
>
> OK. I needed to say the idea, to make sure we had considered it. I now
> pronounce it dead and buried.
>
> > BTW, I'm sure you are the last one who needs to be reminded that
> > any such thing breaks PITR completely.

I don't think we do any WAlling of TEMP tables, so it may be easy to
extend this to any table with 'NO_WAL' bit set.

That would create kind of 'extended temp table' - unsafe but fast ;)

--
Hannu Krosing <hannu(at)skype(dot)net>


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 11:54:19
Message-ID: 1117626859.3844.945.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 16:34 +0800, Christopher Kings-Lynne wrote:
> > There are some other arguments in favour of a LOAD command.... Alon?
>
> We already have LOAD, so you'll have to choose something else :)

Its annoying, I grant you. :-)

LOAD 'library' would still need to be the default.

LOAD LIBRARY 'library' would be the new recommended usage.

LOAD DATA... would be the new command... with most other options hanging
off of that. There's no problem with that, since that is then the same
as Oracle syntax for the load utility.

Best Regards, Simon Riggs


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 12:03:25
Message-ID: 1117627405.3844.950.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 04:44 -0400, Greg Stark wrote:
> Greg Stark <gsstark(at)MIT(dot)EDU> writes:
>
> > For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
> > records at all. If it fails in the middle you just drop the table. When it
> > completes you do a checkpoint before acknowledging the COMMIT.
> >
> > I think this is already done for CREATE INDEX/REINDEX, also only in the
> > non-PITR case.
>
> Sorry to followup to my own message, but it occurs to me that COPY could be
> made to automatically do this for the case of an empty destination table too.
>
> I'm not sure if it should automatically check for an empty table or if there
> should be an option for the user to indicate he wants COPY to replace the
> current contents entirely. The latter might actually be more useful. .
>
> But either way, you just WAL log a record indicating that the table should be
> entirely empty. Then you fill it up without logging anything. Do a checkpoint
> and then WAL log that the COPY is finished. If any failure occurs replay
> leaves it empty.
>
> Again this sadly only works in the non-PITR case.

Yes, all of the above could work.

It would use essentially the same functionality that Manfred suggested
for handling truncated tables. Ignore the first LOAD DATA started
message until recovery completes, then truncate table if the LOAD DATA
complete message was not logged in wal.

Best Regards, Simon Riggs


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 12:10:55
Message-ID: 1117627855.4772.21.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On K, 2005-06-01 at 00:01 +0100, Simon Riggs wrote:
> Recent test results have shown a substantial performance improvement
> (+25%) if WAL logging is disabled for large COPY statements. This is to
> be expected, though has a price attached: losing the ability to crash
> recover data loaded in this manner.

Not only recover the DB itself but also having a hot standby (and
hopefully a read-only replica some time in the future).

> There are two parts to this proposal. First, when and whether to do this
> at all. Second, syntax and invocation.

I think this should be a decision done when creating a table, just like
TEMP tables. So you always know if a certain table is or is not
safe/replicated/recoverable.

This has also the advantage of requiring no changes to actual COPY and
INSERT commands.

--
Hannu Krosing <hannu(at)skype(dot)net>


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 13:39:07
Message-ID: 200506011339.j51Dd7W14937@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Wed, 2005-06-01 at 04:44 -0400, Greg Stark wrote:
> > Greg Stark <gsstark(at)MIT(dot)EDU> writes:
> >
> > > For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
> > > records at all. If it fails in the middle you just drop the table. When it
> > > completes you do a checkpoint before acknowledging the COMMIT.
> > >
> > > I think this is already done for CREATE INDEX/REINDEX, also only in the
> > > non-PITR case.
> >
> > Sorry to followup to my own message, but it occurs to me that COPY could be
> > made to automatically do this for the case of an empty destination table too.
> >
> > I'm not sure if it should automatically check for an empty table or if there
> > should be an option for the user to indicate he wants COPY to replace the
> > current contents entirely. The latter might actually be more useful. .
> >
> > But either way, you just WAL log a record indicating that the table should be
> > entirely empty. Then you fill it up without logging anything. Do a checkpoint
> > and then WAL log that the COPY is finished. If any failure occurs replay
> > leaves it empty.
> >
> > Again this sadly only works in the non-PITR case.
>
> Yes, all of the above could work.
>
> It would use essentially the same functionality that Manfred suggested
> for handling truncated tables. Ignore the first LOAD DATA started
> message until recovery completes, then truncate table if the LOAD DATA
> complete message was not logged in wal.

Well, why not just add this functionality to COPY rather than create a
new command? One optimization is to write the dirty shared buffers to
the kernel then fsync that relation, rather than do a checkpoint.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 13:58:53
Message-ID: 13424.1117634333@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Sorry to followup to my own message, but it occurs to me that COPY could be
> made to automatically do this for the case of an empty destination table too.

Not unless you are proposing to change COPY to acquire a lock strong
enough to lock out other writers to the table for the duration ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:01:06
Message-ID: 13447.1117634466@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Wed, 2005-06-01 at 16:34 +0800, Christopher Kings-Lynne wrote:
> There are some other arguments in favour of a LOAD command.... Alon?
>>
>> We already have LOAD, so you'll have to choose something else :)

> Its annoying, I grant you. :-)

> LOAD 'library' would still need to be the default.

> LOAD LIBRARY 'library' would be the new recommended usage.

> LOAD DATA... would be the new command... with most other options hanging
> off of that. There's no problem with that, since that is then the same
> as Oracle syntax for the load utility.

Uh, what's wrong with adding an option to COPY? Not like it hasn't got
a ton of 'em already. The Oracle-compatibility angle doesn't interest
me at all, mainly because I find it highly improbable that we'd be exactly
compatible anyway.

regards, tom lane


From: Jochem van Dieten <jochemd(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:07:37
Message-ID: f96a9b8305060107074de65307@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01 Jun 2005 04:44:24 -0400, Greg Stark wrote:
> Greg Stark writes:
>>
>> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
>> records at all. If it fails in the middle you just drop the table. When it
>> completes you do a checkpoint before acknowledging the COMMIT.
>>
>> I think this is already done for CREATE INDEX/REINDEX, also only in the
>> non-PITR case.

Checkpoint or fsync?

> Sorry to followup to my own message, but it occurs to me that COPY could be
> made to automatically do this for the case of an empty destination table too.

Why only on an empty table? What is the problem with bypassing WAL on
any table as long as all files of that table are fsync'ed before
commit?

> Again this sadly only works in the non-PITR case.

Apart from that problem of course :)

Jochem


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:12:18
Message-ID: 200506011412.j51ECIT20344@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Sorry to followup to my own message, but it occurs to me that COPY could be
> > made to automatically do this for the case of an empty destination table too.
>
> Not unless you are proposing to change COPY to acquire a lock strong
> enough to lock out other writers to the table for the duration ...

Well, if the table is initally empty, what harm is there in locking the
table? How many people query the table while it is being loaded, and
because the transaction isn't committed, the table is empty to everyone
else anyway.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jochem van Dieten <jochemd(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:14:42
Message-ID: 200506011414.j51EEge20648@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jochem van Dieten wrote:
> On 01 Jun 2005 04:44:24 -0400, Greg Stark wrote:
> > Greg Stark writes:
> >>
> >> For CREATE TABLE AS in the non-PITR case you don't really need to WAL log the
> >> records at all. If it fails in the middle you just drop the table. When it
> >> completes you do a checkpoint before acknowledging the COMMIT.
> >>
> >> I think this is already done for CREATE INDEX/REINDEX, also only in the
> >> non-PITR case.
>
> Checkpoint or fsync?
>
>
> > Sorry to followup to my own message, but it occurs to me that COPY could be
> > made to automatically do this for the case of an empty destination table too.
>
> Why only on an empty table? What is the problem with bypassing WAL on
> any table as long as all files of that table are fsync'ed before
> commit?

Because adding rows to a table might modify existing pages, and if the
COPY fails, you have to restore those pages to a consistent state, and
make sure they are recovered for partial page writes, which we can't do
without WAL. With an initially empty table, you can just throw away the
file system file.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:18:25
Message-ID: 13658.1117635505@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom Lane wrote:
>> Not unless you are proposing to change COPY to acquire a lock strong
>> enough to lock out other writers to the table for the duration ...

> Well, if the table is initally empty, what harm is there in locking the
> table?

You cannot *know* whether it is empty unless you lock the table before
you look. So your argument is circular.

I think this only makes sense as an explicit option to COPY, one of the
effects of which would be to take a stronger lock than COPY normally does.

regards, tom lane


From: Dawid Kuroczko <qnex42(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 14:43:44
Message-ID: 758d5e7f050601074371de6390@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/1/05, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> wrote:
> Personally I don't think that it is a good idea to do that.
> People will tend to corrupt their systems because they want speed
> (sometimes without thinking about the consequences).
>
> I can only think of one scenario where nologging would actually make
> sense: Many people use session tables to keep track of user level
> information on a website. corrupting a session table (usually not very
> large) would not cause a lot of problems.

Well, from what I know, TEMPORARY tables are not WAL-logged, since
they won't exist after "restart" (since none of the current sessions
would exist). The problem with TEMPORARY tables is that they are not
globally visible.

I think it would be neat to be able to create "server-life-time" tables.
I.e. table which is initially empty, can be used by all users, but is
guaranteed to be empty (truncated) upon server restart. A place
to store global temporary variables. A table type to put on a RAM-disk.

Potential uses? Imagine a site which has a busy 'session' table and
a relatively 'static' other tables. Most of WALs would consist on
'useless' updates to session table. And recovery using WAL files
would take longer (as PostgreSQL would have to dump and restore
whole a lot of session data). Having a "global temporary" table
would be helpful in such a situation. And theoretically it wouldn't
need to "spill to disk" at all, provided it was small enough.

Regards,
Dawid


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 15:21:07
Message-ID: 1117639267.3844.985.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 10:18 -0400, Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Not unless you are proposing to change COPY to acquire a lock strong
> >> enough to lock out other writers to the table for the duration ...
>
> > Well, if the table is initally empty, what harm is there in locking the
> > table?
>
> You cannot *know* whether it is empty unless you lock the table before
> you look. So your argument is circular.
>
> I think this only makes sense as an explicit option to COPY, one of the
> effects of which would be to take a stronger lock than COPY normally does.

Locking the table is OK. We're loading it for the first time, so we are
expecting to be the only users at this time.

Here's a more fully worked out plan for wal/concurrency:

1. Lock table, scan until we find our first live row (not deleted,
insert not aborted), in which case throw an error. If no error, then ok
to proceed to next phase.

2. If there are any dead row versions we truncate the file, since we do
not wish to see those rows ever again. No existing transaction can have
a lock on them, since we hold it, so its ok for them to go. We record
that we have done this in WAL with a short message to say that that we
are loading an empty table, whether or not archiving is not enabled.

3. We load the table, with all TransactionIds set as FROZEN. We do not
generate WAL for each row unless archiving is enabled.

4. Optionally, we set a flag on the table showing the whole table is
frozen. Anybody writing to this table subsequently will spoil this flag.
If the flag is set, all forms of VACUUM will return success immediately
without performing a scan (since it is already in a perfect VACUUM FULL
and VACUUM FREEZE state).

If the server crashes, we replay WAL. If we see a load start message, we
truncate the relation and note that a load has started. If there is WAL
data for the tuples, we replay it. If WAL replay ends without the load
transaction having successfully committed, then we truncate the table.

The above applies to both LOAD/COPY whatever-yer-call-it and in modified
form for CREATE TABLE AS SELECT. For CTAS, no scan is required in (1),
and no truncate is required in (2), otherwise the same.

I'm expecting Alon Goldshuv to join this discussion soon to explain some
other ideas, new developments and discuss the potential for a new
command, so lets wait for him...

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 15:31:31
Message-ID: 14333.1117639891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hannu Krosing <hannu(at)skype(dot)net> writes:
> I think this should be a decision done when creating a table, just like
> TEMP tables. So you always know if a certain table is or is not
> safe/replicated/recoverable.
> This has also the advantage of requiring no changes to actual COPY and
> INSERT commands.

That doesn't seem right to me; the scenario I envision is that you are
willing to do the initial data loading over again (since you presumably
still have the source data available). But once you've got it loaded
you want full protection.

Perhaps it could work to use an ALTER TABLE command to flip the state.
But I'm not really seeing the point compared to treating it as a COPY
option. I do not believe that anyone needs this to work on individual
INSERT commands --- if you are after max speed, why aren't you using
COPY? And treating it as an ALTER property opens the possibility of
forgetting to ALTER the table back to normal behavior, which would be
a foot-gun of large caliber indeed :-(

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Jochem van Dieten <jochemd(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 15:43:24
Message-ID: 14461.1117640604@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Jochem van Dieten wrote:
>> Why only on an empty table? What is the problem with bypassing WAL on
>> any table as long as all files of that table are fsync'ed before
>> commit?

> Because adding rows to a table might modify existing pages, and if the
> COPY fails, you have to restore those pages to a consistent state, and
> make sure they are recovered for partial page writes, which we can't do
> without WAL. With an initially empty table, you can just throw away the
> file system file.

You have also got to think about the effects on the table's indexes ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 15:55:46
Message-ID: 14606.1117641346@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> 4. Optionally, we set a flag on the table showing the whole table is
> frozen. Anybody writing to this table subsequently will spoil this flag.
> If the flag is set, all forms of VACUUM will return success immediately
> without performing a scan (since it is already in a perfect VACUUM FULL
> and VACUUM FREEZE state).

This bit strikes me as dangerous and not related to the original
proposal. I don't care for the load-already-frozen-data part at all,
either. That's not just giving up WAL protection, that's an outright
MVCC semantics violation, in return for which we get ... not much.
Certainly not any speedup in the LOAD itself.

regards, tom lane


From: Jochem van Dieten <jochemd(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 16:00:28
Message-ID: f96a9b8305060109001b157a0d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 6/1/05, Bruce Momjian wrote:
> Jochem van Dieten wrote:
>>
>> Why only on an empty table? What is the problem with bypassing WAL on
>> any table as long as all files of that table are fsync'ed before
>> commit?
>
> Because adding rows to a table might modify existing pages, and if the
> COPY fails, you have to restore those pages to a consistent state, and
> make sure they are recovered for partial page writes, which we can't do
> without WAL. With an initially empty table, you can just throw away the
> file system file.

Thank you for the explanation, but I am afraid I still don't get it.

COPY can either fail and do a normal rollback, in which case there is
no problem because the xid never made it to the xlog. So I take it you
are talking about a hard crash (pull the plug) somewhere during the
actual writing to disk. In that case you have updated several pages
and overwritten the free space with new tuples. But you have not
overwritten live tuples, so why would you need to restore them? I
mean, didn't PostgreSQL < 7.1 work without a WAL at all?

Jochem


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 16:18:09
Message-ID: 1117642690.4830.4.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On K, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> Hannu Krosing <hannu(at)skype(dot)net> writes:
> > I think this should be a decision done when creating a table, just like
> > TEMP tables. So you always know if a certain table is or is not
> > safe/replicated/recoverable.
> > This has also the advantage of requiring no changes to actual COPY and
> > INSERT commands.
>
> That doesn't seem right to me; the scenario I envision is that you are
> willing to do the initial data loading over again (since you presumably
> still have the source data available). But once you've got it loaded
> you want full protection.

What I mean, was that as it can't be safely replicated using log-
shipping, It should be visible as such.

> Perhaps it could work to use an ALTER TABLE command to flip the state.

No. It would be the same as flipping a TEMP table to an ordinary table,
which we don't support, and IMHO for a good reason

> But I'm not really seeing the point compared to treating it as a COPY
> option.

The point is having a separate (sub)type of storage - non-WAL/non-
replicated table and its indexes.

> I do not believe that anyone needs this to work on individual
> INSERT commands --- if you are after max speed, why aren't you using
> COPY? And treating it as an ALTER property opens the possibility of
> forgetting to ALTER the table back to normal behavior, which would be
> a foot-gun of large caliber indeed :-(

That's what I'm trying to avoid - If it is obvious, that the whole table
is quasi-stable (in PITR/log-shipping sense) it is more clearly a user
choice what kinds of data can be stored there. Same as TEMP tables
again.

--
Hannu Krosing <hannu(at)tm(dot)ee>


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 17:00:28
Message-ID: 1117645228.3844.1016.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote:
> Hannu Krosing <hannu(at)skype(dot)net> writes:
> > I think this should be a decision done when creating a table, just like
> > TEMP tables. So you always know if a certain table is or is not
> > safe/replicated/recoverable.
> > This has also the advantage of requiring no changes to actual COPY and
> > INSERT commands.
>
> That doesn't seem right to me; the scenario I envision is that you are
> willing to do the initial data loading over again (since you presumably
> still have the source data available). But once you've got it loaded
> you want full protection.

Yes, thats the scenario.

Believe me, I prefer less code, but I think general feeling now is that
we must provide a data safe solution to the performance challenge.

> Perhaps it could work to use an ALTER TABLE command to flip the state.
> But I'm not really seeing the point compared to treating it as a COPY
> option. I do not believe that anyone needs this to work on individual
> INSERT commands --- if you are after max speed, why aren't you using
> COPY? And treating it as an ALTER property opens the possibility of
> forgetting to ALTER the table back to normal behavior, which would be
> a foot-gun of large caliber indeed :-(

Oh no, not the foot gun again. I surrender.

Best Regards, Simon Riggs


From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 17:18:55
Message-ID: 20050601171855.GD23141@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 01, 2005 at 06:00:28PM +0100, Simon Riggs wrote:
> On Wed, 2005-06-01 at 11:31 -0400, Tom Lane wrote:

> > Perhaps it could work to use an ALTER TABLE command to flip the state.
> > But I'm not really seeing the point compared to treating it as a COPY
> > option. I do not believe that anyone needs this to work on individual
> > INSERT commands --- if you are after max speed, why aren't you using
> > COPY? And treating it as an ALTER property opens the possibility of
> > forgetting to ALTER the table back to normal behavior, which would be
> > a foot-gun of large caliber indeed :-(
>
> Oh no, not the foot gun again. I surrender.

Sorry, what are you surrendering from/for/of/to? I think the proposal
is doing a reasonable headways. So far we have

- it is an option to COPY and CREATE TABLE AS, not GUC, not ALTER TABLE
- it'd only work on empty tables
- it'd only work it PITR is off
- it needs to follow MVCC semantics, i.e. the tuples need to be written
with the correct Xids
- after the command is done, an fsync is applied to the table file(s)

Another point that needs thought is what to do if the table has any
indexes. Are operations on said indexes logged or not? Maybe we should
just say that indexes are verbotten and the user needs to create them
afterwards.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)surnet(dot)cl>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 17:35:25
Message-ID: 17053.1117647325@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)surnet(dot)cl> writes:
> - it is an option to COPY and CREATE TABLE AS, not GUC, not ALTER TABLE

AFAICS it could just happen automatically for CREATE TABLE AS; there's
no need for an option there, any more than there is for CREATE INDEX.

The only reason it needs to be an explicitly specified option for COPY
is that it would require taking a sole-writer lock on the table, which
COPY does not now do (and I believe I've heard of people using parallel
COPYs to load a table faster, so changing the lock type wouldn't be
transparent for everyone).

> Another point that needs thought is what to do if the table has any
> indexes. Are operations on said indexes logged or not? Maybe we should
> just say that indexes are verbotten and the user needs to create them
> afterwards.

That seems pretty reasonable to me. Again, that's what you'd do anyway
if you are after the fastest possible load time, so why should we work
much harder to support an inefficient approach?

Thinking about it, maybe the user-visible option should be defined thus:

LOCK
Causes COPY FROM to acquire Exclusive lock on the target
table, rather than RowExclusive lock as it normally does.
This ensures that no other process is modifying the table
while the COPY proceeds. In some cases this can allow
significantly faster operation.

and then the checks on PITR mode, no indexes, and empty starting table
could be internal implementation details rather than part of the
user-visible spec (ie, we just fall through and do it normally if any of
those conditions don't hold). I like this a little better because there
might be application-level reasons to want exclusive lock, independently
of implementation details.

Also: AFAICS the starting table need not be empty, if we arrange for all
inserts done by the COPY to be done into freshly-appended blocks. The
initial WAL entry could note the current table length, and instead of
"truncate to 0 length" the recovery action is "truncate to noted
length". So really the constraints are just "no PITR" and "no indexes".

regards, tom lane


From: "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 17:35:30
Message-ID: BEC33FF2.50CC%agoldshuv@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have been working on improving the COPY command performance and as a
result also came up with other thoughts that may possibly be better off
implemented in a new command (i.e LOAD DATA) rather than adding them to the
existing COPY.

The improvements I made were in the COPY parsing logic - changing a
char-by-char parsing to a multi char buffered fast parsing, while using a
bytebuffer which is faster than StringInfoData and minimizing the number of
loads into the buffer. The data conversion and insertion parts of COPY I
left untouched. As a result the parsing performance increased by about 550%,
and the overall COPY performance increased by:

Around 40% for 15 column (mixed types) table.
Around 90% for 1 column table.

(the difference betweeen the two is caused by data conversion overhead).

I will post the patch and more numbers to the list later today with more
details. I'll just comment now that it is only available for delimited ASCII
input data when client and server encodings are the same. CSV and encoding
conversions may be added later, this is merely to show that data could be
loaded much faster.

Here are some things that make me think a new LOAD command is a good idea:

1) There seem to be a possibility that many COPY modifications/improvements
may be problematic to incorporate in the current postgres COPY code. Further
more, it may be desired to keep the COPY command as is and also have a way
to run an improved COPY command for purposes of backwards compatibility.

2) A modified command syntax for introducing a direct single row error
handling. By direct I mean - a row that if rejected from within the COPY
command context does not throw an error and rollsback the whole transaction.
Instead the error is caught and recorded elsewhere, maybe in some error
table, with some more information that can later on be retrieved. The
following rows continue to be processed. This way there is barely any error
handling overhead. Having a recursive row isolation into smaller batches is
extremely expensive for non-small data sets. It's not an option for serious
users.

3) maybe have an option to indicate the EOL (end of line) format in the
command syntax. Current COPY code detects the line-end according to the
first data line, this is problematic is 1st data line is mal formatted, and
also this doesn't allow having any CR's for example in the data file when
EOL is only a linefeed. That causes extra data errors in COPY processing. A
CR is a valid data character. Specifying the line end in command syntax will
save all this badness.

4) Data integrity and escaping improvements. My patch changes now treats all
characters as data (unless it's an escaped delim or EOL) and therefore data
integrity is preserved (take for example the following valid data field
"file:\new\bang" that after COPY into the database, querying for it from
psql will result is wrong data -- backslashes are gone, there are 2 data
lines, and a bell will ring for \b!) However, some people that already got
used to the postgres COPY escaping way may want to keep it. They could do so
by still using the old COPY.

5) allow an ERRORLIMIT to allow control of aborting a load after a certain
number of errors (and a pre-requisite for this is point number 2 above).

6) allow LIMIT and OFFSET, for files with header rows for example (could be
done in COPY too).

7) Allow the blocks to be directly written to the table, rather than via
the buffer cache.

8) Allow a bulk index insertion operation at the end of the LOAD step, if
the data has been loaded in sorted order. Use something like the SORTED
INDEXES statement on Oracle sql*loader to specify the sort order of the
incoming data, so that the index build step can bypass another external
sort before loading directly into the index.

9) allow for Simon's WAL bypass.

I have surely missed some problems that hide behind the idea, but these
points make me believe that LOAD DATA is a good idea.

Alon.

On 5/31/05 7:47 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> Recent test results have shown a substantial performance improvement
>> (+25%) if WAL logging is disabled for large COPY statements.
>
> How much of that is left after we fix the 64-bit-CRC issue?
>
>> Now, I would like to discuss adding an enable_logging USERSET GUC,
>
> [ fear and loathing ... ]
>
> I don't like the idea of a GUC at all, and USERSET is right out.
> I think it would have to be system-wide (cf fsync) to be even
> implementable let alone somewhat predictable. Even if it could
> be done per-backend with reasonable semantics, random users should
> not get to make that decision --- it should be the DBA's call,
> which means it needs at least SUSET permissions.
>
> BTW, I'm sure you are the last one who needs to be reminded that
> any such thing breaks PITR completely. Which is surely sufficient
> reason not to let it be USERSET.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 17:55:46
Message-ID: 1117648546.3844.1033.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 11:55 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > 4. Optionally, we set a flag on the table showing the whole table is
> > frozen. Anybody writing to this table subsequently will spoil this flag.
> > If the flag is set, all forms of VACUUM will return success immediately
> > without performing a scan (since it is already in a perfect VACUUM FULL
> > and VACUUM FREEZE state).
>
> This bit strikes me as dangerous and not related to the original
> proposal. I don't care for the load-already-frozen-data part at all,
> either. That's not just giving up WAL protection, that's an outright
> MVCC semantics violation, in return for which we get ... not much.
> Certainly not any speedup in the LOAD itself.

I agree it *sounds* dangerous, but is it? If so, how? I will immediately
withdraw any idea that proves dangerous.

We're holding the table lock and will continue to do so until end of
transaction. No transaction with an earlier id will ever see the data we
load because of the lock. Later transactions will see the data only when
we commit. At that stage, all they care about is that the data is
visible...it doesn't matter that we have frigged it to look like its
been there a real long time.

We're not giving up WAL protection. The PITR case is covered. Non-PITR
cases have exactly the same transactional consistency across as crash as
they do now. Nothing is lost through this proposal.

Doing those two additional actions gives us the ability to avoid:
1. avoid writing the whole table twice because of
SetBufferCommitInfoNeedsSave
2. avoid pointless VACUUMs of very large tables
3. load the table direct to read-only media, or copy it to read-only
media at some point in the future without needing to do a VACUUM FREEZE

All of that means we write the table once, rather than 3 times over the
lifetime of the table. And we only need scan it for user SQL, not for
wrap-around avoiding VACUUMs.

I know you have a solution to the second one in mind. I would not argue
against the needs-vacuuming bitmap idea when that comes back around.

Best Regards, Simon Riggs


From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 18:10:45
Message-ID: 20050601181045.GA26471@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
> On Wed, 2005-06-01 at 11:55 -0400, Tom Lane wrote:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > 4. Optionally, we set a flag on the table showing the whole table is
> > > frozen. Anybody writing to this table subsequently will spoil this flag.
> > > If the flag is set, all forms of VACUUM will return success immediately
> > > without performing a scan (since it is already in a perfect VACUUM FULL
> > > and VACUUM FREEZE state).
> >
> > This bit strikes me as dangerous and not related to the original
> > proposal. I don't care for the load-already-frozen-data part at all,
> > either. That's not just giving up WAL protection, that's an outright
> > MVCC semantics violation, in return for which we get ... not much.
> > Certainly not any speedup in the LOAD itself.
>
> I agree it *sounds* dangerous, but is it? If so, how? I will immediately
> withdraw any idea that proves dangerous.
>
> We're holding the table lock and will continue to do so until end of
> transaction. No transaction with an earlier id will ever see the data we
> load because of the lock.

Suppose you load half the tuples and the plug is pulled. After
recovery, you have half-load of tuples that are visible to everyone.
This is a no-no. Plus, what is the benefit of having the tuples in
frozen state to start with? If this is a data warehouse application,
I'd expect the table to be dropped or truncated rather before the
billion-transactions barrier comes to pass.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 18:14:27
Message-ID: 17318.1117649667@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> If the server crashes, we replay WAL. If we see a load start message, we
> truncate the relation and note that a load has started. If there is WAL
> data for the tuples, we replay it. If WAL replay ends without the load
> transaction having successfully committed, then we truncate the table.

On further thought, this seems both risky and unnecessary.

The reason it's risky is this scenario:

* Backend 1 makes a LOAD-start WAL entry.

* Backend 1 loads some data, extending the table beyond its
former end.

* Backend 1 errors out without committing its transaction.

* Backend 2 inserts some data into the no-longer-locked table.
It uses free space in one of the added pages, or maybe even
adds new pages of its own.

* Backend 2 commits.

* System crashes, and we have to replay the above actions.

In this scenario you cannot truncate at the end of replay without losing
backend 2's committed data.

You can think of various ways to avoid this risk (for instance, maybe
*any* WAL-logged operation on the table should cause the pending
TRUNCATE to be discarded) but they all seem expensive and/or still
somewhat unsafe.

The reason it's unnecessary is what's the point? All you're doing by not
truncating is leaving some uncommitted tuples in the table. It's not
the job of WAL recovery to get rid of such things; that's VACUUM's job.

So what I'm thinking is we need no special WAL entries for this. What
we need is just an operating mode of COPY in which it doesn't WAL-log
its inserts, but instead fsyncs before completion, much like index build
does. For safety it must do all its inserts into freshly-added pages;
this is not to ensure truncatability, because we aren't going to do that
anyway, but to ensure that we don't have unlogged operations changing
pages that might contain committed tuples. (That would pose a risk of
losing committed data to incomplete writes in case of system crash
partway through. The same reason is why we need exclusive lock: else
we might end up with pages containing a mix of logged and unlogged
tuples.) Also there can be no indexes, since we don't want index
entries pointing to unlogged tuples. And PITR can't be enabled.
Otherwise no problem.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)surnet(dot)cl>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 18:24:39
Message-ID: 17405.1117650279@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)surnet(dot)cl> writes:
> On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
>> We're holding the table lock and will continue to do so until end of
>> transaction. No transaction with an earlier id will ever see the data we
>> load because of the lock.

> Suppose you load half the tuples and the plug is pulled. After
> recovery, you have half-load of tuples that are visible to everyone.
> This is a no-no.

Simon is expecting that the loaded tuples are guaranteed to be erased
(by table truncation) during recovery. As I just noted I'm unconvinced
of the safety of doing truncations during recovery, so I'd prefer not
to depend on that.

The scenario I was thinking of was different: you load pre-frozen
tuples, commit, and thereby release the table lock. Now the tuples
are visible to transactions that started before you did; that's what
violates MVCC.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)surnet(dot)cl>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 20:17:39
Message-ID: 1117657059.3844.1042.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 14:24 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)surnet(dot)cl> writes:
> > On Wed, Jun 01, 2005 at 06:55:46PM +0100, Simon Riggs wrote:
> >> We're holding the table lock and will continue to do so until end of
> >> transaction. No transaction with an earlier id will ever see the data we
> >> load because of the lock.
>
> > Suppose you load half the tuples and the plug is pulled. After
> > recovery, you have half-load of tuples that are visible to everyone.
> > This is a no-no.
>
> Simon is expecting that the loaded tuples are guaranteed to be erased
> (by table truncation) during recovery. As I just noted I'm unconvinced
> of the safety of doing truncations during recovery, so I'd prefer not
> to depend on that.
>
> The scenario I was thinking of was different: you load pre-frozen
> tuples, commit, and thereby release the table lock. Now the tuples
> are visible to transactions that started before you did; that's what
> violates MVCC.

Agreed. MVCC violation. OK, back to the drawing board.

Best Regards, Simon Riggs


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 20:33:24
Message-ID: 1117658004.3844.1059.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 14:14 -0400, Tom Lane wrote:
> So what I'm thinking is we need no special WAL entries for this. What
> we need is just an operating mode of COPY in which it doesn't WAL-log
> its inserts, but instead fsyncs before completion, much like index build
> does. For safety it must do all its inserts into freshly-added pages;
> this is not to ensure truncatability, because we aren't going to do that
> anyway, but to ensure that we don't have unlogged operations changing
> pages that might contain committed tuples. (That would pose a risk of
> losing committed data to incomplete writes in case of system crash
> partway through. The same reason is why we need exclusive lock: else
> we might end up with pages containing a mix of logged and unlogged
> tuples.) Also there can be no indexes, since we don't want index
> entries pointing to unlogged tuples. And PITR can't be enabled.
> Otherwise no problem.

What you describe above is a coherent set of features that provide most
of the benefits I sought, plus some others. We also don't mess with WAL,
which is grand thing. We gain the ability to load into tables with rows
already in them.

I don't agree with all of your other points, but given time schedules, I
think that we win with the above, so forget the rest.

The main COPY/LOAD DATA discussion is on another thread of this from
Alon, who has some interesting ideas and some really cool performance
results to share.

Best Regards, Simon Riggs


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jochem van Dieten <jochemd(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 21:27:01
Message-ID: 200506012127.j51LR1b27129@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jochem van Dieten wrote:
> On 6/1/05, Bruce Momjian wrote:
> > Jochem van Dieten wrote:
> >>
> >> Why only on an empty table? What is the problem with bypassing WAL on
> >> any table as long as all files of that table are fsync'ed before
> >> commit?
> >
> > Because adding rows to a table might modify existing pages, and if the
> > COPY fails, you have to restore those pages to a consistent state, and
> > make sure they are recovered for partial page writes, which we can't do
> > without WAL. With an initially empty table, you can just throw away the
> > file system file.
>
> Thank you for the explanation, but I am afraid I still don't get it.
>
> COPY can either fail and do a normal rollback, in which case there is
> no problem because the xid never made it to the xlog. So I take it you
> are talking about a hard crash (pull the plug) somewhere during the
> actual writing to disk. In that case you have updated several pages
> and overwritten the free space with new tuples. But you have not
> overwritten live tuples, so why would you need to restore them? I
> mean, didn't PostgreSQL < 7.1 work without a WAL at all?

What if you are adding rows to an existing page --- in that case you are
writing a page that also contained valid tuples before the COPY.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 21:56:07
Message-ID: 200506012156.j51Lu7h01730@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alon Goldshuv wrote:
> 9) allow for Simon's WAL bypass.
>
> I have surely missed some problems that hide behind the idea, but these
> points make me believe that LOAD DATA is a good idea.

The community is unlikely to add a new LOAD DATA command that does
_almost_ everything COPY does. We are much more likely to incrementally
improve COPY.

The problem with a new command is that it becomes unclear when you
should use COPY and when LOAD DATA, and it confuses users, and has
maintenance overhead. If Bizgres wants a new command name, go for it,
but it is unlikely that the community release is going to go in that
direction, unless there is a fundamental agreement that COPY is broken
and needs a major revamp, and I have heard no talk of that.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:05:13
Message-ID: 200506012205.j51M5DF03365@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom Lane wrote:
> >> Not unless you are proposing to change COPY to acquire a lock strong
> >> enough to lock out other writers to the table for the duration ...
>
> > Well, if the table is initally empty, what harm is there in locking the
> > table?
>
> You cannot *know* whether it is empty unless you lock the table before
> you look. So your argument is circular.
>
> I think this only makes sense as an explicit option to COPY, one of the
> effects of which would be to take a stronger lock than COPY normally does.

One idea would be to look at the table file size first. If it has zero
blocks, lock the table and if it still has zero blocks, do the no-WAL
copy.

I hate to add a flag to a command when we can automatically handle it
ourselves.

Now, you mentioned the idea of doing the optimization in tables that
already have data, and if we do that, we would need a flag because the
lock is stronger than what we have now.

What we could do is to do no-WAL automatically for empty tables (like
when a database is first loaded), and use the flag for cases where the
tables is not zero pages. The fact is that database loads are a prefect
case for this optimization and old dumps are not going to have that flag
anyway, and automatic is better if we can do it.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:18:41
Message-ID: BEC38251.6C0E%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

> The problem with a new command is that it becomes unclear when you
> should use COPY and when LOAD DATA, and it confuses users, and has
> maintenance overhead. If Bizgres wants a new command name, go for it,
> but it is unlikely that the community release is going to go in that
> direction, unless there is a fundamental agreement that COPY is broken
> and needs a major revamp, and I have heard no talk of that.

The question of whether COPY should be improved or whether the changes
should take the form of a new command is separate from the question of
whether the performance of the load path in PostgreSQL needs improvement.

The 90% performance increase (from 12 MB/s to 21 MB/s) that Alon reported
comes from replacing the parsing logic within COPY. I believe that the
parsing logic in COPY is fundamentally broken from a performance
perspective, and may be broken from a functionality perspective WRT embedded
backslashes.

One of the reasons to consider a LOAD DATA command is that we can isolate
the need for performance improvements and special syntax from the concerns
of preserving the legacy behavior of COPY for use as the primary mechanism
for DUMP and RESTORE.

- Luke


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:32:32
Message-ID: 21427.1117665152@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> One idea would be to look at the table file size first. If it has zero
> blocks, lock the table and if it still has zero blocks, do the no-WAL
> copy.

I think that's a bad idea. It would make the behavior unpredictable
--- sometimes a COPY will take an exclusive lock, and other times not;
and the reason why is at a lower semantic level than the user is
supposed to know about.

Before you say "this is not important", consider the nontrivial risk
that the stronger lock will cause a deadlock failure. I don't think
that it's acceptable for lock strength to be unpredictable.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:37:07
Message-ID: 21480.1117665427@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Luke Lonergan" <llonergan(at)greenplum(dot)com> writes:
> One of the reasons to consider a LOAD DATA command is that we can isolate
> the need for performance improvements and special syntax from the concerns
> of preserving the legacy behavior of COPY for use as the primary mechanism
> for DUMP and RESTORE.

... and instead, define some new behavior that will soon be considered
broken legacy code itself?

There isn't any demand for changing the semantics of COPY, as far as
I've noticed. If we can make it faster with the same semantics that's
great, but I'm not in favor of inventing an alternate that does almost
the same thing but (eg) breaks backslash handling in the name of speed.

regards, tom lane


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:37:51
Message-ID: 1117665472.4830.12.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On K, 2005-06-01 at 18:05 -0400, Bruce Momjian wrote:

> What we could do is to do no-WAL automatically for empty tables (like
> when a database is first loaded),

You forget that some databases use WAL for PITR / replication and doing
it automatically there would surely mess up their replica.

How is index creation handeled if it is not logged in WAL ?
- is it not automatically WAL'ed ?
- Must one recreate indexes after PITR or failover ?

> and use the flag for cases where the
> tables is not zero pages. The fact is that database loads are a prefect
> case for this optimization and old dumps are not going to have that flag
> anyway, and automatic is better if we can do it.

--
Hannu Krosing <hannu(at)tm(dot)ee>


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:39:59
Message-ID: 1117665599.3844.1119.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 10:35 -0700, Alon Goldshuv wrote:
> I have been working on improving the COPY command performance

> Around 40% for 15 column (mixed types) table.
> Around 90% for 1 column table.

Thats very cool.

> 2) A modified command syntax for introducing a direct single row error
> handling. By direct I mean - a row that if rejected from within the COPY
> command context does not throw an error and rollsback the whole transaction.
> Instead the error is caught and recorded elsewhere, maybe in some error
> table, with some more information that can later on be retrieved. The
> following rows continue to be processed. This way there is barely any error
> handling overhead. Having a recursive row isolation into smaller batches is
> extremely expensive for non-small data sets. It's not an option for serious
> users.

Can we call this the ERRORTABLE clause?

> 5) allow an ERRORLIMIT to allow control of aborting a load after a certain
> number of errors (and a pre-requisite for this is point number 2 above).

The default for which would be ERRORLIMIT 0 to give backwards
compatibility.

2) and 5) seem critical for combined usability & performance with real
world data.

I'm not clear from all of those options whether we still need a LOAD
command, based upon other issues/comments raised on this thread.

However, there are some other arguments for why it might be a good idea
to have a LOAD DATA command separate from COPY. Certainly long term
features would be easier to add with two commands. Trying to maintain
backwards compatibility just because we use COPY seems like an uphill
struggle and is going to mean we have to handle sensible new additions
as options so we don't break existing applications. The most important
one is the lock type held.

[Oracle compatibility isn't one of them, even if it did provide the
command name.]

But things will be clearer when we see the patch.

Best Regards, Simon Riggs


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 22:47:17
Message-ID: 1117666037.3844.1127.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-06-01 at 18:32 -0400, Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > One idea would be to look at the table file size first. If it has zero
> > blocks, lock the table and if it still has zero blocks, do the no-WAL
> > copy.
>
> I think that's a bad idea. It would make the behavior unpredictable
> --- sometimes a COPY will take an exclusive lock, and other times not;
> and the reason why is at a lower semantic level than the user is
> supposed to know about.
>
> Before you say "this is not important", consider the nontrivial risk
> that the stronger lock will cause a deadlock failure. I don't think
> that it's acceptable for lock strength to be unpredictable.

While I agree with Bruce's sentiment, the locking is an issue with
COPY.

It is particularly important to be able to plan the concurrency within
an application. That is why DB2's interesting ability to perform lock
upgrades in a relatively unpredictable manner is not good.

For CREATE TABLE AS SELECT, I think we could do this Bruce's way and
win, since we would take the same grade of lock in both cases.

Best Regards, Simon Riggs


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 23:00:34
Message-ID: 200506012300.j51N0Yp21000@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Luke Lonergan wrote:
> Bruce,
>
> > The problem with a new command is that it becomes unclear when you
> > should use COPY and when LOAD DATA, and it confuses users, and has
> > maintenance overhead. If Bizgres wants a new command name, go for it,
> > but it is unlikely that the community release is going to go in that
> > direction, unless there is a fundamental agreement that COPY is broken
> > and needs a major revamp, and I have heard no talk of that.
>
> The question of whether COPY should be improved or whether the changes
> should take the form of a new command is separate from the question of
> whether the performance of the load path in PostgreSQL needs improvement.
>
> The 90% performance increase (from 12 MB/s to 21 MB/s) that Alon reported
> comes from replacing the parsing logic within COPY. I believe that the
> parsing logic in COPY is fundamentally broken from a performance
> perspective, and may be broken from a functionality perspective WRT embedded
> backslashes.

COPY works as designed. The idea that some guy we have never heard of
is going to appear and rewrite COPY's processing and tell us that the
existing code is actually broken seems pretty arrogant to me. If it is
broken (meaning doesn't work as designed), please show us facts rather
than conjecture.

Oh, and the "Our COPY improvements are so fundamental that they deserve
a new command name" also has a similar flavor.

(Please explain how you handle literal delimiters and nulls with no
escape processing.)

> One of the reasons to consider a LOAD DATA command is that we can isolate
> the need for performance improvements and special syntax from the concerns
> of preserving the legacy behavior of COPY for use as the primary mechanism
> for DUMP and RESTORE.

This seems like a case where GreenPlum's priorities and the community's
priorities might not match. There is much more work required on your
part if you are going to convince the community it needs a new data
loading command, and starting out with the assumption in emails that it
is going to be a newly named command isn't the best approach. That is
my fundamental point.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Neil Conway <neilc(at)samurai(dot)com>, Alvaro Herrera <alvherre(at)surnet(dot)cl>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 23:02:17
Message-ID: 200506012302.j51N2H921211@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > One idea would be to look at the table file size first. If it has zero
> > blocks, lock the table and if it still has zero blocks, do the no-WAL
> > copy.
>
> I think that's a bad idea. It would make the behavior unpredictable
> --- sometimes a COPY will take an exclusive lock, and other times not;
> and the reason why is at a lower semantic level than the user is
> supposed to know about.
>
> Before you say "this is not important", consider the nontrivial risk
> that the stronger lock will cause a deadlock failure. I don't think
> that it's acceptable for lock strength to be unpredictable.

Yea, but you are only doing the lock if the table is zero pages.
Doesn't that help? Maybe not.

I do like the LOCK keyword if we have to use one to enable this
functionality, but I am suspecting people will want this functionality
in pg_dump output. How do we do that? Just make it the default for
pg_dump output?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 23:06:07
Message-ID: 200506012306.j51N67E21732@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> I'm not clear from all of those options whether we still need a LOAD
> command, based upon other issues/comments raised on this thread.
>
> However, there are some other arguments for why it might be a good idea
> to have a LOAD DATA command separate from COPY. Certainly long term
> features would be easier to add with two commands. Trying to maintain
> backwards compatibility just because we use COPY seems like an uphill
> struggle and is going to mean we have to handle sensible new additions
> as options so we don't break existing applications. The most important
> one is the lock type held.

Well, we have had a pretty much unmodified COPY format since like the
Berkeley days (I added \N and \.). Please tell us exactly what you want
do to that requires a format change, and we can talk about it, but
showing up with no proof and expecting a new command is the _wrong_
approach. It actually reminds me of the "our company developed it so it
must be great" approach, which doesn't work well in the community.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Joe Conway <mail(at)joeconway(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 23:06:09
Message-ID: 429E3F61.8040003@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Wed, 2005-06-01 at 10:35 -0700, Alon Goldshuv wrote:
>>2) A modified command syntax for introducing a direct single row error
>>handling. By direct I mean - a row that if rejected from within the COPY

>>5) allow an ERRORLIMIT to allow control of aborting a load after a certain
>>number of errors (and a pre-requisite for this is point number 2 above).

> 2) and 5) seem critical for combined usability & performance with real
> world data.

I'll second that! This would be a huge win for one of my real world
applications, whether implemented as a new command, or as added
capability on top of COPY. The other performance enhancements would
certainly be nice to have also, but in my experience not nearly as
important as these two.

Joe


From: "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-01 23:29:30
Message-ID: BEC392EA.5140%agoldshuv@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

The patch is not there to show that something is "broken" is it there to
show how things could be done in another way, which may or may not be more
desireable.

> COPY works as designed. The idea that some guy we have never heard of
> is going to appear and rewrite COPY's processing and tell us that the
> existing code is actually broken seems pretty arrogant to me. If it is
> broken (meaning doesn't work as designed), please show us facts rather
> than conjecture.

I am sure that the code works as designed. In my previous email I was
referring to the fact that using COPY with a data field that happens to have
backslashes in it, and then querying this data field you will get different
results. For example do COPY of a field "c:\\one\ten\nine" and query for it.
The result will be much different. This is a problem with clickstream data
for example. That's all. It's very possible that there are reasons that I
missed for why things are the way they are.

> (Please explain how you handle literal delimiters and nulls with no
> escape processing.)
Escape processing is done only for these 2 cases.

Thx,
Alon.


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 00:05:29
Message-ID: BEC39B59.6C2B%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,
>
> ... and instead, define some new behavior that will soon be considered
> broken legacy code itself?

I'll not argue further about whether to have a separate LOAD command.
That's not as important as fixing the performance issues in the data load
path in PostgreSQL to me.

However, I find it compelling that the underlying problem confronting high
performance data loading in the engine is the slow parse code in psql/copy,
etc., and not the WAL or other issues. Next on the list are huge gains from
removing things like repetitive calls to strlen(TZ) in the attribute
conversion code.

> There isn't any demand for changing the semantics of COPY, as far as
> I've noticed. If we can make it faster with the same semantics that's
> great, but I'm not in favor of inventing an alternate that does almost
> the same thing but (eg) breaks backslash handling in the name of speed.

A 540% increase in parsing speed while performing substantially the same
logic is what the patch accomplishes. The patch is now available on
pgsql-patches. In this case the speed did not come from breaking backslash
handling, but rather from using faster processing for escape processing and
delimiter parsing. We *could* recreate identical escape semantics to COPY
at nearly the same speed, but frankly we're puzzled as to why character
sequences with "\" automatically imply escape processing.

We've found that there are many cases where more sophisticated escape
processing options are necessary, including multi-byte delimiters and escape
sequences, and the default use of "\" for escaping breaks many data load
cases.

- Luke


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Cc: Luke Lonergan <llonergan(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 01:01:56
Message-ID: 200506020101.j5211uj02770@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alon Goldshuv wrote:
> Bruce,
>
> The patch is not there to show that something is "broken" is it there to
> show how things could be done in another way, which may or may not be more
> desireable.

Sure, we are always looking for ways to improve things.

> > COPY works as designed. The idea that some guy we have never heard of
> > is going to appear and rewrite COPY's processing and tell us that the
> > existing code is actually broken seems pretty arrogant to me. If it is
> > broken (meaning doesn't work as designed), please show us facts rather
> > than conjecture.
>
> I am sure that the code works as designed. In my previous email I was
> referring to the fact that using COPY with a data field that happens to have
> backslashes in it, and then querying this data field you will get different
> results. For example do COPY of a field "c:\\one\ten\nine" and query for it.
> The result will be much different. This is a problem with clickstream data
> for example. That's all. It's very possible that there are reasons that I
> missed for why things are the way they are.

Yep, you have to double backslahses coming in as data so we can use
backslash for marking null, delimiters, etc. I see no way around that,
and no one since Berkeley has come up with one either. If you have an
idea, we would _love_ to hear it.

> > (Please explain how you handle literal delimiters and nulls with no
> > escape processing.)
> Escape processing is done only for these 2 cases.

OK, how is that done? How about for newlines in the data?

Right now I think that is the only escapes we do. We support more but
basically the only required ones are delimiter and nulls. The others
are around just for convenience.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 01:05:11
Message-ID: 200506020105.j5215B703414@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Luke Lonergan wrote:
> Tom,
> >
> > ... and instead, define some new behavior that will soon be considered
> > broken legacy code itself?
>
> I'll not argue further about whether to have a separate LOAD command.
> That's not as important as fixing the performance issues in the data load
> path in PostgreSQL to me.
>
> However, I find it compelling that the underlying problem confronting high
> performance data loading in the engine is the slow parse code in psql/copy,
> etc., and not the WAL or other issues. Next on the list are huge gains from
> removing things like repetitive calls to strlen(TZ) in the attribute
> conversion code.

Yep, we would _love_ those improvements.

> > There isn't any demand for changing the semantics of COPY, as far as
> > I've noticed. If we can make it faster with the same semantics that's
> > great, but I'm not in favor of inventing an alternate that does almost
> > the same thing but (eg) breaks backslash handling in the name of speed.
>
> A 540% increase in parsing speed while performing substantially the same
> logic is what the patch accomplishes. The patch is now available on
> pgsql-patches. In this case the speed did not come from breaking backslash
> handling, but rather from using faster processing for escape processing and
> delimiter parsing. We *could* recreate identical escape semantics to COPY
> at nearly the same speed, but frankly we're puzzled as to why character
> sequences with "\" automatically imply escape processing.

I am confused why you are confused. :-)

> We've found that there are many cases where more sophisticated escape
> processing options are necessary, including multi-byte delimiters and escape
> sequences, and the default use of "\" for escaping breaks many data load
> cases.

Uh, how do you do the escapes if you don't double the escape character
on input so you can distinguish a literal escape from one use to mark
special data like a literal delimiter or a null?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 01:50:23
Message-ID: BEC3B3EF.6C3E%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce,

> Yep, you have to double backslahses coming in as data so we can use
> backslash for marking null, delimiters, etc. I see no way around that,
> and no one since Berkeley has come up with one either. If you have an
> idea, we would _love_ to hear it.

Apropos to the previous thread on escape processing, IMEO it needs to be
switchable (e.g. No default of "\" as the escape character, escape
processing not done by default). Otherwise lots of data loading scenarios
(e.g. virtually all weblog sources) break in nasty ways.

The remedy to date has been to perform extended preprocessing of the data to
"pre-escape process" the data so that it will pass through the default
escape processing of the PostgreSQL copy processing.

I propose an extended syntax to COPY with a change in semantics to remove
the default of "WITH ESCAPE '\'".

- Luke


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 01:54:51
Message-ID: BEC3B4FB.6C42%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>
> Yep, we would _love_ those improvements.
>

Coming soon, probably from the guy you've never heard of :-)

>
> I am confused why you are confused. :-)
>
> Uh, how do you do the escapes if you don't double the escape character
> on input so you can distinguish a literal escape from one use to mark
> special data like a literal delimiter or a null?

Escape processing would proceed as before, but the semantics would change to
allow the use of different characters as the escape character, in addition
to the special characters for delimiter and newline. Also, escape
processing would be "false" as the default, so that the only special
characters by default would be the newline and delimiter characters.

Also of importance is the specification of newline and delimiter as
arbitrary double byte or 8-bit characters.

- Luke


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:13:38
Message-ID: 429E6B52.1050006@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Luke Lonergan wrote:

> I propose an extended syntax to COPY with a change in semantics to remove
> the default of "WITH ESCAPE '\'".

Er, doesn't this break existing database dumps?

-O


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:27:51
Message-ID: 200506020227.j522RpG22470@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Luke Lonergan wrote:
>
> >
> > Yep, we would _love_ those improvements.
> >
>
> Coming soon, probably from the guy you've never heard of :-)

LOL

> >
> > I am confused why you are confused. :-)
> >
> > Uh, how do you do the escapes if you don't double the escape character
> > on input so you can distinguish a literal escape from one use to mark
> > special data like a literal delimiter or a null?
>
> Escape processing would proceed as before, but the semantics would change to
> allow the use of different characters as the escape character, in addition
> to the special characters for delimiter and newline. Also, escape
> processing would be "false" as the default, so that the only special
> characters by default would be the newline and delimiter characters.
>
> Also of importance is the specification of newline and delimiter as
> arbitrary double byte or 8-bit characters.

I am still confused how you have reliable, never-break semantics without
special escaping.

How do you distinguis an escape-delimiter used to escape a delimiter in
the data from a literal escape-delimiter in the data being loaded --- it
seems impossible to do.

The idea of allowing a different escape character is interesting,
however, and certainly possible. Right now we allow ESCAPE to be
changed only in CSV mode, but I suppose it is possible to allow it to be
changed in non-CSV mode as well.

Or are you saying there would be no escape at all. If you make '@' the
escape, you can't just say @n is a newline because you need to make '@'
output as '@@' so you can distinguish @-n from a newline.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Luke Lonergan <llonergan(at)greenplum(dot)com>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:28:41
Message-ID: 200506020228.j522SfT22580@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oliver Jowett wrote:
> Luke Lonergan wrote:
>
> > I propose an extended syntax to COPY with a change in semantics to remove
> > the default of "WITH ESCAPE '\'".
>
> Er, doesn't this break existing database dumps?

Right, we will not change the default.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Oliver Jowett" <oliver(at)opencloud(dot)com>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:35:33
Message-ID: BEC3BE85.6C52%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oliver,

On 6/1/05 7:13 PM, "Oliver Jowett" <oliver(at)opencloud(dot)com> wrote:

> Luke Lonergan wrote:
>
>> I propose an extended syntax to COPY with a change in semantics to remove
>> the default of "WITH ESCAPE '\'".
>
> Er, doesn't this break existing database dumps?

Yes, one of the previously stated reasons to create another command for
loading data.

Another possible approach is to keep the default, but allow the escape
processing to be turned off.

- Luke


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:40:06
Message-ID: BEC3BF96.6C56%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> How do you distinguis an escape-delimiter used to escape a delimiter in
> the data from a literal escape-delimiter in the data being loaded --- it
> seems impossible to do.

The usual approach is fine - a pair of escapes is equivalent to a literal
escape.

> Or are you saying there would be no escape at all. If you make '@' the
> escape, you can't just say @n is a newline because you need to make '@'
> output as '@@' so you can distinguish @-n from a newline.

Yup - which is why when we've implemented this set of semantics in the past
we've allowed for the use of arbitrary 8-bit escape characters.

Binary characters (128-255) will appear fairly regularly in web log data,
and as Alon pointed out previously, many text fields include backslashes
(particularly Windows friendly ones). More flexibility in string handling
is needed.

Luke


From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 02:47:26
Message-ID: 20050602024726.GA12894@gp.word-to-the-wise.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 01, 2005 at 07:35:33PM -0700, Luke Lonergan wrote:
> >> I propose an extended syntax to COPY with a change in semantics to remove
> >> the default of "WITH ESCAPE '\'".
> >
> > Er, doesn't this break existing database dumps?
>
> Yes, one of the previously stated reasons to create another command for
> loading data.
>
> Another possible approach is to keep the default, but allow the escape
> processing to be turned off.

I've been following this thread, and I'm a little confused. Could you
possibly clarify what you mean, by providing a couple of lines of
input as it would be formatted with escape processing turned off -
containing a text field with an embedded newline and tab and a null field.

Cheers,
Steve


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Steve Atkins" <steve(at)blighty(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 03:35:57
Message-ID: BEC3CCAD.6C60%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Steve,

> I've been following this thread, and I'm a little confused. Could you
> possibly clarify what you mean, by providing a couple of lines of
> input as it would be formatted with escape processing turned off -
> containing a text field with an embedded newline and tab and a null field.

Using an extended command syntax for a "copy-like" command named LOAD:

LOAD [schema.]tablename [(column1,column2,...)]
FROM {'filename' | STDIN}
[ [WITH]
[DELIMITER [AS] 'delimiter']
[NULL [AS] 'null string']
[ESCAPE [AS] Œescape¹] ] ;

If you intend to support embedded newlines (0x0a) in your character data
without escapes, you will need to choose an alternative newline character
for formatting the data. An example that uses '0xaa' as the newline might
be:

Control statement:
LOAD webform (formdata) FROM /home/sample/sample.txt WITH DELIMITER Œ|¹ NULL
Œ¹ NEWLINE '0xaa';

Sample with 2 identical rows (with binary representations depicted between
<>):
Blahblah<0xaa>blahblah<0x09>blahblah<0x00>blahblah<0xaa>Blahblah<0xaa>blahbl
ah<0x09>blahblah<0x00>blahblah<0xaa>

- Luke


From: Alvaro Herrera <alvherre(at)surnet(dot)cl>
To: Alon Goldshuv <agoldshuv(at)greenplum(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 03:41:29
Message-ID: 20050602034129.GA32286@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 01, 2005 at 10:35:30AM -0700, Alon Goldshuv wrote:

> 2) A modified command syntax for introducing a direct single row error
> handling. By direct I mean - a row that if rejected from within the COPY
> command context does not throw an error and rollsback the whole transaction.
> Instead the error is caught and recorded elsewhere, maybe in some error
> table, with some more information that can later on be retrieved. The
> following rows continue to be processed. This way there is barely any error
> handling overhead.

Is there any idea on exactly how would this be done? Do you plan on
using savepoints to implement it? I fail to see how is this "barely any
overhead". Savepoints are not that expensive but they are not free either.
(No, I haven't measured it.)

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)


From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 05:16:34
Message-ID: BEC3E442.6C81%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Steve,

Oops. Example below should have read differently:

> Sample with 2 identical rows (with binary representations depicted between
> <>):
> Blahblah<0xaa>blahblah<0x09>blahblah<0x00>blahblah<0xaa>Blahblah<0xaa>blahbl
> ah<0x09>blahblah<0x00>blahblah<0xaa>

Blahblah<0x0a>blahblah<0x09>blahblah<0x00>blahblah<0xaa>Blahblah<0x0a>blahbl
ah<0x09>blahblah<0x00>blahblah<0xaa>

This would result in the load of two records each of which would look like
this (when printed on a typical terminal):

Blahblah
blahblah blahblah

Luke


From: "Alon Goldshuv" <agoldshuv(at)greenplum(dot)com>
To: "Alvaro Herrera" <alvherre(at)surnet(dot)cl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NOLOGGING option, or ?
Date: 2005-06-02 16:54:56
Message-ID: BEC487F0.51BA%agoldshuv@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> 2) A modified command syntax for introducing a direct single row error
>> handling. By direct I mean - a row that if rejected from within the COPY
>> command context does not throw an error and rollsback the whole transaction.
>> Instead the error is caught and recorded elsewhere, maybe in some error
>> table, with some more information that can later on be retrieved. The
>> following rows continue to be processed. This way there is barely any error
>> handling overhead.
>
> Is there any idea on exactly how would this be done? Do you plan on
> using savepoints to implement it? I fail to see how is this "barely any
> overhead". Savepoints are not that expensive but they are not free either.
> (No, I haven't measured it.)

Good question, I am not entirely sure if this is possible yet, as I didn't
think it through entirely yet.

I guess data errors could be divided into 2 main categories: mal-formed data
where error is detected even before forming a tuple, and the other is errors
that are caused by some constraint violation, that is, after the tuple is
formed and inserted.

From what I hear and experience the big majority of errors are of the first
type. In that case the error could be caught, the data line + line number +
error description could be inserted into an ERROR table (all TEXT fields),
and then COPY can skip forming a tuple, and move to parsing the next. In
this process there is barely any overhead.

The more difficult part obviously is handling the second error type, which I
haven't looked at yet deeply. Hopefully it is not impossible to do while
keeping transaction integrity (Any ideas anyone?). The overhead for this one
will probably be larger, but again, we expect those to happen less (in most
cases at least). Nevertheless, it is surely much faster than recursively
narrowing down batch sizes.

Alon.