Re: CREATE TABLE, load and freezing

Lists: pgsql-hackers
From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: CREATE TABLE, load and freezing
Date: 2008-02-28 09:08:51
Message-ID: 2e78013d0802280108s59e52e14hae032e1e67409d6c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I had this idea sometime back. Not sure if this has been discussed before

In a typical scenario, user might create a table and load data in the table as
part of a single transaction (e.g pg_restore). In this case, it would help if we
create the tuples in the *frozen* state to avoid any wrap-around related issues
with the table. Without this, very large read-only tables would
require one round of
complete freezing if there are lot of transactional activities in the
other parts
of the database. And when that happens, it would generate lots of unnecessary
IOs on these large tables.

I don't know if this is a real problem for anybody, but I could think
of its use case, at least in theory.

Is it worth doing ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 09:35:38
Message-ID: 47C6806A.3010505@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavan Deolasee wrote:
> In a typical scenario, user might create a table and load data in the
> table as part of a single transaction (e.g pg_restore). In this case,
> it would help if we create the tuples in the *frozen* state to avoid
> any wrap-around related issues with the table. Without this, very
> large read-only tables would require one round of complete freezing
> if there are lot of transactional activities in the other parts of
> the database. And when that happens, it would generate lots of
> unnecessary IOs on these large tables.
If that works, then we might also want to set the visibility hint bits.
Not because lookup of that information is expensive - the tuples all
came from the same transaction, virtually guaranteeing that the relevent
pg_clog page stays in memory after the first few pages.
But by setting them immediatly we'd save some IO, since we won't dirty
all pages during the first scan.

> I don't know if this is a real problem for anybody, but I could think
> of its use case, at least in theory.
A cannot speak for freeze-on-restore, but in a project I'm currently
working on, the IO caused (I guess) by hint-bit updates during the
first scan of the table is at least noticeably...

regards, Florian Pflug


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 09:55:07
Message-ID: 20080228183958.5F5A.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> wrote:

> In a typical scenario, user might create a table and load data in the table as
> part of a single transaction (e.g pg_restore). In this case, it would help if we
> create the tuples in the *frozen* state to avoid any wrap-around related issues
> with the table.

Sounds cool. I recommended users to do VACUUM FREEZE just after initial
loading, but we can avoid it with your method.

> Without this, very large read-only tables would require one round of
> complete freezing if there are lot of transactional activities in the other parts
> of the database. And when that happens, it would generate lots of unnecessary
> IOs on these large tables.

To make things worse, the freezing day comes at once because the first restore
is done in a single or near transactions; The wraparound timings of many
tables are aligned at the same time. Freezing copy will be the solution.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 10:01:00
Message-ID: 2e78013d0802280201j52b597fpcabf5bba298ff942@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 28, 2008 at 3:05 PM, Florian G. Pflug <fgp(at)phlo(dot)org> wrote:
>
> If that works, then we might also want to set the visibility hint bits.

Oh yes. Especially because random time-scattered index scans on
the table can actually generate multiple writes of a page of a
read-only table.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 10:16:50
Message-ID: 2e78013d0802280216of79c5fcw9a04fbedf14b2e1d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 28, 2008 at 3:25 PM, ITAGAKI Takahiro
<itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>
>
> Sounds cool. I recommended users to do VACUUM FREEZE just after initial
> loading, but we can avoid it with your method.
>

Yeah, and the additional step of VACUUM FREEZE adds up to the restore
time.

>
> To make things worse, the freezing day comes at once because the first restore
> is done in a single or near transactions; The wraparound timings of many
> tables are aligned at the same time. Freezing copy will be the solution.
>

If we can start with a freezed table and even if the table is
subsequently updated,
hopefully DSM (or something of that sort) will help us reduce the vacuum freeze
time whenever its required.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 10:20:21
Message-ID: 47C68AE5.1020707@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavan Deolasee wrote:
> On Thu, Feb 28, 2008 at 3:05 PM, Florian G. Pflug <fgp(at)phlo(dot)org> wrote:
>> If that works, then we might also want to set the visibility hint bits.
>
> Oh yes. Especially because random time-scattered index scans on
> the table can actually generate multiple writes of a page of a
> read-only table.

I remember that Simon tried to set hint bits as well when he wrote the
"skip WAL on new table" optimization, but there was some issues with it.
I can't remember the details, but I think it was related to commands in
the same transaction seeing the tuples too early. Like triggers, or
portals opened before the COPY.

Hint bits is the critical part of the issue. If you can set the hint
bits, then you can freeze as well, but freezing without setting hint
bits doesn't buy you much.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Florian Pflug <fgp(dot)phlo(dot)org(at)gmail(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 13:42:49
Message-ID: 47C6BA59.2030505@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro wrote:
>> Without this, very large read-only tables would require one round of
>> complete freezing if there are lot of transactional activities in the other parts
>> of the database. And when that happens, it would generate lots of unnecessary
>> IOs on these large tables.
>
> To make things worse, the freezing day comes at once because the first restore
> is done in a single or near transactions; The wraparound timings of many
> tables are aligned at the same time. Freezing copy will be the solution.

Hm.. Couldn't we eliminate that particular concern easily by adding some
randomness to the freeze_age?

regards, Florian Pflug


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-28 22:42:29
Message-ID: 47C6E474.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> On Thu, Feb 28, 2008 at 3:08 AM, in message
<2e78013d0802280108s59e52e14hae032e1e67409d6c(at)mail(dot)gmail(dot)com>, "Pavan Deolasee"
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> I had this idea sometime back. Not sure if this has been discussed before

There was a thread discussing the problems you're looking to address:

http://archives.postgresql.org/pgsql-performance/2007-12/msg00230.php

> I don't know if this is a real problem for anybody, but I could think
> of its use case, at least in theory.

Yeah, it's real. We are now doing a VACUUM FREEZE of a table or
database which has been freshly loaded. If you can load them
frozen and/or with hint bits, that would reduce the time to bring
a database online. It would be much appreciated here.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-29 03:45:24
Message-ID: 21282.1204256724@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> In a typical scenario, user might create a table and load data in the
> table as part of a single transaction (e.g pg_restore). In this case,
> it would help if we create the tuples in the *frozen* state to avoid
> any wrap-around related issues with the table.

We've heard that idea before, and it's just as bad as it was when
proposed before. "Pre-frozen" tuples eliminate any possibility of
tracking when a tuple was inserted; which is extremely important to know
when you are trying to do forensic analysis of a broken table. The
point of the current design is to not throw away information about tuple
insertion time until the tuple is old enough that the info is (probably)
not interesting anymore.

regards, tom lane


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-29 04:06:27
Message-ID: 2e78013d0802282006w69f8f032y1a1af3b376ec4aec@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 29, 2008 at 9:15 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
> We've heard that idea before, and it's just as bad as it was when
> proposed before. "Pre-frozen" tuples eliminate any possibility of
> tracking when a tuple was inserted; which is extremely important to know
> when you are trying to do forensic analysis of a broken table. The
> point of the current design is to not throw away information about tuple
> insertion time until the tuple is old enough that the info is (probably)
> not interesting anymore.
>

Understood. But if we consider a special case of creation and loading
of a table in a single transaction, we can possibly save the information
that the table was loaded with pre-frozen tuples with xmin equals to the
transaction creating the table.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: CREATE TABLE, load and freezing
Date: 2008-02-29 11:28:49
Message-ID: 1204284529.4223.207.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2008-02-28 at 14:38 +0530, Pavan Deolasee wrote:

> I had this idea sometime back. Not sure if this has been discussed before

Check the archives for my post to hackers in Jan 2007 and subsequent
discussion. It's possible, just a little fiddly.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk