Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Fabrízio Mello <fabriziomello(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement
Date: 2015-07-08 06:20:01
Message-ID: CAFjFpRc00NcuASBOVhyxNs0s1iFrGeQ9FQZCfnZd=QH7peGYVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 8, 2015 at 1:27 AM, Fabrízio de Royes Mello <
fabriziomello(at)gmail(dot)com> wrote:

>
> On Fri, Jul 3, 2015 at 9:29 AM, Ashutosh Bapat <
> ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> >
> >
> > On Fri, Jul 3, 2015 at 4:48 PM, Fabrízio de Royes Mello <
> fabriziomello(at)gmail(dot)com> wrote:
> >>
> >>
> >> On Thu, Apr 2, 2015 at 3:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >> >
> >> > On Wed, Mar 25, 2015 at 9:46 PM, Fabrízio de Royes Mello
> >> > <fabriziomello(at)gmail(dot)com> wrote:
> >> > >
> http://www.postgresql.org/message-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com
> >> >
> >> > I like the idea of the feature a lot, but the proposal to which you
> >> > refer here mentions some problems, which I'm curious how you think you
> >> > might solve. (I don't have any good ideas myself, beyond what I
> >> > mentioned there.)
> >> >
> >>
> >> You're right, and we have another design (steps 1 and 2 was implemented
> last year):
> >>
> >>
> >> *** ALTER TABLE changes
> >>
> >> 1) ATController
> >> 1.1) Acquire an AccessExclusiveLock
> (src/backend/commands/tablecmds.c - AlterTableGetLockLevel:3023)
> >>
> >>
> >> 2) Prepare to change relpersistence (src/backend/commands/tablecmds.c -
> ATPrepCmd:3249-3270)
> >> • check temp table (src/backend/commands/tablecmds.c -
> ATPrepChangePersistence:11074)
> >> • check foreign key constraints (src/backend/commands/tablecmds.c -
> ATPrepChangePersistence:11102)
> >>
> >>
> >> 3) FlushRelationBuffers, DropRelFileNodeBuffers and smgrimmedsync
> (MAIN_FORKNUM, FSM_FORKNUM, VISIBILITYMAP_FORKNUM and INIT_FORKNUM if
> exists)
> >>
> >>
> >> 4) Create a new fork called "TRANSIENT INIT FORK":
> >>
> >> • from Unlogged to Logged (create _initl fork) (INIT_TO_LOGGED_FORKNUM)
> >> ∘ new forkName (src/common/relpath.c) called "_initl"
> >> ∘ insert XLog record to drop it if transaction abort
> >>
> >> • from Logged to Unlogged (create _initu fork) (INIT_TO_UNLOGGED_FORKUM)
> >> ∘ new forkName (src/common/relpath.c) called "_initu"
> >> ∘ insert XLog record to drop it if transaction abort
> >
> >
> > AFAIU, while reading WAL, the server doesn't know whether the
> transaction that produced that WAL record aborted or committed. It's only
> when it sees a COMMIT/ABORT record down the line, it can confirm whether
> the transaction committed or aborted. So, one can only "redo" the things
> that WAL tells have been done. We can not "undo" things based on the WAL.
> We might record this fact "somewhere" while reading the WAL and act on it
> once we know the status of the transaction, but I do not see that as part
> of this idea. This comment applies to all the steps inserting WALs for
> undoing things.
> >
> >
>
> Even if I "xlog" the create/drop of the transient fork?
>
>
Yes.

>
>
> >> 5) Change the relpersistence in catalog (pg_class->relpersistence)
> (heap, toast, indexes)
> >>
> >>
> >> 6) Remove/Create INIT_FORK
> >>
> >> • from Unlogged to Logged
> >> ∘ remove the INIT_FORK and INIT_TO_LOGGED_FORK adding to the
> pendingDeletes queue
> >> • from Logged to Unlogged
> >> ∘ remove the INIT_TO_UNLOGGED_FORK adding to the pendingDeletes queue
> >> ∘ create the INIT_FORK using "heap_create_init_fork"
> >> ∘ insert XLog record to drop init fork if the transaction abort
> >>
> >>
> >>
> >> *** CRASH RECOVERY changes
> >>
> >> 1) During crash recovery
> (src/backend/access/transam/xlog.c:6507:ResetUnloggedRelations)
> >>
> >
> > This operation is carried out in two phases: one before replaying WAL
> records and second after that. Are you sure that the WALs generated for the
> unlogged or logged forks, as described above, have been taken care of?
> >
>
> Hummm... actually no...
>
>
>
> >>
> >> • if the transient fork "_initl" exists then
> >> ∘ drop the transient fork "_initl"
> >> ∘ if the init fork doesn't exist then create it
> >> ∘ reset relation
> >> • if the transient fork "_initu" exists then
> >> ∘ drop the transient fork "_initl"
> >> ∘ if the init fork exists then drop it
> >> ∘ don't reset the relation
> >>
> >
> > Consider case of converting unlogged to logged. The server crashes after
> 6th step when both the forks are removed. During crash recovery, it will
> not see any of the fork and won't reset the unlogged table. Remember the
> table is still unlogged since the transaction was aborted due to the crash.
> So, it has to have an init fork to be reset on crash recovery.
> >
> > Similarly while converting from logged to unlogged. The server crashes
> in the 6th step after creating the INIT_FORK, during crash recovery the
> init fork will be seen and a supposedly logged table will be trashed.
> >
>
>
> My intention for the 6th step is all recorded to wal, so if a crash occurs
> the recovery process clean the mess.
>
>
AFAIU, PostgreSQL recovery is based on "redo"ing WAL. What you described
earlier, "undo"ing based on the WAL does not fit in the current framework.

>
> > The ideas in 1 and 2 might be better than having yet another init fork.
> >
>
> The link for idea 2 is missing...
>

Sorry for that. 2nd idea is what Robert described using control file.

>
>
> >
> > 1. http://www.postgresql.org/message-id/533D457A.4030007@vmware.com
> >
>
> Unfortunately I completely missed this idea, but is also interesting. But
> I'll need to "stamp" in both ways (from unlogged to logged and vice-versa)
>
> During the recovery to check the status of a transaction can I use
> src/backend/access/transam/clog.c:TransactionIdGetStatus ? I'm asking
> because I don't know this part of code to much.
>

AFAIU, not unless all WALs (or atleast WALs till the commit/abort record of
the transaction in question) are replayed.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-07-08 06:35:42 Re: PL/pgSQL, RAISE and error context
Previous Message Amit Kapila 2015-07-08 05:39:59 Re: Freeze avoidance of very large table.