Re: HOT for PostgreSQL 8.3

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Hannu Krosing" <hannu(at)skype(dot)net>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, mark(at)mark(dot)mielke(dot)cc, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Pavan Deolasee" <pavan(dot)deolasee(at)enterprisedb(dot)com>, "Nikhil S" <nikhil(dot)sontakke(at)enterprisedb(dot)com>
Subject: Re: HOT for PostgreSQL 8.3
Date: 2007-02-16 15:49:24
Message-ID: 2e78013d0702160749h3606b3a5ne9815535309cf5f3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/16/07, Hannu Krosing <hannu(at)skype(dot)net> wrote:
>
> Ühel kenal päeval, K, 2007-02-14 kell 10:41, kirjutas Tom Lane:
> > Hannu Krosing <hannu(at)skype(dot)net> writes:
> > > OTOH, for same page HOT tuples, we have the command and trx ids stored
> > > twice first as cmax,xmax of the old tuple and as cmin,xmin of the
> > > updated tuple. One of these could probably be used for in-page HOT
> tuple
> > > pointer.
> >
> > This proposal seems awfully fragile, because the existing
> > tuple-chain-following logic *depends for correctness* on comparing each
> > tuple's xmin to prior xmax.
>
> What kinds of correctnes guarantees does this give for same-page tuples?

I agree with Tom that xmin/xmax check does help to guarantee correctness.
I myself have used it often during HOT development to find/fix bugs. But
ISTM that we don't need atleast for in-page tuple chain, if we are
careful. So if removing this buys us something important, I am all for it.

The comparing of each tuple's xmin to prior xmax should stay for
> inter-page ctid links.

Agree.

Mostly you can think of the same-page HOT chain as one extended tuple
> when looking at it from outside of that page.
>
> > I don't think you can just wave your hands and say we don't need that
> cross-check.
>
> > Furthermore it seems to me you
> > haven't fixed the problem, which is that you can't remove the chain
> > member that is being pointed at by off-page links (either index entries
> > or a previous generation of the same tuple).
>
> You can't remove any tuples before they are invisible for all
> transactions (i.e. dead). And being dead implies that all previous
> versions are dead as well. So if I can remove a tuple, I can also remove
> all its previous versions as well. Or are you trying to say that VACUUM
> follows ctid links of dead tuples for some purpose ?

The only exception to this would be the case of aborted updates. In that
case a tuple is dead, but the one pointing to it is still live. But I don't
see
any reason somebody would want to follow a chain past a live tuple. Not
sure about the VACUUM FULL code path though. Thats the only
place other than EvalPlanQual where we follow ctid chain.

The problem I am trying to fix is reusing in-page space without need to
> touch indexes.

Can we do some kind of indirection from the root line pointer ? Haven't
completely thought through yet, but the basic idea is to release the actual
space consumed by the root tuple once it becomes dead, but store the
offnum of the new root in the line pointer of the original root tuple. We
may need to flag the line pointer for that, but if I am not wrong, LP_DELETE
is not used for heap tuples.

We would waste 4 bytes of line pointer until the tuple is COLD updated and
the entire chain and the associated index entry is removed.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas ADI SD 2007-02-16 15:56:16 Re: HOT for PostgreSQL 8.3
Previous Message Joshua D. Drake 2007-02-16 15:40:59 Re: WIP patch - INSERT-able log statements