Re: Incorrect cursor behaviour with gist index

Lists: pgsql-hackers
From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Incorrect cursor behaviour with gist index
Date: 2008-10-17 18:33:20
Message-ID: 48F8DA70.7010804@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm back, sorry for a long absence.

About this bug: http://archives.postgresql.org/pgsql-bugs/2008-09/msg00086.php

Unfortunately, GiST index doesn't work with change direction of scan. I.e. it
can't move forward then move backward and this behaviour was from the beginning.

I think it's fixable, although GiST doesn't store on page left links (only right
links) and doesn't store parent page pointer. That's needed because matched
entries in GiST isn't stored consecutively unlike to BTree. So, price for it
will be storing in memory or stack all numbers of visited pages in index even
they will not be used. In worst case it's 2^32 of GISTSearchStack. Right now it
occupies 24-32 bytes depending on sizeof(void*), and it will be needed to add
one more pointer to make double-linked stack. So, about 32Mb. Is it acceptable?

Nevertheless, it doesn't solve problem with concurrent page splitting what can
cause different order between forward and backward scan in one cursor.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-17 19:21:08
Message-ID: 28085.1224271268@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> I'm back, sorry for a long absence.
> About this bug: http://archives.postgresql.org/pgsql-bugs/2008-09/msg00086.php

> Unfortunately, GiST index doesn't work with change direction of scan. I.e. it
> can't move forward then move backward and this behaviour was from the beginning.

> I think it's fixable, although GiST doesn't store on page left links (only right
> links) and doesn't store parent page pointer. That's needed because matched
> entries in GiST isn't stored consecutively unlike to BTree. So, price for it
> will be storing in memory or stack all numbers of visited pages in index even
> they will not be used. In worst case it's 2^32 of GISTSearchStack. Right now it
> occupies 24-32 bytes depending on sizeof(void*), and it will be needed to add
> one more pointer to make double-linked stack. So, about 32Mb. Is it acceptable?

> Nevertheless, it doesn't solve problem with concurrent page splitting what can
> cause different order between forward and backward scan in one cursor.

Seems like a lotta work for a partial solution :-(. Probably the path
of least resistance is to teach the planner that only some index AMs can
do backwards scan. That would result in a Materialize buffer getting
placed in front of the query if the user demanded scroll capability,
but it would cost nothing in the more typical case where backwards scan
isn't needed.

It should be sufficient to specify this in pg_am, right? Or could the
opclass or indexkey details affect it?

BTW, can GIN do backwards scan?

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-17 20:41:58
Message-ID: 48F8F896.8030409@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Seems like a lotta work for a partial solution :-(. Probably the path
> of least resistance is to teach the planner that only some index AMs can
> do backwards scan. That would result in a Materialize buffer getting
> placed in front of the query if the user demanded scroll capability,
> but it would cost nothing in the more typical case where backwards scan
> isn't needed.

Probably, it will be a better solution. In this case GiST code could be
simplified - remove support of backward scan (in any case not fully workable)

>
> It should be sufficient to specify this in pg_am, right? Or could the
> opclass or indexkey details affect it?

I don't see any examples where it depends on opclass or index key, that's is a
AM property.

>
> BTW, can GIN do backwards scan?
No, at all.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-17 20:51:01
Message-ID: 10496.1224276661@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> Seems like a lotta work for a partial solution :-(. Probably the path
>> of least resistance is to teach the planner that only some index AMs can
>> do backwards scan. That would result in a Materialize buffer getting
>> placed in front of the query if the user demanded scroll capability,
>> but it would cost nothing in the more typical case where backwards scan
>> isn't needed.

> Probably, it will be a better solution. In this case GiST code could be
> simplified - remove support of backward scan (in any case not fully workable)

Okay. I'll go fix the core code, and you can take out whatever you want
in GiST/GIN.

BTW, I think that the support for ammarkpos/amrestrpos in these two AMs
is pretty much useless code as well. We only use mark/restore in merge
joins, and so it only needs to be supported by plan nodes that can
deliver sorted output, which GiST/GIN indexscans can't. Furthermore,
even if we had another use for the facility, it'd be pretty questionable
to use it when the AM can't guarantee to return the same sequence of
tuples after backing up. So I think it would be sufficient to have
gistmarkpos et al throw error if called.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-17 21:04:08
Message-ID: 48F8FDC8.1070705@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> to use it when the AM can't guarantee to return the same sequence of
> tuples after backing up. So I think it would be sufficient to have
> gistmarkpos et al throw error if called.

Why not to remove gistrestrpos/gistmarkpos/ginrestrpos/ginmarkpos from pg_am table?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-17 21:21:25
Message-ID: 18172.1224278485@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> to use it when the AM can't guarantee to return the same sequence of
>> tuples after backing up. So I think it would be sufficient to have
>> gistmarkpos et al throw error if called.

> Why not to remove gistrestrpos/gistmarkpos/ginrestrpos/ginmarkpos from pg_am table?

First, because that would mean adding code to the indexam.c functions to
avoid crashing, and second because then we'd have to force initdb to
change our minds about this. I think having stub functions that throw
errors, rather than no catalog entry at all, is cheap future-proofing.

regards, tom lane


From: Martin Schäfer <Martin(dot)Schaefer(at)cadcorp(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Teodor Sigaev" <teodor(at)sigaev(dot)ru>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Incorrect cursor behaviour with gist index
Date: 2008-10-20 07:26:16
Message-ID: 38D7001B29E62F469A4CB27B323584EF9A18@dev011_ex.Dev.cadcorp.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Okay. I'll go fix the core code, and you can take out
> whatever you want in GiST/GIN.

Which PostgreSQL versions will contain the fix?

Regards,

Martin Schaefer