Re: Get the offset of a tuple inside a table

Lists: pgsql-hackers
From: Pei He <hepeimail(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Get the offset of a tuple inside a table
Date: 2010-09-21 21:02:53
Message-ID: AANLkTiksU2kiKXfyePjuGLYSLGiaK7Uwenif26W_uSSX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
When I using an index scan, can I get the offset of the tuple in the table?

Thanks
--
Pei


From: Szymon Guz <mabewlun(at)gmail(dot)com>
To: Pei He <hepeimail(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-21 21:44:54
Message-ID: AANLkTim3hSLL91bZHiBKJji_Er0OLhB9oGAe-0C8U-xH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 21 September 2010 23:02, Pei He <hepeimail(at)gmail(dot)com> wrote:

> Hi,
> When I using an index scan, can I get the offset of the tuple in the table?
>
> Thanks
> --
> Pei
>

What do you mean by "the offset in the table"?


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pei He" <hepeimail(at)gmail(dot)com>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-21 22:47:53
Message-ID: 4C98EFC90200002500035AA1@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pei He <hepeimail(at)gmail(dot)com> wrote:

> When I using an index scan, can I get the offset of the tuple in
> the table?

Well, regardless of the type of scan, you can select the ctid
column, which gives the block number and the tuple number within the
block.

-Kevin


From: Pei He <hepeimail(at)gmail(dot)com>
To: Szymon Guz <mabewlun(at)gmail(dot)com>, Kevin(dot)Grittner(at)wicourts(dot)gov
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-22 00:37:42
Message-ID: AANLkTinzKoTUKEJhYgOSPjuamxKNwcz+mJzOu2Z4x3cx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Szymon, Kevin,
The offset is the order of a tuple in a Sequential Scan.

What I want to do is, for a given key return the tuples that Index scan can
find, and return the rest tuples by a seq scan. So, I need to know which
tuples have been returned by maintain a bitmap, and to avoid return the same
tuple twice.

If I can know the offset of a tuple in the order of file scan, then I can
force the seq scan to skip it.

By ctid, can I know how much tuples are in a block, is it an constant?

I think a more general solution would be build a hash table on ctid.
What do you think?

Thanks
--
Pei

On Tue, Sep 21, 2010 at 5:44 PM, Szymon Guz <mabewlun(at)gmail(dot)com> wrote:

>
>
> On 21 September 2010 23:02, Pei He <hepeimail(at)gmail(dot)com> wrote:
>
>> Hi,
>> When I using an index scan, can I get the offset of the tuple in the
>> table?
>>
>> Thanks
>> --
>> Pei
>>
>
> What do you mean by "the offset in the table"?
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pei He <hepeimail(at)gmail(dot)com>
Cc: Szymon Guz <mabewlun(at)gmail(dot)com>, Kevin(dot)Grittner(at)wicourts(dot)gov, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-22 01:32:12
Message-ID: 13464.1285119132@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pei He <hepeimail(at)gmail(dot)com> writes:
> What I want to do is, for a given key return the tuples that Index scan can
> find, and return the rest tuples by a seq scan. So, I need to know which
> tuples have been returned by maintain a bitmap, and to avoid return the same
> tuple twice.

> If I can know the offset of a tuple in the order of file scan, then I can
> force the seq scan to skip it.

As pointed out, "offset" is an unworkable concept here. That's why the
tidbitmap code doesn't work with offsets; it works with tids. You don't
really need to reinvent this wheel. Go read the bitmapscan code.

(One wonders though what you think you are going to save if you have to
do a seqscan anyway. Where's the advantage over just doing a seqscan?)

regards, tom lane


From: Pei He <hepeimail(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Szymon Guz <mabewlun(at)gmail(dot)com>, Kevin(dot)Grittner(at)wicourts(dot)gov, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-22 02:58:49
Message-ID: AANLkTimZf-mvT1ZPqYmosnbSwZRe_5EsC9tC0Lh4wGvJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom,
The bitmapset works for me.

I want to implement the operator for the following query:

Select * from a left join b on a.id = b.id order by b.id;

In a left outer join, I want the tuples that have matches in the inner table
appear first. So, the order by clause is need.

If there is a index on a.id, I can use the tuples in b to probe the index.

After return all the tuples retrieved through index, it needs to return the
rest tuples in a, because it is a left outer join in the query.

What I need to do is remember what have been returned by the index, and
avoid to return it twice.

The bitmapscan needs to remember what have to been retrieved later, so it
used the tidbitmap. But, for me, I need the bitmapset.

Thanks, your reply helps me to find the bitmapset.

Regards
--
Pei

On Tue, Sep 21, 2010 at 9:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Pei He <hepeimail(at)gmail(dot)com> writes:
> > What I want to do is, for a given key return the tuples that Index scan
> can
> > find, and return the rest tuples by a seq scan. So, I need to know which
> > tuples have been returned by maintain a bitmap, and to avoid return the
> same
> > tuple twice.
>
> > If I can know the offset of a tuple in the order of file scan, then I can
> > force the seq scan to skip it.
>
> As pointed out, "offset" is an unworkable concept here. That's why the
> tidbitmap code doesn't work with offsets; it works with tids. You don't
> really need to reinvent this wheel. Go read the bitmapscan code.
>
> (One wonders though what you think you are going to save if you have to
> do a seqscan anyway. Where's the advantage over just doing a seqscan?)
>
> regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pei He <hepeimail(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Szymon Guz <mabewlun(at)gmail(dot)com>, Kevin(dot)Grittner(at)wicourts(dot)gov, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get the offset of a tuple inside a table
Date: 2010-09-22 10:31:13
Message-ID: AANLkTi=5YO49BCjay+E07cKFH=23_J3KA-gnO1Y7Q_p1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 21, 2010 at 10:58 PM, Pei He <hepeimail(at)gmail(dot)com> wrote:
> Hi Tom,
> The bitmapset works for me.
>
> I want to implement the operator for the following query:
>
> Select * from a left join b on a.id = b.id order by b.id;
>
> In a left outer join, I want the tuples that have matches in the inner table
> appear first. So, the order by clause is need.

Why can't you just write SELECT * FROM a LEFT JOIN b ON a.id = b.id
ORDER BY b.id NULLS FIRST? "I want my query results in a different
order" is almost never something that requires modifying the source
code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company