Quick Links

Re: A thought on Index Organized Tables

From:	Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: A thought on Index Organized Tables
Date:	2010-02-23 13:47:42
Message-ID:	9362e74e1002230547i39e232fdj8bd64474135d4af@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Feb 23, 2010 at 10:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > Instead, how about excluding columns in primary keys from table data?
>
> How will you implement "select * from mytable" ? Or even
> "select * from mytable where non_primary_key = something" ?
> If you can't do either of those without great expense, I think
> a savings on primary-key lookups is not going to be adequate
> recompense.
>

Tom,
I am talking things more from the perspective of how things have got
implemented in Oracle/SQL Server. Both Oracle and SQL Server store the
snapshot info with indexes and hence can do index-only scans with their
indexes. But still they have the concept of Index Organized Tables /
Clustered Indexes. Apart from the disk footprint, it will have an impact on
the cache efficiency also.
In Oracle IOT and SQL Server Clustered Indexes, you have an option to
store some of the columns in the leaf pages( but not in the non-leaf pages)
and hence the tuples won't get sorted based on them, but you don't require
an extra i/o to access them. This optimization is again to reduce the size
of IOT. Oracle IOT has a concept called overflow regions, which is more like
a heap and will store a few columns. There will be a pointer from main
b-tree structure to this secondary structure. Accessing these columns are
costly, but the idea is that the database designer has taken this into
account while deciding on the columns to be put in the overflow regions.
We can design secondary indexes to make the access faster for
non-primary key based searches. But since the Secondary indexes store
primary key in the place of HeapTuple Pointer, the access will usually take
2-3 more i/os. But the idea is that the IOT is for those kind of data. which
will be 99% queried based on primary keys. The database provides that extra
performance for that kind of access patterns. So to answer your question,
full table scans(if overflow regions are involved) and search based on
non-primary keys will be slow in an IOT.
I looked at the postgres nbtree code. From my analysis(which might
be wrong!), we can implement IOTs, provided we make a decision on broken
data types issue.

Thanks,
Gokul.

In response to

Re: A thought on Index Organized Tables at 2010-02-23 05:12:46 from Tom Lane

Responses

Re: A thought on Index Organized Tables at 2010-02-23 14:58:17 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2010-02-23 14:34:13	Re: tie user processes to postmaster was:(Re: [HACKERS] scheduler in core)
Previous Message	Csaba Nagy	2010-02-23 12:00:47	Re: A thought on Index Organized Tables