Re: Index only scans

Lists: pgsql-hackers
From: Shrish Purohit <shrish_purohit(at)persistent(dot)co(dot)in>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Sharmila Jothirajah <sjothirajah(at)regenstrief(dot)org>, Mahesh Nalkande <mahesh_nalkande(at)persistent(dot)co(dot)in>, Arvind Hulgeri <arvind_hulgeri(at)persistent(dot)co(dot)in>, Sameer Pendharkar <sameer_pendharkar(at)persistent(dot)co(dot)in>
Subject: Index only scans
Date: 2010-05-31 15:09:18
Message-ID: 14B8F27D7CE40C4C9E481B2B845C2E0D06445FDCDC@EXCHANGE.persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Heikki, Pgsql-Hackers,

Sometime back you have started with "Separate Heap Fetch from Index Scan" which was planned to support partial index only scans. Are you still working on it or do you know someone still working on it?

We did some development with Gokul's Index Only Patch and have got good performance numbers which are as follows:

Test table constitutes 0.5 billion records with thick index on (id,aid) on three machines {pg_normal , pg_enhanced( PGSQL with thick index feature ), Oracle} each having 16 Gb Ram. Disk I/O obtained using sar.

testdb=# \d test
Table "public.test"
Column | Type | Modifiers
-----------+------------------+-----------
id | integer |
startdate | date |
enddate | date |
charge | double precision |
firstname | text |
lastname | text |
aid | double precision |
bid | double precision |
Indexes:
"taid" THICK btree (id, aid) CLUSTER

Index size
On oracle 15.20 Gb
On Pg-normal 14.73 Gb
Pg_enhanced 23.17 Gb (16bytes*0.5billion = ~7.6 GB)

PFA excel sheet for details. In general we saw fair amount of performance improvement, but one thing that surprises us is that after around 20% tuples updated we found oracle taking more time.

Regards,
Shrish Purohit |Senior Software Engineer|Persistent Systems
shrish_purohit(at)persistent(dot)co(dot)in |Cell:+91-9850-959-940|Tel:+91(20)302-34493
Innovation in software product design, development and delivery- www.persistentsys.com

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Attachment Content-Type Size
Thick_index_perf_numbers.xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 36.9 KB

From: Sharmila Jothirajah <sjothirajah(at)regenstrief(dot)org>
To: 'Shrish Purohit' <shrish_purohit(at)persistent(dot)co(dot)in>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Sharmila Jothirajah" <sjothirajah(at)regenstrief(dot)org>, Mahesh Nalkande <mahesh_nalkande(at)persistent(dot)co(dot)in>, Arvind Hulgeri <arvind_hulgeri(at)persistent(dot)co(dot)in>, Sameer Pendharkar <sameer_pendharkar(at)persistent(dot)co(dot)in>, Cyril Colvard <ccolvard(at)regenstrief(dot)org>
Subject: Re: Index only scans
Date: 2010-05-31 15:19:15
Message-ID: 673107EFBDBE53419C100C0A56E118CA0EF2372532@rg-exchange.RII.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Shrish,
I saw the excel that you sent to Heikki. Here are my comments.

1. Don't worry about comparing the results with Oracle. Just take pg-normal and pg-enhanced. That's what the community cares about. Later we can also add Oracle's performance.
2. In the excel sheet you have 'simple queries' and 'simple queries repeated'...you have to compile them together.
3. Also for pg-normal, the queries are run with seq scan and not forcing 'index scan' . If you compare that run with pg-enhanced it will definitely look bad. You have to compare pg-enh with pg-normal's indexscan (and also seq scan). That will give a complete picture.

These performance numbers that we sent them is very important. Make sure the excel is formatted and is very clear before your sent the numbers. Its hard to get the community to respond if our reports are not clear

Thanks
Sharmila

-----Original Message-----
From: Shrish Purohit [mailto:shrish_purohit(at)persistent(dot)co(dot)in]
Sent: Monday, May 31, 2010 11:09 AM
To: Heikki Linnakangas
Cc: pgsql-hackers(at)postgresql(dot)org; Sharmila Jothirajah; Mahesh Nalkande; Arvind Hulgeri; Sameer Pendharkar
Subject: Index only scans

Hi Heikki, Pgsql-Hackers,

Sometime back you have started with "Separate Heap Fetch from Index Scan" which was planned to support partial index only scans. Are you still working on it or do you know someone still working on it?

We did some development with Gokul's Index Only Patch and have got good performance numbers which are as follows:

Test table constitutes 0.5 billion records with thick index on (id,aid) on three machines {pg_normal , pg_enhanced( PGSQL with thick index feature ), Oracle} each having 16 Gb Ram. Disk I/O obtained using sar.

testdb=# \d test
Table "public.test"
Column | Type | Modifiers
-----------+------------------+-----------
id | integer |
startdate | date |
enddate | date |
charge | double precision |
firstname | text |
lastname | text |
aid | double precision |
bid | double precision |
Indexes:
"taid" THICK btree (id, aid) CLUSTER

Index size
On oracle 15.20 Gb
On Pg-normal 14.73 Gb
Pg_enhanced 23.17 Gb (16bytes*0.5billion = ~7.6 GB)

PFA excel sheet for details. In general we saw fair amount of performance improvement, but one thing that surprises us is that after around 20% tuples updated we found oracle taking more time.

Regards,
Shrish Purohit |Senior Software Engineer|Persistent Systems shrish_purohit(at)persistent(dot)co(dot)in |Cell:+91-9850-959-940|Tel:+91(20)302-34493
Innovation in software product design, development and delivery- www.persistentsys.com

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Shrish Purohit <shrish_purohit(at)persistent(dot)co(dot)in>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Sharmila Jothirajah <sjothirajah(at)regenstrief(dot)org>, Mahesh Nalkande <mahesh_nalkande(at)persistent(dot)co(dot)in>, Arvind Hulgeri <arvind_hulgeri(at)persistent(dot)co(dot)in>, Sameer Pendharkar <sameer_pendharkar(at)persistent(dot)co(dot)in>
Subject: Re: Index only scans
Date: 2010-06-01 07:34:02
Message-ID: 4C04B7EA.9010904@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31/05/10 18:09, Shrish Purohit wrote:
> Sometime back you have started with "Separate Heap Fetch from Index Scan" which was planned to support partial index only scans. Are you still working on it or do you know someone still working on it?

I'm not working on it right now, but hopefully I'll have the time to
revisit that for 9.1. If anyone else wants to pick it up, be my guest,
I'll be glad to review and join the discussions.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com