Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)

Lists: pgsql-hackerspgsql-patches
From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 07:40:48
Message-ID: 9362e74e0712162340j294c37e7q69d0c52b17acb614@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,
Currently we check for the existence of NULL values in the tuple and we
set the has_null flag. If the has_null flag is present, the tuple will be
storing a null bitmap. What i propose is

a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we can
check whether all the nulls are trailing nulls. If all the nulls are
trailing nulls, then we will not set the has_null flag and we will not have
the null bitmap with the tuple.

b) While selecting the tuple, we will check whether the tuple offset equals
/ exceeds the length of the tuple and then mark the remaining attributes of
the tuple as null. To be exact, we need to modify the slot_deform_tuple in
order to achieve the same.

This may not give huge performance benefits, but as you may know, it will
help is reducing the disk footprint.

Expecting your comments..

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 08:02:54
Message-ID: 9362e74e0712170002h25c5249ev43831f80e1d475af@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

We can also implement the same for index tuples.....

On Dec 17, 2007 1:10 PM, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
wrote:

> Hi,
> Currently we check for the existence of NULL values in the tuple and
> we set the has_null flag. If the has_null flag is present, the tuple will be
> storing a null bitmap. What i propose is
>
> a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we can
> check whether all the nulls are trailing nulls. If all the nulls are
> trailing nulls, then we will not set the has_null flag and we will not have
> the null bitmap with the tuple.
>
> b) While selecting the tuple, we will check whether the tuple offset
> equals / exceeds the length of the tuple and then mark the remaining
> attributes of the tuple as null. To be exact, we need to modify the
> slot_deform_tuple in order to achieve the same.
>
> This may not give huge performance benefits, but as you may know, it will
> help is reducing the disk footprint.
>
>
> Expecting your comments..
>
> --
> Thanks,
> Gokul.
> CertoSQL Project,
> Allied Solution Group.
> (www.alliedgroups.com)

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
Cc: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 09:23:44
Message-ID: 87k5ndptcv.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com> writes:

> a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we can
> check whether all the nulls are trailing nulls. If all the nulls are
> trailing nulls, then we will not set the has_null flag and we will not have
> the null bitmap with the tuple.

I think that would work. The only question is whether it's worth bothering
since we would have to check it on every heap_form_tuple. But I suspect it
might be possible to do it pretty cheaply or perhaps even for free. The extra
complexity would be pretty localized so I don't think that's a big downside.

> b) While selecting the tuple, we will check whether the tuple offset equals
> / exceeds the length of the tuple and then mark the remaining attributes of
> the tuple as null. To be exact, we need to modify the slot_deform_tuple in
> order to achieve the same.

Actually this already works. *_deform_tuple has to be able to deal with tables
to which people have added columns. In that case tuples inserted before the
columns were added will look just as you describe, with trailing columns
missing.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 09:58:59
Message-ID: 1197885539.12912.99.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2007-12-17 at 13:10 +0530, Gokulakannan Somasundaram wrote:

> Currently we check for the existence of NULL values in the tuple
> and we set the has_null flag. If the has_null flag is present, the
> tuple will be storing a null bitmap. What i propose is

Will this work for ALTER TABLE when adding and dropping columns?

Another idea is to store the bitmap from the first nullable column.

Some of these ideas have been discussed before, so I would check the
archives thoroughly. Most everything has if you look closely enough.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 13:23:25
Message-ID: 9362e74e0712170523x15b168c9ve65cf83d1956283a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Dec 17, 2007 3:28 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On Mon, 2007-12-17 at 13:10 +0530, Gokulakannan Somasundaram wrote:
>
> > Currently we check for the existence of NULL values in the tuple
> > and we set the has_null flag. If the has_null flag is present, the
> > tuple will be storing a null bitmap. What i propose is
>
> Will this work for ALTER TABLE when adding and dropping columns?

When we drop columns, it is not at all an issue. When we add columns, by
default they have null values. If we want to set default, postgres allows it
only for new inserts. Can you think of any specific instance.

>
>
> Another idea is to store the bitmap from the first nullable column.

This is a different idea. I like this. I will think about this also.

>
>
> Some of these ideas have been discussed before, so I would check the
> archives thoroughly. Most everything has if you look closely enough.

I have done a fair amount of search in the archives. But if you remember any
please notify me about it.

>
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 13:24:34
Message-ID: 9362e74e0712170524o43b1c5b4nbf9a4461e6b7e60c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Thanks. I agree with you.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 13:47:54
Message-ID: 47667E0A.4040008@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark wrote:
> "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com> writes:
>
>
>> a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we can
>> check whether all the nulls are trailing nulls. If all the nulls are
>> trailing nulls, then we will not set the has_null flag and we will not have
>> the null bitmap with the tuple.
>>
>
> I think that would work. The only question is whether it's worth bothering
> since we would have to check it on every heap_form_tuple.
>
This strikes me as such a corner case that it's likely not to be worth it.

If you really want to save space along these lines, one better place to
start might be mutable with column ordering - see
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php . That
would mean that we would be able to move nullable columns physically to
the tail which in turn might help this suggestion have more effect.

cheers

andrew


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2007-12-17 14:12:57
Message-ID: 1197900777.12912.110.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2007-12-17 at 08:47 -0500, Andrew Dunstan wrote:

> This strikes me as such a corner case that it's likely not to be worth it.
>
> If you really want to save space along these lines, one better place to
> start might be mutable with column ordering - see
> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php . That
> would mean that we would be able to move nullable columns physically to
> the tail which in turn might help this suggestion have more effect.

Could be a good idea.

Currently on a 64-bit system we occupy 23 bytes for row header, so any
table with more than 8 columns will cause the null bitmap to overflow
and for us to use another 8 bytes.

OP's idea could avoid that in many cases, so the saving isn't 1 byte it
is fairly frequently going to be an 8 byte saving.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-17 14:52:59
Message-ID: 87r6hlnzjo.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:

> On Mon, 2007-12-17 at 08:47 -0500, Andrew Dunstan wrote:
>
>> This strikes me as such a corner case that it's likely not to be worth it.
>>
>> If you really want to save space along these lines, one better place to
>> start might be mutable with column ordering - see
>> http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php . That
>> would mean that we would be able to move nullable columns physically to
>> the tail which in turn might help this suggestion have more effect.

That would only be one factor in deciding how to arrange columns but you have
to decide what order to store them when you're creating the table. You can't
move them around tuple by tuple. Only when rewriting the whole table would you
be able to move them around.

My first thought on how to arrange the columns would be:

fixed-size not nullable
fixed-size nullable
all variable-sized

With this additional tweak you would want to change that to:

fixed-size not nullable
fixed-size nullable
variable-size not nullable
variable-size nullable

I don't think you would want to store variable-sized not nullable columns
before fixed-sized nullable columns because in the cases where they're not
null you want to be able to use the cached offsets.

There could be some other factors to the decision when it comes to alignment
though. It might be worth putting a nullable column before a not null column
if it lets you fix the alignment and it's rarely actually null.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-17 18:35:43
Message-ID: 9362e74e0712171035q4c5638eajbb3be6b9bd3ec9bb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,
I made the fix and tested it today. It involved some 10-15 lines of code
change. I will mail it tomorrow. Feel free to give suggestions on making the
fix more maintainable.
I have followed Gregory's advice in the fix - Instead of changing the
slot_deform_tuple, i have reduced the number of attributes field of the
HeapTupleHeader(during insertion), so that the trailing nulls are treated
the same as newly added columns. Thanks Gregory.
Regarding arrangement of the columns, my take is to leave it to the user
on the arrangement of the columns. May be we can put some kind of tuning
hint somewhere in our document on the suggestions. I have made the above
statement, without thinking about other advantages, if any.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-18 19:17:34
Message-ID: 9362e74e0712181117ufe54b33y6ffcbcb28cc6eecc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,
I have currently completed the following
a) If there are only trailing nulls in the heap, no null-bitmap gets stored
b) If there are trailing nulls in addition to nulls inbetween values in the
heap, then the trailing nulls are not added to the null-bitmap. I wouldn't
have done it, but it came almost free of cost
c) If there are only trailing nulls in the index, no null-bitmap gets stored

The index part gave some issues and i hope i have fixed it. i am still
testing it(feeling sleepy :)). So i will post the patch, as soon as i
complete testing.

Thanks,
Gokul.

On Dec 18, 2007 12:05 AM, Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
wrote:

> Hi,
> I made the fix and tested it today. It involved some 10-15 lines of
> code change. I will mail it tomorrow. Feel free to give suggestions on
> making the fix more maintainable.
> I have followed Gregory's advice in the fix - Instead of changing the
> slot_deform_tuple, i have reduced the number of attributes field of the
> HeapTupleHeader(during insertion), so that the trailing nulls are treated
> the same as newly added columns. Thanks Gregory.
> Regarding arrangement of the columns, my take is to leave it to the
> user on the arrangement of the columns. May be we can put some kind of
> tuning hint somewhere in our document on the suggestions. I have made the
> above statement, without thinking about other advantages, if any.
>
>
>
> --
> Thanks,
> Gokul.
> CertoSQL Project,
> Allied Solution Group.
> (www.alliedgroups.com)
>

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-18 21:15:25
Message-ID: 5031.1198012525@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com> writes:
> I have currently completed the following
> a) If there are only trailing nulls in the heap, no null-bitmap gets stored
> b) If there are trailing nulls in addition to nulls inbetween values in the
> heap, then the trailing nulls are not added to the null-bitmap. I wouldn't
> have done it, but it came almost free of cost
> c) If there are only trailing nulls in the index, no null-bitmap gets stored

> The index part gave some issues and i hope i have fixed it.

I doubt you have fixed it; I doubt it's *possible* to fix it without
significant rejiggering of IndexTuple representation. The problem is
that IndexTuple lacks a number-of-fields field, so there is no place
to indicate how many null bitmap bits you have actually stored.
I would suggest forgetting that part and submitting the part that
has some chance of getting accepted.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-19 00:40:19
Message-ID: 87lk7rjz4c.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com> writes:
>> I have currently completed the following
>> a) If there are only trailing nulls in the heap, no null-bitmap gets stored
>> b) If there are trailing nulls in addition to nulls inbetween values in the
>> heap, then the trailing nulls are not added to the null-bitmap. I wouldn't
>> have done it, but it came almost free of cost
>> c) If there are only trailing nulls in the index, no null-bitmap gets stored
>
>> The index part gave some issues and i hope i have fixed it.
>
> I doubt you have fixed it; I doubt it's *possible* to fix it without
> significant rejiggering of IndexTuple representation. The problem is
> that IndexTuple lacks a number-of-fields field, so there is no place
> to indicate how many null bitmap bits you have actually stored.
> I would suggest forgetting that part and submitting the part that
> has some chance of getting accepted.

I suspect there's also an awkward case that *does* need to handled when you
insert a tuple which has a null column which you're leaving out of the tuple
but which appears in an index. You would have to make sure that the index
tuple has that datum listed as NULL even though it's entirely missing from the
heap tuple.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Gregory Stark" <stark(at)enterprisedb(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-19 18:00:57
Message-ID: 9362e74e0712191000t4257eea7kabcf97a1d99ef5f5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I have submitted the first working patch for the trailing null optimization.
It currently does the following
a) Doesn't store the null bitmap, if the heap tuple / index tuple contains
only trailing nulls
b) In Heap Tuple, the trailing nulls won't occupy space in the null bitmap.

The General design is like this
a) After checking for trailing nulls, i reduce the number of attributes
field, which gets stored in each heap tuple.
b) For Index, i have changed the Index_form_tuple to store the unaligned
total size in the size mask. While navigating through the index tuple, if
the offset exceeds the unaligned total size stored, then a null is returned

Please review it and provide suggestions.

> >
> > I doubt you have fixed it; I doubt it's *possible* to fix it without
> > significant rejiggering of IndexTuple representation. The problem is
> > that IndexTuple lacks a number-of-fields field, so there is no place
> > to indicate how many null bitmap bits you have actually stored.
>

Actually i have made one change to the structure of IndexTupleData. Instead
of storing the Aligned size in the size mask, i have stored the un-aligned
size. I am storing the size before the final MAXALIGN. The interface remains
un-changed. IndexTupleSize does a MAXALIGN before returning the size value.
so the interface remains un-changed. The advantage of storing the
un-aligned size is that we can get both aligned size and un-aligned size(As
you may know). I have created two more macros to return the un-aligned size.

>
> > I would suggest forgetting that part and submitting the part that
> > has some chance of getting accepted.
>

Actually i want to submit the patch, which is best according to me.

>
>
> I suspect there's also an awkward case that *does* need to handled when
> you
> insert a tuple which has a null column which you're leaving out of the
> tuple
> but which appears in an index. You would have to make sure that the index
> tuple has that datum listed as NULL even though it's entirely missing from
> the
> heap tuple.
>
> Actually this is taken care because of your suggestion. When you add a new
column, it doesn't appear in the heaptuple, but if you create an index on
that column afterwards, the case is handled. There is a field in HeapTuple,
which mentions the number of attributes in the tuple. If we are requesting
for attribute numbers greater than this number, it is returned as null. So
that problem was taken care.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

Attachment Content-Type Size
trailing-nulls.patch.gz application/x-gzip 1.8 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-19 18:46:15
Message-ID: 476966F7.6030007@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gokulakannan Somasundaram wrote:
>
>
>
>
>
> > I would suggest forgetting that part and submitting the part that
> > has some chance of getting accepted.
>
>
> Actually i want to submit the patch, which is best according to me.
>
>

That's not an attitude that is likely to succeed - you need to take
suggestions from Tom very seriously.

Also, please submit patches as context diffs, as set out in the
Developer FAQ, which you should probably read carefully:
http://www.postgresql.org/docs/faqs.FAQ_DEV.html

cheers

andrew


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-19 18:54:53
Message-ID: 20071219105453.78edbd3b@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 19 Dec 2007 13:46:15 -0500
Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> > > I would suggest forgetting that part and submitting the part
> > > that has some chance of getting accepted.
> >
> >
> > Actually i want to submit the patch, which is best according to me.
> >

You do need to be able to be able to feel that your work is up to a
standard that you find redeemable. However...

> That's not an attitude that is likely to succeed - you need to take
> suggestions from Tom very seriously.

Andrew is absolutely correct here. If you do not agree with Tom, you
best prove why. Otherwise your patch will likely be ignored on
submission.

Sincerely,

Joshua D. Drake

- --
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHaWj9ATb/zqfZUUQRAqsNAJ9k6p0z7rQEcqal0JoKw/ZZG8h5kACfaB9y
xQJ4O+h1xe947O1gnTLEbTU=
=WaSW
-----END PGP SIGNATURE-----


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-20 08:36:49
Message-ID: 9362e74e0712200036l29081b1eq3ad0e815f07a3ee8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Thanks for the suggestions. I am re-submitting the patch in contextual diff
format.

As far as storage savings are concened, the patch claims whatever is stated.
I checked it by creating a table with 10 columns on a 32 bit machine. i
inserted 100,000 rows with trailing nulls and i observed savings of
400Kbytes.
I did a similar test for index and i found similar space saving.

I have tested regression in both 32 bit system and 64 bit system.

Please go through the patch and provide further suggestions.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
( www.alliedgroups.com)

Attachment Content-Type Size
trailing_nulls.patch.gz application/x-gzip 3.4 KB

From: Decibel! <decibel(at)decibel(dot)org>
To: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
Cc: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-21 23:13:54
Message-ID: 8488A285-D027-4467-8F66-A788887D0B3F@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote:
> I checked it by creating a table with 10 columns on a 32 bit
> machine. i inserted 100,000 rows with trailing nulls and i observed
> savings of 400Kbytes.

That doesn't really tell us anything... how big was the table
originally? Also, testing on 64 bit would be interesting.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: Decibel! <decibel(at)decibel(dot)org>
Cc: "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Subject: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Date: 2007-12-25 18:04:38
Message-ID: 9362e74e0712251004p6115193dncd6e65326bdc2291@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,
Back from the holiday times. I have tried to present the proof, that the
null bitmap was absent in the table with the trailing nulls.

On Dec 22, 2007 4:43 AM, Decibel! <decibel(at)decibel(dot)org> wrote:

> On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote:
> > I checked it by creating a table with 10 columns on a 32 bit
> > machine. i inserted 100,000 rows with trailing nulls and i observed
> > savings of 400Kbytes.
>
>
> That doesn't really tell us anything...

As i said that the patch removes the null bitmap, if the tuple has trailing
nulls. Our tuple size without null bitmap is 23 bytes. Currently, as long as
the table has less than 8 columns(with null), the heaptuple header size will
be 24 bytes. But if the tuple has more than 8 columns, then it will occupy 4
more bytes in a 32 bit system and 8 more bytes in a 64 bit system. This
patch attempts to save that extra space, if the tuple has only trailing
nulls

> how big was the table
> originally?

I think it was 5.5 M and 5.1M before and after applying the patch. But how
is this relevant? The patch saves 4 bytes in a 32 bit system per tuple,
irrespective of the size of the tuple

> Also, testing on 64 bit would be interesting.

I tested the patch on 64 bit system also for regression. The saving was 8
bytes per tuple.

I have attempted to provide an explanation. But i don't know whether i have
answered your doubts exactly.
Please revert back, in case you haven't got clarified.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2008-03-18 18:41:24
Message-ID: 200803181841.m2IIfO914285@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Added to TODO:

* Consider not storing a NULL bitmap on disk if all the NULLs are
trailing

http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php
http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php

Tom's comments are:

What this lacks is some performance testing to measure the cost of the
extra tests in heap_form_tuple. If that can be shown to be negligible
then it's probably worth doing .... though I don't like any part of the
actually submitted patch ;-). All this should need is a bit more logic
in heap_form_tuple and heap_formtuple.

---------------------------------------------------------------------------

Gokulakannan Somasundaram wrote:
> Hi,
> Currently we check for the existence of NULL values in the tuple and we
> set the has_null flag. If the has_null flag is present, the tuple will be
> storing a null bitmap. What i propose is
>
> a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we can
> check whether all the nulls are trailing nulls. If all the nulls are
> trailing nulls, then we will not set the has_null flag and we will not have
> the null bitmap with the tuple.
>
> b) While selecting the tuple, we will check whether the tuple offset equals
> / exceeds the length of the tuple and then mark the remaining attributes of
> the tuple as null. To be exact, we need to modify the slot_deform_tuple in
> order to achieve the same.
>
> This may not give huge performance benefits, but as you may know, it will
> help is reducing the disk footprint.
>
>
> Expecting your comments..
>
> --
> Thanks,
> Gokul.
> CertoSQL Project,
> Allied Solution Group.
> (www.alliedgroups.com)

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2008-03-21 17:59:04
Message-ID: 9362e74e0803211059p1333287ewd888c15cf891a2d7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I would work on this and try to present the performance test results.
I would also go ahead and examine, whether the logic can be added into
heap_form_tuple by any means.

Thanks,
Gokul.

On Wed, Mar 19, 2008 at 12:11 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

>
> Added to TODO:
>
> * Consider not storing a NULL bitmap on disk if all the NULLs are
> trailing
>
> http://archives.postgresql.org/pgsql-hackers/2007-12/msg00624.php
> http://archives.postgresql.org/pgsql-patches/2007-12/msg00109.php
>
> Tom's comments are:
>
> What this lacks is some performance testing to measure the cost of
> the
> extra tests in heap_form_tuple. If that can be shown to be
> negligible
> then it's probably worth doing .... though I don't like any part of
> the
> actually submitted patch ;-). All this should need is a bit more
> logic
> in heap_form_tuple and heap_formtuple.
>
>
> ---------------------------------------------------------------------------
>
> Gokulakannan Somasundaram wrote:
> > Hi,
> > Currently we check for the existence of NULL values in the tuple and
> we
> > set the has_null flag. If the has_null flag is present, the tuple will
> be
> > storing a null bitmap. What i propose is
> >
> > a) By modifying the functions, heap_form_tuple and heap_fill_tuple, we
> can
> > check whether all the nulls are trailing nulls. If all the nulls are
> > trailing nulls, then we will not set the has_null flag and we will not
> have
> > the null bitmap with the tuple.
> >
> > b) While selecting the tuple, we will check whether the tuple offset
> equals
> > / exceeds the length of the tuple and then mark the remaining attributes
> of
> > the tuple as null. To be exact, we need to modify the slot_deform_tuple
> in
> > order to achieve the same.
> >
> > This may not give huge performance benefits, but as you may know, it
> will
> > help is reducing the disk footprint.
> >
> >
> > Expecting your comments..
> >
> > --
> > Thanks,
> > Gokul.
> > CertoSQL Project,
> > Allied Solution Group.
> > (www.alliedgroups.com)
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://postgres.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal for Null Bitmap Optimization(for Trailing NULLs)
Date: 2008-03-25 19:36:18
Message-ID: 9362e74e0803251236s673b5a70kff1a9262f581dbd1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi,
As said, i am attaching the performance test results and the same patch
in this thread works with the latest CVS head.
Actually, i am seeing a slight performance improvement with the patch, which
i think might be either because of noise/ lesser pages. i ran it with the
default settings. i have tested only inserts and selects, because that's
where the code change has happened.

Regarding Tom's comments....
As far as the changes are concerned, the patch changes the following
functions
a) heap_fill_tuple
b) nocachegetattr
c) heap_form_tuple
d) index_form_tuple
e) nocache_index_getattr
f) changed the macros index_getattr, IndexTupleSize, IndexTupleDSize
g) Introduced a new macro IndexTupleActualSize

The patch introduces the following changes to the storage of tuples
1) If there are only trailing nulls, it doesn't store the null bitmap
2) If there are non-trailing nulls and trailing nulls, it stores the
null-bitmap only till the last non-null value. so it decreases the storage
requirement of null bitmap. This is expected to have only very few use-cases
3) It doesn't store the null-bitmap for trailing nulls in indexes also

The functions mentioned in d), e), f), g) are required for the functionality
of index null-bitmap handling. I suppose, we can't handle it with only
heap_form_tuple. Please correct me, if i am wrong..

For having the functionality 2), we have to touch the heap_fill_tuple. i
have done the trick, by asking it to use the passed number of attributes,
instead of taking it from tupdesc. Again please advice me on how to
implement this with only heap_form_tuple.

Looking forward for comments/suggestions.....

Thanks,
Gokul.

Attachment Content-Type Size
Trailing null - results.ods application/vnd.oasis.opendocument.spreadsheet 11.5 KB