Re: page macros cleanup (ver 04)

Lists: pgsql-patches
From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: page macros cleanup
Date: 2008-06-13 16:08:35
Message-ID: 48529B83.9010907@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I attached code cleanup which is related to in-place upgrade. I replace direct
access to PageHeader structure with already existing macros and I removed also
unnecessary retyping. There still lot of places which directly access to
PageHeader structure which require new macros/fuction in page API. I will fix it
in next patch.

Zdenek

Attachment Content-Type Size
page.patch text/x-patch 33.0 KB

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-03 11:50:52
Message-ID: 2e78013d0807030450m4cab60d0jd99d1acce59fae56@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jun 13, 2008 at 9:38 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
> I attached code cleanup which is related to in-place upgrade. I replace
> direct access to PageHeader structure with already existing macros and I
> removed also unnecessary retyping.

A quick review comment:

One thing I noticed is that the modified definition of HashMaxItemSize
now does not account for the size of ItemId which may not be the right
thing. Please recheck that.

Apart from that the patch looks good to me. As you said, we should
probably replace the other direct usage of PageHeader with appropriate
macros.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-03 14:04:51
Message-ID: 486CDC83.7060609@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Just one quick note:

Zdenek Kotala wrote:
> *** pgsql.orig.da8c485e0e2a/src/backend/access/gist/gistutil.c pá črn 13 18:00:35 2008
> --- pgsql.orig/src/backend/access/gist/gistutil.c pá črn 13 18:00:35 2008
> ***************
> *** 592,598 ****
> /*
> * Additionally check that the special area looks sane.
> */
> ! if (((PageHeader) (page))->pd_special !=
> (BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData))))
> ereport(ERROR,
> (errcode(ERRCODE_INDEX_CORRUPTED),
> --- 592,598 ----
> /*
> * Additionally check that the special area looks sane.
> */
> ! if ( PageGetSpecialPointer(page) - page !=
> (BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData))))
> ereport(ERROR,
> (errcode(ERRCODE_INDEX_CORRUPTED),

Should probably use PageGetSpecialSize here. Much simpler, and doesn't
assume that the special area is always at the end of page (not that I
see us changing that anytime soon).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 07:31:29
Message-ID: 486DD1D1.50604@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee napsal(a):
> On Fri, Jun 13, 2008 at 9:38 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>> I attached code cleanup which is related to in-place upgrade. I replace
>> direct access to PageHeader structure with already existing macros and I
>> removed also unnecessary retyping.
>
> A quick review comment:

Thanks you for your review,

>
> One thing I noticed is that the modified definition of HashMaxItemSize
> now does not account for the size of ItemId which may not be the right
> thing. Please recheck that.

Good catch. I lost in basic arithmetic. What I see now that original definition
count sizeof(ItemIdData) twice and on other side it does not take care about
MAXALING correctly. I think correct formula is:

#define HashMaxItemSize(page) \
(PageGetPageSize(page) - \
( MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))+ \
MAXALIGN(sizeof(HashPageOpaqueData)) \
)\
)

What do you think?

Thanks for your comments Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 08:38:04
Message-ID: 486DE16C.3070705@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas napsal(a):
> Just one quick note:

> Should probably use PageGetSpecialSize here. Much simpler, and doesn't
> assume that the special area is always at the end of page (not that I
> see us changing that anytime soon).
>

Thanks for review,
good catch. I found similar example also in nbtree and hash. I attached updated
patch version which also contains fix for Pavan's observation. Patch modifies
HashMaxItemSize. It should require reindex on all hash indexes. Should we bump
catalog version?

thanks Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

Attachment Content-Type Size
page_02.patch text/x-patch 33.2 KB

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 08:50:12
Message-ID: 2e78013d0807040150w393ad72codaf06a6ac12a83ce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jul 4, 2008 at 1:01 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>
>
> Good catch. I lost in basic arithmetic. What I see now that original
> definition count sizeof(ItemIdData) twice and on other side it does not take
> care about MAXALING correctly. I think correct formula is:
>
> #define HashMaxItemSize(page) \
> (PageGetPageSize(page) - \
> ( MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))+ \
> MAXALIGN(sizeof(HashPageOpaqueData)) \
> )\
> )
>
> What do you think?
>

Yes. I think that's the correct way.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 09:03:41
Message-ID: 2e78013d0807040203m268ae3a7kdbe65a9695aa7365@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jul 4, 2008 at 2:08 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
> Patch
> modifies HashMaxItemSize. It should require reindex on all hash indexes.
> Should we bump catalog version?
>

Do we really need that ? Even if the new HashMaxItemSize comes out to
be lower than the current limit (because of MAXALIGN), I don't see how
existing hash indexes can have a larger item than the new limit.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 03)
Date: 2008-07-04 09:57:47
Message-ID: 486DF41B.1060000@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee napsal(a):
> On Fri, Jul 4, 2008 at 2:08 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>> Patch
>> modifies HashMaxItemSize. It should require reindex on all hash indexes.
>> Should we bump catalog version?
>>
>
> Do we really need that ? Even if the new HashMaxItemSize comes out to
> be lower than the current limit (because of MAXALIGN), I don't see how
> existing hash indexes can have a larger item than the new limit.

Yeah, I performed calculation and situation is following (for MAXALIGN 4 and 8):

4 8
Orig 8144 8144
New 8148 8144

It should be OK for all cases.

I fixed also one typo (ItemId -> ItemIdData) and new revision is attached.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

Attachment Content-Type Size
page_03.patch text/x-patch 33.2 KB

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 10:07:00
Message-ID: 486DF644.4070604@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee wrote:
> On Fri, Jul 4, 2008 at 1:01 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>>
>> Good catch. I lost in basic arithmetic. What I see now that original
>> definition count sizeof(ItemIdData) twice and on other side it does not take
>> care about MAXALING correctly. I think correct formula is:
>>
>> #define HashMaxItemSize(page) \
>> (PageGetPageSize(page) - \
>> ( MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))+ \
>> MAXALIGN(sizeof(HashPageOpaqueData)) \
>> )\
>> )
>>
>> What do you think?
>
> Yes. I think that's the correct way.

Doesn't look right to me. There's no padding after the first line
pointer, hence the first MAXALIGN shouldn't be there.

BTW, looking at hashinsert.c where it's used, we're actually passing a
pointer to the meta page to HashMaxItemSize(). So the PageGetPageSize()
call on that is quite bogus, since it's not the meta page that the tuple
is going to be inserted to. It's academical, because all pages are the
same size anyway, but doesn't look right. I think I'd go with BLKCSZ
instead.

I think this is the way it should be:

#define HashMaxItemSize \
(BLCKSZ - \
SizeOfPageHeaderData - \
MAXALIGN(sizeof(HashPageOpaqueData)) - \
sizeof(ItemIdData))

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 10:50:03
Message-ID: 2e78013d0807040350m3b2913cfkceaa70d8f9d697c2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jul 4, 2008 at 3:37 PM, Heikki Linnakangas
<heikki(at)enterprisedb(dot)com> wrote:
>
>
> I think this is the way it should be:
>
> #define HashMaxItemSize \
> (BLCKSZ - \
> SizeOfPageHeaderData - \
> MAXALIGN(sizeof(HashPageOpaqueData)) - \
> sizeof(ItemIdData))
>

I am wondering if this would fail for corner case if HashMaxItemSize
happened to be unaligned. For example, if (itemsz < HashMaxItemSize <
MAXALIGN(itemsz), PageAddItem() would later fail with a not-so-obvious
error. Should we just MAXALIGN_DOWN the HashMaxItemSize ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 10:50:55
Message-ID: 486E008F.3080304@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas napsal(a):
> Pavan Deolasee wrote:
>> On Fri, Jul 4, 2008 at 1:01 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>
>> wrote:
>>>
>>> Good catch. I lost in basic arithmetic. What I see now that original
>>> definition count sizeof(ItemIdData) twice and on other side it does
>>> not take
>>> care about MAXALING correctly. I think correct formula is:
>>>
>>> #define HashMaxItemSize(page) \
>>> (PageGetPageSize(page) - \
>>> ( MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))+ \
>>> MAXALIGN(sizeof(HashPageOpaqueData)) \
>>> )\
>>> )
>>>
>>> What do you think?
>>
>> Yes. I think that's the correct way.
>
> Doesn't look right to me. There's no padding after the first line
> pointer, hence the first MAXALIGN shouldn't be there.

Are you sure? I expecting that tupleheader must be aligned to MAXALIGN. If it is
not required than you are right.

Look on PageAddItem:

00226 alignedSize = MAXALIGN(size);
00227
00228 upper = (int) phdr->pd_upper - (int) alignedSize;

By my opinion first place where tuple should be placed is:

MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))

> BTW, looking at hashinsert.c where it's used, we're actually passing a
> pointer to the meta page to HashMaxItemSize(). So the PageGetPageSize()
> call on that is quite bogus, since it's not the meta page that the tuple
> is going to be inserted to. It's academical, because all pages are the
> same size anyway, but doesn't look right. I think I'd go with BLKCSZ
> instead.
>

Yeah, BLKCSZ looks good. Anyway, I plan to reorganize all *MaxItemSize staff to
be compatible with in-place upgrade.

thanks Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 10:55:44
Message-ID: 486E01B0.9050100@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee wrote:
> On Fri, Jul 4, 2008 at 3:37 PM, Heikki Linnakangas
> <heikki(at)enterprisedb(dot)com> wrote:
>>
>> I think this is the way it should be:
>>
>> #define HashMaxItemSize \
>> (BLCKSZ - \
>> SizeOfPageHeaderData - \
>> MAXALIGN(sizeof(HashPageOpaqueData)) - \
>> sizeof(ItemIdData))
>>
>
> I am wondering if this would fail for corner case if HashMaxItemSize
> happened to be unaligned. For example, if (itemsz < HashMaxItemSize <
> MAXALIGN(itemsz), PageAddItem() would later fail with a not-so-obvious
> error. Should we just MAXALIGN_DOWN the HashMaxItemSize ?

No, there's a itemsz = MAXALIGN(itemsz) call before the check against
HashMaxItemSize.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:04:15
Message-ID: 2e78013d0807040404q5ff7f7faq9f87a39e0f0552fa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jul 4, 2008 at 4:25 PM, Heikki Linnakangas
<heikki(at)enterprisedb(dot)com> wrote:
>
>
> No, there's a itemsz = MAXALIGN(itemsz) call before the check against
> HashMaxItemSize.
>

Ah, right. Still should we just not MAXALIGN_DOWN the Max size to
reflect the practical limit on the itemsz ? It's more academical
though, so not a big deal.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:04:15
Message-ID: 486E03AF.2040402@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas napsal(a):
> Pavan Deolasee wrote:
>> On Fri, Jul 4, 2008 at 3:37 PM, Heikki Linnakangas
>> <heikki(at)enterprisedb(dot)com> wrote:
>>>
>>> I think this is the way it should be:
>>>
>>> #define HashMaxItemSize \
>>> (BLCKSZ - \
>>> SizeOfPageHeaderData - \
>>> MAXALIGN(sizeof(HashPageOpaqueData)) - \
>>> sizeof(ItemIdData))
>>>
>>
>> I am wondering if this would fail for corner case if HashMaxItemSize
>> happened to be unaligned. For example, if (itemsz < HashMaxItemSize <
>> MAXALIGN(itemsz), PageAddItem() would later fail with a not-so-obvious
>> error. Should we just MAXALIGN_DOWN the HashMaxItemSize ?
>
> No, there's a itemsz = MAXALIGN(itemsz) call before the check against
> HashMaxItemSize.

It is tru, but id somebody will use HashMaxItemSize in different place in the
future he could omit to use same approach. See tuptoaster.h how TOAST_MAX_CHUNK
is defined or BTMaxItemSize in nbtree.h.

Pavan's comments is correct. It should give same result as my implementation
(because BLCKSZ is aligned), but it is better readable and consistent with other.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:05:36
Message-ID: 2e78013d0807040405k4c738cfsf7699baec35f7d3e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Fri, Jul 4, 2008 at 4:20 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>
>
>
> By my opinion first place where tuple should be placed is:
>
> MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))
>

Tuple actually starts from the other end of the block.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:13:03
Message-ID: 486E05BF.1090704@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee napsal(a):
> On Fri, Jul 4, 2008 at 4:20 PM, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com> wrote:
>>
>>
>> By my opinion first place where tuple should be placed is:
>>
>> MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))
>>
>
> Tuple actually starts from the other end of the block.

Yes, but I meant first from offset point of view which is opposite to insert order.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:18:37
Message-ID: 486E070D.8010101@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Zdenek Kotala wrote:
> By my opinion first place where tuple should be placed is:
>
> MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))

Yeah, but we don't enforce that directly. We enforce it by MAXALIGNing
size in PageAddItem, and with the rule that special-size is MAXALIGNed.

To put it another way, it's possible that there's an unaligned amount of
free space on a page, as returned by PageGetExactFreeSpace. But since
item size is always MAXALIGNed, it's impossible to use it all.

Come to think of it, why do we require MAXALIGN alignment of tuples? I
must be missing something, but AFAICS the widest fields in
HeapTupleHeaderData are 4-bytes wide, so it should be possible to get
away with 4-byte alignment.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup
Date: 2008-07-04 11:47:00
Message-ID: 486E0DB4.9030400@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas napsal(a):
> Zdenek Kotala wrote:
>> By my opinion first place where tuple should be placed is:
>>
>> MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))
>
> Yeah, but we don't enforce that directly. We enforce it by MAXALIGNing
> size in PageAddItem, and with the rule that special-size is MAXALIGNed.

Yeah, it is implication from other rules. But how Pavan mentioned and I agree
with him. We should have *MaxItemSize value maxaligned, because tuple is
maxaligned anyway and *MaxItemSize defines real limit.

>
> Come to think of it, why do we require MAXALIGN alignment of tuples? I
> must be missing something, but AFAICS the widest fields in
> HeapTupleHeaderData are 4-bytes wide, so it should be possible to get
> away with 4-byte alignment.
>

I think that you have problem to make correct decision about padding between
header and first data item on platform where MAXALIGN is 8. Padding will depends
on align of offset - if it is max aligned or not. It will have effect for
example on SPARC but it adds extra complexity to code. ... Maybe only set hoff
correctly when new tuple is created ...

Another improvement should be reorder column to got best space allocation during
create table (of course it affect select * from).

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-04 12:11:59
Message-ID: 486E138F.4060900@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan Deolasee napsal(a):
> On Fri, Jul 4, 2008 at 4:25 PM, Heikki Linnakangas
> <heikki(at)enterprisedb(dot)com> wrote:
>>
>> No, there's a itemsz = MAXALIGN(itemsz) call before the check against
>> HashMaxItemSize.
>>
>
> Ah, right. Still should we just not MAXALIGN_DOWN the Max size to
> reflect the practical limit on the itemsz ? It's more academical
> though, so not a big deal.

Finally I use following formula:

#define HashMaxItemSize(page) \
MAXALIGN_DOWN(PageGetPageSize(page) - \
( SizeOfPageHeaderData + sizeof(ItemIdData) ) - \
MAXALIGN(sizeof(HashPageOpaqueData)) )

I did not replace PageGetPageSize(page), because other *MaxItemSize has same
interface.

Revised patch is attached.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

Attachment Content-Type Size
page_04.patch text/x-patch 33.4 KB

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-08 19:28:27
Message-ID: 4873BFDB.7020702@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Pavan, Heikki,

Is it OK now or do you have any comments?

Thanks Zdenek

Zdenek Kotala napsal(a):
> Pavan Deolasee napsal(a):
>> On Fri, Jul 4, 2008 at 4:25 PM, Heikki Linnakangas
>> <heikki(at)enterprisedb(dot)com> wrote:
>>>
>>> No, there's a itemsz = MAXALIGN(itemsz) call before the check against
>>> HashMaxItemSize.
>>>
>>
>> Ah, right. Still should we just not MAXALIGN_DOWN the Max size to
>> reflect the practical limit on the itemsz ? It's more academical
>> though, so not a big deal.
>
> Finally I use following formula:
>
> #define HashMaxItemSize(page) \
> MAXALIGN_DOWN(PageGetPageSize(page) - \
> ( SizeOfPageHeaderData + sizeof(ItemIdData) ) - \
> MAXALIGN(sizeof(HashPageOpaqueData)) )
>
>
> I did not replace PageGetPageSize(page), because other *MaxItemSize has
> same interface.
>
> Revised patch is attached.
>
> Zdenek
>
>
> ------------------------------------------------------------------------
>
>


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-09 11:54:08
Message-ID: 4874A6E0.9070909@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Zdenek Kotala wrote:
> Pavan Deolasee napsal(a):
>> On Fri, Jul 4, 2008 at 4:25 PM, Heikki Linnakangas
>> <heikki(at)enterprisedb(dot)com> wrote:
>>>
>>> No, there's a itemsz = MAXALIGN(itemsz) call before the check against
>>> HashMaxItemSize.
>>>
>>
>> Ah, right. Still should we just not MAXALIGN_DOWN the Max size to
>> reflect the practical limit on the itemsz ? It's more academical
>> though, so not a big deal.
>
> Finally I use following formula:
>
> #define HashMaxItemSize(page) \
> MAXALIGN_DOWN(PageGetPageSize(page) - \
> ( SizeOfPageHeaderData + sizeof(ItemIdData) ) - \
> MAXALIGN(sizeof(HashPageOpaqueData)) )
>
>
> I did not replace PageGetPageSize(page), because other *MaxItemSize has
> same interface.

Ok, fair enough.

There's another academical discrepancy between these two hunks:

> *** src/backend/access/hash/hashpage.c 12 May 2008 00:00:44 -0000 1.75
> --- src/backend/access/hash/hashpage.c 9 Jul 2008 11:30:09 -0000
> ***************
> *** 407,413 ****
> for (i = _hash_log2(metap->hashm_bsize); i > 0; --i)
> {
> if ((1 << i) <= (metap->hashm_bsize -
> ! (MAXALIGN(sizeof(PageHeaderData)) +
> MAXALIGN(sizeof(HashPageOpaqueData)))))
> break;
> }
> --- 407,413 ----
> for (i = _hash_log2(metap->hashm_bsize); i > 0; --i)
> {
> if ((1 << i) <= (metap->hashm_bsize -
> ! (MAXALIGN(SizeOfPageHeaderData) +
> MAXALIGN(sizeof(HashPageOpaqueData)))))
> break;
> }

and

> *** src/include/access/hash.h 19 Jun 2008 00:46:05 -0000 1.88
> --- src/include/access/hash.h 9 Jul 2008 11:30:10 -0000
> ***************
> *** 192,198 ****
> #define BMPG_SHIFT(metap) ((metap)->hashm_bmshift)
> #define BMPG_MASK(metap) (BMPGSZ_BIT(metap) - 1)
> #define HashPageGetBitmap(pg) \
> ! ((uint32 *) (((char *) (pg)) + MAXALIGN(sizeof(PageHeaderData))))
>
> /*
> * The number of bits in an ovflpage bitmap word.
> --- 191,197 ----
> #define BMPG_SHIFT(metap) ((metap)->hashm_bmshift)
> #define BMPG_MASK(metap) (BMPGSZ_BIT(metap) - 1)
> #define HashPageGetBitmap(pg) \
> ! ((uint32 *) (((char *) (pg)) + MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData))))
>
> /*
> * The number of bits in an ovflpage bitmap word.

I admit I don't understand what that bitmap is, it looks like we're
assuming it can take up all the space between page header and the
special portion, without any line pointers, but in HashPageGetBitmap,
we're reserving space for one pointer. It looks like the actual size of
the bitmap is only the largest power of 2 below the maximum size, so
that won't be an issue in practice. That macro is actually doing the
same thing as PageGetContents, so I switched to using that. As that
moves the data sligthly on those bitmap pages, I guess we'll need a
catversion bump.

Attached is an updated patch. I also fixed some whitespace and comments
that were no longer valid. How does it look to you now?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
page_04-heikki-1.patch text/x-diff 40.9 KB

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-09 13:43:59
Message-ID: 4874C09F.4000302@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas napsal(a):
> Zdenek Kotala wrote:
>> Pavan Deolasee napsal(a):

> There's another academical discrepancy between these two hunks:

<snip>

> I admit I don't understand what that bitmap is, it looks like we're
> assuming it can take up all the space between page header and the
> special portion, without any line pointers, but in HashPageGetBitmap,
> we're reserving space for one pointer. It looks like the actual size of
> the bitmap is only the largest power of 2 below the maximum size, so
> that won't be an issue in practice. That macro is actually doing the
> same thing as PageGetContents, so I switched to using that. As that
> moves the data sligthly on those bitmap pages, I guess we'll need a
> catversion bump.

Good catch. if we have to bump catalog version then I have there small patch
which introduce following macro and remove PageHeaderData stucture from
HashMetaPageData:

#define HashPageGetMeta(page) ((HashMetaPage) (PageGetContents(page)))

It also needs bump a catalog version and if we do it now I think it is better to
connect both these patches and do bump only once.

> Attached is an updated patch. I also fixed some whitespace and comments
> that were no longer valid. How does it look to you now?

Perfect. Thanks

Zdenek

Attachment Content-Type Size
hash.patch text/x-patch 7.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-13 18:05:44
Message-ID: 12344.1215972344@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> Good catch. if we have to bump catalog version then I have there small patch
> which introduce following macro and remove PageHeaderData stucture from
> HashMetaPageData:

Seems like a bad idea --- PageGetContents is not guaranteed to deliver
a maxaligned pointer, so at least in principle this could result in
alignment violations. It would accidentally fail to fail as of CVS
HEAD, but any future rearrangement of PageHeaderData or HashMetaPageData
could create a problem.

(Possibly PageGetContents *should* guarantee a maxaligned result,
but the current coding does not.)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Zdenek Kotala" <Zdenek(dot)Kotala(at)Sun(dot)COM>, "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "PostgreSQL-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-13 19:02:19
Message-ID: 13370.1215975739@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> ... That macro is actually doing the
> same thing as PageGetContents, so I switched to using that. As that
> moves the data sligthly on those bitmap pages, I guess we'll need a
> catversion bump.

I'm amazed that Zdenek didn't scream bloody murder about that. You're
creating a work item for in-place-upgrade that would not otherwise
exist, in exchange for a completely trivial bit of code beautification.
(The same can be said of his proposed change to hash meta pages.)

I'm planning to go over this patch today and apply it sans the parts
that would require catversion bump. We can argue later about whether
those are really worth doing, but I'm leaning to "not" --- unless Zdenek
says that he has no intention of making in-place-upgrade handle hash
indexes any time soon.

BTW, after further thought about the PageGetContents() situation:
right now we can change it to guarantee maxalignment "for free",
since SizeOfPageHeaderData happens to be maxaligned on all platforms
(this wasn't the case as recently as 8.2). So I'm thinking we should
do that. There's at least one place that thinks that PageGetContents
is the same as page + SizeOfPageHeaderData, but that's easily fixed.
On balance it seems like hidden assumptions about alignment are a bigger
risk than assumptions about that offset --- anyone want to argue the
contrary?

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-21 19:19:25
Message-ID: 4884E13D.3070109@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane napsal(a):
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> ... That macro is actually doing the
>> same thing as PageGetContents, so I switched to using that. As that
>> moves the data sligthly on those bitmap pages, I guess we'll need a
>> catversion bump.
>
> I'm amazed that Zdenek didn't scream bloody murder about that. You're
> creating a work item for in-place-upgrade that would not otherwise
> exist, in exchange for a completely trivial bit of code beautification.
> (The same can be said of his proposed change to hash meta pages.)

:-) Yeah, These changes break in-place-upgrade on hash indexes and invokes
reindexing request. I have had several reasons why I didn't complaint about it:

1) IIRC, hash function for double has been change
2) there is ongoing project to improve hash index performance -> completely
redesigned content
3) hash index is not much used (by my opinion) and it affect only small group of
users

> I'm planning to go over this patch today and apply it sans the parts
> that would require catversion bump. We can argue later about whether
> those are really worth doing, but I'm leaning to "not" --- unless Zdenek
> says that he has no intention of making in-place-upgrade handle hash
> indexes any time soon.

Thanks for applying patch. I think that hash index "upgradebility" is currently
broken or it will be with new hash index improvement. But if I think about it it
does not make sense to break compatibility by this patch first. I will prepare
patch which will be upgrade friendly. And if we will reimplement hash index
soon, than we can clean a code.

> BTW, after further thought about the PageGetContents() situation:
> right now we can change it to guarantee maxalignment "for free",
> since SizeOfPageHeaderData happens to be maxaligned on all platforms
> (this wasn't the case as recently as 8.2). So I'm thinking we should
> do that. There's at least one place that thinks that PageGetContents
> is the same as page + SizeOfPageHeaderData, but that's easily fixed.
> On balance it seems like hidden assumptions about alignment are a bigger
> risk than assumptions about that offset --- anyone want to argue the
> contrary?

I think it is OK and I seen that you already applied a patch.

Thanks Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: page macros cleanup (ver 04)
Date: 2008-07-21 20:16:13
Message-ID: 17231.1216671373@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> Thanks for applying patch. I think that hash index "upgradebility" is
> currently broken or it will be with new hash index improvement. But if
> I think about it it does not make sense to break compatibility by this
> patch first.

Right, my point exactly. Those necessarily-incompatible changes might
or might not ever get applied --- if they are, we can do cosmetic
cleanups afterwards.

regards, tom lane