WIP: Pg_upgrade - page layout converter (PLC) hook

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP: Pg_upgrade - page layout converter (PLC) hook
Date: 2008-04-15 10:46:35
Message-ID: 4804878B.3040709@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I attached patch which implemented page layout converter (PLC) hook. It is base
stone for in-place upgrade.

How it works:

When PLC module is loaded, then for each page which does not have native page
version conversion routine is called. Buffer is mark as a dirty and upgraded
page is inserted into WAL.

Performance:

I executed "select count(*) from table" on 2,2GB table (4671039 rows) (without
any tunning) and with conversion 2033s (34min) and after conversion and server
restart 31s (0,5min).

Request for comments:

1) I not sure if calling log_newpage is correct.

a) Calling from storage something in access method seems to me as bad think.
I'm thinking to move log_newpage to storage, but it invokes more question about
placement, RM ...

b) log_newpage is used for new page logging, but I use it for storing
converted page. It seems to me that it safe and heap_xlog_newpage correctly
works for new and converted page. I have only doubt about assert macro
mdextend/mdwrite which checks extend vs.write.

2) PLC module placement. I'm looking for best place (directory) where I can put
PLC code. One possibility is to put under contrib/pg_upgrade another
possibility is to put into backend/storage/upgrade/, but in this location it
will not be possible make it as a module.

3) data structures version tracking

For PLC I need to have old version of data structures like page header, tuple
header and so on. It is also useful for external tools to handle more version of
postgresql easily (e.g. pg_control should show data from all supported
postgresql versions).

My idea is to have for each structure version keep own header e.g. bufpage_03.h,
bufpage_04.h ... which will contain typedef struct PageHeaderData_03 ... and
generic bufpage.h with following content:

...
#include "bufpage_04.h"
...
typedef PageHeaderData_04 PageHeaderData;

#define PageGetPageSize(page) PageGetPageSize_04(page)
...

4) how to handle corrupted page? If page is corrupted it could invoke false
calling of convert routine. It could hide problems and conversion could "fix" it
in wrong way. Probably we need to have PageHeaderIsValid for all page layout
version.


Thanks for your comments

Attachment Content-Type Size
plc_hook_01.diff text/x-patch 2.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2008-04-15 12:58:47 Re: Lessons from commit fest
Previous Message ITAGAKI Takahiro 2008-04-15 09:19:43 Sorting writes during checkpoint