Re: In-place upgrade: catalog side

Lists: pgsql-hackers
From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>
Subject: In-place upgrade: catalog side
Date: 2008-12-03 07:21:53
Message-ID: Pine.GSO.4.64.0812022349130.28660@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Since this whole in-place upgrade thing is going nowhere until there's
also a good solution for ye olde "The database cluster was initialized
with CATALOG_VERSION_NO ..." error, I spent some time today touring
through what everybody else has done there and have some initial review
commentary and questions.

Zdenek provided the following code:
http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/postgres/postgresql-upgrade/

The main thing you'll find there is a ksh script that handles most of the
upgrade, presuming there's no page format changes. It looks like it was
originally aimed at 8.1->8.2 upgrades (easy as long as you don't use
INET/CIDR in your database) and still has some hard-coded stuff from that
in there to clean up.

I spent some time reading the code to figure out exactly what it's doing,
but come to find there's an easier way to ramp up. It looks like this
script is a fairly straight shell port (presumably to make things more
flexible) of EDB's pg_migrator:
http://pgfoundry.org/projects/pg-migrator/

If you want to understand the basics of what happens in either program,
the pg_migrator download has a nice outline in doc/intro.txt The basic
concept seems workable, albeit a bit more brute-force than I was
expecting: don't bother trying to figure out what actually changed in the
catalog, just start with a new cluster, restore schema, then slam
renumbered data pages into it. At a high level it works like this:

-Run pg_dumpall against the old database to grab its schema
-Extract some catalog information from the old database
-Export the pg_control information
-Create a new cluster
-Copy the old pg_clog over
-With the help of pg_resetxlog, move over the timeline and other ids
-Restore the schema dump
-Update TOAST info
-Join the new database relid information against the old set so you can
easily map old and new relids for each relation
-Move the underlying database files from their old location (the original
relid) to the new one
-Adjust tablespace information

I'd like to start doing something constructive to help move this part
forward, so here's an initial set of questions I've got for mapping that
work out:

-A ksh script for something this critical is a non-starter on Windows in
particular, so a good plan would be to use Zdenek's script as a malleable
prototype, confirm it works, then update pg_migrator with anything it's
missing, right?

-Are there already any specific tasks done by Zdenek's script that are
already known to be unique to only its implementation? Eventually I
expect I'll figure that out for sure myself just by comparing the code,
was curious what the already known divergences were.

-There are 10 TODO items listed for the pg_migrator project, most or all
of which look like should be squashed before this is really complete.
Any chance somebody (Korry?) has an improved version of this floating
around beyond what's in the pgfoundry CVS already?

-Am I really the only person who is frantic that this program isn't being
worked on actively?

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-03 07:41:01
Message-ID: 4936380D.4000404@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith napsal(a):
>
> The main thing you'll find there is a ksh script that handles most of
> the upgrade, presuming there's no page format changes. It looks like it
> was originally aimed at 8.1->8.2 upgrades (easy as long as you don't use
> INET/CIDR in your database) and still has some hard-coded stuff from
> that in there to clean up.

Yes, It is correct. It is only for 8.1->8.2. It works fine for 8.3->8.4 too, but
I'm working on cleanup and fixing bugs. I hope that I will send updated version
to community today.

<snip>

> I'd like to start doing something constructive to help move this part
> forward, so here's an initial set of questions I've got for mapping that
> work out:
>
> -A ksh script for something this critical is a non-starter on Windows in
> particular, so a good plan would be to use Zdenek's script as a
> malleable prototype, confirm it works, then update pg_migrator with
> anything it's missing, right?

It is more workaround or temporary solution. This approach is easy but it has
lot of limitation. Problem with toast tables is one, but biggest problem is with
dropped columns. And maybe there will be more issues. Problem with dump is that
you lost a internal data.

I personally prefer to have special mode (like boostrap) which converts data
from old catalog to new format.

I think pg_upgrade.sh is good starter, before we will implement direct catalog
upgrade.

> -Are there already any specific tasks done by Zdenek's script that are
> already known to be unique to only its implementation? Eventually I
> expect I'll figure that out for sure myself just by comparing the code,
> was curious what the already known divergences were.

If you compare with pg_migrator, there is better handling of locale and I think
vacuum freeze is used correctly. Also shuffling with tablespaces is little bit
different (it should avoid to move data outside of mountpoint). But in principal
the idea is same.

> -There are 10 TODO items listed for the pg_migrator project, most or all
> of which look like should be squashed before this is really complete.
> Any chance somebody (Korry?) has an improved version of this floating
> around beyond what's in the pgfoundry CVS already?

As I mentioned before pg_migrator and pg_upgrade.sh is not good way. It is
workaround. It does not make sense to continue in this way.

Zdenek


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-03 07:45:44
Message-ID: 49363928.8030309@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala wrote:
> Greg Smith napsal(a):
>> -There are 10 TODO items listed for the pg_migrator project, most or
>> all of which look like should be squashed before this is really
>> complete. Any chance somebody (Korry?) has an improved version of this
>> floating around beyond what's in the pgfoundry CVS already?
>
> As I mentioned before pg_migrator and pg_upgrade.sh is not good way. It
> is workaround. It does not make sense to continue in this way.

Why not?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-03 23:36:48
Message-ID: 200812032336.mB3Namo06409@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala wrote:
> If you compare with pg_migrator, there is better handling of locale and I think
> vacuum freeze is used correctly. Also shuffling with tablespaces is little bit
> different (it should avoid to move data outside of mountpoint). But in principal
> the idea is same.
>
> > -There are 10 TODO items listed for the pg_migrator project, most or all
> > of which look like should be squashed before this is really complete.
> > Any chance somebody (Korry?) has an improved version of this floating
> > around beyond what's in the pgfoundry CVS already?
>
> As I mentioned before pg_migrator and pg_upgrade.sh is not good way. It is
> workaround. It does not make sense to continue in this way.

As the author of the original shell script, which was in
/contrib/pg_upgrade, I think the code has grown to the point where it
should be reimplemented in something like Perl.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 07:21:04
Message-ID: 493784E0.20101@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas napsal(a):
> Zdenek Kotala wrote:
>> Greg Smith napsal(a):
>>> -There are 10 TODO items listed for the pg_migrator project, most or
>>> all of which look like should be squashed before this is really
>>> complete. Any chance somebody (Korry?) has an improved version of
>>> this floating around beyond what's in the pgfoundry CVS already?
>>
>> As I mentioned before pg_migrator and pg_upgrade.sh is not good way.
>> It is workaround. It does not make sense to continue in this way.
>
> Why not?
>

Problem is the pg_dump does not export all important data for upgrade. For
example relfileid and so on. However, biggest problem here are dropped columns
(thanks to point me on this issue). Dropped column does not have information
about type. It could be possible to fake it somehow during a dump and drop them
again after restore, but I'm not convinced that it is what we want.

The solution is good now as a starter, but it is far from final solution.

Zdenek


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 07:35:27
Message-ID: Pine.GSO.4.64.0812040144050.27355@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 3 Dec 2008, Zdenek Kotala wrote:

> It works fine for 8.3->8.4 too, but I'm working on cleanup and fixing
> bugs. I hope that I will send updated version to community today.

That would be great. It didn't feel like you were quite done with it yet.
I'll be glad to help test it out, just didn't want to jump into that if it
was known to still have issues that were being worked on. Please let us
know what the remaining bugs you know about are at that point, I really
don't want this part of things to get ignored just because the page format
stuff is the harder part.

> It is more workaround or temporary solution. This approach is easy but it has
> lot of limitation. Problem with toast tables is one, but biggest problem is
> with dropped columns. And maybe there will be more issues. Problem with dump
> is that you lost a internal data.

Can you be a bit more specific about what the problems with TOAST and
dropped columns are? If those are covered in your presentation or came up
already and I missed it, just point me that way; I'm still working my way
through parts of this and don't expect to ever have it all in my head like
you do at this point. Obviously this approach is going to be somewhat
traumatic even if perfectly executed because of things like losing table
statistics.

As we move closer to final crunch time here, what I am trying to keep
clear in my head is which bits are absolutely required to do any type of
in-place upgrade, whether or not the page format changes in 8.4. What's
nice is that those parts I can be testing right now just by trying to
upgrade from 8.3 to 8.4. Barring things like the TOAST problem you
mention getting in the way, the fundamental approach taken by these
upgrade scripts seems workable for the job even it's not optimal, and
that's a whole lot better than nothing at all.

> I personally prefer to have special mode (like boostrap) which converts data
> from old catalog to new format.

That's a perfectly fine idea I would like to see too. But if we have to
write such a thing from scratch right now, I'm afraid that may be too late
to implement and still ship the next release on schedule. And if such
bootstrap code is needed, we sure need to make sure the prototype it's
going to be built on is solid ASAP. That's what I want to help you look
into if you can catch me up a bit here.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 08:58:33
Message-ID: 49379BB9.6030200@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith napsal(a):
> On Wed, 3 Dec 2008, Zdenek Kotala wrote:
>
>> It works fine for 8.3->8.4 too, but I'm working on cleanup and fixing
>> bugs. I hope that I will send updated version to community today.
>
> That would be great. It didn't feel like you were quite done with it
> yet. I'll be glad to help test it out, just didn't want to jump into
> that if it was known to still have issues that were being worked on.
> Please let us know what the remaining bugs you know about are at that
> point, I really don't want this part of things to get ignored just
> because the page format stuff is the harder part.
>
>> It is more workaround or temporary solution. This approach is easy but
>> it has lot of limitation. Problem with toast tables is one, but
>> biggest problem is with dropped columns. And maybe there will be more
>> issues. Problem with dump is that you lost a internal data.
>
> Can you be a bit more specific about what the problems with TOAST and
> dropped columns are? If those are covered in your presentation or came
> up already and I missed it, just point me that way; I'm still working my
> way through parts of this and don't expect to ever have it all in my
> head like you do at this point. Obviously this approach is going to be
> somewhat traumatic even if perfectly executed because of things like
> losing table statistics.

The TOAST problem is already addressed and script should handle it correctly.
But I don't like it much, because it is kind of magic.

Dropped column is another story. Heikki pointed me this issue in Prato and
current published version of script does not handle it. Problem is that dropped
columns are only mark as a deleted and data are still stored in tuples. Catalog
contains related information about position and length, but when you perform
dump and restore, this information is lost and columns are shifted ...

I already added to check for dropped column and now I'm going to test how
upgrade works with visibility map. I'll send this version when I finish tests.

<snip>

>
>> I personally prefer to have special mode (like boostrap) which
>> converts data from old catalog to new format.
>
> That's a perfectly fine idea I would like to see too. But if we have to
> write such a thing from scratch right now, I'm afraid that may be too
> late to implement and still ship the next release on schedule. And if
> such bootstrap code is needed, we sure need to make sure the prototype
> it's going to be built on is solid ASAP. That's what I want to help you
> look into if you can catch me up a bit here.

I agree. This is a starter for 8.3 -> 8.4, but we need more robust solution in
the future.

Zdenek


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 10:11:06
Message-ID: Pine.GSO.4.64.0812040401260.27355@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 4 Dec 2008, Zdenek Kotala wrote:

> The TOAST problem is already addressed and script should handle it correctly.
> But I don't like it much, because it is kind of magic.

I just read through the whole "toast by chunk-end" thread again and it
does seem pretty complicated. What is the magic part you're still not
happy with?

> I'll send this version when I finish tests.

You really should feel free to forward these things over as soon as you've
got something working, even if you're still running your own tests. With
all due respect to how much you've done here, the sooner we can get more
people working with and on something closer to candidate code the better.
I've have started a couple of days ago but couldn't find anything but the
old script. If some parts have comments like "this is an awful check for
dropped columns that probably doesn't even work yet", that's OK. We need
to get other people helping out with this besides you.

> Problem is that dropped columns are only mark as a deleted and data are
> still stored in tuples. Catalog contains related information about
> position and length, but when you perform dump and restore, this
> information is lost and columns are shifted ...

Here's a good example; that seems a perfect problem for somebody else to
work on. I understand it now well enough to float ideas without even
needing to see your code. Stop worring about it, I'll grab responsibility
for making sure it gets done by someone.

So, for everyone else who isn't Zdenek: when columns are dropped,
pg_attribute.attisdropped turns true and atttypid goes to 0. pg_dump
skips over them, and even if it didn't pg_restore doesn't know how to put
them back. I can think of a couple of hacks to work around this, and one
of them might even work:

1) Create a dummy type that exists only to flag these during conversion.
Re-add all the deleted columns by turning off attisdropped and flag them
with that type. Dump. Restore. Re-delete the columns. My first pass
through poking holes in this idea wonders how the dump will go crazy if it
finds rows that were created after the column was dropped, that therefore
have no value for it.

2) Query the database to find all these deleted columns and store the
information we need about them, save that into some text files (similary
to how relids are handled by the script right now). After the schema
restore, read that list in, iterating over the missing ones. For each
column that was gone, increment attnum for everything above that position
to renumber a place for it. Put a dummy column entry back in that's
already marked as deleted.

3) Wander back into pre-upgrade land by putting together something that
wanders through every table updating any row that contains data for a
dropped column. Since dropping columns isn't really common in giant data
warehouses, something that had to wander over all the tuples related to a
table that has lost a column should only need to consider a pretty small
subset of the database. You might even make it off-line without getting
too many yelps from the giant DW crowd, seems like it would be easy to
write something to estimate the amount of work needed in advance of doing
it even (before you take the system down, run a check utility that says
"The server currently has 65213 rows of data for tables with deleted
columns").

Who wants to show off how much more they know about this than me by saying
what's right or wrong with these various ideas?

If we care about the fact that columns never go away and are using (1) or
(2), could also consider adding some additional meta-data to 8.4 such that
something like vacuum can flag when a column no longer exists in any part
of the data. All deleted columns move from 8.3 to 8.4, but one day the
8.5 upgrade could finally blow them away. There's already plenty of
per-table catalog data being proposed to push into 8.4 for making future
upgrades easier, this seems like a possible candidate for something to
make space for there. As I just came to appreciate the problem I'm not
sure about that.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 11:21:10
Message-ID: 871vwo2nkp.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:

> Here's a good example; that seems a perfect problem for somebody else to work
> on. I understand it now well enough to float ideas without even needing to see
> your code. Stop worring about it, I'll grab responsibility for making sure it
> gets done by someone.
>
> So, for everyone else who isn't Zdenek: when columns are dropped,
> pg_attribute.attisdropped turns true and atttypid goes to 0. pg_dump skips
> over them, and even if it didn't pg_restore doesn't know how to put them back.
> I can think of a couple of hacks to work around this, and one of them might
> even work:
>
> 1) Create a dummy type that exists only to flag these during conversion. Re-add
> all the deleted columns by turning off attisdropped and flag them with that
> type. Dump. Restore. Re-delete the columns. My first pass through poking
> holes in this idea wonders how the dump will go crazy if it finds rows that
> were created after the column was dropped, that therefore have no value for it.

No, those records would work fine, they will have the column set NULL. But in
any case it doesn't matter, you don't need to dump out the data at all --
that's kind of the whole point of the exercise after all :)

> Who wants to show off how much more they know about this than me by saying
> what's right or wrong with these various ideas?

*blush* :)

They all seem functional ideas. But it seems to me they're all ideas that
would be appropriate if this was a pgfoundry add-on for existing releases.
But if this is an integrated feature targeting future releases we have more
flexibility and there are more integrated approaches possible.

How about adding a special syntax for CREATE TABLE which indicates to include
a dropped column in that position? Then pg_dump could have a -X option to
include those columns as placeholders. Something like:

CREATE TABLE foo (
col1 integer,
NULL COLUMN,
col2 integer
);

> If we care about the fact that columns never go away and are using (1) or (2),
> could also consider adding some additional meta-data to 8.4 such that something
> like vacuum can flag when a column no longer exists in any part of the data.
> All deleted columns move from 8.3 to 8.4, but one day the 8.5 upgrade could
> finally blow them away. There's already plenty of per-table catalog data being
> proposed to push into 8.4 for making future upgrades easier, this seems like a
> possible candidate for something to make space for there. As I just came to
> appreciate the problem I'm not sure about that.

Hm, that's an interesting idea but I think it would work differently. If the
column is dropped but there are tuples where the column is present then vacuum
could squeeze the column out and set the null bit on each tuple instead. But
that would involve a lot of I/O so it wouldn't be something we would want to
do on a regular vacuum.

Actually removing the attribute is downright hard. You would have to have the
table locked, and squeeze the null bitmap -- and if you crash in the middle
your data will be toast. I don't see much reason to worry about dropping the
attribute though. The only cases where it matters are if you're near
MaxAttrNum (1600 columns IIRC) or if it's the only null column (and in a table
with more than 8 columns).

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 12:31:05
Message-ID: 4937CD89.2080403@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark napsal(a):

> How about adding a special syntax for CREATE TABLE which indicates to include
> a dropped column in that position? Then pg_dump could have a -X option to
> include those columns as placeholders. Something like:
>
> CREATE TABLE foo (
> col1 integer,
> NULL COLUMN,
> col2 integer
> );

You need to know a size of the attribute for fetchattr function.

Zdenke


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 13:48:05
Message-ID: 87hc5k127e.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:

> Gregory Stark napsal(a):
>
>> How about adding a special syntax for CREATE TABLE which indicates to include
>> a dropped column in that position? Then pg_dump could have a -X option to
>> include those columns as placeholders. Something like:
>>
>> CREATE TABLE foo (
>> col1 integer,
>> NULL COLUMN,
>> col2 integer
>> );
>
> You need to know a size of the attribute for fetchattr function.

True. and the alignment. I guess those would have to be in the syntax as well
since we don't even know what type those columns were previously. The type
might not even exist any more.

This is an internal syntax so I don't see any reason to bother making new
keywords just to pretty up the syntax. I don't see a problem with just doing
something like "NULL COLUMN (2,1)"

The only constraint it seems to me is that it should be an unlikely string for
someone to come up with accidentally from a simple syntax error. That's why I
originally suggested two reserved keywords. But even NULL(2,1) seems like it
would be fine.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 22:08:20
Message-ID: Pine.GSO.4.64.0812041630530.26740@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 4 Dec 2008, Gregory Stark wrote:

> They all seem functional ideas. But it seems to me they're all ideas that
> would be appropriate if this was a pgfoundry add-on for existing releases.

I was mainly trying to target things that would be achievable within the
context of the existing shell script. I think that we need such a script
that does 100% of the job and can be tested ASAP. If it's possible to
slice the worst of the warts off later, great, but don't drop focus from
getting a potential candidate release done first.

> How about adding a special syntax for CREATE TABLE which indicates to
> include a dropped column in that position? Then pg_dump could have a -X
> option to include those columns as placeholders...This is an internal
> syntax so I don't see any reason to bother making new keywords just to
> pretty up the syntax. I don't see a problem with just doing something
> like "NULL COLUMN (2,1)"

It's a little bit ugly, but given the important of in-place upgrade I
would think this is a reasonable hack to consider; two questions:

-Is there anyone whose clean code sensibilities are really opposed to
adding such a syntax into the 8.4 codebase?

-If nobody has a beef about it, is this something you could draft a patch
for? I'm going to be busy with the upgrade script stuff and don't know
much about extending in this area anyway.

> Actually removing the attribute is downright hard. You would have to have the
> table locked, and squeeze the null bitmap -- and if you crash in the middle
> your data will be toast.

Not being familiar with the code, my assumption was that it would be
possible to push all the tuples involved off to another page as if they'd
been updated, with WAL logging and everything, similarly to the ideas that
keep getting kicked around for creating extra space for header expansion.
Almost the same code really, just with the target of moving everything
that references the dead column rather than moving just enough to create
the space needed. Actually doing the upgrade on the page itself does seem
quite perilous.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Greg Smith" <gsmith(at)gregsmith(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Zdenek Kotala" <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 22:27:12
Message-ID: 603c8f070812041427r2ab3eedevf33d4fb337104367@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Not being familiar with the code, my assumption was that it would be
> possible to push all the tuples involved off to another page as if they'd
> been updated, with WAL logging and everything, similarly to the ideas that
> keep getting kicked around for creating extra space for header expansion.
> Almost the same code really, just with the target of moving everything that
> references the dead column rather than moving just enough to create the
> space needed. Actually doing the upgrade on the page itself does seem quite
> perilous.

For in-place upgrade, you can tell which pages have been converted and
which have not by looking at the page header, so you can put a switch
into the code to handle each version appropriately. I don't think
that would be possible in this case without purpose-built
infrastructure. It might be possible to lock out writers only and
rewrite the table in a new file though, disk space permitting.

...Robert


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-04 23:05:26
Message-ID: 87vdtzy215.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:

> On Thu, 4 Dec 2008, Gregory Stark wrote:
>
>> They all seem functional ideas. But it seems to me they're all ideas that
>> would be appropriate if this was a pgfoundry add-on for existing releases.
>
> I was mainly trying to target things that would be achievable within the
> context of the existing shell script. I think that we need such a script that
> does 100% of the job and can be tested ASAP. If it's possible to slice the
> worst of the warts off later, great, but don't drop focus from getting a
> potential candidate release done first.

I suggested it because I thought it would be easier and less messy though.

>> How about adding a special syntax for CREATE TABLE which indicates to include
>> a dropped column in that position? Then pg_dump could have a -X
>> option to include those columns as placeholders...This is an internal syntax
>> so I don't see any reason to bother making new keywords just to pretty up the
>> syntax. I don't see a problem with just doing something like "NULL COLUMN
>> (2,1)"
>
> It's a little bit ugly, but given the important of in-place upgrade I would
> think this is a reasonable hack to consider; two questions:

Well it's ugly but it seems to me that it would be a whole lot more ugly to
have a whole pile of code which tries to adjust the table definitions to
insert "dropped" columns after the fact.

> -Is there anyone whose clean code sensibilities are really opposed to adding
> such a syntax into the 8.4 codebase?

Incidentally I got this wrong in my previous email. If we're aiming at
8.4->8.5 as the first in-place update then we actually don't need this in 8.4
at all. The recommended path is always to use the *new* pg_dump to dump the
old database even for regular updates and there would be no problem making
that mandatory for an in-place update. So as long as the 8.5 pg_dump knows how
to dump this syntax and the 8.5 create parser knows what to do with it then
that would be sufficient.

> -If nobody has a beef about it, is this something you could draft a patch for?
> I'm going to be busy with the upgrade script stuff and don't know much about
> extending in this area anyway.

It doesn't sound hard off the top of my head. CREATE TABLE is a bit tricky
though, I'll check to make sure it works.

>> Actually removing the attribute is downright hard. You would have to have the
>> table locked, and squeeze the null bitmap -- and if you crash in the middle
>> your data will be toast.
>
> Not being familiar with the code, my assumption was that it would be possible
> to push all the tuples involved off to another page as if they'd been updated,
> with WAL logging and everything, similarly to the ideas that keep getting
> kicked around for creating extra space for header expansion. Almost the same
> code really, just with the target of moving everything that references the dead
> column rather than moving just enough to create the space needed. Actually
> doing the upgrade on the page itself does seem quite perilous.

I'm sorry, I think I misunderstood the original idea, what you're talking
about makes a lot more sense now. You want to save the space of the dead
column by replacing it with NULL, not remove it from the table definition.

That should be possible to do in vacuum or some other operation that has the
vacuum lock without locking the table or introducing new tuples. Whatever does
it does need the tuple descriptor which Vacuum normally doesn't need though.
The page conversion time might be a good time since it'll need to deform all
the tuples and re-form them anyways.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-05 03:52:22
Message-ID: Pine.GSO.4.64.0812042156260.22750@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 3 Dec 2008, Bruce Momjian wrote:

> As the author of the original shell script, which was in
> /contrib/pg_upgrade, I think the code has grown to the point where it
> should be reimplemented in something like Perl.

Ah, there's the common ancestor I couldn't find. Sheesh, you learn Perl
last month, and already you're a zealot. That was fast.

As I see it the priorities for this part are:

1) Finish the shell-based pg_upgrade. The latest one we just got from
Zdenek looks almost there, just have to sort out the dropped column bits.
Start testing that in parallel with the below to figure out if there's
anything that was missed.

2) Re-evaluate what's in there vs. what's already implemented in the
C-based pg_migrator to determine the effort needed to upgrade that to
fully functional.

3) Figure out who is available to do the expected work on schedule.

4) Determine what language they're going to do it in and whether the task
is:
a) Re-implementing the script in a more portable and powerful scripting
language like Perl or Python
b) Adding the additional features needed to complete pg_migrator
c) Writing an implementation into core via some bootstrap process

You suggested (a), I was the latest in a series of people to suggest (b),
and Zdenek suggested (c). They all have different trade-offs in terms of
development time and expected quality of the resulting tool. At this
point we'll be lucky to get anything done, and I think we may have to
settle for whichever of the above options seems most likely to finish
based on the skills of the person(s) doing the job.

I think (c) may be out just because there will be increasing pressure for
a final code freeze on anything in core, so that beta testing can begin on
Jan 1. Whereas something that's going to end up in contrib like either
the final pg_upgrade or an improved pg_migrator might get cut a bit more
slack for starting beta less polished than the core code. (Insert retort
from Tom about how that's not the way it's done here)

Additionally, it may be the case that putting the appropriate hooks in to
support 8.4->8.5 upgrades that have been suggested (dedicated free space
management, catalog support, etc.) is the only thing that ships on time,
and that the 8.4 announcement just mentions "a community tool for in-place
upgrades from 8.3->8.4 is in currently in beta and can be downloaded from
pgforge". That retreat position goes away if you've commited to putting
the whole thing in core.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-05 07:47:30
Message-ID: Pine.GSO.4.64.0812050134460.22750@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 4 Dec 2008, Gregory Stark wrote:

> Incidentally I got this wrong in my previous email. If we're aiming at
> 8.4->8.5 as the first in-place update then we actually don't need this in 8.4
> at all.

I don't know about everybody else, but I haven't give up on putting
together something that works for 8.3. That's why I'm still trying to
chase down an approach, even if it's not as elegant as would be possible
for 8.4->8.5.

> It doesn't sound hard off the top of my head. CREATE TABLE is a bit tricky
> though, I'll check to make sure it works.

Support for NULL bits in CREATE TABLE might be a helpful trick to have
available for this at some point. I'm not sure if it's actually worth
doing after the rest of your comments though; see below.

> I'm sorry, I think I misunderstood the original idea, what you're talking
> about makes a lot more sense now. You want to save the space of the dead
> column by replacing it with NULL, not remove it from the table definition.

Not so much to save the space, it's more about breaking its need for the
soon to be removed pg_attribute that used to lead to the dropped column.
If you think it's possible to slip that change into the page conversion
task, that's probably the ideal way to deal with this.

If 8.4 has the appropriate catalog support to finally execute the full
page upgrade vision Zdenek and everybody else has mapped out, it would
make me feel better about something that just hacked around this problem
crudely for 8.3->8.4. Knowing that a future 8.5 update could finally blow
away the bogus dropped columns makes leaving them in there for this round
not as bad, and it would avoid needing to mess with the whole
pg_dump/CREATE TABLE with NULL bit.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-05 08:48:37
Message-ID: 87myfbxb16.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:

>> I'm sorry, I think I misunderstood the original idea, what you're talking
>> about makes a lot more sense now. You want to save the space of the dead
>> column by replacing it with NULL, not remove it from the table definition.
>
> Not so much to save the space, it's more about breaking its need for the soon
> to be removed pg_attribute that used to lead to the dropped column. If you
> think it's possible to slip that change into the page conversion task, that's
> probably the ideal way to deal with this.

Removing the dropped column attribute is what I think we cannot do. If you
crashed while during that it would leave you with half the table with the
columns in the old attribute position and half the columns in the new
attribute position. There would be no way to tell which was which.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: In-place upgrade: catalog side
Date: 2008-12-05 12:15:31
Message-ID: 200812051215.mB5CFVs11094@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith wrote:
> On Wed, 3 Dec 2008, Bruce Momjian wrote:
>
> > As the author of the original shell script, which was in
> > /contrib/pg_upgrade, I think the code has grown to the point where it
> > should be reimplemented in something like Perl.
>
> Ah, there's the common ancestor I couldn't find. Sheesh, you learn Perl
> last month, and already you're a zealot. That was fast.

Yes, I think that is accurate. ;-)

I think the conversion to a more portable and powerful language is a
long-term goal, rather than something we have to do now.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +