TODO item: teach pg_dump about sparsely-stored large objects

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: TODO item: teach pg_dump about sparsely-stored large objects
Date: 2012-10-09 02:40:51
Message-ID: 18789.1349750451@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The backend code for large objects goes to some lengths to be
intelligent about sparsely-written blobs: if you seek out to the middle
of nowhere and write a few bytes, you don't end up allocating space in
pg_largeobject for all the byte positions you skipped over. However,
pg_dump knows nothing of this. If you pg_dump a sparsely-stored large
object, it will tediously transfer all those nonexistent zeros from
server to client, and write them into the resulting archive. And then
when you restore, the blob isn't sparse anymore ... those zeroes become
non-virtual on the database side too.

Admittedly, this is no different than what happens when you try to back
up a sparsely-stored Unix file, at least with simpler backup tools.
But it seems to me we should try a bit harder.

There are a couple of stumbling blocks to making that happen:

* How should pg_dump find out where there are holes? It would be easy
if it were to look into pg_largeobject, but that would destroy the
ability to use pg_dump as non-superuser. I think we'd really have to
provide some API to read from a blob in a sparse-storage-aware manner.
The first idea that comes to mind is some way to tell lo_read to stop
reading when it hits a gap (instead of manufacturing zeroes) and then
a new "whence" option for lo_lseek that tells it to seek to the next
non-dummy data in the blob.

* How do we get pg_dump to make use of the knowledge once it's got it?
The current code in that area is a masterpiece of ugly unreadability;
it's near impossible to tell what connects to what else, and there are
assorted magic switches that completely change the behavior of major
interface functions. I'd kind of want to rewrite the whole mess before
trying to change its behavior. I'm also pretty certain that we'd need
an archive format change, though we've certainly done those before so
that isn't a fatal objection.

I'm not planning to do anything about this myself, but if someone is
looking for a project, here's one.

regards, tom lane


From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO item: teach pg_dump about sparsely-stored large objects
Date: 2012-10-09 15:44:44
Message-ID: CAM-w4HOvd-OKJ=hy72YsRV72SusSMX4rYpbAvt+RvfkJDYtQDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 9, 2012 at 3:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Admittedly, this is no different than what happens when you try to back
> up a sparsely-stored Unix file, at least with simpler backup tools.
> But it seems to me we should try a bit harder.

Fwiw both GNU tar and GNU cp support creating sparse files. They do it
by just detecting blocks of NULs and skipping over them. pg_restore
could do that today without any API changes.

That said, an API to save pg_dump the time and space of reading the
fake zeros out of the database dosen't sonud like a bad thing.

--
greg