Quick Links

Re: parallel pg_restore - WIP patch

Lists:	pgsql-hackers

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	parallel pg_restore - WIP patch
Date:	2008-09-25 03:02:51
Message-ID:	48DAFF5B.7070208@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Attached is my WIP patch for parallel pg_restore. It's still very rough,
but seems to work.

Anyone who can test this with highend equipment would be helping some.

cheers

andrew

Attachment	Content-Type	Size
parallel_restore_1.patch	text/x-patch	28.2 KB

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 08:27:13
Message-ID:	48DC9CE1.3010707@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>
> Attached is my WIP patch for parallel pg_restore. It's still very rough,
> but seems to work.
>
> Anyone who can test this with highend equipment would be helping some.

tried playing with this(on a 22Gb compressed dump using 4 connections)
but it does not seem to work at all for me:

pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid block type
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid stored
block lengths
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid
distance too far back
pg_restore: [custom archiver] could not uncompress data: invalid
distances set
pg_restore: [custom archiver] could not uncompress data: invalid code
lengths set
pg_restore: [custom archiver] could not uncompress data: incorrect data
check
pg_restore: [custom archiver] could not uncompress data: invalid code
lengths set
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] out of memory
pg_restore: [custom archiver] could not uncompress data: invalid
literal/length code
pg_restore: [custom archiver] could not uncompress data: invalid
literal/length code
pg_restore: [custom archiver] could not uncompress data: invalid block type

each pg_restore process seem to eat a few gigabytes of memory in a few
seconds.

Stefan

From:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 10:55:17
Message-ID:	48DCBF95.2060206@pws.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

As I'm interested in this topic, I thought I'd take a look at the
patch. I have no capability to test it on high end hardware but did
some basic testing on my workstation and basic review of the patch.

I somehow had the impression that instead of creating a new connection
for each restore item we would create the processes at the start and
then send them the dumpId's they should be restoring. That would allow
the controller to batch dumpId's together and expect the worker to
process them in a transaction. But this is probably just an idea I
created in my head.

Do we know why we experience "tuple concurrently updated" errors if we
spawn thread too fast?

I completed some test restores using the pg_restore from head with the
patch applied. The dump was a custom dump created with pg 8.2 and
restored to an 8.2 database. To confirm this would work, I completed a
restore using the standard single threaded mode. The schema restore
successfully. The only errors reported involved non-existent roles.

When I attempt to restore using parallel restore I get out of memory
errors reported from _PrintData. The code returning the error is;

_PrintData(...
while (blkLen != 0)
{
if (blkLen + 1 > ctx->inSize)
{
free(ctx->zlibIn);
ctx->zlibIn = NULL;
ctx->zlibIn = (char *) malloc(blkLen + 1);
if (!ctx->zlibIn)
die_horribly(AH, modulename, " out of memory\n");

ctx->inSize = blkLen + 1;
in = ctx->zlibIn;
}

It appears from my debugging and looking at the code that in _PrintData;
lclContext *ctx = (lclContext *) AH->formatData;

the memory context is shared across all threads. Which means that it's
possible the memory contexts are stomping on each other. My GDB skills
are now up to being able to reproduce this in a gdb session as there are
forks going on all over the place. And if you process them in a serial
fashion, there aren't any errors. I'm not sure of the fix for this.
But in a parallel environment it doesn't seem possible to store the
memory context in the AH.

I also receive messages saying "pg_restore: [custom archiver] could not
read from input file: end of file". I have not investigated these
further as my current guess is they are linked to the out of memory error.

Given I ran into this error at my first testing attempt I haven't
evaluated much else at this point in time. Now all this could be
because I'm using the 8.2 archive, but it works fine in single restore
mode. The dump file is about 400M compressed and an entire archive
schema was removed from the restore path with a custom restore list.

Command line used; PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4
--truncate-before-load -v -d tt2 -L tt.list
/home/mr-russ/pg-index-test/timetable.pgdump 2> log.txt

I sent the log and this email originally to the list, but I think the attachment was too large, so I've resent without any attachements. Since my initial testing, Stefan has confirmed the problem I am having.

If you have any questions, would like me to run other tests or anything,
feel free to contact me.

Regards

Russell

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 12:29:04
Message-ID:	48DCD590.4000207@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Russell Smith wrote:
> Hi,
>
> As I'm interested in this topic, I thought I'd take a look at the
> patch. I have no capability to test it on high end hardware but did
> some basic testing on my workstation and basic review of the patch.
>
> I somehow had the impression that instead of creating a new connection
> for each restore item we would create the processes at the start and
> then send them the dumpId's they should be restoring. That would allow
> the controller to batch dumpId's together and expect the worker to
> process them in a transaction. But this is probably just an idea I
> created in my head.
>

Yes it is. To do that I would have to invent a protocol for talking to
the workers, etc, and there is not the slightest chance I would get that
done by November.
And I don't see the virtue in processing them all in a transaction. I've
provided a much simpler means of avoiding WAL logging of the COPY.

> Do we know why we experience "tuple concurrently updated" errors if we
> spawn thread too fast?
>

No. That's an open item.
> I completed some test restores using the pg_restore from head with the
> patch applied. The dump was a custom dump created with pg 8.2 and
> restored to an 8.2 database. To confirm this would work, I completed a
> restore using the standard single threaded mode. The schema restore
> successfully. The only errors reported involved non-existent roles.
>
> When I attempt to restore using parallel restore I get out of memory
> errors reported from _PrintData. The code returning the error is;
>
> _PrintData(...
> while (blkLen != 0)
> {
> if (blkLen + 1 > ctx->inSize)
> {
> free(ctx->zlibIn);
> ctx->zlibIn = NULL;
> ctx->zlibIn = (char *) malloc(blkLen + 1);
> if (!ctx->zlibIn)
> die_horribly(AH, modulename, " out of memory\n");
>
> ctx->inSize = blkLen + 1;
> in = ctx->zlibIn;
> }
>
>
> It appears from my debugging and looking at the code that in _PrintData;
> lclContext *ctx = (lclContext *) AH->formatData;
>
> the memory context is shared across all threads. Which means that it's
> possible the memory contexts are stomping on each other. My GDB skills
> are now up to being able to reproduce this in a gdb session as there are
> forks going on all over the place. And if you process them in a serial
> fashion, there aren't any errors. I'm not sure of the fix for this.
> But in a parallel environment it doesn't seem possible to store the
> memory context in the AH.
>

There are no threads, hence nothing is shared. fork() create s new
process, not a new thread, and all they share are file descriptors.

> I also receive messages saying "pg_restore: [custom archiver] could not
> read from input file: end of file". I have not investigated these
> further as my current guess is they are linked to the out of memory error.
>
> Given I ran into this error at my first testing attempt I haven't
> evaluated much else at this point in time. Now all this could be
> because I'm using the 8.2 archive, but it works fine in single restore
> mode. The dump file is about 400M compressed and an entire archive
> schema was removed from the restore path with a custom restore list.
>
> Command line used; PGPORT=5432 ./pg_restore -h /var/run/postgresql -m4
> --truncate-before-load -v -d tt2 -L tt.list
> /home/mr-russ/pg-index-test/timetable.pgdump 2> log.txt
>
> I've attached the log.txt file so you can review the errors that I saw.
> I have adjusted the "out of memory" error to include a number to work
> out which one was being triggered. So you'll see "5 out of memory" in
> the log file, which corresponds to the code above.
>

However, there does seem to be something odd happening with the
compression lib, which I will investigate. Thanks for the report.

cheers

andrew

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 12:37:26
Message-ID:	48DCD786.2020105@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stefan Kaltenbrunner wrote:
> Andrew Dunstan wrote:
>>
>> Attached is my WIP patch for parallel pg_restore. It's still very
>> rough, but seems to work.
>>
>> Anyone who can test this with highend equipment would be helping some.
>
> tried playing with this(on a 22Gb compressed dump using 4 connections)
> but it does not seem to work at all for me:
>
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] could not uncompress data: invalid block
> type
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] could not uncompress data: invalid
> stored block lengths
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] could not uncompress data: invalid
> distance too far back
> pg_restore: [custom archiver] could not uncompress data: invalid
> distances set
> pg_restore: [custom archiver] could not uncompress data: invalid code
> lengths set
> pg_restore: [custom archiver] could not uncompress data: incorrect
> data check
> pg_restore: [custom archiver] could not uncompress data: invalid code
> lengths set
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] out of memory
> pg_restore: [custom archiver] could not uncompress data: invalid
> literal/length code
> pg_restore: [custom archiver] could not uncompress data: invalid
> literal/length code
> pg_restore: [custom archiver] could not uncompress data: invalid block
> type
>
> each pg_restore process seem to eat a few gigabytes of memory in a few
> seconds.

Ouch. Ok, Thanks for the report. I will investigate.

cheers

andrew

From:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 12:56:24
Message-ID:	48DCDBF8.1020501@pws.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>> Do we know why we experience "tuple concurrently updated" errors if we
>> spawn thread too fast?
>>
>
> No. That's an open item.

Okay, I'll see if I can have a little more of a look into it. No
promises as the restore the restore isn't playing nicely.
>
>>
>> the memory context is shared across all threads. Which means that it's
>> possible the memory contexts are stomping on each other. My GDB skills
>> are now up to being able to reproduce this in a gdb session as there are
>> forks going on all over the place. And if you process them in a serial
>> fashion, there aren't any errors. I'm not sure of the fix for this.
>> But in a parallel environment it doesn't seem possible to store the
>> memory context in the AH.
>>
>
>
> There are no threads, hence nothing is shared. fork() create s new
> process, not a new thread, and all they share are file descriptors.
>
> However, there does seem to be something odd happening with the
> compression lib, which I will investigate. Thanks for the report.

I'm sorry, I meant processes there. I'm aware there are no threads.
But my feeling was that when you forked with open files you got all of
the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor. My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue.
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
informative for me. I again could be wrong but worth a read. If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor. I haven't had a change to implementation test is
as it's late here. But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders

Russell.

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 13:25:04
Message-ID:	48DCE2B0.4060101@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Russell Smith wrote:
> I'm sorry, I meant processes there. I'm aware there are no threads.
> But my feeling was that when you forked with open files you got all of
> the open file properties, including positions, and as you dupped the
> descriptor, you share all that it's pointing to with every other copy of
> the descriptor. My brief research on that shows that in 2005 there was
> a kernel mailing list discussion on this issue.
> http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
> informative for me. I again could be wrong but worth a read. If it is
> true, then the file needs to be reopened by each child, it can't use the
> duplicated descriptor. I haven't had a change to implementation test is
> as it's late here. But I'd take a stab that it will solve the
> compression library problems.
>
> I hope this helps, not hinders
>
>
>

I'm sure that's the problem. Should be fairly easily fixable, I believe.

Thanks for the info.

cheers

andrew

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 15:20:56
Message-ID:	48DCFDD8.4090702@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

This version of the patch should fix the "shared file descriptor" bug
Russell Smith noticed. It also disables the 1/2 second sleep between
forks, so the performance on a small db (regression) is vastly improved.

cheers

andrew

Attachment	Content-Type	Size
parallel_restore_2.patch	text/x-patch	32.4 KB

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 21:03:51
Message-ID:	48DD4E37.3090905@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>
>
> This version of the patch should fix the "shared file descriptor" bug
> Russell Smith noticed. It also disables the 1/2 second sleep between
> forks, so the performance on a small db (regression) is vastly improved.

this works better but there is something fishy still - using the same
dump file I get a proper restore using pg_restore normally. If I however
use -m for a parallel one I only get parts (in this case only 243 of the
709 tables) of the database restored ...

Stefan

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 21:10:44
Message-ID:	48DD4FD4.6090907@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stefan Kaltenbrunner wrote:
> Andrew Dunstan wrote:
>>
>>
>> This version of the patch should fix the "shared file descriptor" bug
>> Russell Smith noticed. It also disables the 1/2 second sleep between
>> forks, so the performance on a small db (regression) is vastly improved.
>
> this works better but there is something fishy still - using the same
> dump file I get a proper restore using pg_restore normally. If I
> however use -m for a parallel one I only get parts (in this case only
> 243 of the 709 tables) of the database restored ...
>
>
>

Yes, there are several funny things going on, including some stuff with
dependencies. I'll have a new patch tomorrow with luck. Thanks for testing.

cheers

andrew

From:	Joshua Drake <jd(at)commandprompt(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 21:21:34
Message-ID:	20080926142134.5e40ffcb@jd-laptop
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 26 Sep 2008 17:10:44 -0400
Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> Yes, there are several funny things going on, including some stuff
> with dependencies. I'll have a new patch tomorrow with luck. Thanks
> for testing.

O.k. I took at look at the patch itself and although I don't understand
all of it there were a couple of red flags to me:

+ if (ropt->create)
+ die_horribly(AH,modulename,
+ "parallel restore is
incompatible with --create\n");
+

This seems like an odd limitation. In my mind, the schema would not be
restored in parallel. The schema before data would restore as a single
thread. Even the largest schemas would only take minutes (if that).
Thus something like --create should never be a problem.

I also noticed you check if we have zlib? Is it even possible to use
the c format without it? (that would be new to me).

I noticed this line:

+ while((next_work_item = get_next_work_item(AH)) != NULL)
+ {
+ /* XXX need to improve this test in case there is no
table data */
+ /* need to test for indexes, FKs, PK, Unique, etc */
+ if(strcmp(next_work_item->desc,"TABLE DATA") == 0)
+ break;
+ (void) _restore_one_te(AH, next_work_item, ropt,
false);
+
+ next_work_item->prestored = true;
+
+ _reduce_dependencies(AH,next_work_item);
+ }

Intead of the TABLE DATA compare, perhaps it makes sense to back patch
pg_dump to have a line delimiter in the TOC? That way even if there is
no TABLE DATA there would be a delimiter that says:

--- BEGIN TABLE DATA
--- END TABLE DATA

Thus if nothing is there... nothing is there?

+ /* delay just long enough betweek forks to
give the catalog some
+ * breathing space. Without this sleep I got
+ * "tuple concurrently updated" errors.
+ */
+ pg_usleep(500000);
+ continue; /* in case the slots are not yet
full */
+ }

Could that be solved with a lock instead? Once the lock is released....

Anyway... just some thoughts. I apologize if I misunderstood the patch.

Sincerely,

Joshua D. Drake

>
> cheers
>
> andrew
>

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Joshua Drake <jd(at)commandprompt(dot)com>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-26 22:23:30
Message-ID:	48DD60E2.5080708@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Joshua Drake wrote:
> On Fri, 26 Sep 2008 17:10:44 -0400
> Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
>> Yes, there are several funny things going on, including some stuff
>> with dependencies. I'll have a new patch tomorrow with luck. Thanks
>> for testing.
>>
>
> O.k. I took at look at the patch itself and although I don't understand
> all of it there were a couple of red flags to me:
>
> + if (ropt->create)
> + die_horribly(AH,modulename,
> + "parallel restore is
> incompatible with --create\n");
> +
>
> This seems like an odd limitation. In my mind, the schema would not be
> restored in parallel. The schema before data would restore as a single
> thread. Even the largest schemas would only take minutes (if that).
> Thus something like --create should never be a problem.
>

Originally I had everything restoring in parallel. Now I am in fact (as
the patch should have showed you) restoring the first part in a single
thread like you say. Thus I probably can relax that restriction. I will
look and see.

> I also noticed you check if we have zlib? Is it even possible to use
> the c format without it? (that would be new to me).
>
> I noticed this line:
>
>
> + while((next_work_item = get_next_work_item(AH)) != NULL)
> + {
> + /* XXX need to improve this test in case there is no
> table data */
> + /* need to test for indexes, FKs, PK, Unique, etc */
> + if(strcmp(next_work_item->desc,"TABLE DATA") == 0)
> + break;
> + (void) _restore_one_te(AH, next_work_item, ropt,
> false);
> +
> + next_work_item->prestored = true;
> +
> + _reduce_dependencies(AH,next_work_item);
> + }
>
>
> Intead of the TABLE DATA compare, perhaps it makes sense to back patch
> pg_dump to have a line delimiter in the TOC? That way even if there is
> no TABLE DATA there would be a delimiter that says:
>
> --- BEGIN TABLE DATA
> --- END TABLE DATA
>
> Thus if nothing is there... nothing is there?
>

The TOC isn't stored as a text file. So we'll need to look by entry
tags. It's no big deal - there aren't a huge number.

> + /* delay just long enough betweek forks to
> give the catalog some
> + * breathing space. Without this sleep I got
> + * "tuple concurrently updated" errors.
> + */
> + pg_usleep(500000);
> + continue; /* in case the slots are not yet
> full */
> + }
>
> Could that be solved with a lock instead? Once the lock is released....
>

That sleep is now gone.

> Anyway... just some thoughts. I apologize if I misunderstood the patch.
>
>
>

No problem. Thanks for looking.

cheers

andrew

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 02:58:24
Message-ID:	48E04450.9080501@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>
>
>>
>> this works better but there is something fishy still - using the same
>> dump file I get a proper restore using pg_restore normally. If I
>> however use -m for a parallel one I only get parts (in this case only
>> 243 of the 709 tables) of the database restored ...
>>
>>
>>
>
> Yes, there are several funny things going on, including some stuff
> with dependencies. I'll have a new patch tomorrow with luck. Thanks
> for testing.
>
>

OK, in this version a whole heap of bugs are fixed, mainly those to do
with dependencies and saved state. I get identical row counts in the
source and destination now, quite reliably.

cheers

andrew

Attachment	Content-Type	Size
parallel_restore_3.patch	text/x-patch	36.4 KB

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 11:55:03
Message-ID:	48E0C217.4070007@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>
>
> Andrew Dunstan wrote:
>>
>>
>>>
>>> this works better but there is something fishy still - using the same
>>> dump file I get a proper restore using pg_restore normally. If I
>>> however use -m for a parallel one I only get parts (in this case only
>>> 243 of the 709 tables) of the database restored ...
>>>
>>>
>>>
>>
>> Yes, there are several funny things going on, including some stuff
>> with dependencies. I'll have a new patch tomorrow with luck. Thanks
>> for testing.
>>
>>
>
> OK, in this version a whole heap of bugs are fixed, mainly those to do
> with dependencies and saved state. I get identical row counts in the
> source and destination now, quite reliably.

this looks much better (for a restore that usually takes 180min I can
get down to 72min using -m 4) - however especially with higher
concurrency I'm sometimes running into restore failures due to deadlocks
happening during constraint restoration (slightly redacted):

pg_restore: [archiver (db)] Error from TOC entry 7765; 2606 1460743180
FK CONSTRAINT fk_av_relations_av db_owner
pg_restore: [archiver (db)] could not execute query: ERROR: deadlock
detected
DETAIL: Process 18100 waits for AccessExclusiveLock on relation
1460818342 of database 1460815284; blocked by process 18103.
Process 18103 waits for AccessExclusiveLock on relation 1460818336 of
database 1460815284; blocked by process 18100.
HINT: See server log for query details.

ALTER TABLE ONLY foo
ADD CONSTRAINT fk_av_relations_av FOREIGN KEY (vs_id) REFERENCES
bar ...

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 12:06:23
Message-ID:	9592.1222689983@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> pg_restore: [archiver (db)] could not execute query: ERROR: deadlock
> detected
> DETAIL: Process 18100 waits for AccessExclusiveLock on relation
> 1460818342 of database 1460815284; blocked by process 18103.
> Process 18103 waits for AccessExclusiveLock on relation 1460818336 of
> database 1460815284; blocked by process 18100.
> HINT: See server log for query details.

> ALTER TABLE ONLY foo
> ADD CONSTRAINT fk_av_relations_av FOREIGN KEY (vs_id) REFERENCES
> bar ...

Hmm, I'll bet the restore code doesn't realize that this can't run in
parallel with index creation on either table ...

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 12:32:31
Message-ID:	48E0CADF.8000009@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>
>> pg_restore: [archiver (db)] could not execute query: ERROR: deadlock
>> detected
>> DETAIL: Process 18100 waits for AccessExclusiveLock on relation
>> 1460818342 of database 1460815284; blocked by process 18103.
>> Process 18103 waits for AccessExclusiveLock on relation 1460818336 of
>> database 1460815284; blocked by process 18100.
>> HINT: See server log for query details.
>>
>
>
>> ALTER TABLE ONLY foo
>> ADD CONSTRAINT fk_av_relations_av FOREIGN KEY (vs_id) REFERENCES
>> bar ...
>>
>
> Hmm, I'll bet the restore code doesn't realize that this can't run in
> parallel with index creation on either table ...
>
>
>

Yeah. Of course, it's never needed to bother with stuff like that till now.

The very simple fix is probably to run a separate parallel cycle just
for FKs, after the index creation.

A slightly more elegant fix would probably be to add dependencies from
each index that might cause this to the FK constraint.

I'll work on the first for now.

Is there any chance that the locks we're taking here are too strong?
Intuitively it looks a bit like it.

cheers

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 12:39:24
Message-ID:	10177.1222691964@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> Hmm, I'll bet the restore code doesn't realize that this can't run in
>> parallel with index creation on either table ...

> Yeah. Of course, it's never needed to bother with stuff like that till now.

> The very simple fix is probably to run a separate parallel cycle just
> for FKs, after the index creation.

Um, FKs could conflict with each other too, so that by itself isn't
gonna fix anything.

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 13:02:38
Message-ID:	48E0D1EE.9010504@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> Tom Lane wrote:
>>
>>> Hmm, I'll bet the restore code doesn't realize that this can't run in
>>> parallel with index creation on either table ...
>>>
>
>
>> Yeah. Of course, it's never needed to bother with stuff like that till now.
>>
>
>
>> The very simple fix is probably to run a separate parallel cycle just
>> for FKs, after the index creation.
>>
>
> Um, FKs could conflict with each other too, so that by itself isn't
> gonna fix anything.
>
>
>

Good point. Looks like we'll need to make a list of "can't run in
parallel with" items as well as strict dependencies.

cheers

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 13:17:12
Message-ID:	10663.1222694232@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> Um, FKs could conflict with each other too, so that by itself isn't
>> gonna fix anything.

> Good point. Looks like we'll need to make a list of "can't run in
> parallel with" items as well as strict dependencies.

Yeah, I was just thinking about that. The current archive format
doesn't really carry enough information for this. I think there
are two basic solutions we could adopt:

* Extend the archive format to provide some indication that "restoring
this object requires exclusive access to these dependencies".

* Hardwire knowledge into pg_restore that certain types of objects
require exclusive access to their dependencies.

The former seems more flexible, as well as more in tune with the basic
design assumption that pg_restore shouldn't have a lot of knowledge
about individual archive object types. But it would mean that you
couldn't use parallel restore with any pre-8.4 dumps. In the long run
that's no big deal, but in the short run it's annoying.

Another angle is that it's not clear what happens if the need for
exclusive access changes over time. You were just speculating about
reducing the lock strength required for ALTER TABLE ADD FOREIGN KEY.
I don't know if that's workable or not, but certainly reducing the
lock strength for some types of ALTER TABLE might be in our future.
Contrarily, we don't currently try hard to lock any non-table objects
(schemas, functions, etc) while building dependent objects; but that's
obviously not really right, and someday we might decide to fix it.
So having pg_dump prepare the list of exclusive dependencies at dump
time might be the wrong thing --- it would reflect the behavior of
the source server version, not the target which is what matters.

Thoughts?

regards, tom lane

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 13:51:40
Message-ID:	200809291551.42729.dfontaine@hi-media.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le lundi 29 septembre 2008, Tom Lane a écrit :
> * Extend the archive format to provide some indication that "restoring
> this object requires exclusive access to these dependencies".
>
> * Hardwire knowledge into pg_restore that certain types of objects
> require exclusive access to their dependencies.

Well, it seems to me that currently the FK needs in term of existing indexes
and locks, and some other object lock needs, are all hardwired. Is it even
safe to consider having the locks needed for certain commands not be
hardwired?

Provided I'm not all wrong here, I don't see how having something more
flexible at restore time than at build time is a win. The drawback is that
whenever you change a lock need in commands, you have to remember teaching
pg_restore about it too.

So my vote here is in favor of hardwired knowledge of pg_restore, matching
target server code assumptions and needs.

Regards,
--
dim

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 14:25:10
Message-ID:	48E0E546.9030204@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Dimitri Fontaine wrote:
> Le lundi 29 septembre 2008, Tom Lane a écrit :
>
>> * Extend the archive format to provide some indication that "restoring
>> this object requires exclusive access to these dependencies".
>>
>> * Hardwire knowledge into pg_restore that certain types of objects
>> require exclusive access to their dependencies.
>>
>
> Well, it seems to me that currently the FK needs in term of existing indexes
> and locks, and some other object lock needs, are all hardwired. Is it even
> safe to consider having the locks needed for certain commands not be
> hardwired?
>
> Provided I'm not all wrong here, I don't see how having something more
> flexible at restore time than at build time is a win. The drawback is that
> whenever you change a lock need in commands, you have to remember teaching
> pg_restore about it too.
>
> So my vote here is in favor of hardwired knowledge of pg_restore, matching
> target server code assumptions and needs.
>
>

Well, I've had to use some knowledge of various item types already, and
I have been trying not to disturb pg_dump also, so I'm inclined to build
this knowledge into pg_restore.

ISTM that "things that will have lock conflicts" are different and more
target version dependent than "things that logically depend on other
things", so we can still rely on pg_dump to some extent to provide the
latter while building the former at restore time.

cheers

andrew

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 19:54:42
Message-ID:	48E13282.5030605@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> Tom Lane wrote:
>>> Um, FKs could conflict with each other too, so that by itself isn't
>>> gonna fix anything.
>
>> Good point. Looks like we'll need to make a list of "can't run in
>> parallel with" items as well as strict dependencies.
>
> Yeah, I was just thinking about that. The current archive format
> doesn't really carry enough information for this. I think there
> are two basic solutions we could adopt:
>
> * Extend the archive format to provide some indication that "restoring
> this object requires exclusive access to these dependencies".
>
> * Hardwire knowledge into pg_restore that certain types of objects
> require exclusive access to their dependencies.
>
> The former seems more flexible, as well as more in tune with the basic
> design assumption that pg_restore shouldn't have a lot of knowledge
> about individual archive object types. But it would mean that you
> couldn't use parallel restore with any pre-8.4 dumps. In the long run
> that's no big deal, but in the short run it's annoying.

hmm not sure how much of a problem that really is - we usually recommend
to use the pg_dump version of the target database anyway.

Stefan

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-29 23:38:58
Message-ID:	48E16712.8020301@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Stefan Kaltenbrunner wrote:
> Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> Tom Lane wrote:
>>>> Um, FKs could conflict with each other too, so that by itself isn't
>>>> gonna fix anything.
>>
>>> Good point. Looks like we'll need to make a list of "can't run in
>>> parallel with" items as well as strict dependencies.
>>
>> Yeah, I was just thinking about that. The current archive format
>> doesn't really carry enough information for this. I think there
>> are two basic solutions we could adopt:
>>
>> * Extend the archive format to provide some indication that "restoring
>> this object requires exclusive access to these dependencies".
>>
>> * Hardwire knowledge into pg_restore that certain types of objects
>> require exclusive access to their dependencies.
>>
>> The former seems more flexible, as well as more in tune with the basic
>> design assumption that pg_restore shouldn't have a lot of knowledge
>> about individual archive object types. But it would mean that you
>> couldn't use parallel restore with any pre-8.4 dumps. In the long run
>> that's no big deal, but in the short run it's annoying.
>
> hmm not sure how much of a problem that really is - we usually
> recommend to use the pg_dump version of the target database anyway.
>
>
>
>

We don't really need a huge amount of hardwiring as it turns out. Here
is a version of the patch that tries to do what's needed in this area.

cheers

andrew

Attachment	Content-Type	Size
parallel_restore_4.patch	text/x-patch	38.9 KB

From:	Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-30 03:26:00
Message-ID:	48E19C48.7010402@rhyme.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

>
> + if (strcmp(te->desc,"CONSTRAINT") == 0 ||
> + strcmp(te->desc,"FK CONSTRAINT") == 0 ||
> + strcmp(te->desc,"CHECK CONSTRAINT") == 0 ||
> + strcmp(te->desc,"TRIGGER") == 0 ||
> + strcmp(slots[i].te->desc,"CONSTRAINT") == 0 ||
> + strcmp(slots[i].te->desc,"FK CONSTRAINT") == 0 ||
> + strcmp(slots[i].te->desc,"CHECK CONSTRAINT") == 0 ||
> + strcmp(slots[i].te->desc,"TRIGGER") == 0)
>
Really just an observation from the peanut gallery here, but every time
pg_restore hard-codes this kind of thing, it introduces yet another
possible side-effect bug when someone, eg, adds a new TOC type.

Would it substantially decrease the benefits of the patch to skip *any*
toc entry that shares dependencies with another? (rather than just those
listed above).

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-30 03:59:23
Message-ID:	48E1A41B.4070207@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Philip Warner wrote:
>> + if (strcmp(te->desc,"CONSTRAINT") == 0 ||
>> + strcmp(te->desc,"FK CONSTRAINT") == 0 ||
>> + strcmp(te->desc,"CHECK CONSTRAINT") == 0 ||
>> + strcmp(te->desc,"TRIGGER") == 0 ||
>> + strcmp(slots[i].te->desc,"CONSTRAINT") == 0 ||
>> + strcmp(slots[i].te->desc,"FK CONSTRAINT") == 0 ||
>> + strcmp(slots[i].te->desc,"CHECK CONSTRAINT") == 0 ||
>> + strcmp(slots[i].te->desc,"TRIGGER") == 0)
>>
>>
> Really just an observation from the peanut gallery here, but every time
> pg_restore hard-codes this kind of thing, it introduces yet another
> possible side-effect bug when someone, eg, adds a new TOC type.
>
> Would it substantially decrease the benefits of the patch to skip *any*
> toc entry that shares dependencies with another? (rather than just those
> listed above).
>
>
>

Unfortunately, it quite possibly would. You would not be able to build
two indexes on the same table in parallel, even though they wouldn't
have conflicting locks.

cheers

andrew

From:	Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-30 04:13:06
Message-ID:	48E1A752.2060807@rhyme.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
> Unfortunately, it quite possibly would. You would not be able to build
> two indexes on the same table in parallel, even though they wouldn't
> have conflicting locks.
I suppose so, but:

1. By the same logic it might speed things up; it might build two
completely separate indexes and thereby avoid (some kind of) contention.
In any case, it would most likely do *something* else. It should only
reduce performance if (a) it can do nothing or (b) there is a benefit in
building multiple indexes on the same table at the same time.

2. Perhaps if there are a limited number of items that share
dependencies but which are known to be OK (ie. indexes), maybe list them
in the inner loop as exceptions and allow them to run parallel. This
would mean a failure to list a new TOC item type would result in worse
performance rather than a crash.

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-09-30 12:55:47
Message-ID:	48E221D3.2040701@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Philip Warner wrote:
> Andrew Dunstan wrote:
>
>> Unfortunately, it quite possibly would. You would not be able to build
>> two indexes on the same table in parallel, even though they wouldn't
>> have conflicting locks.
>>
> I suppose so, but:
>
> 1. By the same logic it might speed things up; it might build two
> completely separate indexes and thereby avoid (some kind of) contention.
> In any case, it would most likely do *something* else. It should only
> reduce performance if (a) it can do nothing or (b) there is a benefit in
> building multiple indexes on the same table at the same time.
>
> 2. Perhaps if there are a limited number of items that share
> dependencies but which are known to be OK (ie. indexes), maybe list them
> in the inner loop as exceptions and allow them to run parallel. This
> would mean a failure to list a new TOC item type would result in worse
> performance rather than a crash.
>
>
>

I will look at it in due course. Right now my concern is simply to get
something that works that we can do some testing with. I think that's
what we have now (fingers crossed). Some parts of it are jury rigged.

BTW, though, building indexes for the same table together is likely to
be a win AIUI, especially given the recent work on synchronised scans.

cheers

andrew

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject:	Re: parallel pg_restore - WIP patch
Date:	2008-10-03 08:51:27
Message-ID:	48E5DD0F.5070700@kaltenbrunner.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan wrote:
>
>
> Stefan Kaltenbrunner wrote:
>> Tom Lane wrote:
>>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>> Tom Lane wrote:
>>>>> Um, FKs could conflict with each other too, so that by itself isn't
>>>>> gonna fix anything.
>>>
>>>> Good point. Looks like we'll need to make a list of "can't run in
>>>> parallel with" items as well as strict dependencies.
>>>
>>> Yeah, I was just thinking about that. The current archive format
>>> doesn't really carry enough information for this. I think there
>>> are two basic solutions we could adopt:
>>>
>>> * Extend the archive format to provide some indication that "restoring
>>> this object requires exclusive access to these dependencies".
>>>
>>> * Hardwire knowledge into pg_restore that certain types of objects
>>> require exclusive access to their dependencies.
>>>
>>> The former seems more flexible, as well as more in tune with the basic
>>> design assumption that pg_restore shouldn't have a lot of knowledge
>>> about individual archive object types. But it would mean that you
>>> couldn't use parallel restore with any pre-8.4 dumps. In the long run
>>> that's no big deal, but in the short run it's annoying.
>>
>> hmm not sure how much of a problem that really is - we usually
>> recommend to use the pg_dump version of the target database anyway.
>>
>>
>>
>>
>
> We don't really need a huge amount of hardwiring as it turns out. Here
> is a version of the patch that tries to do what's needed in this area.

this one is much better - however I still seem to be able to create
deadlock scenarios with strange FK relations - ie FKs going in both
directions between two tables.

for those interested these are the timings on my 8 core testbox for my
test database:

single process restore: 169min
-m2: 101min
-m6: 64min
-m8: 63min
-m16: 56min

Stefan