Re: Parallel Sort

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallel Sort
Date: 2013-05-13 14:57:39
Message-ID: 3859.1368457059@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> Each worker needs to make SnapshotNow visibility decisions coherent with the
> master. For sorting, this allows us to look up comparison functions, even
> when the current transaction created or modified those functions. This will
> also be an essential building block for any parallelism project that consults
> user tables. Implementing this means copying the subtransaction stack and the
> combocid hash to each worker.

> [ ... and GUC settings, and who knows what else ... ]

This approach seems to me to be likely to guarantee that the startup
overhead for any parallel sort is so large that only fantastically
enormous sorts will come out ahead.

I think you need to think in terms of restricting the problem space
enough so that the worker startup cost can be trimmed to something
reasonable. One obvious suggestion is to forbid the workers from
doing any database access of their own at all --- the parent would
have to do any required catalog lookups for sort functions etc.
before forking the children.

I think we should also seriously think about relying on fork() and
copy-on-write semantics to launch worker subprocesses, instead of
explicitly copying so much state over to them. Yes, this would
foreclose ever having parallel query on Windows, but that's okay
with me (hm, now where did I put my asbestos longjohns ...)

Both of these lines of thought suggest that the workers should *not*
be full-fledged backends.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-05-13 15:00:43 Re: Add more regression tests for dbcommands
Previous Message Fabien COELHO 2013-05-13 14:52:08 Re: Add more regression tests for dbcommands