Quick Links

Re: Parallell Optimizer

From:	Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc:	simon(at)2ndQuadrant(dot)com, michael(dot)paquier(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, fred(at)nti(dot)ufop(dot)br, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Parallell Optimizer
Date:	2013-06-11 20:04:06
Message-ID:	51B782B6.7070703@2ndQuadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 06/11/2013 04:53 PM, Tatsuo Ishii wrote:
>> On 11 June 2013 01:45, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>>> On Sat, Jun 8, 2013 at 5:04 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>>
>>>>> On 7 June 2013 20:23, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>>
>>>>>> As for other databases, I suspect that ones that have parallel execution
>>>>>> are probably doing it with a thread model not a process model.
>>>>> Separate processes are more common because it covers the general case
>>>>> where query execution is spread across multiple nodes. Threads don't
>>>>> work across nodes and parallel queries predate (working) threading
>>>>> models.
>>>>>
>>>> Indeed. Parallelism based on processes would be more convenient for
>>>> master-master
>>>> type of applications. Even if no master-master feature is implemented
>>>> directly in core,
>>>> at least a parallelism infrastructure based on processes could be used for
>>>> this purpose.
>>> As long as "true" synchronous replication is not implemented in core,
>>> I am not sure there's a value for parallel execution spreading across
>>> multile nodes because of the delay of data update propagation.
>> Please explain what you mean by the word "true" used here.
> In another word, "eager replication".
Do you mean something along these lines :

"Most synchronous or eager replication solutions do conflict prevention,
while asynchronous solutions have to do conflict resolution. For instance,
if a record is changed on two nodes simultaneously, an eager replication
system would detect the conflict before confirming the commit and abort
one of the transactions. A lazy replication system would allow both
transactions to commit and run a conflict resolution during
resynchronization. "

IMO it is possible to do this "easily" once BDR has reached the state
where you
can do streaming apply. That is, you replay actions on other hosts as they
are logged, not after the transaction commits. Doing it this way you can
wait
any action to successfully complete a full circle before committing it
in source.

Currently main missing part in doing this is autonomous transactions.
It can in theory be done by opening an extra backend for each incoming
transaction but you will need really big number of backends and also you
have extra overhead from interprocess communications.

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

In response to

Re: Parallell Optimizer at 2013-06-11 14:53:57 from Tatsuo Ishii

Responses

Re: Parallell Optimizer at 2013-06-11 23:01:48 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fabien COELHO	2013-06-11 20:11:37	Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Previous Message	Gavin Flower	2013-06-11 19:59:39	Re: Parallell Optimizer