Re: Parallell Optimizer

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: simon(at)2ndQuadrant(dot)com, michael(dot)paquier(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, fred(at)nti(dot)ufop(dot)br, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallell Optimizer
Date: 2013-06-11 20:04:06
Message-ID: 51B782B6.7070703@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/11/2013 04:53 PM, Tatsuo Ishii wrote:
>> On 11 June 2013 01:45, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>>> On Sat, Jun 8, 2013 at 5:04 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>>
>>>>> On 7 June 2013 20:23, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>>
>>>>>> As for other databases, I suspect that ones that have parallel execution
>>>>>> are probably doing it with a thread model not a process model.
>>>>> Separate processes are more common because it covers the general case
>>>>> where query execution is spread across multiple nodes. Threads don't
>>>>> work across nodes and parallel queries predate (working) threading
>>>>> models.
>>>>>
>>>> Indeed. Parallelism based on processes would be more convenient for
>>>> master-master
>>>> type of applications. Even if no master-master feature is implemented
>>>> directly in core,
>>>> at least a parallelism infrastructure based on processes could be used for
>>>> this purpose.
>>> As long as "true" synchronous replication is not implemented in core,
>>> I am not sure there's a value for parallel execution spreading across
>>> multile nodes because of the delay of data update propagation.
>> Please explain what you mean by the word "true" used here.
> In another word, "eager replication".
Do you mean something along these lines :

"Most synchronous or eager replication solutions do conflict prevention,
while asynchronous solutions have to do conflict resolution. For instance,
if a record is changed on two nodes simultaneously, an eager replication
system would detect the conflict before confirming the commit and abort
one of the transactions. A lazy replication system would allow both
transactions to commit and run a conflict resolution during
resynchronization. "

?

IMO it is possible to do this "easily" once BDR has reached the state
where you
can do streaming apply. That is, you replay actions on other hosts as they
are logged, not after the transaction commits. Doing it this way you can
wait
any action to successfully complete a full circle before committing it
in source.

Currently main missing part in doing this is autonomous transactions.
It can in theory be done by opening an extra backend for each incoming
transaction but you will need really big number of backends and also you
have extra overhead from interprocess communications.

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2013-06-11 20:11:37 Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)
Previous Message Gavin Flower 2013-06-11 19:59:39 Re: Parallell Optimizer