Re: performance problem of Failover Datasource?

From: Chen Huajun <chenhj(at)cn(dot)fujitsu(dot)com>
To: Scott Harrington <scotth01(at)sns-usa(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: performance problem of Failover Datasource?
Date: 2012-12-25 13:45:57
Message-ID: 50D9AE15.3000807@cn.fujitsu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc


> But could you (and perhaps Mikko Tiihonen who originally proposed the "Simple connection failover support") remind the rest of us why we want this complexity inside the pgjdbc driver, rather than in a
> more robust and featureful layer like pgpool-II?

I saw some mails discussed about this subject, but I don't know all.
I think the performance and simple is the reason.Is it right?

> Taking a step back, it seems you have implemented a DNS-like static (JVM-global) helper which performs lazy-caching of information about servers. I would argue PGJDBC itself should only do simple
> single-host connections, but perhaps provide a well-documented HostChooser interface and a JVM-global (static) method such as Driver.setHostChooser(), similar to Driver.setLogLevel(), so that
> applicaitons that need to override the default "DNS lookups" (or "host choosing") may do so.

I think the problem is if some simple HostChoosers are useful.
If yes and they are not so complex for a jdbc drvier,why not providing them.
By the way,if HostChooser implement choosing is needed,a JVM-global property(such as -Dxx.xx.HostChooser=XXXXHostChooser)
or a connection parameter may be more suitable,because they can avoid modifying applicaiton's source while change HostChooser implement.

> Applications that want the load balancing would use something like "host=myvirtualpool" which would would obviously fail unless the you've installed some sort of LoadBalanceHostChooser, which knows
> about all the "realservers" that comprise the "myvirtualpool" and their master/slave/OK/dead status. (Starts to sound more and more like pgpool-II or other projects that already exist for this).

Really I know some products just do as "host=myvirtualpool".
And sometime the form is more convenient and easy to customizing,
the shortcoming may be more complex just as you said.

Now ,Could you explain the the detail of following issues?

> At first glance, there are a couple of issues:
>
> 1. Double-Checked Locking in reportHostStatus, which is bad form

I knows a problem of Double-Checked Locking,while used in singleton pattern as following.

public static Singleton getInstance() {
if (instance == null) {
synchronized (Singleton.class) {
if (instance == null) {
instance = new Singleton();
}
}
}
return instance;
}

because JVM would run "instance = new Singleton(); " as that :

mem = allocate();
instance = mem;
ctorSingleton(instance);

Do you think my code has the same problem or just it looks ugly?

> 2. Synchronized code in a subclass that locks the base class

Synchronized code is for hostStatusCache which shared by all subclass,
so locks the base class.Is there any problems?

> 3. No need for 'volatile' if you're also using 'synchronized'

'synchronized' is only used for write,'volatile' is for read.
It's for performance and is a bit complex.(It may be a excessive design)
I worry about JVM will optimize the following code

HashMap<String,HostStatus> newHostStatusMap = (HashMap<String, HostStatus>) hostStatusCache.clone();
newHostStatusMap.put(hostSpecKey, hostStatus);
hostStatusCache=newHostStatusMap;
as:
hostStatusCache=(HashMap<String, HostStatus>) hostStatusCache.clone();
hostStatusCache.put(hostSpecKey, hostStatus);

do you know if it will happen?

(2012/12/25 4:10), Scott Harrington wrote:
> Hmm, there's some neat stuff in there, slave-only, slavefirst, etc.
>
> But could you (and perhaps Mikko Tiihonen who originally proposed the "Simple connection failover support") remind the rest of us why we want this complexity inside the pgjdbc driver, rather than in a
> more robust and featureful layer like pgpool-II?
>
> At first glance, there are a couple of issues:
>
> 1. Double-Checked Locking in reportHostStatus, which is bad form
>
> 2. Synchronized code in a subclass that locks the base class
>
> 3. No need for 'volatile' if you're also using 'synchronized'
>
> Taking a step back, it seems you have implemented a DNS-like static (JVM-global) helper which performs lazy-caching of information about servers. I would argue PGJDBC itself should only do simple
> single-host connections, but perhaps provide a well-documented HostChooser interface and a JVM-global (static) method such as Driver.setHostChooser(), similar to Driver.setLogLevel(), so that
> applicaitons that need to override the default "DNS lookups" (or "host choosing") may do so.
>
> Applications that want the load balancing would use something like "host=myvirtualpool" which would would obviously fail unless the you've installed some sort of LoadBalanceHostChooser, which knows
> about all the "realservers" that comprise the "myvirtualpool" and their master/slave/OK/dead status. (Starts to sound more and more like pgpool-II or other projects that already exist for this).
>
> Side benefit is your LoadBalanceHostChooser could be designed to do "eager" connection probing on worker threads so that when an application thread needs a PGJDBC connection, you would avoid any of
> the slow connection / dead server issues you were originally trying to solve.
>
>
> On Mon, 24 Dec 2012, Chen Huajun wrote:
>
>> Hi
>>
>> I have made a new patch(with my test). Please give a look.
>> It support the following features
>> 1) performance improve for fail over by avoiding dead hosts.
>> 2) simple load balance by picking up the first host from multiple valid hosts randomly.
>> 3) ability of choosing master or slave to connect to.
>>
>> And in the patch, three connection parameters were added.
>>
>> targetServerType = String
>> Specifies what kind of server to connect.The value should be one of the following:
>> any
>> master
>> slave
>> slavefirst (Try connecting to the slaves first.If failed try the master)
>> The default is 'any'.
>>
>> enableLoadBalance = boolean
>> Enable or disable load balance when multiple hosts were specified;If load balance is enabled,specified multiple hosts will be picked up randomly.
>> The default is false.
>>
>> failedHostCheckPeriod = int
>> Specifies period(seconds) to check whether the failed hosts had been repaired, when load balance is enabled; 0 means never check.
>> The default is 600 seconds.
>>
>>
>> (2012/12/18 8:05), Chen Huajun wrote:
>>>
>>> Thanks for you advise.
>>> I will try to made a new patch and add load balance supporting.
>>>
>>>
>>
>> --
>> Best Regards,
>> Chen Huajun
>>
>>
>
>

--
Best Regards,
Chen Huajun

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Florent Guillaume 2012-12-26 11:17:14 Re: performance problem of Failover Datasource?
Previous Message Scott Harrington 2012-12-24 20:10:54 Re: performance problem of Failover Datasource?