Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-08-24 06:02:35
Message-ID: CAA4eK1+Pu-09syF=re74=jFwSTKkeRWZRne1_9jJrwq+ZasQ3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 19, 2014 at 4:27 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> Few more comments:
>
Some more comments:

1. I could see one shortcoming in the way the patch has currently
parallelize the
work for --analyze-in-stages. Basically patch is performing the work for
each stage
for multiple tables in concurrent connections that seems okay for the
cases when
number of parallel connections is less than equal to number of tables,
but for
the case when user has asked for more number of connections than number
of tables,
then I think this strategy will not be able to use the extra
connections.

2. Similarly for the case of multiple databases, currently it will not be
able
to use connections more than number of tables in each database because
the
parallelizing strategy is to just use the conncurrent connections for
tables inside single database.

I am not completely sure whether current strategy is good enough or
we should try to address the above problems. What do you think?

3.
+ do
+ {
+ i = select_loop(maxFd, &slotset);
+ Assert(i != 0);

Could you explain the reason of using this loop, I think you
want to wait for data on socket descriptor, but why for maxFd?
Also it is better if you explain this logic in comments.

4.
+ for (i = 0; i < max_slot; i++)
+ {
+ if (!FD_ISSET(pSlot[i].sock, &slotset))
+ continue;
+
+ PQconsumeInput(pSlot[i].connection);
+ if (PQisBusy(pSlot[i].connection))
+ continue;

I think it is better to call PQconsumeInput() only if you find
connection is busy.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2014-08-24 10:16:22 Re: pgbench throttling latency limit
Previous Message Michael Paquier 2014-08-24 03:59:44 Missing comment block at the top of streamutil.h and receivelog.h