Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-10-16 20:01:19
Message-ID: CA+U5nMJKQWQ2V-mybTF9JmRDn8wBom+TPeCeW0VmGp4sAM4tzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 October 2014 15:09, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

>> Just send a message to autovacuum to request an immediate action. Let
>> it manage the children and the tasks.
>>
>> SELECT pg_autovacuum_immediate(nworkers = N, list_of_tables);
>>
>> Request would allocate an additional N workers and immediately begin
>> vacuuming the stated tables.
>
> I think doing anything on the server side can have higher complexity like:
> a. Does this function return immediately after sending request to
> autovacuum, if yes then the behaviour of this new functionality
> will be different as compare to vacuumdb which user might not
> expect?
> b. We need to have some way to handle errors that can occur in
> autovacuum (may be need to find a way to pass back to user),
> vacuumdb or Vacuum can report error to user.
> c. How does nworkers input relates to autovacuum_max_workers
> which is needed at start for shared memory initialization and in calc
> of maxbackends.
> d. How to handle database wide vacuum which is possible via vacuumdb
> e. What is the best UI (a command like above or via config parameters)?

c) seems like the only issue that needs any thought. I don't think its
going to be that hard.

I don't see any problems with the other points. You can make a
function wait, if you wish.

> I think we can find a way for above and may be if any other similar things
> needs to be taken care, but still I think it is better that we have this
> feature
> via vacuumdb rather than adding complexity in server code. Also the
> current algorithm used in patch is discussed and agreed upon in this
> thread and if now we want to go via some other method (auto vacuum),
> it might take much more time to build consensus on all the points.

Well, I read Alvaro's point from earlier in the thread and agreed with
it. All we really need is an instruction to autovacuum to say "be
aggressive".

Just because somebody added something to the TODO list doesn't make it
a good idea. I apologise to Dilip for saying this, it is not anything
against him, just the idea.

Perhaps we just accept the patch and change AV in the future.

>> vacuumdb can still issue the request, but the guts of this are done by
>> the server, not a heavily modified client.
>>
>> If we are going to heavily modify a client then it needs to be able to
>> run more than just one thing. Parallel psql would be nice. pg_batch?
>
> Could you be more specific in this point, I am not able to see how
> vacuumdb utility has anything to do with parallel psql?

That's my point. All this code in vacuumdb just for this one isolated
use case? Twice the maintenance burden.

A more generic utility to run commands in parallel would be useful.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-10-16 20:12:11 Re: strip nulls functions for json and jsonb
Previous Message Jim Nasby 2014-10-16 19:59:24 Superuser connect during smart shutdown