Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-09-26 13:36:46
Message-ID: 20140926133646.GC5311@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila wrote:

> Today while again thinking about the startegy used in patch to
> parallelize the operation (vacuum database), I think we can
> improve the same for cases when number of connections are
> lesser than number of tables in database (which I presume
> will normally be the case). Currently we are sending command
> to vacuum one table per connection, how about sending multiple
> commands (example Vacuum t1; Vacuum t2) on one connection.
> It seems to me there is extra roundtrip for cases when there
> are many small tables in database and few large tables. Do
> you think we should optimize for any such cases?

I don't think this is a good idea; at least not in a first cut of this
patch. It's easy to imagine that a table you initially think is small
enough turns out to have grown much larger since last analyze. In that
case, putting one worker to process that one together with some other
table could end up being bad for parallelism, if later it turns out that
some other worker has no table to process. (Table t2 in your example
could grown between the time the command is sent and t1 is vacuumed.)

It's simpler to have workers do one thing at a time only.

I don't think it's a very good idea to call pg_relation_size() on every
table in the database from vacuumdb.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-09-26 13:39:33 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Robert Haas 2014-09-26 13:34:19 Re: Scaling shared buffer eviction