Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-12-24 10:30:47
Message-ID: 4205E661176A124FAF891E0A6BA9135266398365@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 December 2014 16:41, Amit Kapila Wrote,

>One idea is to send all the stages and corresponding Analyze commands
>to server in one go which means something like
>"BEGIN; SET default_statistics_target=1; SET vacuum_cost_delay=0;
> Analyze t1; COMMIT;"
>"BEGIN; SET default_statistics_target=10; RESET vacuum_cost_delay;
> Analyze t1; COMMIT;"
>"BEGIN; RESET default_statistics_target;
> Analyze t1; COMMIT;"

Case1:In Case for CompleteDB:

In base code first it will process all the tables in stage 1 then in stage2 and so on, so that at some time all the tables are analyzed at least up to certain stage.

But If we process all the stages for one table first, and then take the other table for processing the stage 1, then it may happen that for some table all the stages are processed,
but others are waiting for even first stage to be processed, this will affect the functionality for analyze-in-stages.

Case2: In case for independent tables like –t “t1” –t “t2”

In base code also currently we are processing all the stages for first table and processing same for next table and so on.

I think, if user is giving multiple tables together then his purpose might be to analyze those tables together stage by stage,
but in our code we analyze table1 in all stages and then only considering the next table.

So for tables also it should be like
Stage1:
T1
T2
..
Stage2:
T1
T2

Thoughts?

>Now, still parallel operations in other backends could lead to
>page misses, but I think the impact will be minimized.

Regards,
Dilip

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-12-24 11:34:09 Re: replicating DROP commands across servers
Previous Message Andres Freund 2014-12-24 09:32:16 Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)