Re: Is analyze_new_cluster.sh still useful?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christoph Berg <christoph(dot)berg(at)credativ(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is analyze_new_cluster.sh still useful?
Date: 2014-06-18 19:51:52
Message-ID: CAMkU=1wOkAzO+s_68iU1DN+3ZkX3=SFrjzaEMZEwuKETUw9sGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 18, 2014 at 10:58 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-06-18 13:54:16 -0400, Tom Lane wrote:

>> I think we're not on the same page. My point is that someone might want
>> to automate the whole sequence: stop old postmaster, run pg_upgrade, start
>> the updated postmaster normally (hence it *is* open for business), kick
>> off the analyze runs. If you're concerned about minimal downtime you
>> would not want to be waiting around for the admin to issue a perfectly
>> predictable series of commands.
>
> Oh, yea. Definitely. I think that's what I've seen happen in pretty much
> *all* usages of pg_upgrade.

I think it is a popular way to do it not because it is a particularly
good way, but because the better alternatives are not readily
available.

If your database needs statistics badly enough that you want to do a
coarse pre-pass with default_statistics_target=1, why would you want
that pass to be done on an open database? Surely you don't want 100
open connections all doing giant seq scans (that should be single-row
look up, but without stats they are not) competing with the analyze.

Having a database which is "open" to queries but they have such
deranged execution plans that they never actually finish is not truly
open, and the attempts to service those futile queries just delays the
true opening even further.

If you really need a multi pass ANALYZE, you probably need the first
pass to be before the database opens because otherwise the open will
be a disaster, and the 2nd pass to be after the database opens but
before your bulk queries (mass deletes, EOM reports, etc.) kick in.
Having both passes be on the same "side" of the opening seems unlikely
to do much good for most use cases. Fortunately it probably doesn't
do much harm to most people, either, simple because most databases are
not terribly sensitive to the issue.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-06-18 19:52:49 Re: [COMMITTERS] pgsql: Reduce the number of semaphores used under --disable-spinlocks.
Previous Message Robert Haas 2014-06-18 19:50:14 Re: pg_control is missing a field for LOBLKSIZE