Re: In progress INSERT wrecks plans on table

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: In progress INSERT wrecks plans on table
Date: 2013-05-02 21:06:03
Message-ID: 5182D53B.4050708@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 03/05/13 00:27, Simon Riggs wrote:
> On 2 May 2013 01:49, Mark Kirkwood<mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
>> On 02/05/13 02:06, Tom Lane wrote:
>>> Mark Kirkwood<mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> writes:
>>>> I am concerned that the deafening lack of any replies to my original
>>>> message is a result of folk glancing at your original quick reply and
>>>> thinking... incomplete problem spec...ignore... when that is not that
>>>> case - yes I should have muttered "9.2" in the original email, but we
>>>> have covered that now.
>>> No, I think it's more that we're trying to get to beta, and so anything
>>> that looks like new development is getting shuffled to folks' "to
>>> look at later" queues. The proposed patch is IMO a complete nonstarter
>>> anyway; but I'm not sure what a less bogus solution would look like.
>>>
>> Yeah, I did think that beta might be consuming everyone's attention (of
>> course immediately *after* sending the email)!
>>
>> And yes, the patch was merely to illustrate the problem rather than any
>> serious attempt at a solution.
> I think we need a problem statement before we attempt a solution,
> which is what Tom is alluding to.
>
> ISTM that you've got a case where the plan is very sensitive to a
> table load. Which is a pretty common situation and one that can be
> solved in various ways. I don't see much that Postgres can do because
> it can't know ahead of time you're about to load rows. We could
> imagine an optimizer that set thresholds on plans that caused the
> whole plan to be recalculated half way thru a run, but that would be a
> lot of work to design and implement and even harder to test. Having
> static plans at least allows us to discuss what it does after the fact
> with some ease.
>
> The plan is set using stats that are set when there are very few
> non-NULL rows, and those increase massively on load. The way to cope
> is to run the ANALYZE immediately after the load and then don't allow
> auto-ANALYZE to reset them later.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
Would be practicable to have a facility for telling Postgres in advance
what you intend to do, so it can create plans accordingly?

I won't try and invent syntax, but it would be good to tell the system
that you intend to:
insert a million rows sequentially
or
insert ten million rows randomly
or
update two million rows in the primary key range 'AA00000' to 'PP88877'
etc.

Though, sometime it may be useful to give a little more
detail,especially if you have a good estimate of the distribution of
primary keys, e.g.:
AA000000
20%
AA000456
0%
KE700999
30%
NN400005
35%
PA000001
15%
PP808877

I figure that if the planner had more information about what one intends
to do, then it could combine that with the statistics it knows, to come
up with a more realistic plan.

Cheers,
Gavin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-05-02 21:09:21 Re: 9.3 Beta1 status report
Previous Message Bruce Momjian 2013-05-02 19:41:59 Re: 9.3 Beta1 status report

Browse pgsql-performance by date

  From Date Subject
Next Message mark.kirkwood 2013-05-02 22:19:18 Re: In progress INSERT wrecks plans on table
Previous Message Claudio Freire 2013-05-02 15:00:28 Re: "WHERE 1 = 2 OR ..." makes planner choose a very inefficient plan