Re: checkpointer continuous flushing

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: checkpointer continuous flushing
Date: 2015-06-02 15:16:59
Message-ID: 20150602151659.GP30287@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-06-02 17:01:50 +0200, Fabien COELHO wrote:
> >The actual problem is sorting & fsyncing in a way that deals efficiently
> >with tablespaces, i.e. doesn't write to tablespaces one-by-one.
> >Not impossible, but it requires some thought.
>
> Hmmm... I would have neglected this point in a first approximation,
> but I agree that not interleaving tablespaces could indeed loose some
> performance.

I think it'll be a hard to diagnose performance regression. So we'll
have to fix it. That argument actually was the blocker in previous
attempts...

> >IMO this feature, if done correctly, should result in better performance
> >in 95+% of the workloads
>
> To demonstrate that would require time...

Well, that's part of the contribution process. Obviously you can't test
100% of the problems, but you can work hard with coming up with very
adversarial scenarios and evaluate performance for those.

> >and be enabled by default.
>
> I did not had such an ambition with the submitted patch:-)

I don't think we want yet another tuning knob that's hard to tune
because it's critical for one factor (latency) but bad for another
(throughput); especially when completely unnecessarily.

> >And that'll not be possible without actually writing mostly sequentially.
>
> >It's also not just the sequential writes making this important, it's also
> >that it allows to do the final fsync() of the individual segments as soon
> >as their last buffer has been written out.
>
> Hmmm... I'm not sure this would have a large impact. The writes are
> throttled as much as possible, so fsync will catch plenty other writes
> anyway, if there are some.

That might be the case in a database with a single small table;
i.e. where all the writes go to a single file. But as soon as you have
large tables (i.e. many segments) or multiple tables, a significant part
of the writes issued independently from checkpointing will be outside
the processing of the individual segment.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2015-06-02 15:22:58 [PATCH] Fix documentation bug in how to calculate the quasi-unique pg_log session_id
Previous Message Robert Haas 2015-06-02 15:16:22 Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1