Re: Why we are going to have to go DirectIO

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Jonathan Corbet <corbet(at)lwn(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why we are going to have to go DirectIO
Date: 2013-12-04 17:45:22
Message-ID: 529F6A32.50901@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/04/2013 04:33 PM, Jonathan Corbet wrote:
> On Tue, 03 Dec 2013 10:44:15 -0800
> Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> It seems clear that Kernel.org, since 2.6, has been in the business of
>> pushing major, hackish, changes to the IO stack without testing them or
>> even thinking too hard about what the side-effects might be. This is
>> perhaps unsurprising given that two of the largest sponsors of the
>> Kernel -- who, incidentally, do 100% of the performance testing -- don't
>> use the IO stack.
>>
>> This says to me that Linux will clearly be an undependable platform in
>> the future with the potential to destroy PostgreSQL performance without
>> warning, leaving us scrambling for workarounds. Too bad the
>> alternatives are so unpopular.
>
> Wow, Josh, I'm surprised to hear this from you.
>
> The active/inactive list mechanism works great for the vast majority of
> users. The second-use algorithm prevents a lot of pathological behavior,
> like wiping out your entire cache by copying a big file or running a
> backup. We *need* that kind of logic in the kernel.
>
> Now, back in 2012, Johannes (working for one of those big contributors)
> hit upon an issue where second-use falls down. So he set out to fix it:
>
> https://lwn.net/Articles/495543/
>
> This code has been a bit slow getting into the mainline for a few reasons,
> but one of the chief ones is this: nobody is saying from the sidelines
> that they need it! If somebody were saying "Postgres would work a lot
> better with this code in place" and had some numbers to demonstrate that,
> we'd be far more likely to see it get into an upcoming release.
>
> In the end, Linux is quite responsive to the people who participate in its
> development, even as testers and bug reporters. It responds rather less
> well to people who find problems in enterprise kernels years later,
> granted.
>
> The amount of automated testing, including performance testing, has
> increased markedly in the last couple of years. I bet that it would not
> be hard at all to get somebody like Fengguang Wu to add some
> Postgres-oriented I/O tests to his automatic suite:
>
> https://lwn.net/Articles/571991/
>
> Then we would all have a much better idea of how kernel releases are
> affecting one of our most important applications; developers would pay
> attention to that information.

hmm interesting tool, I can see how that would be very useful "for early
warning" style detection on the kernel development side using a small
set of postgresql "benchmarks". That would basically help with part of
Josh complained that it will take ages for regressions to be detected.
From postgresqls pov we would also need additional long term and more
complex testing spanning different postgresql version on various
distribution platforms (because that is what people deploy in
production, hand built git-fetched kernels are rare) using tests that
both might have extended runtimes and/or require external infrastructure

>
> Or you could go off and do your own thing, but I believe that would leave
> us all poorer.

fully agreed

Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2013-12-04 17:52:51 Re: Extension Templates S03E11
Previous Message Jeff Davis 2013-12-04 17:35:12 Re: Extension Templates S03E11