Re: Weird XFS WAL problem

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Craig James <craig_james(at)emolecules(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Weird XFS WAL problem
Date: 2010-06-04 09:27:04
Message-ID: alpine.DEB.2.00.1006041009540.4083@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 3 Jun 2010, Greg Smith wrote:
> And it's also quite reasonable for a RAID controller to respond to that
> "flush the whole cache" call by flushing its cache.

Remember that the RAID controller is presenting itself to the OS as a
large disc, and hiding the individual discs from the OS. Why should the OS
care what has actually happened to the individual discs' caches, as long
as that "flush the whole cache" command guarantees that the data is
persistent. Taking the RAID array as a whole, that happens when the data
hits the write-back cache.

The only circumstance where you actually need to flush the data to the
individual discs is when you need to take that disc away somewhere else
and read it on another system. That's quite a rare use case for a RAID
array (http://thedailywtf.com/Articles/RAIDing_Disks.aspx
notwithstanding).

> If the controller had some logic that said "it's OK to not flush the
> cache when that call comes in if my battery is working fine", that would
> make this whole problem go away.

The only place this can be properly sorted is the RAID controller.
Anywhere else would be crazy.

Matthew

--
"To err is human; to really louse things up requires root
privileges." -- Alexander Pope, slightly paraphrased

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jon Schewe 2010-06-04 12:17:35 How filesystems matter with PostgreSQL
Previous Message Matthew Wakeling 2010-06-04 09:00:21 Re: slow query