Re: Linux/PostgreSQL scalability issue - problem with 8 cores

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jakub Ouhrabka <kuba(at)comgate(dot)cz>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Linux/PostgreSQL scalability issue - problem with 8 cores
Date: 2008-01-08 00:54:25
Message-ID: 28297.1199753665@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Perhaps it would make sense to try to take the "fast path" in
> SIDelExpiredDataEntries with only a shared lock rather than exclusive.

I think the real problem here is that sinval catchup processing is well
designed to create contention :-(. Once we've decided that the message
queue is getting too full, we SIGUSR1 all the backends at once (or as
fast as the postmaster can do it anyway), then they all go off and try
to touch the sinval queue. Backends that haven't awoken even once
since the last time will have to process the entire queue contents,
and they're all trying to do that at the same time. What's worse, they
take and release the SInvalLock once for each message they take off the
queue. This isn't so horrid for one-core machines (since each process
will monopolize the CPU for probably less than one timeslice while it's
catching up) but it's pretty obvious where all the contention is coming
from on an 8-core.

Some ideas for improving matters:

1. Try to avoid having all the backends hit the queue at once. Instead
of SIGUSR1'ing everybody at the same time, maybe hit only the process
with the oldest message pointer, and have him hit the next oldest after
he's done reading the queue.

2. Try to take more than one message off the queue per SInvalLock cycle.
(There is a tuning tradeoff here, since it would mean holding the lock
for longer at a time.)

3. Try to avoid having every backend run SIDelExpiredDataEntries every
time through ReceiveSharedInvalidMessages. It's not critical to delete
entries until the queue starts getting full --- maybe we could rejigger
the logic so it only happens once when somebody notices the queue is
getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum
do it, or something like that?

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jakub Ouhrabka 2008-01-08 12:19:24 Re: Linux/PostgreSQL scalability issue - problem with 8 cores
Previous Message Tom Lane 2008-01-08 00:01:32 Re: Linux/PostgreSQL scalability issue - problem with 8 cores