Re: Sequence Access Method WIP

From: José Luis Tallón <jltallon(at)adv-solutions(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Subject: Re: Sequence Access Method WIP
Date: 2014-12-03 14:50:00
Message-ID: 547F2318.6080207@adv-solutions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/03/2014 11:24 AM, Andres Freund wrote:
> On 2014-12-03 10:59:50 +0100, José Luis Tallón wrote:
>> snip]
> I don't think the WAL logging would need to change much in comparison to
> the current solution. We'd just add the page number to the WAL record.
>
> The biggest advantage would be to require fewer heavyweight locks,
> because the pg_sequence one could be a single fastpath lock. Currently
> we have to take the sequence's relation lock (heavyweight) and then the
> the page level lock (lwlock) for every single sequence used.

Got it, thank you for the explanation.

>> May I possibly suggest a file-per-schema model instead? This approach would
>> certainly solve the excessive i-node consumption problem that --I guess--
>> Andres is trying to address here.
> I don't think that really has any advantages.

Just spreading the I/O load, nothing more, it seems:

Just to elaborate a bit on the reasoning, for completeness' sake:
Given that a relation's segment maximum size is 1GB, we'd have
(1048576/8)=128k sequences per relation segment.
Arguably, not many real use cases will have that many sequences.... save
for *massively* multi-tenant databases.

The downside being that all that random I/O --- in general, it can't
really be sequential unless there are very very few sequences--- can't
be spread to other spindles. Create a "sequence_default_tablespace" GUC
+ ALTER SEQUENCE SET TABLESPACE, to use an SSD for this purpose maybe?
(I could take a shot at the patch, if deemed worthwhile)

>> [snip]
> The overhead of a file is much more than wasting the remainder of a
> page. Alone the requirement of separate fsyncs and everything is pretty
> bothersome. The generated IO patterns are also much worse...

Yes, you are right. I stand corrected.

> [snip]
> I think there's pretty much no chance of accepting more than one page
> per sequence

Definitively.

Thanks,

J.L.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-03 14:54:22 pgsql: Keep track of transaction commit timestamps
Previous Message Robert Haas 2014-12-03 14:39:18 Re: B-Tree support function number 3 (strxfrm() optimization)