Re: MultiXacts & WAL

Lists: pgsql-hackers
From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Subject: MultiXacts & WAL
Date: 2006-06-16 23:35:46
Message-ID: 20060616233546.65754.qmail@web27814.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am working on a possible extension of postgresql mvcc to support very timely failure masking in the context of three-tier applications so i am currently studying Postgresql internals...

I am wondering what are the reasons why both the MultiXactIds and the corresponding OFFSETs and MEMBERs are currently persisted.
In multixact.c 's documentation on the top of the file you can find the following statement:
"...This allows us to completely rebuild the data entered since the last checkpoint during XLOG replay..."

I can see the need to persist (not eagerly) multixactids to avoid wraparounds. Essentially, mass storage is used to extend the limited capabity of slrus data structures in shared memory.

The point i am missing is the need to be able to completely recover multixacts offsets and members data. These carry information about current transactions holding shared locks on db tuples, which should not be essential for recovery purposes. After a crash you want to recover the content of your data, not the presence of shared locks on any tuple. AFAICS, this seems true for both committed/aborted transactions (which being concluded do not care any more about the fact that they could have held any shared lock), as well as prepared transactions (which only need to recover their exclusive locks).

I have tried to dig around the comments within the main multixact.c functions and i have walked through this comment (CreateMultiXactId())):

"...The only way for the MXID to be referenced from any data page is for heap_lock_tuple() to have put it there, and heap_lock_tuple() generates an XLOG record that must follow ours... "

But still I cannot see the need to recover complete shared locks info (i.e. not only multixactids but also the corresponding registered transactionids that were holding the lock)...

May be this is needed to support savepoints/subtransactions? Or is it something else that i am missing?

Thanks for your precious help!

Paolo

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 00:09:56
Message-ID: 9308.1150502996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

paolo romano <paolo(dot)romano(at)yahoo(dot)it> writes:
> The point i am missing is the need to be able to completely recover
> multixacts offsets and members data. These carry information about
> current transactions holding shared locks on db tuples, which should
> not be essential for recovery purposes.

This might be optimizable if we want to assume that multixacts will never
be used for any purpose except holding locks, but that seems a bit short
sighted. Is there any actually significant advantage to not logging
this information?

regards, tom lane


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 06:15:49
Message-ID: Pine.OSF.4.61.0606170914060.231307@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 17 Jun 2006, paolo romano wrote:

> May be this is needed to support savepoints/subtransactions? Or is it
> something else that i am missing?

It's for two-phase commit. A prepared transaction can hold locks that need
to be recovered.

- Heikki


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 06:47:25
Message-ID: 20060617064725.22507.qmail@web27811.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> May be this is needed to support savepoints/subtransactions? Or is it
> something else that i am missing?

It's for two-phase commit. A prepared transaction can hold locks that need
to be recovered.

When a transaction enters (successfully) the prepared state it only retains its exclusive locks and releases any shared locks (i.e. multixacts)... or, at least, that's how it should be in principle according to serializiaton theory, i haven't yet checked out if this is what is done in postgresql .

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 07:02:05
Message-ID: 20060617070205.1615.qmail@web27807.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> ha scritto: paolo romano
writes:
> The point i am missing is the need to be able to completely recover
> multixacts offsets and members data. These carry information about
> current transactions holding shared locks on db tuples, which should
> not be essential for recovery purposes.

This might be optimizable if we want to assume that multixacts will never
be used for any purpose except holding locks, but that seems a bit short
sighted. Is there any actually significant advantage to not logging
this information?

regards, tom lane

I can see two main advantages:

* Reduced I/O Activity: during transaction processing: current workloads are typically dominated by reads (rather than updates)... and reads give rise to multixacts (if there are at least two transactions reading the same page or if an explicit lock request is performed through heap_lock_tuple). And (long) transactions can read a lot of tuples, which directly translates into (long) multixact logging sooner or later. To accurately estimate the possible performance gain one should perform some profiling, but at first glance ISTM that there are good potentialities.

* Reduced Recovery Time: because of shorter logs & less data structures to rebuild... and reducing recovery time helps improving system availability so should not be overlooked.

Regards,

Paolo

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 11:05:16
Message-ID: Pine.OSF.4.61.0606171349020.307302@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 17 Jun 2006, paolo romano wrote:

> When a transaction enters (successfully) the prepared state it only
> retains its exclusive locks and releases any shared locks (i.e.
> multixacts)... or, at least, that's how it should be in principle
> according to serializiaton theory, i haven't yet checked out if this is
> what is done in postgresql .

In PostgreSQL, shared locks are not taken when just reading data. They're
used to enforce foreign key constraints. When inserting a row to a table
with a foreign key, the row in the parent table is locked to
keep another transaction from deleting it. It's not safe to release the
lock before end of transaction.

- Heikki


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 11:17:47
Message-ID: Pine.OSF.4.61.0606171406160.307302@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 17 Jun 2006, paolo romano wrote:

> * Reduced I/O Activity: during transaction processing: current workloads
> are typically dominated by reads (rather than updates)... and reads give
> rise to multixacts (if there are at least two transactions reading the
> same page or if an explicit lock request is performed through
> heap_lock_tuple). And (long) transactions can read a lot of tuples,
> which directly translates into (long) multixact logging sooner or later.
> To accurately estimate the possible performance gain one should perform
> some profiling, but at first glance ISTM that there are good
> potentialities.

Read-only transactions don't acquire shared locks. And updating
transcations emit WAL records anyway; the additional I/O caused by
multixact records is negligable.

Also, multixacts are only used when two transactions hold a shared lock
on the same row.

> * Reduced Recovery Time: because of shorter logs & less data
> structures to rebuild... and reducing recovery time helps improving
> system availability so should not be overlooked.

I doubt the multixact stuff makes much difference compared to all other
WAL traffic.

In fact, logging the multixact stuff could be skipped when no two-phase
transactions are involved. The problem is, you don't know if a transaction is one
phase or two phase before you see COMMIT or PREPARE TRANSACTION.

- Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: paolo romano <paolo(dot)romano(at)yahoo(dot)it>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 16:10:12
Message-ID: 15544.1150560612@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> Also, multixacts are only used when two transactions hold a shared lock
> on the same row.

Yeah, it's difficult to believe that multixact stuff could form a
noticeable fraction of the total WAL load, except perhaps under really
pathological circumstances, because the code just isn't supposed to be
exercised often. So I don't think this is worth pursuing. Paolo's free
to try to prove the opposite of course ... but I'd want to see numbers
not speculation.

regards, tom lane


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 17:37:18
Message-ID: 20060617173718.63055.qmail@web27809.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


In PostgreSQL, shared locks are not taken when just reading data. They're
used to enforce foreign key constraints. When inserting a row to a table
with a foreign key, the row in the parent table is locked to
keep another transaction from deleting it. It's not safe to release the
lock before end of transaction.

Releasing shared locks (whether used for plain reading or enforcing foreign keys) before transaction end would be clearly wrong.
The original point I was moving is if there were any concrete reason (which still I can't see) to require Multixacts recoverability (by means of logging).
Concerning the prepare state of two phase commit, as I was pointing out in my previous post, shared locks can safely be released once a transaction gets precommitted, hence they do not have to be made durable.

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: paolo romano <paolo(dot)romano(at)yahoo(dot)it>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 17:43:34
Message-ID: 20060617174334.15471.qmail@web27807.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Yeah, it's difficult to believe that multixact stuff could form a
noticeable fraction of the total WAL load, except perhaps under really
pathological circumstances, because the code just isn't supposed to be
exercised often. So I don't think this is worth pursuing. Paolo's free
to try to prove the opposite of course ... but I'd want to see numbers
not speculation.

regards, tom lane
Tom is right, mine are indeed just plain speculations, motivated by my original doubt concerning whether there were hidden reasons for requiring multixacts recoverability.
I don't know if I'll find the time to do some performance tests, at least in the short term, but I've enjoyed to exchange my views with you all, so thanks a lot for your feedback!

Just a curiosity, what kind of benchmarks would you use to evaluate this effect? I am quite familiar with TPC-C and TPC-W, but i am a newbie of postgresql community so i was wondering if you were using any reference benchmark....

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 19:09:01
Message-ID: Pine.OSF.4.61.0606172113310.312139@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 17 Jun 2006, paolo romano wrote:

> The original point I was moving is if there were any concrete reason
> (which still I can't see) to require Multixacts recoverability (by means
> of logging).
> Concerning the prepare state of two phase commit, as I was pointing out
> in my previous post, shared locks can safely be released once a
> transaction gets precommitted, hence they do not have to be made
> durable.

No, it's not safe to release them until 2nd phase commit.

Imagine table foo and table bar. Table bar has a foreign key reference to
foo.

1. Transaction A inserts a row to bar, referencing row R in foo. This
acquires a shared lock on R.
2. Transaction A precommits, releasing the lock.
3. Transaction B deletes R. The new row inserted by A is not visible to
B, so the delete succeeds.
4. Transaction A and B commit. Oops, the new row in bar references R that
doesn't exist anymore.

Holding the lock until the true end of transaction, the 2nd phase
of commit, blocks B from deleting R.

- Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 19:21:09
Message-ID: 22553.1150572069@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

paolo romano <paolo(dot)romano(at)yahoo(dot)it> writes:
> Concerning the prepare state of two phase commit, as I was pointing out in my previous post, shared locks can safely be released once a transaction gets precommitted, hence they do not have to be made durable.

The above statement is plainly wrong. It would for example allow
violation of FK constraints.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: paolo romano <paolo(dot)romano(at)yahoo(dot)it>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 19:22:41
Message-ID: 200606171222.41942.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom, Paolo,

> Yeah, it's difficult to believe that multixact stuff could form a
> noticeable fraction of the total WAL load, except perhaps under really
> pathological circumstances, because the code just isn't supposed to be
> exercised often.  So I don't think this is worth pursuing.  Paolo's free
> to try to prove the opposite of course ... but I'd want to see numbers
> not speculation.

I would like to see some checking of this, though. Currently I'm doing
testing of PostgreSQL under very large numbers of connections (2000+) and am
finding that there's a huge volume of xlog output ... far more than
comparable RDBMSes. So I think we are logging stuff we don't really have
to.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, paolo romano <paolo(dot)romano(at)yahoo(dot)it>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 19:40:57
Message-ID: 22679.1150573257@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> I would like to see some checking of this, though. Currently I'm doing
> testing of PostgreSQL under very large numbers of connections (2000+) and am
> finding that there's a huge volume of xlog output ... far more than
> comparable RDBMSes. So I think we are logging stuff we don't really have
> to.

Please dump some of the WAL segments with xlogdump so we can get a
feeling for what's in there.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, paolo romano <paolo(dot)romano(at)yahoo(dot)it>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 20:02:15
Message-ID: 200606171302.15284.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

> Please dump some of the WAL segments with xlogdump so we can get a
> feeling for what's in there.

OK, will do on Monday's test run. Is it possible for me to run this at the
end of the test run, or do I need to freeze it in the middle to get useful
data?

Also, we're toying with the idea of testing full_page_writes=off for Solaris.
The Solaris engineers claim that it should be safe on Sol10 + Sun hardware.
I'm not entirely sure that's true; is there a destruction test of the bug
that caused us to remove that option?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, paolo romano <paolo(dot)romano(at)yahoo(dot)it>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: MultiXacts & WAL
Date: 2006-06-17 20:37:45
Message-ID: 23001.1150576665@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Please dump some of the WAL segments with xlogdump so we can get a
>> feeling for what's in there.

> OK, will do on Monday's test run. Is it possible for me to run this at the
> end of the test run, or do I need to freeze it in the middle to get useful
> data?

I'd just copy off a random sample of WAL segment files while the run is
proceeding. You don't need very many, half a dozen at most.

> Also, we're toying with the idea of testing full_page_writes=off for Solaris.
> The Solaris engineers claim that it should be safe on Sol10 + Sun hardware.
> I'm not entirely sure that's true; is there a destruction test of the bug
> that caused us to remove that option?

The bug that made us turn it off in the 8.1 branch had nothing to do
with hardware reliability or the lack thereof. As for testing, will
they let you yank the power cord?

regards, tom lane


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-18 13:10:03
Message-ID: 20060618131003.71367.qmail@web27807.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


No, it's not safe to release them until 2nd phase commit.

Imagine table foo and table bar. Table bar has a foreign key reference to
foo.

1. Transaction A inserts a row to bar, referencing row R in foo. This
acquires a shared lock on R.
2. Transaction A precommits, releasing the lock.
3. Transaction B deletes R. The new row inserted by A is not visible to
B, so the delete succeeds.
4. Transaction A and B commit. Oops, the new row in bar references R that
doesn't exist anymore.

Holding the lock until the true end of transaction, the 2nd phase
of commit, blocks B from deleting R.

- Heikki

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
message can get through to the mailing list cleanly

Heikki, thanks for the clarifications. I was not considering the additional issues arising in case of referential integrity constraints... in fact i was citing a known result from theory books on 2PC, which did not include FK in their speculations... But as usual in theory things look always much simpler than in practice!

Anyway, again in theory, if one wanted to minimize logging overhead for shared locks, one might adopt a different treatment for (i) regular shared locks (i.e. locks due to plain reads not requiring durability in case of 2PC) and (ii) shared locks held because some SQL command is referencing a tuple via a FK, which have to be persisted until the 2-nd 2PC phase (There is no any other scenario in which you *must* persist shared locks, is there?)

Of course, in practice distinguishing the 2 above situations may not be so simple and it still has to be shown whether such an optimization is really worth of...
By the way, postgresql is detailedly logging *every* single shared lock, even though this is actually needed only if (i) the transaction turns out to be a distributed one (i.e. prepare is issued on that transactions), AND (ii) the shared lock is due to ensure validity of a FK. AFAICS, in most practical workloads (i) local transactions dominate distributed ones and (ii) shared locks due to plain reads dominate locks due to FK, so the current implementaion does not seem to be optimizing the most frequent scenario.

regards,

paolo

Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-18 15:17:44
Message-ID: 28458.1150643864@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

paolo romano <paolo(dot)romano(at)yahoo(dot)it> writes:
> Anyway, again in theory, if one wanted to minimize logging overhead for shared locks, one might adopt a different treatment for (i) regular shared locks (i.e. locks due to plain reads not requiring durability in case of 2PC) and (ii) shared locks held because some SQL command is referencing a tuple via a FK, which have to be persisted until the 2-nd 2PC phase (There is no any other scenario in which you *must* persist shared locks, is there?)

I can't see any basis at all for asserting that you don't need to
persist particular types of locks. In the current system, a multixact
lock might arise from either FK locking, or a user-issued SELECT FOR SHARE.
In either case it's possible that the lock was taken to guarantee the
integrity of a data change made somewhere else. So we can't release it
before commit.

regards, tom lane


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-18 18:30:11
Message-ID: Pine.OSF.4.61.0606182110040.46925@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 18 Jun 2006, paolo romano wrote:

> Anyway, again in theory, if one wanted to minimize logging overhead for
> shared locks, one might adopt a different treatment for (i) regular
> shared locks (i.e. locks due to plain reads not requiring durability in
> case of 2PC) and (ii) shared locks held because some SQL command is
> referencing a tuple via a FK, which have to be persisted until the 2-nd
> 2PC phase (There is no any other scenario in which you *must* persist
> shared locks, is there?)

There is no "regular shared locks" in postgres in that sense. Shared locks
are only used for maintaining FK integrity. Or by manually issuing a
SELECT FOR SHARE, but that's also for maintaining integrity. MVCC
rules take care of the "plain reads". If you're not familiar with MVCC,
it's explained in chapter 12 of the manual.

The source code in heapam.c also mentions Point In Time Recovery to
require logging the locks, though I'm not sure why.

> By the way, postgresql is detailedly logging *every* single shared
> lock, even though this is actually needed only if (i) the transaction
> turns out to be a distributed one (i.e. prepare is issued on that
> transactions), AND (ii) the shared lock is due to ensure validity of a
> FK. AFAICS, in most practical workloads (i) local transactions dominate
> distributed ones and (ii) shared locks due to plain reads dominate locks
> due to FK, so the current implementaion does not seem to be optimizing
> the most frequent scenario.

The problem with is that we don't know beforehand if a transaction is a
distributed one or not.

Feel free to write a benchmark to see how much difference the logging
makes! If it's significant, I'm sure we can figure out ways to improve it.

- Heikki


From: paolo romano <paolo(dot)romano(at)yahoo(dot)it>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXacts & WAL
Date: 2006-06-18 21:35:25
Message-ID: 20060618213525.74301.qmail@web27810.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>There is no "regular shared locks" in postgres in that sense. Shared locks
>are only used for maintaining FK integrity. Or by manually issuing a
>SELECT FOR SHARE, but that's also for maintaining integrity. MVCC
>rules take care of the "plain reads". If you're not familiar with MVCC,
>it's explained in chapter 12 of the manual.
>
>The source code in heapam.c also mentions Point In Time Recovery to
>require logging the locks, though I'm not sure why.

Thanks for your explanations, now I can see what was confusing me.
The problem with is that we don't know beforehand if a transaction is a
distributed one or not.

Feel free to write a benchmark to see how much difference the logging
makes! If it's significant, I'm sure we can figure out ways to improve it.

Now that i finally see that multixacts are due only to explicit shared lock requests or to FKs, I tend to agree with tom's original doubts about the actual impact of the multixact related logging activities. Of course in practice such an impact would vary from application to application, so it may still make sense for some classes of workloads to avoid multixact logging, assuming they contain no distributed transactions and finding an hack to know beforehand whether a transaction is distributed or not... BTW, if i manage to find some free time to do some performance tests, i'll sure let you know!

Thanks again,

Paolo
Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com