The writings of Merlin Moncure, professional database developer, about work, life, family, and everything else.

Wednesday, December 13, 2006

Advisory Locks Part 2

Previously, I wrote about PostgreSQL Advisory Locks and how important and useful they are. One thing I forgot to mention is how in very specifc cases they can improve concurrency in your application. This is because they are both long-term (lock duration can exceed a single transaction) and short-term (lock duration can be shorter than a single transaction). This entry will describe the short-term aspect and what it can be used for.

PostgreSQL uses a high performance, high concurrency MVCC engine to manage mutiple sessions reading and writing to the database. All locks in PostgreSQL use this engine and must obey transaction sementics -- except for advisory locks. While advisory locks can be used for various tricks, their most important role is to provide a database managed mutex of precisely controlled duration irresepctive of what else is going on in the system at the time. The obvious use for this is for long term pessimistic locks, but how else might they be used?

One scenario is to manage contention to resources that also do not follow transactional semantics. One such resource is a sequence -- even if in a serializable transaction other sessions can access that resource and alter your expected results. For example, if you pulled 10 values from a sequence in a pl/pgsql FOR loop, there is no guarantee that those values are contiguous, except to lock a resource before the first value is fetched and hold that lock for the duration of the transaction. Advisory locks provide the tools to give that guarantee with a shorter term lock.

Imagine you had a pl/pgsql function that was long or was part of a transaction that was long and had the following requirements:
  • the sequence s must be incremented
  • after incrementing s, execute an operation with a possible (handled) failure condition. If that fails, return s to the value before incrementing
  • until the failure state of the operation is checked, no other sessions must increment s
The third requirement is basically saying s must be locked inside this transaction. The standard way to do this is to lock something, either a table or a record. However, since all locks, once acquired, must be held until transaction terminatation (via COMMIT/ROLLBACK), our concurrency is poor especially if the lock is acuired early on and the transaction is long running, as there is no facility to unlock resources in-transaction.

Our transaction time line is:
B-TS----------C
B=Begin, S=increment S, C=Commit, T=Standard SQL transaction lock
using the lock, we have a mutex guarding S to C.

With advisory locks, a better solution emerges: lock some number that represents s and unlock that number once the failure state is checked and handled. Now our transaction timeline is:
B-LSU---------C
L=Advisory lock, U=Advisory unlock

Obviously, this lowers contention and improves concurrency. In certain rare cases, this can provide enormous benefits.

Sunday, December 10, 2006

About the author

Who am I?

My name is Merlin Moncure and I am a computer programmer living in Florida. I have a beautiful wife and two beautiful children and am pretty successful by most reasonable measures. This blog is going to be moslty about things related to computer and technology but also about the various things I am concerned about. Particular areas of focus will be the economy, PostgreSQL, a smattering of policitcs and all kinds of other long winded pontificating about whatever else comes to mind.

I previously had a some PostgreSQL related information in planetpostgresql, but I found the blogging software in use there, Serendipity, very frustrating and difficult to use. Also, Blogger integrates with my GMail account in two ways: first, I can login with my Google credentials, and secondly I can publish entries from Google Docs which is a much nicer interface than the inline editor I use here. So, here I am. My first real entry on this blog is actually a post that Serendipity ate.

I would like to give a quick shout out to a few of favorite places on the internet. If you are in the programming business, do yourself a favor and check out PostgeSQL. Also, if you like to fret about the world economy like me, you absolutely have to read Mish's blog. Brilliant stuff. If you live in West Michigan and need furniture, be sure to get it from Longstreet Furniture and tell them Merlin sent you, and you will get a good deal. Also, if you are into RF engineering, check out RadioSoft, one of the biggest names in the business.


merlin

Saturday, December 09, 2006

PostgreSQL 8.2 Advisory Locks

Recently released PostgreSQL 8.2 is a treasure trove of helpful features for the database developer. Dubbed a performance release by the advocacy team, the most powerful open source database engine has once again shown that it is greater than the sum of its parts, and the breadth and scope of the project is greater than its developers. Many internal structural improvements were made, and several user level features were sprinkled in for good measure. One such feature was a overhaul of one of the more useful (and overlooked) contrib modules. Previously known as userlocks, the module was re-branded 'advisory' locks and brought into the core core PostgreSQL architecture.

Advisory locks are 'long-term, cooperative' locks. They are long-term in the sense that they do not follow mvcc rules, and are in fact completely orthogonal to transactional locks. This is a slightly long-winded way of saying that advisory locks are not released upon transaction commit/rollback and can be held outside of a transaction. This is noteworthy because advisory locks are one of the very few things that can be used to maintain session state without using persistent storage (prepared statements, temporary tables, and listen/notify also qualify here), and especially noteworthy because they are the only mechanism in PostgreSQL that I know that can affect the behavior of another session without consideration of either session's transactional state. Advisory locks are cooperative in that all users of them are on the honor system -- this is quite different from standard mvcc locks which follow very strict rules. Anything can be locked at any time. Advisory locks can be thought of as a low level interface to the internal locking engine and will generally need some scaffolding to make them useful.

These properties make advisory locks practical for a number of useful things:
  • Emulation of 'pessimistic locks' common in non-MVCC databases
  • High-performance simple session IPC (like setting a boolean in a table, but faster and MVCC penalty free)
  • Allow cooperation locking of resources without using long duration transactions (which are evil!)

Advisory locks are very simple to use:

Administrator=# select pg_advisory_lock(1);
pg_advisory_lock
------------------
(1 row)

Administrator=# select locktype, classid, objid, pid, mode, granted from pg_locks where locktype = 'advisory';
locktype | classid | objid | pid | mode | granted
----------+---------+-------+------+---------------+---------
advisory | 0 | 1 | 3896 | ExclusiveLock | t
(1 row)

Administrator=# select pg_advisory_unlock(1);
pg_advisory_unlock
--------------------
t
(1 row)



The above commands acquired an advisory lock, demonstrated it in the pg_locks view, and released it using the built in interface. The meaning of the number '1' is completely application defined. It could represent the ID of a table, a special application mutex, a session boolean flag, or just about anything. Keep in mind that if all sessions do not agree on what the number one represents, problems are bound to follow. Regardless of how the session terminates, the locks are released immediately upon disconnect.

If you want to lock an ID of a table, the lock command can be inlined into the select statement:

Administrator=# select pg_advisory_lock(id), * from foo where id = 1;
pg_advisory_lock | id | f
------------------+----+-----
| 1 | abc
(1 row)


And the unlock is just as easy. Advisory locks are extremely fast:

Administrator=# explain analyze select pg_advisory_lock(v) from generate_series(1,1000) v;
QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series v (cost=0.00..17.50 rows=1000 width=4) (actual time=0.304..4.365 rows=1000 loops=1) Total runtime: 6.244 ms
(2 rows)


If I read that right the time on my iMac to acquire a lock is about 6 microseconds....not bad! A note of caution here: advisory locks are drawn out of shared memory and if they are abused the results can be catastrophic:

Administrator=# select pg_advisory_lock(v) from generate_series(1,10000) v;

WARNING: out of shared memory
ERROR: out of shared memory

HINT: You may need to increase max_locks_per_transaction.


Administrator=# select * from pg_locks;
WARNING: out of shared memory

When shared memory is used up, the server basically cannot do anything at all. Quitting the session returned everything to normal however. All the same, extreme care must be taken not to leak locks. Also, they are an extremely obvious Denial of Service target following an SQL injection attack. If this is an issue you may want to consider locking them down:

Administrator=# revoke execute on function pg_advisory_lock(bigint) from public;
REVOKE

Advisory locks provide a 64 bit space for locks. You have the choice of locking two 32 bit integers or a single 64 bit integer (these are drawn from the same lock pool and are thus only superficially different). In addition, there are the 'shared' flavors of the advisory locks which do not block other shared lockers but block and are blocked by exclusive lockers. Best of all, there are 'try' flavors of the functions which do not wait for the lock to be held if another session hold it -- this is similar to the NOWAIT clause of some other SQL statements that can hold locks. The full listing of all the advisory lock functions is here.

Merlin