Concurrent Data Store introduction

    To create Berkeley DB Concurrent Data Store applications, you must first initialize an environment by calling . You must specify the DB_INIT_CDB and flags to that method. It is an error to specify any of the other DB_ENV->open() subsystem or recovery configuration flags, for example, , DB_INIT_TXN or All databases must, of course, be created in this environment by using the db_create() function or constructor, and specifying the environment as an argument.

    Berkeley DB performs appropriate locking so that safe enforcement of the deadlock-free, multiple-reader/single-writer semantic is transparent to the application. However, a basic understanding of Berkeley DB Concurrent Data Store locking behavior is helpful when writing Berkeley DB Concurrent Data Store applications.

    Berkeley DB Concurrent Data Store avoids deadlocks without the need for a deadlock detector by performing all locking on an entire database at once (or on an entire environment in the case of the DB_CDB_ALLDB flag), and by ensuring that at any given time only one thread of control is allowed to simultaneously hold a read (shared) lock and attempt to acquire a write (exclusive) lock.

    To enforce the rule that only one thread of control at a time can attempt to upgrade a read lock to a write lock, however, Berkeley DB must forbid multiple cursors from attempting to write concurrently. This is done using the flag to the DB->cursor() method. This is the only difference between access method calls in Berkeley DB Concurrent Data Store and in the other Berkeley DB products. The flag causes the newly created cursor to be a “write” cursor; that is, a cursor capable of performing writes as well as reads. Only cursors thus created are permitted to perform write operations (either deletes or puts), and only one such cursor can exist at any given time.

    Any attempt to create a second write cursor or to perform a non-cursor write operation while a write cursor is open will block until that write cursor is closed. Read cursors may open and perform reads without blocking while a write cursor is extant. However, any attempts to actually perform a write, either using the write cursor or directly using the DB->put() or methods, will block until all read cursors are closed. This is how the multiple-reader/single-writer semantic is enforced, and prevents reads from seeing an inconsistent database state that may be an intermediate stage of a write operation.

    By default, Berkeley DB Concurrent Data Store does locking on a per-database basis. For this reason, using cursors to access multiple databases in different orders in different threads or processes, or leaving cursors open on one database while accessing another database, can cause an application to hang. If this behavior is a requirement for the application, Berkeley DB should be configured to do locking on an environment-wide basis. See the DB_CDB_ALLDB flag of the method for more information.

    As a consequence of the Berkeley DB Concurrent Data Store locking model, the following sequences of operations will cause a thread to block itself indefinitely:

    If the application needs to open multiple cursors in a single thread to perform an operation, it can indicate to Berkeley DB that the cursor locks should not block each other by creating a Berkeley DB Concurrent Data Store group, using DB_ENV->cdsgroup_begin(). This creates a locker ID that is shared by all cursors opened in the group.

    Berkeley DB Concurrent Data Store groups use a handle to indicate the shared locker ID to Berkeley DB calls, and call DB_TXN->commit() to end the group. This is a convenient way to pass the locked ID to the calls where it is needed, but should not be confused with the real transactional semantics provided by Berkeley DB Transactional Data Store. In particular, Berkeley DB Concurrent Data Store groups do not provide any abort or recovery facilities, and have no impact on durability of operations.