首先是创建Column Family,这里注意我们可以通过两种方式来创建Column Family,一种是在Open DB的时候通过传递需要创建的Column Family,一种是当DB创建并打开之后, 通过直接的CreateColumnFamily来创建Column Family.
这里可以看到不管是哪一种方式最终都会返回一个ColumnFamilyHandle给调用者来使用.
然后就是删除Column Family的方式,这里很简单就是传递之前创建的ColumnFamilyHandle给RocksDB,然后用以删除.
所有的Column Family都是通过一个叫做ColumnFamilySet的结构来管理的,而每一个Column Family都是一个ColumnFamilyData.
class ColumnFamilySet {
public:
// ColumnFamilySet supports iteration
public:
.................................
ColumnFamilyData* CreateColumnFamily(const std::string& name, uint32_t id,
Version* dummy_version,
const ColumnFamilyOptions& options);
iterator begin() { return iterator(dummy_cfd_->next_); }
iterator end() { return iterator(dummy_cfd_); }
...............................
private:
friend class ColumnFamilyData;
// helper function that gets called from cfd destructor
// REQUIRES: DB mutex held
void RemoveColumnFamily(ColumnFamilyData* cfd);
// column_families_ and column_family_data_ need to be protected:
// * when mutating both conditions have to be satisfied:
// 1. DB mutex locked
// 2. thread currently in single-threaded write thread
// * when reading, at least one condition needs to be satisfied:
// 1. DB mutex locked
// 2. accessed from a single-threaded write thread
std::unordered_map<std::string, uint32_t> column_families_;
std::unordered_map<uint32_t, ColumnFamilyData*> column_family_data_;
uint32_t max_column_family_;
ColumnFamilyData* dummy_cfd_;
// We don't hold the refcount here, since default column family always exists
// We are also not responsible for cleaning up default_cfd_cache_. This is
// just a cache that makes common case (accessing default column family)
// faster
ColumnFamilyData* default_cfd_cache_;
..................................
};
然后来看ColumnFamilyData,这个数据结构就是用来表示一个ColumnFamily,保存了对应的信息,我们可以看到有ID/name以及当前ColumnFamily对应的所有的version(dummy_versions_). 其中这里的next_/prev_就是在ColumnFamilySet中用来表示所有ColumnFamily的双向链表.
然后就是返回给调用者的ColumnFamilyHandleImpl结构,这个结构主要是封装了ColumnFamilyData.
// ColumnFamilyHandleImpl is the class that clients use to access different
class ColumnFamilyHandleImpl : public ColumnFamilyHandle {
public:
// create while holding the mutex
ColumnFamilyHandleImpl(
ColumnFamilyData* cfd, DBImpl* db, InstrumentedMutex* mutex);
// destroy without mutex
virtual ~ColumnFamilyHandleImpl();
virtual ColumnFamilyData* cfd() const { return cfd_; }
......................................
private:
ColumnFamilyData* cfd_;
DBImpl* db_;
InstrumentedMutex* mutex_;
};
接下来我们就来从ColumnFamily的创建以及删除来分析ColumnFamily的实现.我们从DBImpl::CreateColumnFamilyImpl开始.在这个函数 中首先就是通过调用GetNextColumnFamilyID来得到当前创建的ColumnFamily对应的ID(自增).然后再调用LogAndApply来对ColumnFamily 进行对应的操作.最后再返回封装好的ColumnFamilyHandle给调用者.
Status DBImpl::CreateColumnFamilyImpl(const ColumnFamilyOptions& cf_options,
const std::string& column_family_name,
ColumnFamilyHandle** handle) {
.......................................
{
...................................
VersionEdit edit;
edit.AddColumnFamily(column_family_name);
uint32_t new_id = versions_->GetColumnFamilySet()->GetNextColumnFamilyID();
edit.SetColumnFamily(new_id);
edit.SetLogNumber(logfile_number_);
edit.SetComparatorName(cf_options.comparator->Name());
// LogAndApply will both write the creation in MANIFEST and create
// ColumnFamilyData object
{ // write thread
WriteThread::Writer w;
write_thread_.EnterUnbatched(&w, &mutex_);
// LogAndApply will both write the creation in MANIFEST and create
// ColumnFamilyData object
s = versions_->LogAndApply(nullptr, MutableCFOptions(cf_options), &edit,
&mutex_, directories_.GetDbDir(), false,
&cf_options);
write_thread_.ExitUnbatched(&w);
........................................
*handle = new ColumnFamilyHandleImpl(cfd, this, &mutex_);
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"Created column family [%s] (ID %u)",
column_family_name.c_str(), (unsigned)cfd->GetID());
}
.............................................
} // InstrumentedMutexLock l(&mutex_)
.................................
return s;
}
最终会在LogAndApply调用ColumnFamilySet的CreateColumnFamily函数(通过VersionSet::CreateColumnFamily),这个函数我们可看到主要做了下面三件事情
- 创建ColumnFamilyData对象
- 将新的创建好的CFD加入到双向链表
-
然后来看如何删除ColumnFamily,这里所有的删除最终都会调用ColumnFamilySet::RemoveColumnFamily函数,这个函数是是从两个Map中删除对应的ColumnFamily. 这里或许我们要问了,为什么管理的双向链表不需要删除呢。这里原因是这样的,由于ColumnFamilyData是通过引用计数管理的,因此只有当所有的引用计数都清零之后, 才需要真正的函数ColumnFamilyData(也就是会从双向链表中删除数据).
// under a DB mutex AND from a write thread
void ColumnFamilySet::RemoveColumnFamily(ColumnFamilyData* cfd) {
auto cfd_iter = column_family_data_.find(cfd->GetID());
assert(cfd_iter != column_family_data_.end());
column_family_data_.erase(cfd_iter);
column_families_.erase(cfd->GetName());
}
因此我们来看ColumnFamilyData的析构函数.可以看到析构函数中会从双向链表中删除对应的数据,以及处理对应的Version(corrent_).
// DB mutex held
ColumnFamilyData::~ColumnFamilyData() {
assert(refs_.load(std::memory_order_relaxed) == 0);
// remove from linked list
auto prev = prev_;
auto next = next_;
prev->next_ = next;
next->prev_ = prev;
if (!dropped_ && column_family_set_ != nullptr) {
// If it's dropped, it's already removed from column family set
// If column_family_set_ == nullptr, this is dummy CFD and not in
// ColumnFamilySet
column_family_set_->RemoveColumnFamily(this);
}
if (current_ != nullptr) {
current_->Unref();
}
..............................
最后我们来看一下在磁盘上ColumnFamily是如何保存的,首先需要明确的是ColumnFamily是保存在MANIFEST文件中的,信息的保存比较简单(之前的文章有介绍), 和MANIFEST中其他的信息没什么区别,因此这里我们主要来看数据的读取以及初始化,这里所有的操作都是包含在VersionSet::Recover中,我们来看这个函数.
函数主要的逻辑就是读取MANIFEST然后来再来将磁盘上读取的ColumnFamily的信息初始化(初始化ColumnFamilySet结构),可以看到这里相当于将之前的create/drop 的操作全部回放一遍,也就是会调用CreateColumnFamily/DropColumnFamily来将磁盘的信息初始化到内存.