Brewing Models
In this overview we will introduce brew, a lightweight collection of helper functions to help you build your model. We will start with explaining the key concepts of Ops versus Helper Functions. Then we will show brew
usage, how it acts as an interface to the ModelHelper
object, the and arg_scope
syntax sugar. Finally we discuss the motivation of introducing brew
.
Before we dig into brew
we should review some conventions in Caffe2 and how layers of a neural network are represented. Deep learning networks in Caffe2 are built up with operators. Typically these operators are written in C++ for maximum performance. Caffe2 also provides a Python API that wraps these C++ operators, so you can more flexibly experiment and prototype. In Caffe2, operators are always presented in a CamelCase fashion, whereas Python helper functions with a similar name are in lowercase. Examples of this are to follow.
We often refer to operators as an “Op” or a collection of operators as “Ops”. For example, the FC
Op represents a Fully-Connected operator that has weighted connections to every neuron in the previous layer and to every neuron on the next layer. For example, you can create an FC
Op with:
Or you can create a Op with:
- model.net.Copy(blob_in, blob_out)
It should also be noted that you can also create an operator without annotating net
. For example, just like in the previous example where we created a Copy
Op, we can use the following code to create a Copy
operator on model.net
:
- model.Copy(blob_in, blob_out)
Building your model/network using merely single operators could be painstaking since you will have to do parameter initialization, device/engine choice all by yourself (but this is also why Caffe2 is so fast!). For example, to build an FC layer you have several lines of code to prepare weight
and bias
, which are then fed to the Op.
Luckily Caffe2 helper functions are here to help. Helper functions are wrapper functions that create a complete layer for a model. The helper function will typically handle parameter initialization, operator definition, and engine selection. Caffe2 default helper functions are named in Python PEP8 function convention. For example, using , implementing an FC
Op via the helper function fc
is much simpler:
An easier way using a helper function:
- fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference
Check out the repo for more cool helper functions!
Now that you’ve been introduced to Ops and Helper Functions, let’s cover how brew
can make model building even easier. brew
is a smart collection of helper functions. You can use all Caffe2 awesome helper functions with a single import of brew module. You can now add a FC layer using:
- from caffe2.python import brew
- brew.fc(model, blob_in, blob_out, ...)
That’s pretty much the same as using the helper function directly, however brew
really starts to shine once your models get more complicated. The following is a LeNet model building example, extracted from the .
Each layer is created using brew
, which in turn is using its operator hooks to instantiate each Op.
- # change all weight_init here
- brew.conv(model, ..., weight_init=('XavierFill', {}),...)
- # repeat 150 times
- ...
- brew.conv(model, ..., weight_init=('XavierFill', {}),...)
Or with the help of arg_scope
, you can
- with brew.arg_scope([brew.conv], weight_init=('XavierFill', {})):
- brew.conv(model, ...) # no weight_init needed here!
- brew.conv(model, ...)
- ...
As you use brew
more often and find a need for implementing an Op not currently covered by brew
you will want to write your own helper function. You can register your helper function to brew to enjoy unified management and syntax sugar.
Simply define your new helper function, register it with brew
using the .Register
function, then call it with brew.new_helper_function
.
If you think your helper function might be helpful to the rest of the Caffe2 community, remember to share it, and create a pull request.
To get more details about each of these functions, visit the Operators Catalogue.
- add_weight_decay
- concat
- conv_nd
- depth_concat
- fc
- fc_sparse
- group_conv_deprecated
- instance_norm
- lrn
- max_pool_with_index
- prelu
- spatial_bn
- sum
Thanks for reading a whole overview on brew
! Congratulations, you are finally here! Long story short, we want to separate model building process and model storage. In our view, ModelHelper
class should only contain network definition and parameter information. The brew
module will have the functions to build network and initialize parameters.
Compared with previous gigantic CNNModelHelper
that is doing both model storage and model building, the ModelHelper
+ brew
way of model building is much more modularized and easier to extend. In terms of naming, it is also much less confusing as the Caffe2 family supports a variety of networks, including MLP, RNN and CNN. We hope this tutorial will help your model building to be faster and easier while also getting to know Caffe2 in more depth. There is a detailed example of brew usage in python/brew_test.py. If you have any question about brew, please feel free to contact us and ask a question in an Issue on the repo. Thank you again for embracing the new brew
API.