Brewing Models

    In this overview we will introduce brew, a lightweight collection of helper functions to help you build your model. We will start with explaining the key concepts of Ops versus Helper Functions. Then we will show brew usage, how it acts as an interface to the ModelHelper object, the and arg_scope syntax sugar. Finally we discuss the motivation of introducing brew.

    Before we dig into brew we should review some conventions in Caffe2 and how layers of a neural network are represented. Deep learning networks in Caffe2 are built up with operators. Typically these operators are written in C++ for maximum performance. Caffe2 also provides a Python API that wraps these C++ operators, so you can more flexibly experiment and prototype. In Caffe2, operators are always presented in a CamelCase fashion, whereas Python helper functions with a similar name are in lowercase. Examples of this are to follow.

    We often refer to operators as an “Op” or a collection of operators as “Ops”. For example, the FC Op represents a Fully-Connected operator that has weighted connections to every neuron in the previous layer and to every neuron on the next layer. For example, you can create an FC Op with:

    Or you can create a Op with:

    1. model.net.Copy(blob_in, blob_out)

    It should also be noted that you can also create an operator without annotating net. For example, just like in the previous example where we created a Copy Op, we can use the following code to create a Copy operator on model.net:

    1. model.Copy(blob_in, blob_out)

    Building your model/network using merely single operators could be painstaking since you will have to do parameter initialization, device/engine choice all by yourself (but this is also why Caffe2 is so fast!). For example, to build an FC layer you have several lines of code to prepare weight and bias, which are then fed to the Op.

    Luckily Caffe2 helper functions are here to help. Helper functions are wrapper functions that create a complete layer for a model. The helper function will typically handle parameter initialization, operator definition, and engine selection. Caffe2 default helper functions are named in Python PEP8 function convention. For example, using , implementing an FC Op via the helper function fc is much simpler:

    An easier way using a helper function:

    1. fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference

    Check out the repo for more cool helper functions!

    Now that you’ve been introduced to Ops and Helper Functions, let’s cover how brew can make model building even easier. brew is a smart collection of helper functions. You can use all Caffe2 awesome helper functions with a single import of brew module. You can now add a FC layer using:

    1. from caffe2.python import brew
    2.  
    3. brew.fc(model, blob_in, blob_out, ...)

    That’s pretty much the same as using the helper function directly, however brew really starts to shine once your models get more complicated. The following is a LeNet model building example, extracted from the .

    Each layer is created using brew, which in turn is using its operator hooks to instantiate each Op.

    1. # change all weight_init here
    2. brew.conv(model, ..., weight_init=('XavierFill', {}),...)
    3. # repeat 150 times
    4. ...
    5. brew.conv(model, ..., weight_init=('XavierFill', {}),...)

    Or with the help of arg_scope, you can

    1. with brew.arg_scope([brew.conv], weight_init=('XavierFill', {})):
    2. brew.conv(model, ...) # no weight_init needed here!
    3. brew.conv(model, ...)
    4. ...

    As you use brew more often and find a need for implementing an Op not currently covered by brew you will want to write your own helper function. You can register your helper function to brew to enjoy unified management and syntax sugar.

    Simply define your new helper function, register it with brew using the .Register function, then call it with brew.new_helper_function.

    If you think your helper function might be helpful to the rest of the Caffe2 community, remember to share it, and create a pull request.

    To get more details about each of these functions, visit the Operators Catalogue.

    Thanks for reading a whole overview on brew! Congratulations, you are finally here! Long story short, we want to separate model building process and model storage. In our view, ModelHelper class should only contain network definition and parameter information. The brew module will have the functions to build network and initialize parameters.

    Compared with previous gigantic CNNModelHelper that is doing both model storage and model building, the ModelHelper + brew way of model building is much more modularized and easier to extend. In terms of naming, it is also much less confusing as the Caffe2 family supports a variety of networks, including MLP, RNN and CNN. We hope this tutorial will help your model building to be faster and easier while also getting to know Caffe2 in more depth. There is a detailed example of brew usage in python/brew_test.py. If you have any question about brew, please feel free to contact us and ask a question in an Issue on the repo. Thank you again for embracing the new brew API.