layers

This module contains many layer classes that we might be interested in using in our models. These layers complement the default which we can also use as predefined layers.

Custom fastai modules

Layer that concats AdaptiveAvgPool2d and AdaptiveMaxPool2d.

The output will be 2*sz, or just 2 if sz is None.

The object uses adaptive average pooling and adaptive max pooling and concatenates them both. We use this because it provides the model with the information of both methods and improves performance. This technique is called adaptive because it allows us to decide on what output dimensions we want, instead of choosing the input’s dimensions to fit a desired output size.

Let’s try training with Adaptive Average Pooling first, then with Adaptive Max Pooling and finally with the concatenation of them both to see how they fare in performance.

We will first define a simple_cnn using by changing the source code a bit.

  1. def simple_cnn_max(actns:Collection[int], kernel_szs:Collection[int]=None,
  2. strides:Collection[int]=None) -> nn.Sequential:
  3. "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
  4. nl = len(actns)-1
  5. kernel_szs = ifnone(kernel_szs, [3]*nl)
  6. strides = ifnone(strides , [2]*nl)
  7. layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
  8. for i in range(len(strides))]
  9. layers.append(nn.Sequential(nn.AdaptiveMaxPool2d(1), Flatten()))
  10. return nn.Sequential(*layers)
  1. model = simple_cnn_max((3,16,16,2))
  2. learner.fit(1)

Total time: 00:02

Now let’s try with Adaptive Average Pooling now.

  1. def simple_cnn_avg(actns:Collection[int], kernel_szs:Collection[int]=None,
  2. strides:Collection[int]=None) -> nn.Sequential:
  3. "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
  4. nl = len(actns)-1
  5. kernel_szs = ifnone(kernel_szs, [3]*nl)
  6. strides = ifnone(strides , [2]*nl)
  7. layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
  8. for i in range(len(strides))]
  9. layers.append(nn.Sequential(nn.AdaptiveAvgPool2d(1), Flatten()))
  10. return nn.Sequential(*layers)
  1. model = simple_cnn_avg((3,16,16,2))
  2. learner = Learner(data, model, metrics=[accuracy])
  3. learner.fit(1)

Total time: 00:02

Finally we will try with the concatenation of them both . We will see that, in fact, it increases our accuracy and decreases our loss considerably!

  1. def simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None,
  2. strides:Collection[int]=None) -> nn.Sequential:
  3. "CNN with `conv2d_relu` layers defined by `actns`, `kernel_szs` and `strides`"
  4. nl = len(actns)-1
  5. kernel_szs = ifnone(kernel_szs, [3]*nl)
  6. strides = ifnone(strides , [2]*nl)
  7. layers = [conv_layer(actns[i], actns[i+1], kernel_szs[i], stride=strides[i])
  8. for i in range(len(strides))]
  9. layers.append(nn.Sequential(AdaptiveConcatPool2d(1), Flatten()))
  10. return nn.Sequential(*layers)
  1. model = simple_cnn((3,16,16,2))
  2. learner = Learner(data, model, metrics=[accuracy])
  3. learner.fit(1)

Total time: 00:02

class Lambda[test]

Lambda(func:LambdaFunc) :: PrePostInitMeta :: No tests found for Lambda. To contribute a test please refer to this guide and .

Create a layer that simply calls func with x

This is very useful to use functions as layers in our networks inside a Sequential object. So, for example, say we want to apply a and we need to change the shape of our output batches to be able to use this loss. We can add a layer that applies the necessary change in shape by calling:

Lambda(lambda x: x.view(x.size(0),-1))

Let’s see an example of how the shape of our output can change when we add this layer.

  1. torch.Size([64, 10, 1, 1])
  1. model = nn.Sequential(
  2. nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  3. nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  4. nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  5. nn.AdaptiveAvgPool2d(1),
  6. Lambda(lambda x: x.view(x.size(0),-1))
  7. )
  8. model.cuda()
  9. for xb, yb in data.train_dl:
  10. out = (model(*[xb]))
  11. print(out.size())
  12. break
  1. torch.Size([64, 10])

class Flatten[test]

Flatten(full:bool=False) :: PrePostInitMeta :: No tests found for Flatten. To contribute a test please refer to this guide and .

Flatten x to a single dimension, often used at the end of a model. full for rank-1 tensor

The function we build above is actually implemented in our library as Flatten. We can see that it returns the same size when we run it.

  1. model = nn.Sequential(
  2. nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  3. nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  4. nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  5. nn.AdaptiveAvgPool2d(1),
  6. Flatten(),
  7. )
  8. model.cuda()
  9. for xb, yb in data.train_dl:
  10. out = (model(*[xb]))
  11. print(out.size())
  12. break
  1. torch.Size([64, 10])

PoolFlatten[source][test]

PoolFlatten() → No tests found for PoolFlatten. To contribute a test please refer to this guide and .

Apply nn.AdaptiveAvgPool2d to x and then flatten the result.

We can combine these two final layers ( and Flatten) by using .

  1. model = nn.Sequential(
  2. nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  3. nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  4. nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  5. PoolFlatten()
  6. )
  7. model.cuda()
  8. for xb, yb in data.train_dl:
  9. out = (model(*[xb]))
  10. print(out.size())
  11. break

Another use we give to the Lambda function is to resize batches with ResizeBatch when we have a layer that expects a different input than what comes from the previous one.

ResizeBatch(*size:int) :: :: Module No tests found for ResizeBatch. To contribute a test please refer to and this discussion.

Reshape x to size, keeping batch dim the same size

  1. a = torch.tensor([[1., -1.], [1., -1.]])[None]
  2. print(a)
  1. tensor([[[ 1., -1.],
  2. [ 1., -1.]]])
  1. out = ResizeBatch(4)
  2. print(out(a))
  1. tensor([[ 1., -1., 1., -1.]])

class Debugger[source][test]

Debugger() :: :: Module No tests found for Debugger. To contribute a test please refer to and this discussion.

A module to debug inside a model.

The debugger module allows us to peek inside a network while its training and see in detail what is going on. We can see inputs, outputs and sizes at any point in the network.

For instance, if you run the following:

  1. model = nn.Sequential(
  2. nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  3. Debugger(),
  4. nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  5. nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1), nn.ReLU(),
  6. )
  7. model.cuda()
  8. learner = Learner(data, model, metrics=[accuracy])
  9. learner.fit(5)
  1. /home/ubuntu/fastai/fastai/layers.py(74)forward()
  2. 72 def forward(self,x:Tensor) -> Tensor:
  3. 73 set_trace()
  4. ---> 74 return x
  5. 75
  6. 76 class StdUpsample(nn.Module):
  7. ipdb>

PixelShuffle_ICNR(ni:int, nf:int=None, scale:int=2, blur:bool=False, norm_type=<NormType.Weight: 3>, leaky:float=None) :: :: Module No tests found for PixelShuffle_ICNR. To contribute a test please refer to and this discussion.

Upsample by scale from ni filters to nf (default ni), using , icnr init, and .

class MergeLayer[test]

MergeLayer(dense:bool=False) :: PrePostInitMeta :: No tests found for MergeLayer. To contribute a test please refer to this guide and .

Merge a shortcut with the result of the module by adding them or concatenating them if dense=True.

class PartialLayer[test]

PartialLayer(func, **kwargs) :: PrePostInitMeta :: No tests found for PartialLayer. To contribute a test please refer to this guide and .

Layer that applies partial(func, **kwargs).

SigmoidRange(low, high) :: PrePostInitMeta :: No tests found for SigmoidRange. To contribute a test please refer to this guide and .

Sigmoid module with range (low,x_max)

class SequentialEx[test]

SequentialEx(*layers) :: PrePostInitMeta :: No tests found for SequentialEx. To contribute a test please refer to this guide and .

Like nn.Sequential, but with ModuleList semantics, and can access module input

class SelfAttention[source][test]

  • pytest -sv tests/test_torch_core.py::test_keep_parameter

To run tests please refer to this guide.

Self attention layer for nd.

BatchNorm1dFlat(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) :: No tests found for BatchNorm1dFlat. To contribute a test please refer to this guide and .

nn.BatchNorm1d, but first flattens leading dimensions

Loss functions

class FlattenedLoss[test]

FlattenedLoss(func, *args, axis:int=-1, floatify:bool=False, is_2d:bool=True, **kwargs) No tests found for FlattenedLoss. To contribute a test please refer to this guide and .

Same as func, but flattens input and target.

Create an instance of func with args and kwargs. When passing an output and target, it

  • puts axis first in output and target with a transpose
  • casts the target to float if floatify=True
  • squeezes the output to two dimensions if is_2d, otherwise one dimension, squeezes the target to one dimension

BCEFlat[test]

BCEFlat(*args, axis:int=-1, floatify:bool=True, **kwargs) No tests found for BCEFlat. To contribute a test please refer to this guide and .

Same as nn.BCELoss, but flattens input and target.

BCEWithLogitsFlat[source][test]

BCEWithLogitsFlat(*args, axis:int=-1, floatify:bool=True, **kwargs) No tests found for BCEWithLogitsFlat. To contribute a test please refer to and this discussion.

Same as , but flattens input and target.

CrossEntropyFlat[test]

CrossEntropyFlat(*args, axis:int=-1, **kwargs) No tests found for CrossEntropyFlat. To contribute a test please refer to this guide and .

Same as nn.CrossEntropyLoss, but flattens input and target.

MSELossFlat[source][test]

MSELossFlat(*args, axis:int=-1, floatify:bool=True, **kwargs) No tests found for MSELossFlat. To contribute a test please refer to and this discussion.

Same as , but flattens input and target.

class NoopLoss[test]

NoopLoss() :: PrePostInitMeta :: No tests found for NoopLoss. To contribute a test please refer to this guide and .

Just returns the mean of the output.

WassersteinLoss() :: PrePostInitMeta :: No tests found for WassersteinLoss. To contribute a test please refer to this guide and .

For WGAN.

Helper functions to create modules

bn_drop_lin[source][test]

The function returns a sequence of batch normalization, and a linear layer. This custom layer is usually used at the end of a model.

n_in represents the size of the input, n_out the size of the output, bn whether we want batch norm or not, p how much dropout, and actn (optional parameter) adds an activation function at the end.

conv2d[test]

conv2d(ni:, nf:int, ks:int=3, stride:int=1, padding:int=None, bias=False, init:LayerFunc='kaiming_normal_') → Conv2d No tests found for conv2d. To contribute a test please refer to and this discussion.

Create and initialize layer. padding defaults to ks//2.

conv2d_trans[test]

Create nn.ConvTranspose2d layer.

conv_layer[source][test]

conv_layer(ni:int, nf:int, ks:int=3, stride:int=1, padding:int=None, bias:bool=None, is_1d:bool=False, norm_type:Optional[]=<NormType.Batch: 1>, use_activ:bool=True, leaky:float=None, transpose:bool=False, init:Callable='kaiming_normal_', self_attention:bool=False) No tests found for conv_layer. To contribute a test please refer to this guide and .

The conv_layer function returns a sequence of , BatchNorm and a ReLU or activation function.

n_in represents the size of the input, n_out the size of the output, ks the kernel size, stride the stride with which we want to apply the convolutions. bias will decide if they have bias or not (if None, defaults to True unless using batchnorm). norm_type selects the type of normalization (or None). If leaky is None, the activation is a standard ReLU, otherwise it’s a LeakyReLU of slope leaky. Finally if transpose=True, the convolution is replaced by a ConvTranspose2D.

embedding[test]

embedding(ni:int, nf:int) → Module No tests found for embedding. To contribute a test please refer to and this discussion.

Create an with input size ni and output size nf.

relu[test]

relu(inplace:bool=False, leaky:float=None) No tests found for relu. To contribute a test please refer to this guide and .

Return a relu activation, maybe leaky and inplace.

res_block[test]

res_block(nf, dense:bool=False, norm_type:Optional[NormType]=<NormType.Batch: 1>, bottle:bool=False, **conv_kwargs) No tests found for res_block. To contribute a test please refer to and this discussion.

Resnet block of nf features. conv_kwargs are passed to .

sigmoid_range[test]

sigmoid_range(x, low, high) No tests found for sigmoid_range. To contribute a test please refer to this guide and .

Sigmoid function with range (low, high)

simple_cnn[test]

simple_cnn(actns:Collection[int], kernel_szs:Collection[int]=None, strides:Collection[int]=None, bn=False) → Sequential No tests found for simple_cnn. To contribute a test please refer to and this discussion.

CNN with defined by actns, kernel_szs and strides, plus batchnorm if bn.

Initialization of modules

batchnorm_2d[source][test]

batchnorm_2d(nf:int, norm_type:=<NormType.Batch: 1>) No tests found for batchnorm_2d. To contribute a test please refer to this guide and .

A batchnorm2d layer with nf features initialized depending on norm_type.

icnr[test]

icnr(x, scale=2, init='kaiming_normal_') No tests found for icnr. To contribute a test please refer to this guide and .

ICNR init of x, with scale and init function.

trunc_normal_[test]

trunc_normal_(x:Tensor, mean:float=0.0, std:float=1.0) → Tensor No tests found for trunc_normal_. To contribute a test please refer to this guide and .

Truncated normal initialization.

icnr[test]

icnr(x, scale=2, init='kaiming_normal_') No tests found for icnr. To contribute a test please refer to this guide and .

ICNR init of x, with scale and init function.

An enumeration.