Block-based Benchmark#

Edge devices are usually very constrained in terms of resources. Some devices may execute some layer faster than other, while some devices might not support all layers. This makes it difficult to compare the performance of different devices.

To solve this problem, NeurIO provides a block-based benchmarking framework. This framework allows to compare the performance of different devices by measuring the time it takes to execute each layer of the model.

Layer-based benchmarking#

The basic blocks are consist in popular layers used in neural networks. These blocks are:

  • Convolutional 1D layer

  • Convolutional 2D layer

  • Depthwise 2D convolutional layer

  • Convolutional 3D layer

  • Fully connected layer

  • Pooling layer

  • Activation layer

  • Batch normalization layer

  • Reshape layer

  • Padding

The available configurations for each block are as follow

Block

Config-1

Config-2

Config-3

Config-4

Convolutional 1D

filters 32, stride 1

filters 32, stride 2

filters 64, stride 1

filters 64, stride 2

Convolutional 2D

filters 32, stride 1

filters 32, stride 2

filters 64, stride 1

filters 64, stride 2

Convolutional 2D

filters 32, 1x1

filters 32, 3x3

filters 32, 5x5 | filters 64, 7x7

Depthwise 2D conv

filters 32, stride 1

filters 32, stride 2

filters 64, stride 1

filters 64, stride 2

Convolutional 3D

filters 32, stride 1

filters 32, stride 2

filters 64, stride 1

filters 64, stride 2

Fully connected

128-32 neurons

256-64 neurons

512-128 neurons

1024-256 neurons

AvgPooling1D

2, stride 1

2, stride 2

4, stride 2

7, stride 1

AvgPooling2D

2x2, stride 1

2x2, stride 2

4x4, stride 2

7x7, stride 1

MaxPooling1D

2, stride 1

2, stride 2

4, stride 2

7, stride 1

MaxPooling2D

2x2, stride 1

2x2, stride 2

4x4, stride 2

7x7, stride 1

Activation

ReLU

Sigmoid

ReLU6

Tanh

Batch norm

1D

2D

3D

4D

Reshape

1D

2D

3D

4D

Padding 1D

Zero

Constant

Reflect

Symmetric

Padding 2D

Zero

Constant

Reflect

Symmetric

Usage#

To use the block-based benchmarking framework, you need to create a LayerBasedBenchmark object. This object will be used to benchmark the different blocks. The LayerBasedBenchmark object can be created as follow:

from neurio.benchmark import LayerBasedBenchmark

benchmark = LayerBasedBenchmark()
device = ... # Create a device object
results = benchmark.run_on(device)

Once the LayerBasedBenchmark object is created, you can run the benchmark on a connected device as follow. The run_on method will run all the benchmarks on the device and return a BenchmarkResult object. This object contains the results of the benchmark.

Cell-based benchmarking#

Cells are blocks that are composed of multiple layers. Usually, they are building blocks of popular topologies used in neural networks. These cells can be:

  • Skip connection (Concatenate and Add)

  • ResNet block

  • DenseNet block

  • Inception block

  • MobileNet block

  • Classifier block

The available configurations for each block are as follow:

Block

Config-1

Config-2

Config-3

Config-4

Skip connection

Concatenate-2

Add-2

Concatenate-4

Add-4

ResNet block

ConvBlock 64 filters

IID Block 64 filters

ConvBlock 128 filters

IID Block 128 filters

DenseNet block

Conv1D 32 filters

Conv1D 64 filters

Conv2D 32 filters

Conv2D 64 filters

Inception block

Naive-1D

With reduction-1D

Naive-2D

With reduction-2D

MobileNet block

Block 32 filters

Block 64 filters

Block 128 filters

Block 256 filters

Classifier block

Avg2D-Flatten-Dense10

GlAvg-Flatten-Dense10

Usage#

To use the composed block-based benchmarking framework, you need to create a CellBasedBenchmark object. This object will be used to benchmark the different blocks. The CellBasedBenchmark object can be used as follow:

from neurio.benchmark import CellBasedBenchmark

benchmark = CellBasedBenchmark()
device = ... # Create a device object
results = benchmark.run_on(device)

Once the CellBasedBenchmark object is created, you can run the benchmark on a connected device as follow. The run_on method will run all the benchmarks on the device and return a BenchmarkResult object. This object contains the results of the benchmark.