R2Inference - NCSDK
Make sure you also check R2Inference's companion project: GstInference |
R2Inference |
---|
Introduction |
Getting started |
Supported backends |
Examples |
Model Zoo |
Contact Us |
|
The NCSDK Intel® Movidius™ Neural Compute SDK (Intel® Movidius™ NCSDK) enables the deployment of deep neural networks on compatible devices such as the Intel® Movidius™ Neural Compute Stick. The NCSDK includes a set of software tools to compile, profile, and validate DNNs (Deep Neural Networks) as well as APIs on C/C++ and Python for application development.
The NCSDK has two general usages:
- Profiling, tuning, and compiling DNN models.
- Prototyping user applications, that run accelerated with a neural compute device hardware, using the NCAPI.
Installation
You can install the NCSDK on a system running Linux directly, downloading a Docker container, on a virtual machine, or using a Python virtual environment. All the possible installation paths are documented on the Movidius official installation guide.
And also RidgeRun provides a wiki page Intel Movidius NCSDK Installation for the installation and troubleshooting.
Note: It is recommended to take the docker container route on the NCSDK installation. Other routes may affect your python environment because it sometimes uninstalls and reinstalls python and some common plugins such as NumPy or TensorFlow. Docker installation is actually very simple and it doesn't affect your environment at all. Please refer Installation and Configuration with Docker from Movidius GitHub to jump directly to the docker section on the installation guide. |
Generating a model for R2I
When you use the NCSDK backend you will need a compiled NCS graph file. You can obtain this file from TensorFlow's protobuff and weights filer; or caffe's prototxt and caffemodel files. mvNCCompile is a tool included with the NCSDK installation that compiles a network and produces a graph file that is compatible with the NCAPI and the Gst-Inference plugins using the NCSDK backend.
From Caffe model
For example, given a caffe model (googlenet.caffemodel) and a network description (deploy.prototxt):
mvNCCompile -w googlenet.caffemodel -s 12 deploy.prototxt
From TensorFlow model
For example you will need a frozen TensorFlow graph (inception_v4_frozen.pb) and the name of the input and output layers on the model:
mvNCCompile -s 12 inception_v4_frozen.pb -in=input -on=InceptionV4/Predictions/Reshape_1
This command will output the graph and output_expected.npy files, that can be used later with the googlenet plugin.
If you need help generating a frozen TensorFlow model check the Create a model using saved weights from a .ckpt file section on the TensorFlow wiki.
Tensorboard can be used to determine the input and output layer names of an unknown model.
Tools
mvNCCheck
Checks the validity of a Caffe or TensorFlow model on a neural compute device. The check is done by running inference on both the device and in software and then comparing the results to determine if the network passes or fails. This tool works best with image classification networks. You can check all the available options on the Movidius Github official documentation.
For example lets test the googlenet caffe model downloaded by the Movidius Github ncappzoo repo:
mvNCCheck -w bvlc_googlenet.caffemodel -i ../../data/images/nps_electric_guitar.png -s 12 -id 546 deploy.prototxt -S 255 -M 110
- -w indicates the weights file
- -i the input image
- -s the number of shaves
- -id the expected label id for the input image (you can find the id for any ImageNet model in imagenet1000_clsidx_to_labels.txt)
- -S is the scaling sice
- -M is the substracted mean after scaling
Most of these parameters are available from the model documentation. The command produces the following result:
lob generated USB: Transferring Data... USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Result: (1000,) 1) 546 0.99609 2) 402 0.0038853 3) 420 8.9228e-05 4) 327 0.0 5) 339 0.0 Expected: (1000,) 1) 546 0.99609 2) 402 0.0039177 3) 420 9.0837e-05 4) 889 1.2875e-05 5) 486 5.3644e-06 ------------------------------------------------------------ Obtained values ------------------------------------------------------------ Obtained Min Pixel Accuracy: 0.0032552085031056777% (max allowed=2%), Pass Obtained Average Pixel Accuracy: 7.264380030846951e-06% (max allowed=1%), Pass Obtained Percentage of wrong values: 0.0% (max allowed=0%), Pass Obtained Pixel-wise L2 error: 0.00011369892179413199% (max allowed=1%), Pass Obtained Global Sum Difference: 7.236003875732422e-05 ------------------------------------------------------------
mvNCCompile
Compiles a network and weights files from Caffe or TensorFlow models into a graph file that is compatible with the NCAPI. For examples check the Generating a model for R2I section.
mvNCProfile
Compiles a network, runs it on a connected neural compute device, and outputs profiling info on the terminal and on an HTML file. The profiling data contains layer performance and execution time of the model. The HTML version of the report also contains a graphical representation of the model. For example, to profile the googlenet network:
mvNCProfile deploy.prototxt -s 12
The output looks like:
mvNCProfile v02.00, Copyright @ Intel Corporation 2017 ****** WARNING: using empty weights ****** Layer inception_3b/1x1 forced to im2col_v2, because its output is used in concat /usr/local/bin/ncsdk/Controllers/FileIO.py:65: UserWarning: You are using a large type. Consider reducing your data sizes for best performance Blob generated USB: Transferring Data... Time to Execute : 115.95 ms USB: Myriad Execution Finished Time to Execute : 98.03 ms USB: Myriad Execution Finished USB: Myriad Connection Closing. USB: Myriad Connection Closed. Network Summary Detailed Per Layer Profile Bandwidth time # Name MFLOPs (MB/s) (ms) ======================================================================= 0 data 0.0 55877.1 0.005 1 conv1/7x7_s2 236.0 2453.0 5.745 2 pool1/3x3_s2 1.8 1346.8 1.137 3 pool1/norm1 0.0 711.3 0.538 4 conv2/3x3_reduce 25.7 471.6 0.828 5 conv2/3x3 693.6 305.9 11.957 6 conv2/norm2 0.0 771.6 1.488 7 pool2/3x3_s2 1.4 1403.3 0.818 8 inception_3a/1x1 19.3 554.6 0.560 9 inception_3a/3x3_reduce 28.9 458.3 0.703 10 inception_3a/3x3 173.4 319.2 4.716 11 inception_3a/5x5_reduce 4.8 1035.8 0.283 12 inception_3a/5x5 20.1 716.0 0.872 13 inception_3a/pool 1.4 648.5 0.443 14 inception_3a/pool_proj 9.6 657.0 0.455 15 inception_3b/1x1 51.4 446.0 0.999 16 inception_3b/3x3_reduce 51.4 445.1 1.001 17 inception_3b/3x3 346.8 261.0 8.228 18 inception_3b/5x5_reduce 12.8 879.9 0.453 19 inception_3b/5x5 120.4 536.8 2.510 20 inception_3b/pool 1.8 678.7 0.564 21 inception_3b/pool_proj 25.7 631.2 0.656 22 pool3/3x3_s2 0.8 1213.8 0.591 23 inception_4a/1x1 36.1 364.0 0.977 24 inception_4a/3x3_reduce 18.1 490.3 0.545 25 inception_4a/3x3 70.4 306.0 2.187 26 inception_4a/5x5_reduce 3.0 763.2 0.254 27 inception_4a/5x5 7.5 455.1 0.414 28 inception_4a/pool 0.8 604.6 0.297 29 inception_4a/pool_proj 12.0 613.0 0.389 30 inception_4b/1x1 32.1 349.6 0.995 31 inception_4b/3x3_reduce 22.5 385.6 0.780 32 inception_4b/3x3 88.5 280.9 2.888 33 inception_4b/5x5_reduce 4.8 576.7 0.373 34 inception_4b/5x5 15.1 339.7 0.885 35 inception_4b/pool 0.9 617.8 0.310 36 inception_4b/pool_proj 12.8 579.5 0.438 37 inception_4c/1x1 25.7 415.5 0.762 38 inception_4c/3x3_reduce 25.7 410.3 0.771 39 inception_4c/3x3 115.6 288.2 3.462 40 inception_4c/5x5_reduce 4.8 574.7 0.374 41 inception_4c/5x5 15.1 339.7 0.885 42 inception_4c/pool 0.9 615.3 0.311 43 inception_4c/pool_proj 12.8 577.3 0.440 44 inception_4d/1x1 22.5 382.9 0.786 45 inception_4d/3x3_reduce 28.9 489.2 0.679 46 inception_4d/3x3 146.3 402.9 2.981 47 inception_4d/5x5_reduce 6.4 728.9 0.305 48 inception_4d/5x5 20.1 408.5 0.979 49 inception_4d/pool 0.9 629.5 0.304 50 inception_4d/pool_proj 12.8 630.8 0.403 51 inception_4e/1x1 53.0 297.7 1.531 52 inception_4e/3x3_reduce 33.1 277.0 1.294 53 inception_4e/3x3 180.6 290.3 4.902 54 inception_4e/5x5_reduce 6.6 492.8 0.466 55 inception_4e/5x5 40.1 378.6 1.322 56 inception_4e/pool 0.9 633.0 0.312 57 inception_4e/pool_proj 26.5 446.8 0.731 58 pool4/3x3_s2 0.4 1245.4 0.250 59 inception_5a/1x1 20.9 616.4 0.786 60 inception_5a/3x3_reduce 13.0 569.7 0.582 61 inception_5a/3x3 45.2 570.7 1.786 62 inception_5a/5x5_reduce 2.6 329.2 0.391 63 inception_5a/5x5 10.0 459.6 0.601 64 inception_5a/pool 0.4 531.7 0.146 65 inception_5a/pool_proj 10.4 514.9 0.546 66 inception_5b/1x1 31.3 607.0 1.133 67 inception_5b/3x3_reduce 15.7 612.0 0.625 68 inception_5b/3x3 65.0 606.1 2.366 69 inception_5b/5x5_reduce 3.9 375.0 0.410 70 inception_5b/5x5 15.1 475.0 0.866 71 inception_5b/pool 0.4 531.7 0.146 72 inception_5b/pool_proj 10.4 513.7 0.547 73 pool5/7x7_s1 0.1 405.5 0.236 74 loss3/classifier 0.0 2559.7 0.764 75 prob 0.0 10.0 0.192 --------------------------------------------------------------------------------------------- Total inference time 93.66 --------------------------------------------------------------------------------------------- Generating Profile Report 'output_report.html'...
API
You can find the full documentation of the C API and Python API at
Intel® Movidius™ Neural Compute SDK C API v2
Intel® Movidius™ Neural Compute SDK Python API v2.
Gst-Inference uses only the C API and R2Inference takes care of devices, graphs, models, and FIFOs. Because of this, we will only take a look at the options that you can change when using the C API through R2Inference.
R2Inference changes the options of the framework via the "IParameters" class. First you need to create an object:
r2i::RuntimeError error; std::shared_ptr<r2i::IParameters> parameters = factory->MakeParameters (error);
Then call the "Set" or "Get" virtual functions:
parameters->Set(<option>, <value>) parameters->Get(<option>, <value>)
Device Options
All the device options are read-only.
Property | C API Counterpart | Value | Description |
---|---|---|---|
thermal-throttling-level | NC_RO_THERMAL_THROTTLING_LEVEL | Integer (0,1,2) |
|
device-state | NC_RO_DEVICE_STATE | Integer (0,1,2,3) | The current state of the device:
|
current-memory-used | NC_RO_DEVICE_CURRENT_MEMORY_USED | Integer | Current memory used on the device. |
memory-size | NC_RO_DEVICE_MEMORY_SIZE | Integer | Total memory available on the device. |
max-fifo-num | NC_RO_DEVICE_MAX_FIFO_NUM | Integer | Max number of FIFOs. |
allocated-fifo-num | NC_RO_DEVICE_ALLOCATED_FIFO_NUM | Integer | Number of FIFOs currently allocated. |
max-graph-num | NC_RO_DEVICE_MAX_GRAPH_NUM | Integer | Max number of graphs. |
allocated-graph-num | NC_RO_ALLOCATED_GRAPH_NUM | Integer | Number of graphs currently allocated. |
option-class-limit | NC_RO_DEVICE_OPTION_CLASS_LIMIT | Integer | Highest option class supported. |
device-name | NC_RO_DEVICE_NAME | String | Device name. |
FIFO Options
Most of the R/W options on the FIFO can only be modified between creation and allocation, and R2Inference does both in a single method (Engine->Start()), so it is impossible to write on these options. R2Inference also fixates those options to our specific implementation, so they are not exposed to the plugin.
Global Options
Pay special attention to the log level enumeration, because it is ordered counter-intuitively. 1 is actually the highest log level, 4 is the lowest, and 0 the default.
Property | C API Counterpart | Value | Description |
---|---|---|---|
log-level | NC_RW_LOG_LEVEL | Integer | NCSDK debug log level from ncLogLevel_t enum
|
Graph Options
Property | C API Counterpart | Value | Description |
---|---|---|---|
graph-state | NC_RO_GRAPH_STATE | Integer | The current state of the graph from ncGraphState_t enum
|
graph-input-count | NC_RO_GRAPH_INPUT_TENSOR_DESCRIPTORS | Integer | Array of graph inputs. Returns the size of the array instead of the array itself. |
graph-output-count | NC_RO_GRAPH_OUTPUT_TENSOR_DESCRIPTORS | Integer | Array of graph outputs. Returns the size of the array instead of the array itself. |
graph-debug-info | NC_RO_GRAPH_DEBUG_INFO | String | Debug information. |
graph-name | NC_RO_GRAPH_NAME | String | Graph name. |
graph-option-class-limit | NC_RO_GRAPH_OPTION_CLASS_LIMIT | Integer | The highest option class supported. |
graph-version | NC_RO_GRAPH_VERSION | String | The version ([major, minor]) of the compiled graph. |