Shortcuts

Build for RKNN

This tutorial is based on Ubuntu-18.04 and Rockchip NPU rk3588. For different NPU devices, you may have to use different rknn packages. Below is a table describing the relationship:

Device Python Package c/c++ SDK
RK1808/RK1806 rknn-toolkit rknpu
RV1109/RV1126 rknn-toolkit rknpu
RK3566/RK3568/RK3588 rknn-toolkit2 rknpu2
RV1103/RV1106 rknn-toolkit2 rknpu2

Installation

It is recommended to create a virtual environment for the project.

  1. Get RKNN-Toolkit2 or RKNN-Toolkit through git. RKNN-Toolkit2 for example:

    git clone git@github.com:rockchip-linux/rknn-toolkit2.git
    
  2. Install RKNN python package following rknn-toolkit2 doc or rknn-toolkit doc. When installing rknn python package, it is better to append --no-deps after the commands to avoid dependency conflicts. RKNN-Toolkit2 package for example:

    pip install packages/rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl --no-deps
    
  3. Install ONNX==1.8.0 before reinstall MMDeploy from source following the instructions. Note that there are conflicts between the pip dependencies of MMDeploy and RKNN. Here is the suggested packages versions for python 3.6:

    protobuf==3.19.4
    onnx==1.8.0
    onnxruntime==1.8.0
    torch==1.8.0
    torchvision==0.9.0
    
  4. Install torch and torchvision using conda. For example:

conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge

To work with models from MMPretrain, you may need to install it additionally.

Usage

Example:

python tools/deploy.py \
    configs/mmpretrain/classification_rknn-fp16_static-224x224.py \
    /mmpretrain_dir/configs/resnet/resnet50_8xb32_in1k.py \
    https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth \
    /mmpretrain_dir/demo/demo.JPEG \
    --work-dir ../resnet50 \
    --device cpu

Deployment config

With the deployment config, you can modify the backend_config for your preference. An example backend_config of mmpretrain is shown as below:

backend_config = dict(
    type='rknn',
    common_config=dict(
        mean_values=None,
        std_values=None,
        target_platform='rk3588',
        optimization_level=3),
    quantization_config=dict(do_quantization=False, dataset=None),
    input_size_list=[[3, 224, 224]])

The contents of common_config are for rknn.config(). The contents of quantization_config are used to control rknn.build(). You may have to modify target_platform for your own preference.

Build SDK with Rockchip NPU

Build SDK with RKNPU2

  1. Get rknpu2 through git:

    git clone git@github.com:rockchip-linux/rknpu2.git
    
  2. For linux, download gcc cross compiler. The download link of the compiler from the official user guide of rknpu2 was deprecated. You may use another verified link. After download and unzip the compiler, you may open the terminal, set RKNN_TOOL_CHAIN and RKNPU2_DEVICE_DIR by export RKNN_TOOL_CHAIN=/path/to/gcc/usr;export RKNPU2_DEVICE_DIR=/path/to/rknpu2/runtime/RK3588.

  3. after the above preparition, run the following commands:

cd /path/to/mmdeploy
mkdir -p build && rm -rf build/CM* && cd build
export LD_LIBRARY_PATH=$RKNN_TOOL_CHAIN/lib64:$LD_LIBRARY_PATH
cmake \
    -DCMAKE_TOOLCHAIN_FILE=/path/to/mmdeploy/cmake/toolchains/rknpu2-linux-gnu.cmake \
    -DMMDEPLOY_BUILD_SDK=ON \
    -DCMAKE_BUILD_TYPE=Debug \
    -DOpenCV_DIR=${RKNPU2_DEVICE_DIR}/../../examples/3rdparty/opencv/opencv-linux-aarch64/share/OpenCV \
    -DMMDEPLOY_BUILD_SDK_PYTHON_API=ON \
    -DMMDEPLOY_TARGET_DEVICES="cpu" \
    -DMMDEPLOY_TARGET_BACKENDS="rknn" \
    -DMMDEPLOY_CODEBASES=all \
    -DMMDEPLOY_BUILD_TEST=ON \
    -DMMDEPLOY_BUILD_EXAMPLES=ON \
    ..
make && make install

Run the demo with SDK

First make sure that--dump-infois used during convert model, so that the working directory has the files required by the SDK such as pipeline.json.

adb push the model directory, executable file and .so to the device.

cd /path/to/mmdeploy
adb push resnet50  /data/local/tmp/resnet50
adb push /mmpretrain_dir/demo/demo.JPEG /data/local/tmp/resnet50/demo.JPEG
cd build
adb push lib /data/local/tmp/lib
adb push bin/image_classification /data/local/tmp/image_classification

Set up environment variable and execute the sample.

adb shell
cd /data/local/tmp
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/data/local/tmp/lib
./image_classification cpu ./resnet50  ./resnet50/demo.JPEG
..
label: 65, score: 0.95

Troubleshooting

  • MMDet models.

    YOLOV3 & YOLOX: you may paste the following partition configuration into detection_rknn_static-320x320.py:

    # yolov3, yolox for rknn-toolkit and rknn-toolkit2
    partition_config = dict(
        type='rknn',  # the partition policy name
        apply_marks=True,  # should always be set to True
        partition_cfg=[
            dict(
                save_file='model.onnx',  # name to save the partitioned onnx
                start=['detector_forward:input'],  # [mark_name:input, ...]
                end=['yolo_head:input'],  # [mark_name:output, ...]
                output_names=[f'pred_maps.{i}' for i in range(3)]) # output names
        ])
    

    RTMDet: you may paste the following partition configuration into detection_rknn-int8_static-640x640.py:

    # rtmdet for rknn-toolkit and rknn-toolkit2
    partition_config = dict(
        type='rknn',  # the partition policy name
        apply_marks=True,  # should always be set to True
        partition_cfg=[
            dict(
                save_file='model.onnx',  # name to save the partitioned onnx
                start=['detector_forward:input'],  # [mark_name:input, ...]
                end=['rtmdet_head:output'],  # [mark_name:output, ...]
                output_names=[f'pred_maps.{i}' for i in range(6)]) # output names
        ])
    

    RetinaNet & SSD & FSAF with rknn-toolkit2, you may paste the following partition configuration into detection_rknn_static-320x320.py. Users with rknn-toolkit can directly use default config.

    # retinanet, ssd for rknn-toolkit2
    partition_config = dict(
        type='rknn',  # the partition policy name
        apply_marks=True,
        partition_cfg=[
            dict(
                save_file='model.onnx',
                start='detector_forward:input',
                end=['BaseDenseHead:output'],
                output_names=[f'BaseDenseHead.cls.{i}' for i in range(5)] +
                [f'BaseDenseHead.loc.{i}' for i in range(5)])
        ])
    
  • SDK only supports int8 rknn model, which require do_quantization=True when converting models.

  • Latency problem.

    For devices running RKNPU like rv1126, please set pre_compile=True in quantization_config when converting models. Or the latency may not suit your need.

Read the Docs v: latest
Versions
latest
stable
1.x
v1.3.0
v1.2.0
v1.1.0
v1.0.0
0.x
v0.14.0
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.