apis¶
- mmdeploy.apis.build_task_processor(model_cfg: mmengine.config.config.Config, deploy_cfg: mmengine.config.config.Config, device: str) → mmdeploy.codebase.base.task.BaseTask[source]¶
Build a task processor to manage the deployment pipeline.
- Parameters
model_cfg (str | mmengine.Config) – Model config file.
deploy_cfg (str | mmengine.Config) – Deployment config file.
device (str) – A string specifying device type.
- Returns
A task processor.
- Return type
BaseTask
- mmdeploy.apis.create_calib_input_data(calib_file: str, deploy_cfg: Union[str, mmengine.config.config.Config], model_cfg: Union[str, mmengine.config.config.Config], model_checkpoint: Optional[str] = None, dataset_cfg: Optional[Union[str, mmengine.config.config.Config]] = None, dataset_type: str = 'val', device: str = 'cpu') → None[source]¶
Create dataset for post-training quantization.
- Parameters
calib_file (str) – The output calibration data file.
deploy_cfg (str | Config) – Deployment config file or Config object.
model_cfg (str | Config) – Model config file or Config object.
model_checkpoint (str) – A checkpoint path of PyTorch model, defaults to None.
dataset_cfg (Optional[Union[str, Config]], optional) – Model config to provide calibration dataset. If none, use model_cfg as the dataset config. Defaults to None.
dataset_type (str, optional) – The dataset type. Defaults to ‘val’.
device (str, optional) – Device to create dataset. Defaults to ‘cpu’.
- mmdeploy.apis.extract_model(model: Union[str, onnx.onnx_ml_pb2.ModelProto], start_marker: Union[str, Iterable[str]], end_marker: Union[str, Iterable[str]], start_name_map: Optional[Dict[str, str]] = None, end_name_map: Optional[Dict[str, str]] = None, dynamic_axes: Optional[Dict[str, Dict[int, str]]] = None, save_file: Optional[str] = None) → onnx.onnx_ml_pb2.ModelProto[source]¶
Extract partition-model from an ONNX model.
The partition-model is defined by the names of the input and output tensors exactly.
Examples
>>> from mmdeploy.apis import extract_model >>> model = 'work_dir/fastrcnn.onnx' >>> start_marker = 'detector:input' >>> end_marker = ['extract_feat:output', 'multiclass_nms[0]:input'] >>> dynamic_axes = { 'input': { 0: 'batch', 2: 'height', 3: 'width' }, 'scores': { 0: 'batch', 1: 'num_boxes', }, 'boxes': { 0: 'batch', 1: 'num_boxes', } } >>> save_file = 'partition_model.onnx' >>> extract_model(model, start_marker, end_marker, dynamic_axes=dynamic_axes, save_file=save_file)
- Parameters
model (str | onnx.ModelProto) – Input ONNX model to be extracted.
start_marker (str | Sequence[str]) – Start marker(s) to extract.
end_marker (str | Sequence[str]) – End marker(s) to extract.
start_name_map (Dict[str, str]) – A mapping of start names, defaults to None.
end_name_map (Dict[str, str]) – A mapping of end names, defaults to None.
dynamic_axes (Dict[str, Dict[int, str]]) – A dictionary to specify dynamic axes of input/output, defaults to None.
save_file (str) – A file to save the extracted model, defaults to None.
- Returns
The extracted model.
- Return type
onnx.ModelProto
- mmdeploy.apis.get_predefined_partition_cfg(deploy_cfg: mmengine.config.config.Config, partition_type: str)[source]¶
Get the predefined partition config.
Notes
Currently only support mmdet codebase.
- Parameters
deploy_cfg (mmengine.Config) – use deploy config to get the codebase and task type.
partition_type (str) – A string specifying partition type.
- Returns
A dictionary of partition config.
- Return type
dict
- mmdeploy.apis.inference_model(model_cfg: Union[str, mmengine.config.config.Config], deploy_cfg: Union[str, mmengine.config.config.Config], backend_files: Sequence[str], img: Union[str, numpy.ndarray], device: str) → Any[source]¶
Run inference with PyTorch or backend model and show results.
Examples
>>> from mmdeploy.apis import inference_model >>> model_cfg = ('mmdetection/configs/fcos/' 'fcos_r50_caffe_fpn_gn-head_1x_coco.py') >>> deploy_cfg = ('configs/mmdet/detection/' 'detection_onnxruntime_dynamic.py') >>> backend_files = ['work_dir/fcos.onnx'] >>> img = 'demo.jpg' >>> device = 'cpu' >>> model_output = inference_model(model_cfg, deploy_cfg, backend_files, img, device)
- Parameters
model_cfg (str | mmengine.Config) – Model config file or Config object.
deploy_cfg (str | mmengine.Config) – Deployment config file or Config object.
backend_files (Sequence[str]) – Input backend model file(s).
img (str | np.ndarray) – Input image file or numpy array for inference.
device (str) – A string specifying device type.
- Returns
The inference results
- Return type
Any
- mmdeploy.apis.torch2onnx(img: Any, work_dir: str, save_file: str, deploy_cfg: Union[str, mmengine.config.config.Config], model_cfg: Union[str, mmengine.config.config.Config], model_checkpoint: Optional[str] = None, device: str = 'cuda:0')[source]¶
Convert PyTorch model to ONNX model.
Examples
>>> from mmdeploy.apis import torch2onnx >>> img = 'demo.jpg' >>> work_dir = 'work_dir' >>> save_file = 'fcos.onnx' >>> deploy_cfg = ('configs/mmdet/detection/' 'detection_onnxruntime_dynamic.py') >>> model_cfg = ('mmdetection/configs/fcos/' 'fcos_r50_caffe_fpn_gn-head_1x_coco.py') >>> model_checkpoint = ('checkpoints/' 'fcos_r50_caffe_fpn_gn-head_1x_coco-821213aa.pth') >>> device = 'cpu' >>> torch2onnx(img, work_dir, save_file, deploy_cfg, model_cfg, model_checkpoint, device)
- Parameters
img (str | np.ndarray | torch.Tensor) – Input image used to assist converting model.
work_dir (str) – A working directory to save files.
save_file (str) – Filename to save onnx model.
deploy_cfg (str | mmengine.Config) – Deployment config file or Config object.
model_cfg (str | mmengine.Config) – Model config file or Config object.
model_checkpoint (str) – A checkpoint path of PyTorch model, defaults to None.
device (str) – A string specifying device type, defaults to ‘cuda:0’.
- mmdeploy.apis.torch2torchscript(img: Any, work_dir: str, save_file: str, deploy_cfg: Union[str, mmengine.config.config.Config], model_cfg: Union[str, mmengine.config.config.Config], model_checkpoint: Optional[str] = None, device: str = 'cuda:0')[source]¶
Convert PyTorch model to torchscript model.
- Parameters
img (str | np.ndarray | torch.Tensor) – Input image used to assist converting model.
work_dir (str) – A working directory to save files.
save_file (str) – Filename to save torchscript model.
deploy_cfg (str | mmengine.Config) – Deployment config file or Config object.
model_cfg (str | mmengine.Config) – Model config file or Config object.
model_checkpoint (str) – A checkpoint path of PyTorch model, defaults to None.
device (str) – A string specifying device type, defaults to ‘cuda:0’.
- mmdeploy.apis.visualize_model(model_cfg: Union[str, mmengine.config.config.Config], deploy_cfg: Union[str, mmengine.config.config.Config], model: Union[str, Sequence[str]], img: Union[str, numpy.ndarray, Sequence[str]], device: str, backend: Optional[mmdeploy.utils.constants.Backend] = None, output_file: Optional[str] = None, show_result: bool = False, **kwargs)[source]¶
Run inference with PyTorch or backend model and show results.
Examples
>>> from mmdeploy.apis import visualize_model >>> model_cfg = ('mmdetection/configs/fcos/' 'fcos_r50_caffe_fpn_gn-head_1x_coco.py') >>> deploy_cfg = ('configs/mmdet/detection/' 'detection_onnxruntime_dynamic.py') >>> model = 'work_dir/fcos.onnx' >>> img = 'demo.jpg' >>> device = 'cpu' >>> visualize_model(model_cfg, deploy_cfg, model, img, device, show_result=True)
- Parameters
model_cfg (str | mmengine.Config) – Model config file or Config object.
deploy_cfg (str | mmengine.Config) – Deployment config file or Config object.
model (str | Sequence[str]) – Input model or file(s).
img (str | np.ndarray | Sequence[str]) – Input image file or numpy array for inference.
device (str) – A string specifying device type.
backend (Backend) – Specifying backend type, defaults to None.
output_file (str) – Output file to save visualized image, defaults to None. Only valid if show_result is set to False.
show_result (bool) – Whether to show plotted image in windows, defaults to False.
apis/tensorrt¶
- mmdeploy.apis.tensorrt.from_onnx(onnx_model: Union[str, onnx.onnx_ml_pb2.ModelProto], output_file_prefix: str, input_shapes: Dict[str, Sequence[int]], max_workspace_size: int = 0, fp16_mode: bool = False, int8_mode: bool = False, int8_param: Optional[dict] = None, device_id: int = 0, log_level: tensorrt.Logger.Severity = tensorrt.Logger.ERROR, **kwargs) → tensorrt.ICudaEngine[source]¶
Create a tensorrt engine from ONNX.
- Parameters
onnx_model (str or onnx.ModelProto) – Input onnx model to convert from.
output_file_prefix (str) – The path to save the output ncnn file.
input_shapes (Dict[str, Sequence[int]]) – The min/opt/max shape of each input.
max_workspace_size (int) – To set max workspace size of TensorRT engine. some tactics and layers need large workspace. Defaults to 0.
fp16_mode (bool) – Specifying whether to enable fp16 mode. Defaults to False.
int8_mode (bool) – Specifying whether to enable int8 mode. Defaults to False.
int8_param (dict) – A dict of parameter int8 mode. Defaults to None.
device_id (int) – Choice the device to create engine. Defaults to 0.
log_level (trt.Logger.Severity) – The log level of TensorRT. Defaults to trt.Logger.ERROR.
- Returns
The TensorRT engine created from onnx_model.
- Return type
tensorrt.ICudaEngine
Example
>>> from mmdeploy.apis.tensorrt import from_onnx >>> engine = from_onnx( >>> "onnx_model.onnx", >>> {'input': {"min_shape" : [1, 3, 160, 160], >>> "opt_shape" : [1, 3, 320, 320], >>> "max_shape" : [1, 3, 640, 640]}}, >>> log_level=trt.Logger.WARNING, >>> fp16_mode=True, >>> max_workspace_size=1 << 30, >>> device_id=0) >>> })
- mmdeploy.apis.tensorrt.is_available(with_custom_ops: bool = False) → bool¶
Check whether backend is installed.
- Parameters
with_custom_ops (bool) – check custom ops exists.
- Returns
True if backend package is installed.
- Return type
bool
- mmdeploy.apis.tensorrt.load(path: str, allocator: Optional[Any] = None) → tensorrt.ICudaEngine[source]¶
Deserialize TensorRT engine from disk.
- Parameters
path (str) – The disk path to read the engine.
allocator (Any) – gpu allocator
- Returns
The TensorRT engine loaded from disk.
- Return type
tensorrt.ICudaEngine
- mmdeploy.apis.tensorrt.onnx2tensorrt(work_dir: str, save_file: str, model_id: int, deploy_cfg: Union[str, mmengine.config.config.Config], onnx_model: Union[str, onnx.onnx_ml_pb2.ModelProto], device: str = 'cuda:0', partition_type: str = 'end2end', **kwargs)[source]¶
Convert ONNX to TensorRT.
Examples
>>> from mmdeploy.backend.tensorrt.onnx2tensorrt import onnx2tensorrt >>> work_dir = 'work_dir' >>> save_file = 'end2end.engine' >>> model_id = 0 >>> deploy_cfg = ('configs/mmdet/detection/' 'detection_tensorrt_dynamic-320x320-1344x1344.py') >>> onnx_model = 'work_dir/end2end.onnx' >>> onnx2tensorrt(work_dir, save_file, model_id, deploy_cfg, onnx_model, 'cuda:0')
- Parameters
work_dir (str) – A working directory.
save_file (str) – The base name of the file to save TensorRT engine. E.g. end2end.engine.
model_id (int) – Index of input model.
deploy_cfg (str | mmengine.Config) – Deployment config.
onnx_model (str | onnx.ModelProto) – input onnx model.
device (str) – A string specifying cuda device, defaults to ‘cuda:0’.
partition_type (str) – Specifying partition type of a model, defaults to ‘end2end’.
apis/onnxruntime¶
- mmdeploy.apis.onnxruntime.is_available(with_custom_ops: bool = False) → bool¶
Check whether backend is installed.
- Parameters
with_custom_ops (bool) – check custom ops exists.
- Returns
True if backend package is installed.
- Return type
bool
apis/ncnn¶
- mmdeploy.apis.ncnn.from_onnx(onnx_model: Union[onnx.onnx_ml_pb2.ModelProto, str], output_file_prefix: str)[source]¶
Convert ONNX to ncnn.
The inputs of ncnn include a model file and a weight file. We need to use an executable program to convert the .onnx file to a .param file and a .bin file. The output files will save to work_dir.
Example
>>> from mmdeploy.apis.ncnn import from_onnx >>> onnx_path = 'work_dir/end2end.onnx' >>> output_file_prefix = 'work_dir/end2end' >>> from_onnx(onnx_path, output_file_prefix)
- Parameters
onnx_path (ModelProto|str) – The path of the onnx model.
output_file_prefix (str) – The path to save the output ncnn file.
- mmdeploy.apis.ncnn.is_available(with_custom_ops: bool = False) → bool¶
Check whether backend is installed.
- Parameters
with_custom_ops (bool) – check custom ops exists.
- Returns
True if backend package is installed.
- Return type
bool