Oven logo

Oven

Published

ONNXRuntime Extensions

pip install onnxruntime-extensions

Package Downloads

Weekly DownloadsMonthly Downloads

Project URLs

Requires Python

Dependencies

    ONNXRuntime-Extensions

    Build Status

    What's ONNXRuntime-Extensions

    Introduction: ONNXRuntime-Extensions is a C/C++ library that extends the capability of the ONNX models and inference with ONNX Runtime, via ONNX Runtime Custom Operator ABIs. It includes a set of ONNX Runtime Custom Operator to support the common pre- and post-processing operators for vision, text, and nlp models. And it supports multiple languages and platforms, like Python on Windows/Linux/macOS, some mobile platforms like Android and iOS, and Web-Assembly etc. The basic workflow is to enhance a ONNX model firstly and then do the model inference with ONNX Runtime and ONNXRuntime-Extensions package.

    Quickstart

    The library can be utilized as either a C/C++ library or other advance language packages like Python, Java, C#, etc. To build it as a shared library, you can use the build.bat or build.sh scripts located in the root folder. The CMake build definition is available in the CMakeLists.txt file and can be modified by appending options to build.bat or build.sh, such as build.bat -DOCOS_BUILD_SHARED_LIB=OFF. For more details, please refer to the C API documentation.

    Python installation

    pip install onnxruntime-extensions
    

    The nightly build is also available for the latest features, please refer to nightly build

    Usage

    1. Generation of Pre-/Post-Processing ONNX Model

    The onnxruntime-extensions Python package provides a convenient way to generate the ONNX processing graph. This can be achieved by converting the Huggingface transformer data processing classes into the desired format. For more detailed information, please refer to the API below:

    help(onnxruntime_extensions.gen_processing_models)
    

    NOTE:

    The generation of model processing requires the ONNX package to be installed. The data processing models generated in this manner can be merged with other models using the onnx.compose if needed.

    2. Using Extensions for ONNX Runtime inference

    Python

    There are individual packages for the following languages, please install it for the build.

    import onnxruntime as _ort
    from onnxruntime_extensions import get_library_path as _lib_path
    
    so = _ort.SessionOptions()
    so.register_custom_ops_library(_lib_path())
    
    # Run the ONNXRuntime Session, as ONNXRuntime docs suggested.
    # sess = _ort.InferenceSession(model, so)
    # sess.run (...)
    

    C++

      // The line loads the customop library into ONNXRuntime engine to load the ONNX model with the custom op
      Ort::ThrowOnError(Ort::GetApi().RegisterCustomOpsLibrary((OrtSessionOptions*)session_options, custom_op_library_filename, &handle));
    
      // The regular ONNXRuntime invoking to run the model.
      Ort::Session session(env, model_uri, session_options);
      RunSession(session, inputs, outputs);
    

    Java

    var env = OrtEnvironment.getEnvironment();
    var sess_opt = new OrtSession.SessionOptions();
    
    /* Register the custom ops from onnxruntime-extensions */
    sess_opt.registerCustomOpLibrary(OrtxPackage.getLibraryPath());
    

    C#

    SessionOptions options = new SessionOptions()
    options.RegisterOrtExtensions()
    session = new InferenceSession(model, options)