Note: not all Nvidia GPUs support FP16 precision. trt_fp16_enable: Enable FP16 mode in TensorRT.Subgraphs with smaller size will fall back to other execution providers.trt_min_subgraph_size: minimum node size in a subgraph after partitioning.If target model can’t be successfully partitioned when the maximum number of iterations is reached, the whole model will fall back to other execution providers such as CUDA or CPU.trt_max_partition_iterations: maximum number of iterations allowed in model partitioning for TensorRT.trt_max_workspace_size: maximum workspace size for TensorRT engine. All configurations should be set explicitly, otherwise default value will be taken. In this case, execution provider option settings will override any environment variable settings. It’s useful when each model and inference session have their own configurations. TensorRT configurations can be set by execution provider options. Note: for bool type options, assign them with True/ False in python, or 1/ 0 in C++. ORT_TENSORRT_CONTEXT_MEMORY_SHARING_ENABLE ORT_TENSORRT_FORCE_SEQUENTIAL_ENGINE_BUILD ORT_TENSORRT_INT8_USE_NATIVE_CALIBRATION_TABLE This site uses Just the Docs, a documentation theme for Jekyll.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |