23,794
edits
mNo edit summary |
|||
Line 8: | Line 8: | ||
== Introduction to Triton Inference Server== | == Introduction to Triton Inference Server== | ||
From the official [https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/index.html NVIDIA Triton Inference Server documentation]: The Triton Inference Server provides a cloud inferencing solution optimized for both CPUs and GPUs. The server provides an inference service via an HTTP or GRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server. For edge deployments, Triton Server is also available as a shared library with an API that allows the full functionality of the server to be included directly in an application. | From the official [https://docs.nvidia.com/deeplearning/triton-inference-server/master-user-guide/index.html NVIDIA Triton Inference Server documentation]: The Triton Inference Server provides a cloud inferencing solution optimized for both CPUs and GPUs. The server provides an inference service via an HTTP or GRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server. For edge deployments, Triton Server is also available as a shared library with an API that allows the full functionality of the server to be included directly in an application. | ||
Here you will find how to run and test the server on a JetPack step by step. | Here you will find how to run and test the server on a JetPack step by step. |