René's URL Explorer Experiment

Title: TensorRT - Get Started | NVIDIA Developer

Open Graph Title: TensorRT - Get Started

X Title: NVIDIA TensorRT - Get Started

Description: Learn more about NVIDIA TensorRT, get the quick start guide, and check out the latest codes and tutorials.

Open Graph Description: Learn more about NVIDIA TensorRT and check out the latest codes and tutorials.

X Description: Learn more about NVIDIA TensorRT, get the quick start guide, and check out the latest codes and tutorials.

Keywords:

tensorrt
torch-tensorrt
tensorflow
quick start guide
nvidia

Opengraph URL: https://developer.nvidia.com/tensorrt-getting-started

X: @NVIDIA

direct link

Domain: developer.nvidia.com

Hey, it has json ld scripts:

  {
    "@context": "https://schema.org",
    "@type": "Organization",
    "name": "NVIDIA Developer",
    "url": "https://developer.nvidia.com",
    "logo": "https://www.nvidia.com/en-us/about-nvidia/legal-info/logo-brand-usage/_jcr_content/root/responsivegrid/nv_container_392921705/nv_container_412055486/nv_image.coreimg.100.630.png/1703060329095/nvidia-logo-horz.png",
    "sameAs": [
      "https://github.com/nvidia",
      "https://www.linkedin.com/company/nvidia/",
      "https://x.com/nvidiadeveloper"
    ]
  }

csrf-param	authenticity_token
csrf-token	pzp3OI02ynSH0qTp15pkUte5BF3EPLzCz_PniSq63tBl03Lp-wr4UJ91ZGNCZ1tAx-NhtrTBLaOSiGqSGnXyOg
og:site_name	NVIDIA Developer
og:type	website
og:image	https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/tensorrt-getting-started-og-1200x630.jpg
twitter:image	https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/tensorrt-getting-started-og-1200x630.jpg
twitter:card	summary_large_image
twitter:creator	@NVIDIA
interest	Models / Libraries / Frameworks

Links:

What's New	https://developer.nvidia.com/tensorrt-getting-started#whats-new
Get Started With TensorRT	https://developer.nvidia.com/tensorrt-getting-started#tensorrt
Get Started With TensorRT Frameworks	https://developer.nvidia.com/tensorrt-getting-started#frameworks
Additional Resources	https://developer.nvidia.com/tensorrt-getting-started#resources
NVIDIA Developer Program	https://developer.nvidia.com/developer-program
Download Now	https://developer.nvidia.com/nvidia-tensorrt-download
Documentation	https://docs.nvidia.com/deeplearning/tensorrt/
NVIDIA NIM	https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/
NVIDIA Triton™ Inference Server	https://www.nvidia.com/en-us/ai-data-science/products/triton-inference-server/
NVIDIA AI Enterprise	https://www.nvidia.com/en-us/data-center/products/ai-enterprise/
NVIDIA NGC™.	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt
Download Now	https://developer.nvidia.com/nvidia-tensorrt-download
Pull Container From NGC	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt
Documentation	https://docs.nvidia.com/deeplearning/tensorrt/
Getting Started with NVIDIA TensorRT	https://www.youtube.com/watch?v=SlUouzxBldU
Introductory blog	https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorrt-updated/
Getting started notebooks	https://github.com/NVIDIA/TensorRT/tree/main/quickstart/IntroNotebooks
Quick-start guide	https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html
Sample code (C++)	https://github.com/NVIDIA/TensorRT/tree/main/samples
BERT	https://github.com/NVIDIA/TensorRT/tree/main/demo/BERT
EfficientDet	https://github.com/NVIDIA/TensorRT/tree/main/demo/EfficientDet/notebooks
blog	https://developer.nvidia.com/blog/optimizing-and-serving-models-with-nvidia-tensorrt-and-nvidia-triton/
docs	https://github.com/NVIDIA/TensorRT/tree/main/quickstart/deploy_to_triton
Using quantization aware training (QAT) with TensorRT	https://developer.nvidia.com/blog/achieving-fp32-accuracy-for-int8-inference-using-quantization-aware-training-with-tensorrt/
PyTorch-quantization toolkit	https://github.com/NVIDIA/TensorRT/tree/main/tools/pytorch-quantization
TensorFlow quantization toolkit	https://developer.nvidia.com/blog/accelerating-quantized-networks-with-qat-toolkit-and-tensorrt/
Sparsity with TensorRT	https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/
GitHub.	https://github.com/NVIDIA/TensorRT-LLM/tree/rel
Download Now	https://github.com/NVIDIA/TensorRT-LLM/tree/rel
Documentation	https://nvidia.github.io/TensorRT-LLM
Introduction on how TensorRT-LLM supercharges inference	https://developer.nvidia.com/blog/nvidia-tensorrt-llm-supercharges-large-language-model-inference-on-nvidia-h100-gpus
How to get started with TensorRT-LLM	https://developer.nvidia.com/blog/optimizing-inference-on-llms-with-tensorrt-llm-now-publicly-available/
Sample code	https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples
Performance benchmarks	https://nvidia.github.io/TensorRT-LLM/performance.html
RAG chatbot on Windows reference project	https://github.com/NVIDIA/trt-llm-rag-windows/tree/release/1.0
GitHub	https://github.com/NVIDIA/TensorRT-Model-Optimizer
Download Now	https://github.com/NVIDIA/TensorRT-Model-Optimizer
Documentation	https://nvidia.github.io/TensorRT-Model-Optimizer/
TensorRT Model Optimizer Quick-Start Guide	https://nvidia.github.io/TensorRT-Model-Optimizer/
Introduction on Model Optimizer	https://developer.nvidia.com/blog/accelerate-generative-ai-inference-performance-with-nvidia-tensorrt-model-optimizer-now-publicly-available/
Optimize Generative AI Inference With Quantization	https://www.nvidia.com/en-us/on-demand/session/gtc24-s63213/
Optimizing Diffusion models with 8-bit quantization	https://developer.nvidia.com/blog/tensorrt-accelerates-stable-diffusion-nearly-2x-faster-with-8-bit-post-training-quantization/
Example code	https://github.com/NVIDIA/TensorRT-Model-Optimizer
NVIDIA AI Enterprise	https://www.nvidia.com/en-us/data-center/products/ai-enterprise/
Contact sales	https://www.nvidia.com/en-us/data-center/products/ai-enterprise/contact-sales/
90-day NVIDIA AI Enterprise evaluation license	https://enterpriseproductregistration.nvidia.com/?LicType=EVAL&ProductFamily=NVAIEnterprise
PyTorch container from the NGC catalog	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
Pull Container From NGC	https://ngc.nvidia.com/catalog/containers/nvidia:pytorch
Documentation	https://nvidia.github.io/Torch-TensorRT/
Getting started with NVIDIA Torch-TensorRT	https://www.youtube.com/watch?v=TU5BMU6iYZ0
Accelerate inference up to 6X in PyTorch	https://developer.nvidia.com/blog/accelerating-inference-up-to-6x-faster-in-pytorch-with-torch-tensorrt/
Object detection with SSD	https://github.com/NVIDIA/Torch-TensorRT/blob/master/notebooks/ssd-object-detection-demo.ipynb
Post-training quantization with Hugging Face BERT	https://pytorch.org/TensorRT/_notebooks/Hugging-Face-BERT.html
Quantization aware training	https://pytorch.org/TensorRT/_notebooks/vgg-qat.html
blog	https://developer.nvidia.com/blog/optimizing-and-serving-models-with-nvidia-tensorrt-and-nvidia-triton/
docs	https://pytorch.org/TensorRT/tutorials/serving_torch_tensorrt_with_triton.html
Using dynamic shapes	https://pytorch.org/TensorRT/_notebooks/dynamic-shapes.html
TensorFlow container from the NGC catalog	https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow
Pull Container From NGC	https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow
Documentation	https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
Getting started with TensorFlow-TensorRT	https://www.youtube.com/watch?v=w7871kMiAs8
Leverage TF-TRT Integration for Low-Latency Inference	https://blog.tensorflow.org/2021/01/leveraging-tensorflow-tensorrt-integration.html
Image classification with TF-TRT	https://www.youtube.com/watch?v=O-_K42EAlP0
Quantization with TF-TRT	https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples-py/PTQ_example.ipynb
blog	https://developer.nvidia.com/blog/optimizing-and-serving-models-with-nvidia-tensorrt-and-nvidia-triton/
docs	https://github.com/tensorflow/tensorrt/tree/master/tftrt/triton
Using dynamic shapes	https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples-py/dynamic_shapes.ipynb
TensorRT-LLM Helps Sweep MLPerf Inference Benchmarks	https://blogs.nvidia.com/blog/2023/09/11/grace-hopper-inference-mlperf/
TensorRT-LLM Supercharges Inference	https://developer.nvidia.com/blog/nvidia-tensorrt-llm-supercharges-large-language-model-inference-on-nvidia-h100-gpus
How to Get Started with TensorRT-LLM	https://developer.nvidia.com/blog/optimizing-inference-on-llms-with-tensorrt-llm-now-publicly-available/
Real-Time NLP With BERT	https://developer.nvidia.com/blog/real-time-nlp-with-bert-using-tensorrt-updated/
Optimizing T5 and GPT-2	https://developer.nvidia.com/blog/optimizing-t5-and-gpt-2-for-real-time-inference-with-tensorrt/
Quantize BERT with PTQ and QAT for INT8 Inference	https://github.com/NVIDIA/FasterTransformer/tree/main/examples/pytorch/bert/bert-quantization-sparsity
ASR With TensorRT	https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechRecognition/QuartzNet#inference-process
How to Deploy Real-Time TTS	https://devblogs.nvidia.com/how-to-deploy-real-time-text-to-speech-applications-on-gpus-using-tensorrt/
NLU With BERT Notebook	https://github.com/NVIDIA/TensorRT/tree/main/demo/BERT
Real-Time Text-to-Speech	https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/tensorrt
Building an RNN Network Layer by Layer	https://github.com/NVIDIA/TensorRT/tree/main/samples/sampleCharRNN
Optimize Object Detection	https://github.com/NVIDIA/TensorRT/blob/master/demo/EfficientDet/notebooks/EfficientDet-TensorRT8.ipynb
Estimating Depth With ONNX Models and Custom Layers	https://developer.nvidia.com/blog/estimating-depth-beyond-2d-using-custom-layers-on-tensorrt-and-onnx-models/
Speeding Up Inference Using TensorFlow, ONNX, and TensorRT	https://developer.nvidia.com/blog/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/
EfficientDet	https://github.com/NVIDIA/TensorRT/tree/main/samples/python/efficientdet
YOLOv3	https://github.com/NVIDIA/TensorRT/tree/main/samples/python/yolov3_onnx
Using NVIDIA Ampere Architecture and TensorRT	https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/
Achieving FP32 Accuracy in INT8 using Quantization-Aware Training	https://developer.nvidia.com/blog/achieving-fp32-accuracy-for-int8-inference-using-quantization-aware-training-with-tensorrt/
Sign Up	https://www.nvidia.com/en-us/deep-learning-ai/triton-tensorrt-newsletter/
	https://developer.nvidia.com/tensorrt-getting-started

Viewport: width=device-width,initial-scale=1

URLs of crawlers that visited me.