Member-only story

Using Free Resources to Prepare Data for Training Object Detection Models with NVIDIA TAO Toolkit

Huajing Shi
6 min readJan 16, 2024

--

Photo by Florian Roost on Unsplash

Object detection is a crucial component of computer vision. NVIDIA’s Train Adapt Optimize (TAO) Toolkit, a Python-based AI toolkit, allows users to train, fine-tune, prune, and export AI models, customized with their own data. NVIDIA supports its users by providing helpful sample Python notebooks, such asdetectnet_v2.ipynb(REF). DetectNet_v2 is an object detection model within TAO, and it requires training data in KITTI format.

This article is dedicated to providing a guide on how to prepare KITTI training data by utilizing free resources, thus empowering users to leverage their own datasets effectively with NVIDIA TAO Toolkit.

KITTI Format

Using the KITTI format requires data to be organized in this structure (REF):

  • The images directory contains the images to train on.
  • The labels directory contains the labels to the corresponding images.

It’s essential to ensure that each image and its corresponding label share the same file ID (name) before the file extension. This is critical as the image to label correspondence is maintained using this file name. (REF)

--

--

Huajing Shi
Huajing Shi

Written by Huajing Shi

Exploring Data Science in Transportation

Responses (1)