Member-only story
Using Free Resources to Prepare Data for Training Object Detection Models with NVIDIA TAO Toolkit
Object detection is a crucial component of computer vision. NVIDIA’s Train Adapt Optimize (TAO) Toolkit, a Python-based AI toolkit, allows users to train, fine-tune, prune, and export AI models, customized with their own data. NVIDIA supports its users by providing helpful sample Python notebooks, such asdetectnet_v2.ipynb
(REF). DetectNet_v2
is an object detection model within TAO, and it requires training data in KITTI format.
This article is dedicated to providing a guide on how to prepare KITTI training data by utilizing free resources, thus empowering users to leverage their own datasets effectively with NVIDIA TAO Toolkit.
KITTI Format
Using the KITTI format requires data to be organized in this structure (REF):
- The images directory contains the images to train on.
- The labels directory contains the labels to the corresponding images.
It’s essential to ensure that each image and its corresponding label share the same file ID (name) before the file extension. This is critical as the image to label correspondence is maintained using this file name. (REF)