The best tools to make your project dreams come true

Login or Signup

NVIDIA Jetson Nano - Part 2: Image Classification with Machine Learning

By ShawnHymel

The NVIDIA Jetson Nano is a single-board computer based on the NVIDIA Tegra X1 processor, which combines CPU and GPU capabilities. As a result, it is a great starting platform for doing Edge AI.

If you have not done so already, please follow the steps in the previous tutorial to install Linux and configure your Jetson Nano: Getting Started with the NVIDIA Jetson Nano - Part 1: Setup.

Here is a YouTube link if you would like to view these steps in video format:


Note that for the following demos, you will want to use a keyboard, mouse, and monitor connected directly to the Jetson Nano. Otherwise, the camera feed will be extremely slow over a network connection.

Live Detection Demo

If you downloaded the COCO models in the previous episode, then you can use them to identify and detect objects in real time. To do that, we use the DetectNet-Console tool.

First, make sure you have a camera plugged into your Jetson Nano. This can be a CSI camera (the Raspberry Pi Camera Module V2 supposedly works well) or a USB webcam (the Logitech c920 worked for me).

Open a terminal and navigate to the bin directory in aarch64:

Copy Code
cd ~/jetson-inference/build/aarch64/bin/


From there, run the live camera tool. Note that you will need to set the camera parameter to your connected camera: a USB webcam will likely be the device file /dev/video0, or if you’re using a CSI camera, it will be just 0 or 1.

Copy Code
./ --network=coco-dog --camera=/dev/video0


This will start a live feed from your camera. It will also attempt to locate any objects in the frame that match the dog model found in the coco-dog network.

Live image detection and classification

If you use an image of a dog (or a real dog), the program should be able to detect it, label it as a dog, and put a blue bounding box around it.

Training a Model

Because the Nano is an embedded device, it is not nearly as powerful as a modern desktop or server built with a powerful graphics card. As a result, if you plan to train a deep neural network (or other large model) from scratch, we recommend doing so from a laptop, desktop, or server.

NVIDIA has a training interface called DIGITS that makes training networks much easier. This guide will walk you through training deep neural networks from scratch.

That being said, we can do something called “transfer learning” to retrain an existing network. When we do this, we just tweak the model’s parameters to optimize it to our own training data.

To begin, we first need to set up a swap space on our SD card so that we have more RAM to play with. Make sure you have at least 4 GB available on your SD card by running the following command:

Copy Code
df -h


Next, create a mount the swap partition:

Copy Code
sudo fallocate -l 4G /mnt/4GB.swap
sudo chmod 0600 /mnt/4GB.swap
sudo mkswap /mnt/4GB.swap
sudo swapon /mnt/4GB.swap


If you want the swap file to mount on boot, you will need to modify fstab:

Copy Code
sudo vi /etc/fstab


Scroll to the bottom of this file and press ‘o’ to insert a new line and begin editing. Enter the following line:

Copy Code
/mnt/4GB.swap  none swap sw 0  0


You can check to see if the swap space mounted with:

Copy Code
swapon -s


You should see the 4GB.swap file listed.

Next, we need to capture images to create our datasets. I’ll be using 3 different sets of images, as I want my network to identify these categories:

  • Background
  • Fork
  • Spoon

Note that if you are training the network to identify objects, you should take pictures of them in similar backgrounds. With such little data, the network will be sensitive to new backgrounds, new lighting, etc.

To use the jetson-inference capture tool, first create our datasets directory and labels file:

Copy Code
cd ~
mkdir datasets
cd ~/datasets
mkdir utensils
cd utensils
touch labels.txt
echo “background” >> labels.txt
echo “fork” >> labels.txt
echo “spoon” >> labels.txt

Note that the categories in the labels file need to be on separate lines and in alphabetical order! You can check them with:

Copy Code
cat labels.txt


Next, run the camera-capture tool. If you’re using a USB webcam, you will want to use the /dev/video0 device file. If you’re using CSI camera, change the camera parameter to 0 or 1 (whichever one works, such as --camera=0). I’m also using a much lower resolution, as it allows for faster training and later classification:

Copy Code
camera-capture --camera=/dev/video0 --width=640 --height=480


In the capture tool, first point the Dataset Path to your ~/datasets/utensils directory. Then, point the Class Labels to the ~/datasets/utensils/labels.txt file. Select your Current Class (e.g. start with “background”). For Current Set, select train. Use the spacebar or button to capture at least 30 images of your intended background.

Capture background images with Jetson nano

Next change the Set to val (for validation), and take at least 10 more photos of the same background. Change the Set to test and take yet another 10 photos of the background.

Repeat this process for your fork and spoon images, each time, holding up the desired utensil to the camera. You can move the utensil around slightly, but don’t move it too much, or the model will not be able to train on the image properly.

Taking photos of a fork with Jetson nano

In the end, you should have the following set of images:

  • Background
    • Train: 30 (or more) images
    • Val: 10 (or more) images
    • Test: 10 (or more) images
  • Fork
    • Train: 30 (or more) images
    • Val: 10 (or more) images
    • Test: 10 (or more) images
  • Spoon
    • Train: 30 (or more) images
    • Val: 10 (or more) images
    • Test: 10 (or more) images

Now, it’s time to train! Navigate to the classification directory and run the training program:


Copy Code
cd ~/jetson-inference/python/training/classification/
python --model-dir=utensils ~/datasets/utensils

This can take up to 30 minutes, so be patient (or go get some coffee). When it’s done, we will need to export the model to the Open Neural Network Exchange (ONNX) format:

Copy Code
python --model-dir=utensils

Test It!

With the model trained, we can use it to make classification predictions! Run the following program (changing /dev/video0 for your particular camera):

Copy Code
imagenet-camera --model=utensils/resnet18.onnx --labels=/home/sgmustadio/datasets/utensils/
labels.txt --camera=/dev/video0 --width=640 --height=480 --input_blob=input_0 --output_blob=output_0

It can take around 5 minutes for the engine to start up, so be patient with this one, too. Once you get a live stream of your camera, make sure it is facing the background that you trained it on. Then, hold up a fork or spoon in front of the camera. It should be able to identify the utensil!

Classifying spoon with Jetson Nano

Note that it probably won’t be very accurate--we used a very small training set!

Going Further

Try training the network on different objects! NVIDIA also has a number of other demos in their Hello AI World documentation that we recommend working through:

Key Parts and Components

Add all Digi-Key Parts to Cart
  • 1597-1732-ND
  • 993-1343-ND
  • P122042-ND
  • 1690-1011-ND