Product
December 22, 2023

Introducing Enhanced Datasets: Viam's Breakthrough in Simplified Data Curation and Model Training

Written by
Natalia Jacobwitz
Product Manager

Curating data and training models in Viam just got a whole lot more powerful! Now with datasets in Viam, you can organize the data you want to train models on and get valuable information about the dataset before you train. Kicking off a training (or retraining) session has never been easier with the button built into your dataset view. 

To train a model the previous way, you had to filter down to the exact data you wanted to train on, whether it was data within a time range, with certain tags applied, or from a specific component. 

We got feedback that this was confusing, so we wanted to streamline the flow and make sure it was clear to our users exactly which data their model would be training on.

In addition to the confusing filtering-to-train flow, there was no option to manually select which images to train on. Say you liked 20 specific images that were clear and demonstrated something new in each image - you weren’t able to choose to train your model on just those 20 images.

What's new with datasets

Data Curation with Datasets

Explore app.viam.com/data/view to see all your organization's images that you synced to the cloud via Viam’s data capture and sync capabilities. 

Open the image details panel by clicking on any image, and use the dropdown for datasets to add the image to existing datasets or create a new named dataset. 

Notice the Datasets dropdown on the bottom right of this image.

Repeat this process for each image, curating the perfect dataset for your needs.

Effortless Labeling and Real-Time Analysis

Label each image during the dataset curation process or once you have the dataset of images navigate to app.viam.com/data/datasets to label the dataset’s images individually. 

The dataset provides real-time insights into label characteristics, detailing the total count of images including each classification tag and bounding box label. 

It is especially helpful that the dataset stats include information on whether any images are missing labels, and easily lets you filter down to those images. 

Simplified Model Training

Once your dataset is curated, initiating model training is a breeze. 

Click "train model," from within the specific dataset and you'll be guided to a page where you can easily choose to train a new model or a new version of an existing one. 

The 'train a model' page within app.viam.com.

Specify the model type and labels for training, and then give your model a name. 

As you select your model type and training labels you are proactively informed if your dataset doesn’t have a sufficient number of images, images with a specific label, or bounding boxes. 

The message you'll receive if the number of images you're training against is too small.

Then you can kick off the training with the click of a button.

Later, when looking at your trained models, it will be clear what data you have trained your model on by the link to the dataset it trained with, making it easy to understand and replicate training runs.

Easy Repeated Training of Additional Models on the Same Data

Should you find that the resulting model isn’t as good as you would like, or you want to train a slightly different model on the same data, you can easily navigate back to that named dataset to train/retrain on. 

This is a huge enhancement from before, when you needed to once again filter down to the exact data. 

Collaborating with Datasets

Because datasets live in your organization, you can share a dataset to anyone who is also an organization owner. You can also collaboratively add images to a dataset as well as collaborate on the tagging and labeling of your images so you don’t have to do it all yourself. 

Taking action and moving forward 

I have personally been using this feature for my robots, in fact I just successfully trained a machine learning model within minutes using a dataset consisting of just 20 images and a bunch of bounding boxes.

The dataset I trained with for my MenoRobot 2.0 project.

Get started today by following our up-to-date tutorial on how to capture data and train a model using Try Viam. This tutorial will show you how to create a new dataset, add captured images to the dataset, and train on this dataset, all with the updated functionality described in this post. 

Happy curation and training!

on this page

Get started with Viam today!