In robotics, collecting training data dominates discussions about building truly autonomous machines. Companies build expensive custom infrastructure—domain randomization pipelines, synthetic data generation, elaborate sensor rigs—in order to produce simulated data in controlled environments. There are use cases where this level of investment makes sense: autonomous vehicles testing dangerous scenarios, surgical robots, or robotic systems in space that can't (or shouldn’t) be validated in orbit.
At the same time, companies often underinvest in operational data, gathered from live robots deployed with real customers, despite it being cheaper to collect and more relevant to real-world performance. Most teams understand that learning from deployed robots is valuable: the challenge is how to collect, query, and act on that data without building an entire data infrastructure from scratch to support it.
Operational data: learning from robots in the field
Unlike training data collected in controlled lab environments or generated through simulation, operational data comes from your robots during their normal operation in the real world. This includes performance metrics under varying conditions, failure modes you didn't anticipate, usage patterns from actual customers, and environmental variables that affect behavior.
Because this data comes from machines already in the field, it's inherently cheaper to collect than building specialized training data pipelines. More importantly, it's directly relevant to the deployment scenarios you care about—real customers, real environments, real edge cases—making it immediately applicable to improving product performance, providing efficient support, and executing predictive maintenance.
What 'good' looks like: infrastructure that gets out of your way
The infrastructure that enables operational intelligence has specific characteristics:
- Automatic synchronization from edge to cloud. Data flows from deployed robots to your analysis environment without manual intervention. No SSH-ing into individual machines. No physical media retrieval. The system handles intermittent connectivity gracefully—robots operate offline, and data syncs automatically when the connection is restored.
- Fleet-wide query capabilities. You need to make requests across your entire deployment, like: Show me all instances where motor temperature exceeded X in the past week. Or: What's the correlation between firmware version and collection success rate? This needs to be done using familiar tools like SQL queries, not custom scripts, and out-of-the-box visualization tools to be interpretable and sharable across a team.
- Correlation across hardware, software, and environmental variables. The most valuable insights emerge from understanding relationships: How does battery voltage affect performance degradation? Which environmental conditions correlate with specific failure modes? Your infrastructure should make these explorations natural rather than requiring custom data pipeline work.
- Low engineering overhead. If you need dedicated engineers to collect and analyze operational data, you'll never do it consistently. The best infrastructure is configured once and works automatically in the background while your team focuses on product development.
Case study: Tennibot’s weekly development cycles
Tennibot is a robotics and AI company managing thousands of robotic ball machines and ball collectors for tennis, padel, and pickleball. The team, under the direction of CEO and founder Haitham Eletrabi, captures operational data from every machine to refine their ball collection and shooting algorithms, optimize robot behavior for different court surfaces and weather conditions, and identify performance patterns across thousands of training sessions.
Tennibot uses Viam's data capture and synchronization capabilities to automatically collect sensor data, performance metrics, and operational logs from their fleet. Data flows from robots to cloud storage without manual intervention. Viam's query interface lets the team ask fleet-wide questions using SQL: At X RPM, what's the success rate for ball collection on clay courts?
They can correlate hardware performance metrics (motor temperature, battery voltage, wheel speed) with software behavior (algorithm decisions, trajectory calculations) and environmental variables (court surface, lighting conditions, player skill level) to identify optimization opportunities and predict performance across different scenarios.
When they identify an optimization—say, adjusting ball collection timing based on court surface type—they can deploy the algorithm update via over-the-air (OTA) updates through Viam's fleet management, validate the improvement by querying performance metrics across all affected robots, and iterate again. "With Viam, you flip a switch to capture data in real time,” says Haitham. “That gives our team the insights needed to roll out updates weekly instead of monthly."
Tennibot’s entire development cycle can be completed in a short period of time because their data infrastructure is in place as a default, rather than being a weekly engineering project.

.png)