Modular capabilities make a smarter camera

Create a security camera that allows for highly customizable detections, triggers, and actions without coding for each scenario

I have a basic security camera that I don’t use - I live in a rural area where mail and packages get delivered to a PO Box, I usually work from home so the house is not empty, and the camera offers only live view and motion detection capabilities. I’ve looked at well-designed solutions like Ring and Blink, but they don’t provide the features I want most. I also would not love having live video constantly streaming from my house to the cloud, and don’t want more monthly fees in my life.

What would make a security camera useful for me? Well, my list may be an odd one but here it is:

We have egg-laying backyard ducks that have been attacked by bears and foxes: I’d like to point a camera at their coop area and have the camera alert when any wild animals approach.

*My ducks enjoying their time in the backyard.*

Our dog Luna is a “southern girl” - she was rescued from a shelter down in Tennessee, and even after living with us for 5 years in the north she does not prefer being outside for long in the winter. Sometimes we let her in the back yard and mistakenly leave her out for more than a few minutes; I’d love for a camera to watch for a dog near the back door to remind us to let her back in.

*Luna anytime after she is left outside the house for more than five minutes.*

Our neighbor has a big bad Maine Coon cat named “Boo” who loves to come stand near our porch window to taunt our indoor cat, saying “What did you do to deserve living your entire life indoors; a pity you can’t be out in the glory of nature with me, you miserable fool!” So, if a camera could tell me that Boo was on my porch and maybe scare him away that would be nice.

*Boo, my neighbor's cat, taunting my cat from afar.*

The chances of you having the same requirements for a security camera as I do are close to nil. Yet I’d wager that you probably would have your own list that is not fully covered by the basic motion detecting and person detecting capabilities that “well equipped” security cameras have today.

Maybe you run a home business that relies on UPS deliveries so you want to know specifically when a UPS delivery person shows up at your door (but not the postal person).

Perhaps your kids are constantly forgetting their keys and you want them to be automatically let in if the camera sees their faces.
You might own an apartment building with a pool - in order to ensure safety you want a camera to alert you when children are in the pool area unattended.

The possibilities are are endless, but they all come down to a single question:
Can we create a security camera that allows for highly customizable detections, triggers, and actions without coding for each scenario?

The answer is yes, I’ll show you how I built it with Viam in a couple days, and you can then use it yourself for whatever scenario you might have.

Getting started

For this project, I needed two things:

Any computer than can run viam-server
Any camera that can run on viam-server

To try it out without buying any hardware, you can install viam-server on a Mac and use the built-in laptop webcam. I used an inexpensive RTSP IP camera purchased from Amazon and a Raspberry Pi.

To start setting up your project, install viam-server on the computer of your choice as per the Viam installation instructions and then head over to the Viam App and add a new machine instance. You can name it whatever you’d like, I called mine “smart-cameras”.

Adding capabilities

*Creating my smart machine within app.viam.com.*

We can now start adding capabilities to our project. These will fall into two categories:

Those that are core to the project (like the camera).
Those that are contextual to our use cases (like the specific ML models you want to use). These can be added to or changed over time without writing and new code.

Let’s start with a camera. One camera is essential to the project but we could add any number of cameras. If you are using a webcam (like the one on your laptop), this is as simple as choosing the model ‘webcam’, selecting the camera from the component dropdown, and saving your configuration. If you are using another type of camera, configuration is the same process but you would choose a different model and might need to add some other attributes. I configured my RTSP camera using these instructions.

*Adding the camera onto my device within Viam's app.*

Now, let’s add some pre-trained machine learning (ML) models to work in conjunction with the Viam Computer Vision Service. For my purposes, I added:

A mobilenet wildlife detector that is trained on hundreds of thousands of wild animal images. There are many types of wild animals where I live, this model should cover them all.
An efficientnet detector that has been trained on the popular COCO dataset. I’ll use the “dog” label to detect when my dog Luna is waiting outside my back door and the “cat” label to detect when my neighbor’s cat Boo is on my porch.

*The mobilenet wildlife detector in action.*

Both of these models can be found on the Viam registry and can be used with the Viam computer vision service. On the registry you can find other ML Models as well as other sorts of detectors and classifiers like:

Face identification
Motion detection
Feature matching
Integrations with 3rd party AI services like DirectAI and AWS Sagemaker

Of course, you can also upload your own ML models or even use Viam to collect data and train your own custom model. (I might train a model to detect Boo specifically, he's the only cat causing problems!).

Adding each of these capabilities is as simple as a few clicks in the Viam app.

The event manager

Now that we’ve added a camera, the ML models, and Vision Services of our choice, we can start using them. You can go ahead and create something very custom to your use-case using any of the Viam SDKs. Or, you can find modules on the Viam Registry that leverage these capabilities for higher level functionality.

In this case, I wanted something that both I and others could use to:

Watch for configured “events” to trigger.

For example, an event called “Poor Luna got left outside” that detects if there is a dog seen by my back door camera.

When events trigger, act on configured notifications.

For example, when an event called “Poor Luna got left outside” is triggered an SMS message is sent to my phone.

Nothing existed on the registry that did these things, so I went ahead and built it using the Viam Python SDK, then uploaded it to the registry as a public module. The code is open source and you can take a peek, but the flow is summarized below:

An event loop starts
On each pass of the loop, each rule for each event is evaluated
If an event is triggered, any configuration notifications are then acted on
A series of images (that can be viewed as a video clip) leading up to the event is captured in a folder with a timestamp and some metadata.
If an event is triggered, don’t let it trigger again for a configurable amount of time
Repeat

The README for this module describes how to configure rules and notifications. Here’s the configuration I used for the “Poor Luna got left outside” scenario I mentioned.

{
    "mode": "home",
    "events": [
        {
            "name": "Poor Luna got left outside",
            "modes": ["home", "away"],
            "debounce_interval_secs": 300,
            "rule_logic_type": "AND",
            "notifications": [
                {
                    "type": "sms",
                    "carrier": "att",
                    "phone": "9175550100"
                }
            ],
            "rules": [
                {
                    "type": "detection",
                    "detector": "coco-detector",
                    "confidence_pct": 0.7,
                    "class_regex": "dog",                    
                    "cameras": ["backdoor-camera"]
                }            
]
        }
    ]
}

Hopefully at this point I am starting to convey why modular capabilities are so flexible and useful. With no code, I added support for any number and types of cameras, machine learning and computer vision resources.

Next, I wrote code to use these modular capabilities in a flexible, generic way. Then I put that code into the Viam Registry, and now anyone can use that higher-level capability with no code.

Note: Unless you want to code! Everything I’ve shown is open source, and anyone can contribute to the registry.

Taking it further

Being able to leverage machine learning with my camera to get notifications is great, and I spent less than a day getting there (if you use my event manager it can be minutes). But there were some other things I wanted:

The ability to see and manage triggered event video replays
The ability to create and modify rules and notifications with an easy to use interface (there are things that I’d rather be doing than typing JSON into a textbox)

I knew there was a Viam Flutter SDK, so I decided to challenge myself. I’d never worked with Flutter and not done much in the way of mobile app development, but I knew that the same underlying APIs would exist for working with Viam capabilities. Meaning, working with cameras and computer vision detectors would feel very similar in Flutter to how they work in Python or Typescript.

After taking some time to wrap my head around Dart (the programming language Flutter uses - everything is a widget? OK.), I made a mobile app I call SAVCAM (Smart AI Viam Camera). Just like any security camera app, you can see a live feed of any of your cameras. Just like many security camera apps, you can see notifications and video replay of the events. Unlike any security camera app I’ve seen, this Viam-based camera gives you much more flexibility in terms of what it's looking for in the world and how it reacts.

I’ve got more ideas of how to take this project further, at the moment centered around other types of notifications:

I’ve added IFTTT support and will use it to take actions like turning on lights when we are getting home at night (maybe based on seeing a car or a person, but not rain or snow movement).
I need to actually scare away any wild animals (including Boo), so I may use an ESP32 with Viam’s micro-RDK to activate erratic buzzers and LEDs when they are seen.
A colleague of mine just created a Viam module that controls Yale Smart locks - should I use it to let my teenage kids in the door when they forget their keys?