Hacky Hour 15: Making Sense of the RealSense
What is the RealSense Depth Camera?
The Intel RealSense camera is designed to give machines depth perception of a given video stream. This can be used for a variety of applications that require machines to perceive the world in 3D such as determining if two individuals are following social-distancing guidelines, or in the case of last week’s hacky hour, it can be used to create a dart throwing game.
The focus of this Hacky Hour, Making Sense of the RealSense, was to demonstrate how to use a human pose model along with the RealSense camera to track limb motion in 3D space. And to demonstrate this, alwaysAI’s Principal Software Engineer, Eric VanBuhler made a fun dart throwing app. Click here for the GitHub Repo of the app and try it out! Please note, this app is using an experimental and unreleased update to the edgeIQ RealSense API. You will be able to run this app using the development image in the Dockerfile, but integrate this API into your own apps with caution.
For this app, Eric connected the RealSense camera to the NVIDIA Jetson Xavier NX which has excellent inference performance. With the combination of the two, it is possible to get high performance 3D data for the dart throwing motion. Eric rapidly prototyped the app (in two days) with inline and batch processing concepts, and finite state machine architecture. He also wanted to demonstrate how easy it is to build prototypes and full-functioning apps with alwaysAI.
About the Application
Using the finite state machine architecture, Eric isolated various states of the application (see image below). The purpose of this is that the processing of the application can occur in tandem with the other states. See below for the states of the application.
Class Architecture of the App
QUESTION: Have you tested the RealSense in outdoor conditions with full sun, radiation over the objects in the image? I have some issues with the IR projector in those conditions so what is the proper configuration of a RealSense camera to get decent raw data?
ANSWER (Steve): RealSense cameras perform better indoors than outdoors. Stationing the camera outdoors may not have the best output quality.
QUESTION: How did you calculate the release frame? Is it fixed at 15 frames?
ANSWER (Eric): To keep it simple, I just used a set amount of time. I hard-coded 15 frames for the motion.
QUESTION: What's the max difference in wrist-depth?
ANSWER (Eric): 0.6 meters
QUESTION: Have you tried to link multiple models together?
ANSWER (Eric): Yes! We have many example apps that link multiple models such as our age-classifier which uses the face detector with an age classifier.
QUESTION: Would it be possible to track a real dart in motion and calculate the motion from these images?
ANSWER (Eric): Yes! But you would need to train a model to detect where the dart is. You may also need something that can have a high frame rate to accurately detect it. All pieces can come together with the RealSense with the Xavier NX.
See below for the full video of last week's Hacky Hour:
Join us every Thursday at 2 PM PST for weekly Hacky Hour! Whether you are new to the community or an experienced user of alwaysAI, you are welcome to join, ask questions, and provide the community with information about what you're working on. Register here.