This is a guest post from Quenton Hall, AI System Architect for Industrial, Vision, Healthcare and Sciences Markets.
Is data the end goal or just the beginning?
Globally, 10s of millions of IP cameras are installed each year. If we assume that there 100 million IP cameras installed worldwide (which may be a conservative number) and if each of these cameras were to unintelligently stream H.264 encoded HD video at 30fps, 24/7/365, the required total bandwidth would be ~ 859Tbps, or ~3.4 Zettabytes annually. If even half of these cameras were connected to the cloud, it suggests that IP camera internet traffic may currently account for upwards of 1/12th of the total global internet traffic. Clearly, these are ballpark numbers, but it serves to illustrate the scale of the problem.
This data can only provide useful insights if it is either stored for future retrieval and examination or if a human or algorithm is monitoring the video, 24/7/365. Only pixels that are monitored and provide useful insights have been collected with purpose. This suggests that naïvely streaming and storing this massive volume of visual data is, shall we say, inefficient? This reminds me of the words of Bruce Cockburn, “If a tree falls in the forest, does anybody hear? Anybody hear the forest fall?”.
For the past several years there has been a trend towards enabling autonomous AI algorithms to monitor video and send alerts. Whether the goal is to ensure the safety of your home or child, detect criminal acts, detect traffic accidents or chemical spills, ensure patient safety, interpret sign language, or replace biometric timeclocks with facial recognition to minimize contact transmission of viruses, there likely exists a trained AI model that will enable your application.
As the industry has migrated from H.264 to H.265 to reduce storage costs, so is the industry considering how to solve the power and bandwidth costs associated with cloud AI processing. By enabling high-efficiency AI inference at the network edge, we can garner useful insights from this visual data, and decide precisely when, how, and what to encode and transmit for storage. Wouldn’t that be “Smarter”?
As just one example, consider that our new SmartCamera+ demo platform, developed in partnership with On Semiconductor, enables low-latency face-detection and video encoding at up to 200fps while consuming less than 10W of power (Ta = 60 C).
This platform delivers enough AI inference horsepower to process not just a single video stream, but multiple video streams. The on-chip VCU in the MPSoC device can support simultaneous encode and decode of multiple streams and in this context can support AI inference for legacy IP camera installations:
If you couple this functionality with Xilinx’s ROI (Region-of-Interest) encoding and multi-crop IPs, our technology enables you to transmit just the data that should be stored or reviewed to the cloud. If a tree falls in the forest, someone will hear.
To answer the original question that was asked, “is data the end goal or just the beginning?”. My response is that there is no question that data is just the beginning - a means to an end. How manufacturers get “Smarter” about transmitting and storing this data will determine their destiny. Contact your local Xilinx or On Semiconductor FAE or sales office to request a demo of SmartCamera+. It may be a little “Edgy” for me to say this, but we will make you “Smarter”!