The closer processing resources are to the data source, the better. The latency introduced by sending data over networks and waiting for responses hinders performance and makes the development of real-time applications impossible. Furthermore, any time data is sent over a network, there is a chance that it could be accessed inappropriately, raising serious privacy-related concerns.
It is not always easy to handle all processing locally, however. This is especially true when it comes to machine learning, where the algorithms may be extremely resource-intensive, requiring a large cluster of powerful computers for processing. Recent advances have led to the development of new hardware and algorithmic optimizations that now allow many more machine learning applications to run on relatively low-power computing platforms very near to where data is collected.
It’s time for sensors and processors to get cozy
This is a step in the right direction, but there is still an opportunity to get just a little closer to the source of data collection. An emerging technology called in-sensor processing blends sensing and processing together, even allowing machine learning algorithms to run directly on the sensor itself. A team led by engineers at the Innovation Academy Mila has just demonstrated a method that enables a complex machine learning algorithm to run on Intelligent Sensor Processing Units (ISPUs), despite the fact that they have only a tiny amount of processing and memory resources available to work with.
The ISPU architecture (📷: A. Benmessaoud et al.)
The research team has developed a Human Activity Recognition (HAR) model that pushes the boundaries of what is possible on ultra-constrained hardware. Their model, which operates on an ISPU with less than 8KB of memory, successfully classifies 24 different human activities — such as running, washing one’s face, or using tools — by analyzing accelerometer and gyroscope data. Impressively, the model achieves 85% accuracy while using only 850 bytes of stack memory.
Honey, I shrunk the model
Traditional neural networks require substantial memory and processing power, making them difficult to implement on such small-scale hardware. To address this issue, the research team applied a number of techniques, including incremental class injection and feature optimization, to maximize the model’s efficiency while maintaining high accuracy.
By processing data directly on the sensor, the model eliminates the need to transmit raw information to cloud servers or external microcontrollers, reducing latency, enhancing data privacy, and significantly cutting power consumption. The system operates on just 0.5 mA of power, making it highly energy efficient — an important factor for IoT and wearable applications, in particular.
Implications for IoT and beyond
To further advance the field, the team has released a publicly available dataset featuring 24 distinct HAR gestures recorded over 12.5 hours, with data collected from multiple individuals. This dataset provides a valuable resource for training and evaluating new machine learning models on constrained hardware.
Moving forward, the researchers plan to explore advanced compression techniques and broader IoT integration to push the limits of what is possible with TinyML and in-sensor processing. With continued advancements such as these, the future of intelligent sensing appears to be on track to become even more powerful, efficient, and privacy-conscious.