RF Machine Learning at Rowden
Enduring challenges in the RF domain
To successfully navigate the RF spectrum, frontline operators need to monitor it across different dimensions; frequency, time and location. The obvious approach to collecting more information is to provide more sensors to capture the RF picture in real-time across an environment. This approach will require sensors capable of spanning the growing frequency and geographical ranges that RF spectrum usage is proliferating, such as super high frequencies, that provide short range communications.
The combination of the three differing dimensions creates an abundance of information that needs to be processed and fused. Currently, this is either post-processed or orchestrated by a central node, both of which require a coherent database that can be used to store, label, and access the data needed for analysis, SIGINT, and decision-making.
The option to collect and store locally benefits collect where information lag is not important, but it increases the reliance on data storage, management, and processing on the sensor. When you consider that a one minute collect of a single Wi-Fi channel is 31GB, post-processing leaves analysts with a significant amount of data to sift through.
Alternatively, some sensors collect and pass the data back to a central node for processing and analysis. This approach can reduce the need for processing and storage at the edge, but requires a resilient and large backhaul network. This is hard to maintain in congested environments (although agile radios are one avenue for solving this).
In both approaches, users are reliant on being suitably trained to know when to collect and record. They might then be responsible for analysing and collating the data, or passing the information onto signal analysts to understand what useful information is contained within the spectrum, such as detecting, identifying, or intercepting communications, or monitoring changes in spectrum density.
The ability to do this in real time requires considerable skill, knowledge and processing power at the sensor nodes. This is where Machine Learning can help.
RF Machine Learning at the Edge
As modern sensors are being loaded with more processing power, there are opportunities for ML models and AI applications to be pushed to the edge to enable near-real time operations in the RF domain.
Pre-processing at the edge reduces the need to store and pass back large amounts of data. This saves precious storage and data bandwidth, allowing nodes to connect and share useful information in manageable quantities and removing the need for a central processing node.
ML can be used to learn, detect and identify specific signals of interest, as well as a mechanism to only collect signals from specific devices. It can extract signals of interest from a larger collect, post processing, which can dramatically reduce analysis time in searching for specific signals or identifying patterns of life.
By using ML to model normal RF behaviour in a local area, it can be used to detect deceptive targets, such as those hiding in the noise floor, or devices that spoof the transmissions, protocol and identities of others.
RF ML Development
At Rowden, a significant amount of our ML work is focused on understanding and addressing legacy challenges in the RF and SIGINT space.
To accelerate joint research in this area, we identified early on the necessity to train our models by developing a methodology for collecting signals at mass, enabled by a RF-controlled environment in our on-site semi-anechoic chamber.
Our ability to access and control elements of the set-up allow us to conduct rapid collection, testing, training, and modification of our pipeline, applying continuous development and new techniques to our methodology.
Specification Emitter Identification
One of the areas we focus on in RF ML is in the development and deployment of Specification Emitter Identification (SEI) of wireless RF devices, such as push to talk radios like PMRs and DMRs.
SEI is the challenge of pairing a received signal to a particular piece of hardware based purely on information captured at the RF physical layer. In ML terms, this is framed as a classification problem where a model has to pick between 1 of 10 possible classes for each packet received.
To make its decision, the model is given a single data packet in the form of the raw IQ values as received by SDR. The ML model can either use these values directly or extract higher-level features such as the statistical properties or an FFT of the signal.
We have experimented with a range of different models, including classical methods such as KNNs and SVMs through to Deep Learning approaches such as CNNs and LSTMs, with a focus on deploying these models to run in real-time on edge-based hardware.
IoT Anomaly Detection
Another unique application of ML is in providing additional physical layer security by fingerprinting the unique signature of trusted devices. This approach can be used to assess received signals between communication devices, adding additional confidence that the signal being received is from a trusted device.
Our Anomaly Detection work uses an ML model to check if the transmission received really came from the emitter the device claims it came from. The premise behind this technique is to understand if devices within the local network are originating from trusted source.
For the task of anomaly detection, the model’s training data only contains packets from known trusted devices. The model’s job is to use this data to build up an understanding of what is “normal”, such that it can identify anomalies during the testing phase.
Rowden has conducted experimentation and successfully demonstrated this on edge hardware for IoT devices, with variation in performance depending on the Machine Learning and Deep Learning method applied.
RF ML is an exciting area of research that Rowden will continue to invest in to better understand the limitations and operational potential for users at the edge.
To develop this field and further scrutinise the applicability and robustness of AI and ML models in this domain, specialists need to be able to legally collect representative real-world data or have access to controlled environments in which to conduct experiments.
Creating a multi-layered RF environment is highly complex. The real world has a tremendous impact on the way RF signals propagate. Multipath, fading and noise all impact the way in which signals are modified and received; developing such an environment requires significant time and investment.