NanoEdge AI Studio¶
I. What is NanoEdge AI Library?¶
NanoEdge AI Library is an artificial intelligence static library developed by Cartesiam, for embedded C software running on ARM Cortex microcontrollers (MCUs). It comes in the form of a precompiled
.a file, that provide building blocks to implement smart features into any C code.
When embedded on microcontrollers, the NanoEdge AI Library gives them the ability to “understand” sensor patterns automatically, by themselves, without the need for the user to have additional skills in Mathematics, Machine Learning, or Data science.
Each NanoEdge AI static library contains an AI model (as a bundle of signal preprocessing, machine learning model, optimally tuned hyperparameters, etc.) designed bring machine learning capabilities to any C code, in the form of easily implementable functions, such as learn, detect, or classify.
There are two different types of NanoEdge AI Libraries:
- NanoEdge AI Library for Anomaly Detection: it is used to detect anomalous behaviors on a machine, after an initial in-situ training phase, where only the “nominal” behaviors are learned.
- NanoEdge AI Library for Classification: it is used to distinguish and recognize different types of behaviors, anomalous or not, and classify them into pre-establised categories.
Each library, tailored for different goals, has unique features:
Anomaly Detection Library Classification Library Comes optimized yet untrained Comes optimized and pre-trained Gathers knowledge incrementally in the MCU Includes static knowledge Learns dynamically and unsupervised in-situ Doesn’t require any additional learning Detects anomalies in the MCU Classifies signals in the MCU
- However all NanoEdge AI Libraries:
- are ultra optimised to run on MCUs (any ARM Cortex-M)
- are ultra memory efficient (1-20kB RAM/Flash)
- are ultra fast (1-20ms inference on M4 80MHz)
- run directly within the microcontroller
- can be integrated into existing code / hardware
- consume very little energy
- preserve the stack (static allocation only)
- are inherently independent from the cloud
- transmit or save no data
- require no Machine Learning expertise to be deployed
All NanoEdge AI Libraries are created by using NanoEdge AI Studio.
II. Purpose of NanoEdge AI Studio¶
1. What the Studio does¶
NanoEdge AI Libraries contains a range of machine learning models, and each of these models can be optimized by tuning a wide range of hyperparameters. This results in a very large number of potential combinations, each one being tailored for a specific use-case (one static libraries for each combination). We therefore need a tool to find the best possible Library for each project.
- NanoEdge AI Studio is:
- a search engine for AI Libraries,
- built for embedded developers,
- that abstracts away all aspects of Machine Learning and Data Science,
- to enable quick and easy development of Machine Learning capabilities into any C code.
NanoEdge AI Studio takes as input project parameters (MCU type, RAM, sensor type…) and some signal examples, and outputs the most relevant NanoEdge AI Library (that will then do all the learning in-situ, when embedded).
Each NanoEdge AI static library is the result of the benchmark of virtually all possible AI libraries (combinations of signal treatment, ML model, tuned hyperparameters), tested against the minimal data given by the user. It therefore contains the best possible model, for a given project, given the signal examples provided as input.
2. What the Studio doesn’t do¶
In a nutshell, NanoEdge AI Studio takes user data as input (in the form of sensor signal examples), and produces a static library (.a) file as output. This is a straightforward and relatively quick procedure.
However, the Studio doesn’t provide any input data. The user needs to have qualified data in order to obtain satisfactory results from the Studio. These data can be raw sensor signals, or pre-treated signals, and need to be formatted properly (see below). For example, for anomaly detection on a machine, the user needs to collect signal examples of “normal” behavior on this machine, as well as a few examples (non-exhaustive) of “anomalies”. This data collection process is crucial, and can be tedious, as some expertise will be needed to design the correct signal acquisition and sampling methodology, which can vary dramatically from one project to the other.
Additionally, NanoEdge AI Studio doesn’t provide any ready-to-use C code to implement in your final project. This code, which will include some of the NanoEdge AI Library’s smart functions (such as initialize, learn and detect), needs to be written and compiled by the user. The user is free to call these functions as needed, and implement all the smart features imaginable.
In summary, the static (.a) library file, outputted by the Studio from user-generated input data, will have to be linked to some C code written by the user, and compiled/flashed by the user on the target microcontroller.
III. Getting started¶
1. Running NanoEdge AI Studio for the first time¶
When running NanoEdge AI Studio for the first time, you will be prompted for:
Your proxy settings: if you’re using a proxy, use the settings below, otherwise, click NO.
The port you want to use.
It can be changed to any port available on your machine (port 5000 by default).
Your license key.If you don’t know your license key, log in to the Cryptlex licensing platform to retrieve it.If you have lost your login credentials, reset your password using the email address used to download NanoEdge AI Studio.
If you don’t have an Internet connection, offline activation is available:
- Choose Offline activation and enter your license key.
- Copy the long string of characters that appears.
- Log in to the Cryptlex licensing platform.
- Reset your password using the email address provided when downloading NanoEdge AI Studio.
- Log into your Cryptlex dashboard using your new password.
- Click on your license key, then Activations and Offline activation
- Click ACTIVATION, then paste the string of characters copied in step 2, and click Download response.
- In NanoEdge AI Studio, click Import file and open the downloaded .dat file.
2. Preparing signal files¶
During the library selection process, NanoEdge AI Studio uses user data (input files containing signal examples) to test and benchmark many machine learning models and parameters. The way those input files are structured, formatted, and the way the signal were recorded is therefore very important.
This section is composed of the following sub-sections:
i. Anomaly Detection vs. Classification¶
Here is an overview of the two different approaches:
Anomaly Detection Classification (N classes) Studio’s purpose
- finding the best Library
- no knowledge, model is untrained
- finding the best Library and learning some knowledge
- static knowledge, model is pre-trained
Embedded functions (MCU)
- no learning in-situ
classifierfor in-situ classification
- 2 import steps: Step 2 and Step 3
- Import Regular and Abnormal files
- 1 import step: Step 2
- Import all N “class” files
- 2 signal types
- Regular and Abnormal
- N signal types
- One for each class / behavior / regime
- all regimes are in the same file
- nominal: all in “nominal” file
- abnormal: all in “abnormal” file
- only one regime per “class” file
- to distinguish among N regimes; create N “class” files
File contents All files contain several examples (or repetitions, or variations) of the same signal, be it “nominal”, “abnormal”, or corresponding to a given “class” or regime.
The following sub-sections are applicable to both anomaly detection and classification.
ii. General approach for formatting input files properly¶
For both Anomaly Detection (AD) and Classification (Cl), the expected input file format is the same:
- TXT / CSV file
- at least 20-50 lines representing independent signal examples
- multiple numerical values on each line
- same number of numerical values per line
- numerical values only
- uniform separators (comma, semicolon, etc)
Each input file contains several signals, each made of many samples.
Each line in the input file, corresponds to an observation of a signal during a given time. It is a signal example.
Each of these line comes from a sampling process that involves reading values from a signal at defined intervals, generally constant, producing a series of discrete values called samples.
In NanoEdge AI Studio, each lines will be taken into account independently, iteratively, so they must represent a meaningful snapshot in time of the signal to be measured. It it therefore crucial to set a coherent sampling frequency and a proper buffer size (see next section).
The number of lines will determine how many signal examples our search algorithms will treat in total. Realistically, never use fewer than 20-50 lines per file, or millions of lines. A realistic range would be from ~100 to ~1000 lines, although ~100 are usually enough.
During sampling, the signal’s values are read, generally at regular intervals. We recommend keeping a constant sampling frequency during data acquisition, regardless of the final application.
The number of samples per signal, or buffer size, is set by the user depending on this sampling frequency (see next section), and depends on each particular use case. You need enough samples to cover the whole physical phenomenon studied, to get a proper and meaningful signal snapshot.
- 1-axis sensor (e.g. pressure) with buffer size 256; file has 256 values per line.
- 3-axis sensor (e.g. vibration) with buffer size 256; file has 256 x3 = 768 values per line.
Samples are the numerical values that constitute a signal (a line). The number of samples per line, or buffer size, must stay constant across data imported in the Studio that you wish to consider together (e.g. a normal / abnormal signals couple, see Studio: Importing signals).
Whenever possible, please use buffer sizes that are powers of two (e.g. 128, 1024…).The allowed separators are:
;.Please format decimal values using a period (
.) and not commas (
Here is an example of signal file corresponding to a 3-axis sensor, e.g. a collection of m signal examples (m readings, or lines) on a 3-axis accelerometer with a buffer size of 256 on each axis, where each numerical value is separated by a single space:
NoteDepending on project constraints, buffer size, signal lengths, and sampling frequencies will vary.For example, for a buffer size of 256, it could mean that:
- we needed to capture 0.25-second signals, with a sampling frequency of 1 kHz, so we chose a buffer size of 256 (256/1000 = 0.256).
- we needed to sample at a higher frequency (4 kHz), so with a buffer size of 256, our signals will be much shorter, 64 ms (256/4000 = 0.064).
iii. Variant: Aggregating multiple variables into “states”¶
The above procedure described how to treat signals that contain a succession of samples, often resulting from real time acquisition of a temporal signal.
However, it is also possible to create artificial signals from higher-level features, resulting from the aggregation of multiple variables, possibly coming from multiple sensors. Such signal represent instantaneous states rather than time-evolving signals.
For this purpose, “Multi-sensor” can be selected in the Studio, on the project creation screen, where you would normally select your sensors:
Make sure that you select the correct number of variables (e.g. creating a 5-variable state using a 3-axis magnetometer + temperature sensor + pressure sensor).
- Multi-sensor is not intended for the simultaneous use of several temporal buffers coming from different sensor sources, i.e. for time series, or to monitor buffers of temporal data (e.g. accelerometer data, rapidly varying currents/voltage…s). For this purpose, use Multi-library or a ` Generic N-axis sensor <studio/studio.html#iii-which-sensor-to-use-for-which-case>`_.
- Multi-sensor works very differently, compared to other traditional “single” sensors. It is intended for very precise use cases, when the user need to read and monitor multi-sensor machine states.
- Multi-sensor can be used with higher-level features (e.g. mean, min, max, stdev…), extracted from these buffers, to aggregate them into states that can be read from time to time, but not continuously.**
- When using multi sensor, each line (or signal example) only contains as many values as you have variables.
- In other words, each line represents one single sample, or one single state (whereas for “single” sensors, each line represented a succession in time of many samples).
- In summary:
- In mono-sensor, you have
NUMBER_OF_AXES * BUFFER_LENGTHvalues per line.
- In mono-sensor, a line represents a signal snapshot consisting of several samples.
- In multi-sensor, you only have
NUMBER_OF_VARIABLESvalues per line (effectively, a buffer length of 1).
- In multi-sensor, a line represents a state, not a full temporal signal snapshot anymore.
For example, for an application where a state can be represented via a combination of magnetism, temperature, and pressure: we can aggregate data from a 3-axis magnetometer, a (1-axis) thermometer, and a (1-axis) pressure sensor. Temperature and pressure, if they vary slowly, can be read directly, but magnetometer data needs to be summarized using (for example) average values across a 50 millisecond window along all 3 axes.
This would result in 3 extracted magnetic features, followed by temperature, followed by pressure, to represent a 5-variable state.
We could also imagine building a more complex state from our 50 millisecond magnetometer buffer, including not only average magnetometer values, but also minimums and maximums, for all 3 axes. This would result in 3x3 = 9 extracted magnetometer values (3 each for average, minimum, maximum), followed by temperature and pressure, to represent a 11-variable state.
iv. Choosing a relevant sampling frequency and buffer size (except multi-sensor)¶
To prepare input data (except when using multi-sensor), it is crucial to choose the most adequate sampling frequency and buffer size for your sensors.
The sampling frequency corresponds to the number of samples measured per second. For some sensors, the sampling frequency can be directly set by the user, but in other cases, a timer needs to be set up for constant time intervals between each sample.
The speed at which the samples are taken must allow the signal to be accurately described, or “reconstructed”; the sampling frequency must be high enough to account for the rapid variations of the signal. The question of choosing the sampling frequency therefore naturally arises:
- If the sampling frequency is too low, the readings will be too far apart; if the signal contains relevant features between two samples, they will be lost.
- If the sampling frequency is too high, it may negatively impact the costs, in terms of processing power, transmission capacity, storage space, etc.
To choose the sampling frequency, prior knowledge of the signal is useful in order to know its maximum frequency component. Indeed, to accurately reconstruct an output signal from an input signal, the sampling frequency should be at least twice as high as the maximum frequency that you wish to detect within the input signal.
Without any prior knowledge of the signal, we recommend testing several sampling frequencies and refining them according to the results obtained via NanoEdge AI Studio / Library (e.g. 200 Hz, 500 Hz, 1000 Hz, etc.).
The issues related to the choice of sampling frequency and the number of samples are illustrated below:
Case 1: the sampling frequency and the number of samples make it possible to reproduce the variations of the signal.
Case 2: the sampling frequency is not sufficient to reproduce the variations of the signal.
Case 3: the sampling frequency is sufficient but the number of samples is not sufficient to reproduce the entire signal (i.e. only part of the input signal is reproduced).
The buffer size corresponds to the total number of samples recorded per signal, per axis. Together with the sampling frequency, they put a constraint on the effective signal temporal length.
In summary, there are 3 important parameters to consider:
n: buffer size
f: sampling frequency
L: signal length
They are linked together via:
n = f * L. In other words, by choosing two (according to your use case), the third one will be constrained.
Here are general recommendations. Make sure that:
- the sampling frequency is high enough to catch all desired signal features. To sample a 1000 Hz phenomenon, you must at least double the frequency (i.e. sample at 2000 Hz at least).
- your signal is long (or short) enough to be coherent with the phenomenon to be sampled. For example, if you want your signals to be 0.25 seconds long (
L), you must have
n / f = 0.25. For example, choose a buffer size of 256 with a frequency of 1024 Hz, or a buffer of 1024 with a frequency of 4096 Hz, and so on.
For best performances, always use a buffer size
nthat is a power of two (e.g. 128, 512…).
v. Which signals to put in which files¶
For Classification, each category of signal examples that wish to be classified / separated must be put into distinct files and imported separately into different “Classes”.
For Anomaly Detection, the general guideline is to concatenate all signal examples corresponding to the same category into the same file (e.g. “nominal”).
I want to detect anomalies on a 3-speed fan by monitoring its vibration patterns using an accelerometer. I recorded many signals corresponding to different behaviors, both “nominal” and “abnormal”. I have the following signal examples (numbers are arbitrary):
- 30 examples for “Speed 1”, which I consider nominal,
- 25 examples for “Speed 2”, which I consider nominal,
- 35 examples for “Speed 3”, which I consider nominal,
- 30 examples for “Fan turned off”, which I also consider nominal,
- Some of these signals contain “transients”, e.g. fan speeding up, or slowing down.
- 30 examples for “fan air flow obstructed at speed 1”, which I consider abnormal,
- 35 examples for “fan orientation tilted by 90 degrees”, which I consider abnormal,
- 25 examples for “tapping on the fan with my finger”, which I consider abnormal,
- 25 examples for “touching the rotating fan with my finger”, which I consider abnormal.
Here, I will create
- Only 1 nominal input file containing all 120 signal examples (30+25+35+30) covering 4 nominal regimes + transients.
- Only 1 abnormal input file containing all 115 signal examples (30+35+25+25) covering 4 abnormal regimes.
And start a benchmark using only this couple of input files.
- Note that all speeds are not necessarily represented in “abnormal behaviors”.
- It is not a problem. Later on, unseen anomalies can still be detected, because the learning happens in-situ, and not in the Studio.
For anomaly detection, the Studio gives the possibility to add several signal couples, which seems contrary to the instructions above. In fact, adding signal couples is used when creating a general AI Library that will adapt to different types of machines.
I want to detect anomalies on industrial pumps of different brands / types. My detection algorithms need to be adaptable, instead of specialized. I recorded different nominal behaviors (e.g. pump running at max capacity, pump running at half capacity…) on three different pumps (Pump A, Pump B and Pump C). I also recorded one type of anomaly (e.g. minor leak) for each of the 3 pump types, so I have 3 batches of abnormal signals.
Therefore I will:
- Concatenate all nominal behaviors for Pump A into one nominal file “Nominal A”,
- Concatenate all nominal behaviors for Pump B into a separate nominal file “Nominal B”,
- Concatenate all nominal behaviors for Pump C into a separate nominal file “Nominal C”,
- Also import my anomalies into 3 separate files, “Abnormal A”, “Abnormal B” and “Abnormal C”.
And start a benchmark using 3 couples of signal files:
- “Nominal A” + “Abnormal A”
- “Nominal B” + “Abnormal B”
- “Nominal C” + “Abnormal C”
IV. Using NanoEdge AI Studio¶
In order to generate a static library, NanoEdge AI Studio walks the user through several steps:
- Creating a new project and setting up its parameters,
- Importing “signal examples” into the studio for context,
- Running the library selection process,
- Testing the best library found by the Studio,
- Compiling and downloading the library [Full version / Featured boards only].
For anomaly detection:
1. Creating a new project¶
In the main window, you can:
- Create a new project
- Load an existing project
Projects can be imported from and exported to
The 3 projects most recently opened are listed first.
ii. Project creation¶
Choose project type: Anomaly Detection or Classification:
Enter name and description;
Choose the target (microcontroller type);
ARM Cortex M MCUs currently supported: M0, M0+, M1, M3, M4, M23, M33 and M7.
Selecting an authorized board enables library download and compilation when using NanoEdge AI Studio TRIAL version.
Here is the list of the authorized boards currently available:
Choose the maximum amount of RAM allocated to the library (maximum: 10000 kB);
(optional) Choose the maximum amount of FLASH allocated to the library;
Choose the sensor type used to collect data (with the correct number of axes);
iii. Which sensor to use for which case¶
Single-sensor temporal buffer:
This is the typical single-sensor use case where a signal example is composed of a buffer made out of several samples (e.g. current, magnetometer, accelerometer…) Valid sensors to use would be:
Multi-sensor temporal buffer:
To work with temporal buffers coming from multiple sensors of different types, there are 2 different approaches:
Separate each sensor’s signal, and create a Library for each one, using Multi-library.
Each signal will be decoupled and treated on its own by a different library. See the Multi-library section. This is very similar to the Single-sensor buffer case above, but this time there is a need to monitor several buffers coming from several sensors, concurrently, in the same device.
Combine all signals into a single buffer, by using a generic N-axis sensor.
All signals will be treated concurrently by the same Library. The machine learning algorithms will therefore build a model based on the combination of these inputs (unlike option #1).
To combine accelerometer (3 axes) + gyroscope (3 axes) + current (1 axis) signals, you would select a generic 7-axis sensor.
The buffers in the input files are formatted just like a generic 3-axis accelerometer (see this section), but each sample now has 7 variables. Instead of the 3 linear accelerations [X Y Z], the 7-axis sample will add 3 angular accelerations [Gx Gy Gz] from the gyroscope, and 1 current value [C] from the current sensor.
This would result in 7-axis samples [X Y Z Gx Gy Gz C], meaning that for a buffer size of 256, each line would be composed of 7x256 = 1792 numerical values.
- For this option (option #2), all the signals to be combined need to be sampled at the same rate (one single sampling frequency).
The generic N-axis sensor is limited to a maximum of 1000 axes (or variables).
Multi-sensor non-temporal state:
See the Multi-variable section. This is a niche use case, where machine states need to be monitored at defined intervals. Each state is composed of several “variables” (there is no buffer anymore), possibly coming from sensors of different types.
Multi-sensor is limited to a maximum of 1000 variables.
2. Importing signal files¶
i. Anomaly detection¶
For anomaly detection, two types of signals examples are required. They are imported respectively in Step 2: Regular signals and Step 3: Abnormal signals.
The Regular signals correspond to nominal machine behavior, i.e. data acquired by sensors during normal use, when everything is functioning as expected.
Please include data corresponding to all the different regimes, or behaviors, that you wish to consider as “nominal”. For example, when monitoring a fan, you may need to log vibration data corresponding to different speeds, possibly including the transients.
The Abnormal signals correspond to abnormal machine behavior, i.e. data acquired by sensors during a phase of anomaly.
The anomalies don’t have to be exhaustive. In practice, it would be impossible to predict (and include) all the different kinds of anomalies that could happen on your machine. Just include examples of some anomalies that you’ve already encountered, or that you suspect could happen. If needed, don’t hesitate to create “anomalies” manually.
However, if the Library is expected to be sensitive enough to detect very “subtle anomalies”, we recommend that the data provided as abnormal signals includes at least some examples of subtle anomalies as well, and not only very gross, obvious ones.
To import a new file, simply click “Choose Signals”.
These signals files are only necessary to give the benchmark algorithms some context, in order to select the best library possible.
At this stage, no learning is taking place yet. In later stages, after the optimal library is selected, compiled, and downloaded, it will be completely fresh, brand new, untrained, and will have no established knowledge.
The learning process that will then be performed, either via NanoEdge AI Emulator, or in your embedded hardware application, will be completely unsupervised.
For Classification, you need as many signal types as you have classes to distinguish.
For example, for the identification of types of failures on a motor, we could imagine 5 classes, each corresponding to a behavior, such as:
- normal behavior
- bearing failure
- excessive vibration
Which would result in the creation of 5 distinct classes (import one txt/csv file for each), each containing a minimum of 20-50 signal examples of said behavior.
To add a new class, simply click “Add Class” and then “Choose Signals”.
iii. Importing signals from file¶
Please make sure that your input files are formatted properly (see Studio: Formatting input files).
- Click Select file, and choose a valid input file.
- Select the separator you are using.
- Validate import.
If your input file is valid, you will be able to import it. Otherwise, please double check your data (numerical values, uniform separators, constant number of samples per line…).
Any imported signal can be plotted and displayed by clicking the icon:
Plots can then be saved to .png format.
iv. Importing signals “live” from Serial port (USB)¶
It is possible to import signals directly within the Studio, by logging it through your computer’s Serial port (USB).
If you need to open a serial / COM port on Linux, check the FAQ (Section II. 9) for instructions.
You need a USB data logger in order to do it. For instructions on how to make a simple data logger, check our tutorials: Smart vibration sensor and Smart current sensor, under section III. Making a data logger.
- Select your Serial / COM port. Refresh if needed.
- Choose your preferred baudrate.
- If needed, select a maximum number of lines to be recorded.
- Click the red “Record” round button to start the data logging.
- Click the grey “Stop” square button to interrupt the logging.
- Choose your delimiter.
- Validate import.
If your data is valid, you will be able to import it. Otherwise, please double check your data logger parameters.
The data logged via serial will be plotted in real-time at the bottom of the screen:
This serial plotter window can be toggled on/off by clicking the signal icon, above numlerical data preview:
If any error is found during the real-time data logging via serial, a pop-up window will appear, letting you delete or edit the lines where issues were detected.
All imported signals can easily be plotted, displayed and saved to .png by clicking the blue eye icon at the right side of the screen, above the preview tables.
- Your data logger must output lines containing a constant number of samples per line, all separated by the same separator.
- Your data logger must also output signals one line at a time to be coherent with the way input files are formatted, see the Formatting section.
v. Checks, errors and warnings¶
Supported file formats are .txt / .csv. Recommended separators are single spaces, commas or semicolons. Please make sure that your input file is correctly formatted.
In this example, we have an input file containing 200 examples of nominal data (200 lines), for a 3-axis accelerometer that uses a buffer size of 256 (which gives 256x3 = 768 numerical values per line).
The “Check for RAM”” and the next 5 checks are blocking, meaning that you will need to fix any error in your input file before proceeding further.
Click “Run optional checks”” to scan your input file and run additional checks, e.g. to search for duplicate signals, equal consecutive values, random values, outliers, etc. Failing these additional checks gives warnings that suggest possible modifications on your input files. Click any warning for more information and advice.
If you imported data live using a data logger through your Serial port, you can download the resulting .csv file by clicking the icon, at the top-right of the checks.
vi. Data plots¶
On the right hand side of the screen, you will see a preview of data contained in your input files.
These graphs show a summary of the data contained in each line of your input files. There are as many graphs as sensor axes.
The graph’s x-axis corresponds to the columns of your input file. The y-values contain an indication of the mean value of each column (across all lines, or signals), their min-max values, and their standard deviation.
Here, our accelerometer sampled 256 values per line (per axis), so we see 256 points on the graphs’ x-axis. These graphs do not represent a temporal evolution of the behavior of your machine as a whole, but rather a snapshot of the actual physical signals, averaged across all lines.
The FFT plots can be displayed by toggling “Show FFT” at the top-right of the graphs.
Several input files can be loaded (it will be shown on the left side of the screen, see below), either for “regular” or “abnormal” signals, but only one (for each) at a time will be used for library selection.
If FFT was manually activated via Advanced settings (see below), the FFT plots of your signals will also appear under signal previews, for each axis.
vi. Frequency filtering¶
Since v2.1, it is now possible to alter the imported signals by filtering out unwanted frequencies.Click Advanced settings above the list of imported signals.
In the Advanced settings menu, you may toggle Activate FFT to force the Library to include a fast Fourier transform in its signal pre-processing step. The Library will treat imported signals in the frequency domain rather than the time domain.
Input the sampling frequency used to collect your signals (e.g. here 6660 Hz);
Select the range of frequencies to exclude (e.g. here we exclude everything under 500 Hz and above 2000 Hz);
Activating FFT will apply the chosen FFT settings to all signals within the current project.
The FFT plots of your signals will be displayed under signal previews, with greyed out zones corresponding to the excluded frequencies (e.g. below, only the frequencies between 500 Hz and 2000 Hz are kept).
3. Running the library selection process¶
Here (Step: Optimize and Benchmark), you will start and monitor the library benchmark. NanoEdge AI Studio will search for the best possible library given the contextual signal examples provided in the previous step(s).
i. Starting the benchmark¶
Click START to open the signal selection window:
[Anomaly detection]:Select a couple of signal files (regular + abnormal signals) that you wish to use for benchmark.Those signals can be compared visually across all sensor axes by clicking Compare these signals.
You can select several couples that will be used to test the performance of all candidate libraries. For example, if you have logged data of similar type on 3 different machines, you should import them and select them here, to add 3 normal/abnormal signal couples, one corresponding to each machine.
For more information about adding signal couples (vs. concatenating signals in the same file), check “Which signals should I put in which files” or the the FAQ, Section I. A. 7. Should I concatenate data into single files, or use “signal couples”?
Select the classes to take into consideration for benchmarking.
Then, select the number of microprocessor cores from your computer that you wish to dedicate to the benchmark process (see below). Selecting more CPU cores will parallelize the workload of our algorithms, and greatly speed up the process. Please use as many as you can, but be aware that using all available CPU cores might temporarily slow down your computer’s performances.
When you are ready to start the benchmark, click Validate.
ii. Library performance indicators¶
NanoEdge AI Studio uses 3 indicators to translate the performance and relevance of candidate libraries, in the following order of priority:
(Balanced) Accuracy (most important by far)
Balanced Accuracy (anomaly detection) Accuracy (classification)
- Library’s ability to correctly identify regular signals as regular, and abnormal signals as abnormal.
- 100% balanced accuracy = all signals are correctly identified.
- Library’s ability to attribute each signal to the correct class.
- 100% accuracy = all signals are attributed to the correct class.
Optimizing this indicator is the top priority of our algorithms.
Confidence (anomaly detection) Confidence (classification)
- Library’s ability to mathematically separate abnormal signals from regular ones.
- 100% confidence = all regular signals are at 100% similarity, all abnormal ones at 0% (see graphs below).
- Library’s ability to (correctly) attribute signals with high certainty.
- 100% confidence = all classes are correctly attributed with a probability of 100%.
Increasing this indicator is the algorithms’ second priority.
RAM and FLASH (anomaly detection and classification) This is the maximum amount of RAM and FLASH memory space needed by the library after your integrate it on your microcontroller.There is also an indication of the extra (RAM) memory needed for saving signal buffers. The amounts of FLASH and RAM used by the libraries are optimized last.
Along with those 3 indicators, a graph shows a plot of all data points, against a percentage of similarity (on the y-axis). Similarity is a measure of the how much a given data point fits in with (how much it resembles) the existing knowledge base of the library.
Regular signals are shown as blue dots, and abnormal signals as red dots. The x-axis represents the number of the signal example in the corresponding file, and the y-axis represents the similarity score (%). The threshold (decision boundary between the two classes, “nominal” and “anomaly”) set at 90% similarity, is shown as a gray dashed line.
- The blue dot below refers to a signal from the imported “nominal signals” input file, and it is ranked at 67% similarity. It is (incorrectly) detected as anomaly.
- The red dot below refers to a signal from the imported “abnormal signals” input file, and it is ranked at 60% similarity. It is (correctly) detected as anomaly.
- 100% balanced accuracy means that all blue dots are above the 90% threshold, and all red points are below.
- 100% confidence means that all blue dots are at 100% similarity, while all red dots are at 0% similarity.
The graph is subdivided into sections, one for each class. The x-axis shows the number of the signal example in the corresponding class file, while the y-axis represents the probability associated to this signal (the % certainty associated to the class detected).
All signal examples are represented as dots, either green if their associated probability is higher than 50%, or red otherwise.
The red dot below, pertaining to the class / input file “Obstructed” was (incorrectly) classified as “Nominal”.
All other dots (green) were correcly identified as their respective classes.
As the benchmark progresses, confidence may decrease slightly, and RAM / FLASH may vary dramatically, but (balanced) accuracy will keep improving so that at any time, you always get the most optimal library.
iii. Benchmark progress and summary¶
As soon as the library selection process is initiated, a graph will be displayed on the right hand side of the screen (see below), showing the evolution of the 3 performance indicators (see above section) over time, as thousands of candidate library are tested.
If the benchmark seems stuck at 5%, and nothing is happening within a minute (no plot, or axes with no data points) please stop the benchmark, and start a new benchmark, and if the issue keeps happening, please relaunch the Studio.
The selection algorithms will first try to maximise balanced accuracy, then confidence, and finally to decrease the RAM / FLASH needed as much as possible.
The benchmark process may take from tens of minutes to several hours, depending on your CPU’s speed and number of cores. Please be patient; have a break, grab a drink, and let the benchmark complete.
Benchmarks may be paused and resumed at any time.
A paused benchmark will appear in your benchmark list with the corresponding “pause” icon.
To resume it, or stop it completely, click resume or stop in the benchmark information, to the right side of the screen.
In Windows 10, the benchmark’s progress will be shown on the NanoEdge AI Studio taskbar icon.
ImportantAnytime during benchmark, you can test the current library without pausing / stopping the benchmark.While a benchmark is running, just move on to Step: Emulator to use an Emulator to test the best Library found so far, or move on to Step: Deploy to compile and deploy it [Paid version / Featured boards only].
While a benchmark is running, a small log window will display notable information / events such as benchmark status, search speed per thread, new libraries found, etc.
When the benchmark is complete, the progress graph is be replaced by a summary.
Only the best Library (in terms of Accuracy) is shown. However, several “candidates” are saved for each benchmark.
You may select a different library by clicking “N libraries” (see above, “16 libraries”). This feature is useful if you’d like to use a library that has better performances in terms of a secondary indicator (e.g. you want to prioritize low RAM, or high Confidence…).
Just select a different library by clicking the crown icon, under “Lib selected”, and validate your change of result by clicking OK.
Several successive benchmarks can be run; all results will be saved. They can be loaded by clicking them on the left hand side of the screen.
[Anomaly Detection only]: After the benchmark is complete, a plot of the library’s learning behavior is shown:
This graph shows the number of learning iterations needed to obtain optimal performances from the library, when it is embedded in your final hardware application. In this particular example, NanoEdge AI Studio recommended that the learn() should be called 70 times, at the very minimum.
- Never use fewer iterations than the recommended number, but feel free to use more (e.g. 3 to 10 times more).
- This iteration number corresponds to the number of lines to use in your input file, as a bare minimum.
- These iterations must include the whole range of all kinds of nominal behaviors that you want to consider on your machine.
iv. Possible cause for poor benchmark results¶
If your keep getting poor benchmark results, you may try the following:
- Increase the “Max RAM” or “Max Flash”” parameters (e.g. 32 kB or more).
- Adjust your sampling frequency; make sure it is coherent with the phenomenon you want to capture.
- Change your buffer size (and hence, signal length); make sure it is coherent with the phenomenon to sample.
- Make sure your buffer size (number of values per line) is a power of two (except for multi-sensor).
- If using a multi-axis sensor, treat each axis individually by running several benchmarks with a single-axis sensor.
- Include more signal examples (lines) in your input files.
- Check the quality of your signal examples; make sure they contain the relevant features / characteristics.
- Check that your input files don’t contain (too many) parasite signals (e.g. no anomalous signals in the nominal file, for anomaly detection, and no signals belonging to another class, for classification).
- Increase the variety of your signal examples (e.g. more nominal regime, or more anomalies, or more classes).
- Decrease the variety of your signal examples (e.g. fewer nominal regime, or fewer anomalies, or fewer classes).
- Check that the sampling methodology and sensor parameters are kept constant throughout the project for all signal examples recorded (in all input files; nominal, abnormal or class files).
- Check that your signals are not too noisy, too low intensity, too similar, or unrepeatable.
- Remember that microcontrollers are resource-constrained (audio/video, image and voice recognition won’t be supported).
Low confidence scores are not necessarily an indication or poor benchmark performances, if the (balanced) accuracy is sufficiently high (>80-90%). Always use the associated Emulator to determine the performance of a Library, preferably using data that has not been used before (for the benchmark).
If still unable to get good benchmark results, and you’re running out of ideas, don’t hesitate to contact us at email@example.com.
Signal confirmation procedure:
Even with lower (balanced) accuracy scores, detection results can often be greatly improved by implementing a simple confirmation mechanism in the final algorithm / C code. This approach may prove extremely useful, depending on the use case, to limit the number of false positives (or false negatives).
In practice, it consists in validating anomalies before raising alerts, instead of taking the detection results directly. For example, anomalies may be counted as “true anomalies” only after N successive validations using consecutive (distinct) data buffers. The same approach can of course be used to confirm “nominal” signals. Validations can be made using counters, or any statistical tool such as means, modes, etc.
The same approach can be used to confirm that a signal pertains to the correct class, in classification projects. This is useful to minimize classification errors, or eliminate transient regimes.
[Classification only]: in classification projects, this confirmation feature is avalable natively when using the Emulator to test libraries with serial data (see this section).
4. Testing the NanoEdge AI Library¶
Here (Step: Emulator), you will be able to test the Library that was selected during the benchmark process (Step: Optimize and Benchmark) using NanoEdge AI Emulator.
NanoEdge AI Emulator is a clone of the Library that emulates its behavior, and is directly usable within the Studio’s interface. There is no need to embed a Library in order to test its performances with real, “unseen” data. Therefore, each Library, among hundreds of thousands of possibilities, comes with its own Emulator.
The Emulator can be also be downloaded as a standalone
.exe (Windows) or
.deb (Linux) to be used in the terminal through the command line interface.
This screen gives a summary of the selected benchmark (progress, performance, input files used):
Select the benchmark to use, on the left side of the screen, to load the associated emulator.
When you are ready to start testing, click Initialize Emulator.
i. Anomaly Detection¶
Here are the functions of the Anomaly Detection Library that are available through its Emulator:
run first before learning/detecting, or to reset the knowledge of the library/emulator
adjust the pre-set, internal detection sensitivity (does not affect learning, only returned similarity scores)
start a number of learning iterations (to establish an initial knowledge, or enrich an existing one)
start a number detection iterations (inference), once a minimum knowledge base has been established
The testing procedure goes as follows:
When building a smart device, the final features will heavily depend on the way those functions are called. It is entirely up to the developer to design relevant learning and detection strategies, depending on the project’s specificities and constraints.
For example for a hypothetical machine, one strategy could be to:
- initialize the model;
- establish an initial knowledge base by calling learn() every minute for 24 hours on that machine;
- switching to inference mode by calling detect() 10 times every hour (and averaging the returned scores), each day;
- blink a LED and ring alarms whenever detect() returns any anomaly (average score < 90%);
- run another learning cycle to enrich the existing knowledge, if temperature rises above 60°C (and the machine is still OK)
- send a daily report (average number of anomalies per hour, with date, time, machine ID…) using Bluetooth or LoRa.In summary, those smart functions can be triggered by external data (e.g. from sensors, buttons, to account for and adapt to environment changes).The scores returned by the smart functions can trigger all kinds of behaviors on your device.The possibilities are endless.
After initialization, no knowledge base exists yet. It needs to be acquired in-situ, using real signals. Your Library won’t be pre-trained with the signals imported before benchmark, in Steps 2 and Step 3. Therefore, you need to learn some signals.
A learning phase corresponds to several iterations of the
learn()function. You should use at least the minimum number of iterations recommended in the benchmark summary from Step 4. This learning will be incremental and unsupervised.
To learn some signals from a file, click Select file and open the file containing your training data.
To learn some signals “live” from your Serial port, using your own data logger, click Serial data. Then, select your Serial / COM port (refresh if needed), choose your preferred baudrate, and Start recording by clicking the red button.
As soon as some signals are learned, the number of learned signals will be indicated.
Click Go to detection after all relevant signals (nominal, by definition) have been learned.
When a first knowledge base has been established, you can use Detection using any signals, to check if they would be classified as nominal or anomaly by the Library, and make sure this Library performs as intended.
As usual, the signals to use for detection can be imported from file, or from Serial port using a data logger.
Select the signals that you wish to use, and adjust the sensitivity if needed. A pie chart will summarize the detection results.
When detecting using live data from the Serial port, a graph will show how the detection performance (similarity percentage) evolves in real time.
All details of all learning and detection iterations (such as similarity and signal status) are available on the terminal window embedded on the right side of the screen.
Feel free to repeat as many times as needed, adjusting the sensitivity or running additional Learning cycles in the process.
- If the results obtained are satisfactory, move on to the next step, and Deploy your library on your microcontroller.
- Otherwise, it is time to review your data logging procedure (sampling frequency, buffer size, signal length…), import other sets of signals, and start a new benchmark. Also see the next section, Possible cause for poor emulator results.
You will probably not land your ideal library the first time. Using NanoEdge AI Studio is an iterative process. Try, learn, adjust, and repeat!
Here are the functions of the Classification Library that are available through its Emulator:
run first to initialize the knowledge
run an inference iteration (detect which class the input signal belongs to)
Just like in Anomaly Detection (see “Important” section above), the
classifierfunction can be called dynamically whenever needed. It can be triggered by external data (e.g. from sensors, buttons, to account for and adapt to environment changes), and the class / probabilities returned can trigger all kinds of behaviors on your device.
To classify signals from a file, click Select file and open the file containing the signal examples to classify. You will see a pie chart summarizing the classification.
The image below shows data from a 3-speed fan:
- 3 signals were detected at “speed 1”,
- 7 at “speed 2”,
- 17 at “speed 3”,
- 6 when the fan air flow was obstructed,
- … and so on.
To classify signals “live” from your Serial port, using your own data logger, click Serial data. Then, select your Serial / COM port (refresh if needed), choose your preferred baudrate, and Start recording by clicking the red button.
You will see a pie chart summarizing the classification, as well as a graph showing the probabilities associated to each classification iteration (the image below shows data from a 3-speed fan, and “speed 1” is currently being detected).
To minimize classification errors, or eliminate transient regimes during your detections, you may choose to validate signals by increasing the number of consecutive confirmations.
In the example above, the number of confirmations is set to 2, meaning that a signal will only be validated as pertaining to a given class after 2 consecutive data buffers have been successfully classified.
In this example, on a total number of 96 signals seen, 40 verified classifications have been counted (8+8+5+5+4+5+5), out of a possible maximum of 96/2 = 48.
See also this note about confirmation procedures.
All details of all classification iterations (such as class IDs and class probabilities) are available on the terminal window embedded on the right side of the screen.
iii. Possible causes of poor emulator results¶
Here are possible reasons for poor anomaly detection or classification results:
- The data used for library selection (benchmark) is not coherent with the one you’re using for testing via Emulator/Library. The regular/abnormal or class signals imported in the Studio should correspond to the same machine behaviors, regimes, and physical phenomena as the ones used for testing.
- Your (balanced) accuracy score was well below 90% or your confidence score was too low to provide sufficient data separation.
- You used an insufficient number or signals in either regular/abnormal or class signal files. Make sure that you used enough lines in your input files (minimum 20-50). For anomaly detection, make sure that you use at least the minimum number recommended by the Studio, and possibly more.
- The sampling method is inadequate for the physical phenomena studied, in terms of frequency, buffer size, duration, etc.
- The sampling method has changed between Benchmark and Emulator tests. The same parameters (frequency, signal lengths, buffer sizes…) must be kept constant throughout the whole project.
- [Anomaly Detection]: you haven’t run enough learning iterations (your machine learning model is not rich enough), or this data is not representative of the signal examples used for benchmark. Don’t hesitate to run several learning cycles, as long as they all use nominal data as input (only normal, expected behavior should be learned).
- [Classification]: the machine’s status or working conditions have drifted between Benchmark and Emulator tests, and classes aren’t recognized anymore. In that case, please update the imported “class” files, and start a new benchmark.
If still unable to get good emulator results, and you’re running out of ideas, don’t hesitate to contact us at firstname.lastname@example.org.
5. Downloading the NanoEdge AI Library¶
This feature is only available:
- in the Trial version of NanoEdge AI Studio, limited to the featured boards which can be selected during project creation;
- in the Paid version of NanoEdge AI Studio.
i. General case¶
In this step (Step: Deploy), the library will be compiled and downloaded, ready to be used on your microcontroller for your embedded application.
Before compiling the library, several compilation flags are available:
If you ran several benchmarks, make sure that the correct benchmark is selected. Then, when you are ready to download the NanoEdge AI Library, click Compile.
Select Development version to get a library that is intended for testing and prototyping. If you would like to start producing your device, integrating NanoEdge AI Library, please contact us for more details and to get the proper Library version.
After a short delay, a .zip file will be downloaded to your computer.
It contains all relevant documentation, the NanoEdge AI Emulator (both Windows and Linus versions), the NanoEdge AI header file (C and C++), a .json file containing some library details, and the model’s knowledge (for classification only).
You can also re-download any previously compiled library, via the archived libraries list:
In this final step (Step: Deploy) you also have the possibility to add a suffix to the library you’re about to compile and download.
This is useful if you’d like to integrate multiple libraries into the same device / code, when there is a need to:
- monitor several signal sources coming from different sensor types, concurrently, independently,
- train machine learning models and gather knowledge from these different input sources,
- take decisions based on the outputs of the machine learning algorithms for each signal type.
For instance, one library can be created for 3-axis vibration analysis, and suffixed
Later on, a second library can be created later on, for 1-axis electric current analysis, and suffixed
All the NanoEdge AI functions in the corresponding libraries (as well as the header files, variables, and knowledge files if any) will be suffixed appropriately, and will be usable independently in your code. See below the header files and the suffixed functions and variables corresponding to this example:
For more info, please check the Library documentation (AD Library, Cl Library), as well as the code snippets on the right side of the screen, which provide general guidelines about how your code could be structured, and how the NanoEdge AI Library’s functions should be called.