I.MX8 Deep Learning Reference Designs - High Level Design Project Architecture
i.MX8 Deep Learning Reference Designs |
---|
Getting Started |
Project Architecture |
Reference Designs |
Customizing the Project |
Contact Us |
After reading this page you will get a general notion of the main functionality of this project and how it is composed from a design point of view.
The following diagram shows a high-level overview of the system architecture:
Main Framework
This subsystem encapsulates all the modules that constitute the general framework of the project. The idea is that each component of the framework is independent of a specific technology or application so that these modules can be reused regardless of the context in which this reference design will be used. This main infrastructure is responsible for driving the application state and logic and may be reused in a wide variety of AI-powered applications. These modules need no modification to implement a new application. Within this type of module, there are the following:
Camera Capture
This subsystem is responsible for managing the entities in charge of receiving the information data. The input data can be a live streaming video or a video file. Due to the flexibility of the design, this module does not depend on a specific camera that captures the information, nor does it have a dependency on the transmission protocol used to deliver the data. Regardless of the custom module used, Camera Capture will manage the control flow of information by interacting with the entities called media. Each media abstracts through its interface, any specific implementation, and provides the basic control operations like create, start, stop, and delete.
Camera capture provides customization options when creating media entities. As long as the behavior established by the interface is respected, the media can be based on different frameworks and libraries for handling multimedia, such as GStreamer, OpenCV, among others.
Camera capture is also responsible for managing the possible errors that could appear during the process of data transmission. If a failure is detected during the data transmission process, the module will execute the necessary actions to maintain stable the system operation.
AI Manager
This module is in charge of receiving the information data coming from Camera Capture and executes the AI operations necessary to obtain useful information from the transmitted data. Within its responsibilities, the AI manager interacts with entities called engines, which abstract specific implementations on how to perform AI video analysis to infer business rules, depending on the current context of the application. Each engine provides methods to manage the flow of processed information, through basic operations such as create, start, stop and delete.
The implementation of the engines is based on the R2Inference and GstInference frameworks, which provide a set of tools to perform AI operations during streaming analysis. These R2Inference and GstInference implementations are part of the project's framework, so the module is reusable and independent of any custom applications that need to use its services.
Although the AI manager contemplates AI operations, he is not responsible for executing actions depending on the information analyzed. The output of this module will be sent to the Action Dispatcher component. Even so, the AI Manager does monitor the necessary flow since the engines process the streamings through a component called Inference Listener. The Inference Listener provides a well-defined interface, so that he abstracts its implementations, without depending on any specific technology or protocol to handle such information.
Action Dispatcher
This block is in charge of executing specific actions on the media entities defined by the user, based on the inference information received from the internal engine depending on the inference model used. This block depends on the use of triggers that will determine whether the actions need to be executed or not, based on provided policies that work as filters. In other words, this block manages the results of the policies, that are evaluated by the triggers, and depending on its result will do the actions defined by the user.
Trigger
This component is responsible to check the inference information received. If the inference information complies with the policies, this block will execute the actions. Policies, actions, and triggers are set up by the user in the application part allowing custom configuration of the data that will be processed. A trigger is compound for groups of policies and actions, so the stream sources can have different behavior.
Config Parser
This module is responsible for building the configuration that the application will use, by loading the project configuration information. The user is in charge of setting up the necessary information such as policies, actions, triggers, and source stream information, depending on the desired application behavior.
Decoupling Interfaces
Interface modules are responsible for establishing the connection between framework components and custom application modules. The existence of these boundaries is important to avoid mixing application-specific business logic with common infrastructure. In addition, this design allows the project to decouple and rearrange components, without affecting its functionality.
Custom Application
As seen in the diagram, custom application blocks represent any type of implementation or specific technology that can be included in the system design. The diagram presents some examples, however, the I.MX8 Deep Learning Reference Designs project is not limited to those particular modules. The possibility of extending the design and incorporating specific business rules for each application is what gives this project a high degree of flexibility.
Next, it will be explained which modules can include a custom implementation, with code added by the user, and also it is included some examples of what kind of technologies can be used in the construction of these components.
Custom Media
This component represents the entity in charge of transporting the information data received by the camera. The goal is that the Camera Capture module can use every media instance, regardless of how the video frames are being processed. For example, a media could be used in such a way, that receives data through the RTSP protocol, which is capable of receiving and controlling the flow of information received, both audio and video, in a synchronized manner and in real-time. The requirement in this type of communication is to establish a connection between the host server, in charge of transmitting the information, and the media that capture it. For instance, in a restricted zone, this type of camera represents a popular choice due to its flexibility to integrate with CCTV systems.
In addition, if specific cameras are used with well-defined interfaces according to the hardware they present, the user can add a custom media that is compatible, such as GigE Vision, which is used for the transmission of video in high-performance industrial cameras that can be used in applications that require strict monitoring such as the aerospace industry. Another option is the MIPI Camera Serial Interface (CSI), which is an interface widely used in embedded systems for communication between digital cameras and target processors, to run tasks on the edge. As long as the media module interface is respected, there are many possibilities of specific implementations that can be used to process the video frames, without affecting the behavior of the general Camera Capture module.
Custom Deep Learning Models
It should be mentioned that the inference logic is also encapsulated in a separate, independent module, called the engine, with a well-established interface. This module bases its operation on the R2Inference and GstInference frameworks, and allows, within its configurations, to use different inference models according to the application being developed. So, for instance, in a restricted zone system, you could use a network that can detect people.
This configuration will vary from application to application. A speed limit enforcer will likely use a car detector and a tracker. A neuromarketing-powered billboard will use a face detector and a gaze tracker. As you can see, having the inference logic in an independent module allows you to highly customize your deep learning pipeline without modifying the rest of the architecture.
Custom Inference Listener
The inference listener component is responsible for transmitting the information metadata, which is obtained at the output of the neural networks used in the GstInference pipeline. If we are in a system to detect pedestrians trespassing in restricted areas, the inference metadata will contain information about the detected people around the area of interest in the zone center. If the application corresponds to a security system in a parking lot application, the inference results contain essential data about the detected vehicle and its respective license plate.
Therefore, this component has the task of transmitting the metadata that is being obtained in real-time. An example of this type of component could be the GStreamer Daemon Python API, which allows listening to the target pipeline signal and is integrated into the application flow. However, since GStreamer Daemon is based on the GStreamer framework, the user is free to create their custom inference listener element, which can be added to the media pipelines or use another component that is already developed.
Custom Policy
Policies represent the business rules that make your application unique. They take the predictions made by the inference process and perform informed decisions based on them. The specific implementation of how to make each decision is completely left to the users. The important thing at this point is that you can add or delete the number of policies you want, without affecting the general behavior of the system.
In a detect pedestrian trespassing restricted areas example, the business rules will receive a raw prediction containing a list of objects and their locations around the area of interest. These business rules successfully read if the prediction made is a person, if it exceeds a threshold and if the person is inside the restricted zone area.
Take, for example, a parking lot system. You can implement several business rules to process the predictions, for instance, to minimize the probability of an erroneous license plate read, you can implement a low pass filter technique that only reports a successful read if it received N matching sequential predictions. Or if a car is detected at the entrance, exit, or any stall, it is only reported once. If a vehicle is moving between stalls, it is flagged as suspicious, etc.
As you can see, business rules take predictions as inputs and maintain a running state of the system to perform different tasks accordingly.
Custom Action
This type of module represents the actions that will be executed once the business rules established by the policies are activated. As with the previous component, the user has complete freedom on what type of actions he wants to implement and what tools he will use for its development.
It is important to mention that different policies will trigger different actions. In a restricted zone detector example you can have several policy-action relationships like:
- If a person enters the restricted zone a picture is taken
- If a person is detected in the restricted zone, the event is logged into a database
- If a person is detected in the restricted zone, the event is displayed in a dashboard to the user
Most of them can be recycled for other applications. Again, being decoupled in independent modules and protected by the action interface allows very custom actions without interfering with the overall architecture.
Custom Config
The config module represents the way in which the general configurations of the project are loaded prior to execution. The user can decide if the configuration is obtained through a file, command line, or through network configuration. Each custom implementation must define the parameters that it considers necessary, such as URLs, media identifiers, policies, actions, paths, and any other necessary parameters.