AI Based Object Redaction/Examples/Library Examples: Difference between revisions

Revision as of 23:18, 29 December 2023

In this section will be explained an example for face redaction running on GPU.

Backend

First, we create the backend object. The backend provides factories to create the redaction algorithm and buffers. It provides the user the ability to select the desired backend for the execution of the algorithm for object redaction. The backend could be CPU or GPU.

std::shared_ptr<rd::IBackend> backend = std::make_shared<rd::gpu::Backend>();

In case the input buffer is not already in GPU memory, we also need to create a CPU backend to allocate the buffer in CPU memory.

std::shared_ptr<rd::IBackend> cpu_backend = std::make_shared<rd::cpu::Backend>();

Get algorithm

The GetAlgorithm method is used to obtain the redaction algorithm to process the input buffer.

std::shared_ptr<rd::IRedaction> algorithm = backend->GetAlgorithm();

Get Model

The getModel method is used to obtain the AI model that will be used for the detection of the desired object. In this case: faces.

std::shared_ptr<rd::IModel> model = backend->getModel(rd::Model::FACE_DETECTION);

Buffers

Buffers are the structures used to manipulate and load the data corresponding to the video frames. A buffer consists of a resolution and a format.

Resolution

Resolution is a structure that consists of two parameters: width and height. The resolution of the input video/image may differ from the resolution accepted by the AI model. We should create both of these resolutions with the AI model resolution as 640x480 in case of ONNX face detector.

#define CONVERT_WIDTH 640
#define CONVERT_HEIGHT 480
rd::Resolution input_resolution = rd::Resolution(INPUT_WIDTH, INPUT_HEIGHT);
rd::Resolution convert_resolution = rd::Resolution(CONVERT_WIDTH, CONVERT_HEIGHT);

Format

Format is an enumeration of values for the supported formats, which are: RGBA, RGB, GREY and YUV. The format of the input video/image may differ from the format accepted by the AI model. We should create both of these formats with the AI model format as RGB in case of ONNX face detector.

rd::Format format = rd::Format::RGB;
rd::Format input_format = rd::Format::INPUT_FORMAT;

Allocate Buffers

With the resolution and formats defined, the buffer objects can be created.

With the cpu backend create an input and output buffers in CPU memory with the input video/image resolution and format. The input buffer must contain the image/frame data deploy in an array containing a pointer to each color component of the data.

std::shared_ptr<rd::io::IBuffer> input = backend_cpu->getBuffer(imageData, input_resolution, input_format);
std::shared_ptr<rd::io::IBuffer> output_final = backend_cpu->getBuffer(input_resolution, input_format);

With the gpu backend create an input and output buffers in GPU memory with the input video/image resolution and format. Also for the AI model to work properly create a GPU memory buffer with the supported resolution and format.

std::shared_ptr<rd::io::IBuffer> input_gpu = backend->getBuffer(input_resolution, input_format);
std::shared_ptr<rd::io::IBuffer> output = backend->getBuffer(input_resolution, input_format);
std::shared_ptr<rd::io::IBuffer> input_convert = backend->getBuffer(convert_resolution, format);

The CPU input buffer must be allocated to GPU memory when using GPU, to accomplish the allocation use the copyFromHost method to upload the input buffer to GPU memory.

input_gpu->copyFromHost(input);

Redaction Algorithm

The Object Redaction library its compound by the stages: convert, detect, track (optional) and redact. This stages can be perform in a single step using the apply method or in a step-by-step process.

The Object Redaction library use a vector of a structure Rectangle to save the detected and tracked faces coordinates in a images for the redaction algorithm to modify the output buffer. This vector must be initialize before performing the detect stage.

std::vector<rd::Rectangle> faces;

Step-by-step

First step is to preprocess the input image to be accepted by the AI model.

algorithm->convert(input_gpu, input_convert);

Second step is to detect the faces in the preprocess image and save the coordinates in the vector of rectangles.

algorithm->detect(model, input_convert, &faces);

The final step is to redact the detected faces in the given coordinates.

algorithm->redact(input_gpu, output, faces, rd::RedactionAlgorithm::BLURRING);

Apply method

To apply the redaction algorithm in a single step use the apply method set by the algorithm. This method

rd::IRedaction::apply(backend, model, input_gpu, output, &faces, rd::RedactionAlgorithm::BLURRING);

Download buffer to CPU memory

When using GPU the output buffer must be allocated to CPU memory. To accomplish the allocation use the copyToHost method to download the output buffer to CPU memory.

output->copyToHost(output_final);

The output final buffer contains the modified image where the detected faces have been redact.

Full example

The full example script should look like:

#include "rd/common/datatypes.hpp"
#include "rd/common/ibackend.hpp"
#include "rd/common/ibuffer.hpp"
#include "rd/common/imodel.hpp"
#include "rd/common/runtimeerror.hpp"

/*Backend*/
#include "cpu/backend.hpp"
#include "gpu/backend.hpp"

/*io*/
#include "cpu/onnxfacedetect.hpp"
#include "cpu/redaction.hpp"
#include "gpu/onnxfacedetect.hpp"
#include "gpu/redaction.hpp"
#include "rd/common/ivideoinput.hpp"
#include "rd/io/v4l2/v4l2capture.hpp"

#include <unistd.h>

#include <fstream>
#include <iostream>
#include <memory>
#include <string>

#define INPUT_WIDTH 1080
#define INPUT_HEIGHT 720
#define INPUT_BPP 2
#define CONVERT_WIDTH 640
#define CONVERT_HEIGHT 480

static void save_buffer(std::shared_ptr<rd::io::IBuffer> buffer,
                        std::string name) {
  /*Save the buffer*/
  std::vector<unsigned char*> data = buffer->data();
  uint size = buffer->stride()[0] * buffer->size().height;

  FILE* file = fopen(name.c_str(), "wb");
  fwrite(data[0], size, 1, file);
  fclose(file);
}

int main() {
  /* Open the image file using fstream */
  std::ifstream file(SEVEN_FACES, std::ios::binary);

  if (!file.is_open()) {
    std::cerr << "Error: Unable to open the image file." << std::endl;
    return -1;
  }
  /* Determine the file size */
  int file_size = INPUT_WIDTH * INPUT_HEIGHT * INPUT_BPP;

  /* Read the image data into a vector */
  unsigned char* data_ptr = new unsigned char[file_size];
  std::vector<unsigned char*> imageData;
  file.read(reinterpret_cast<char*>(data_ptr), file_size);
  imageData.push_back(data_ptr);

  /* Create GPU Backend */
  std::shared_ptr<rd::IBackend> backend = std::make_shared<rd::gpu::Backend>();
  /* Create CPU backend to save the final image */
  std::shared_ptr<rd::IBackend> backend_cpu =
      std::make_shared<rd::cpu::Backend>();

  /* Get Algorithm */
  std::shared_ptr<rd::IRedaction> algorithm = backend->getAlgorithm();
  std::shared_ptr<rd::IModel> model =
      backend->getModel(rd::Model::FACE_DETECTION);

  /* Buffers */
  rd::Resolution input_resolution = rd::Resolution(INPUT_WIDTH, INPUT_HEIGHT);
  rd::Resolution convert_resolution =
      rd::Resolution(CONVERT_WIDTH, CONVERT_HEIGHT);
  rd::Format format = rd::Format::RGB;
  rd::Format input_format = rd::Format::YUY2;

  /* Allocate Buffers */
  std::shared_ptr<rd::io::IBuffer> input =
      backend_cpu->getBuffer(imageData, input_resolution, input_format);
  std::shared_ptr<rd::io::IBuffer> output_final =
      backend_cpu->getBuffer(input_resolution, input_format);
  std::shared_ptr<rd::io::IBuffer> input_convert =
      backend->getBuffer(convert_resolution, format);
  std::shared_ptr<rd::io::IBuffer> output =
      backend->getBuffer(input_resolution, input_format);
  std::shared_ptr<rd::io::IBuffer> input_gpu =
      backend->getBuffer(input_resolution, input_format);

  input_gpu->copyFromHost(input);
  /* Preprocess image to be accepted by the face recognition model */
  algorithm->convert(input_gpu, input_convert);
  /* Detect Faces */
  std::vector<rd::Rectangle> faces;
  algorithm->detect(model, input_convert, &faces);
  /* Print out detected faces */
  std::cout << faces.size() << std::endl;
  for (size_t i = 0; i < faces.size(); i++) {
    std::cout << faces[i] << std::endl;
  }

  /* Redact detected faces */
  algorithm->redact(input_gpu, output, faces, rd::RedactionAlgorithm::BLURRING);
  /* Download buffer to CPU memory */
  output->copyToHost(output_final);
  /* Save redacted image */
  save_buffer(output_final, "output_final");
  std::cout << "Exit!!" << std::endl;

  return 0;
}

⟵

Index

⟶