Coloring Module Concepts

Overview

RGB cameras provide a source color information. Though the Stereolabs ZED cameras also have stereoscopic capability, the pipeline avoids any depth processing due to latency concerns. Instead cameras are used purely for color information.

../../_images/zed2.png

Stereolabs ZED2 Camera used on 24a

Goal of the Coloring Module

The Coloring Module aims to apply RGB color information from cameras to classify cones coming in from the LiDAR Module by color. Color classification delineates the sides of the track and allows Path Planning and Controls to do higher-level processing.

Info

Input: a set of uncolored cone centroids.

  • /cpp_cones

    • interfaces::msg::PPMConeArray

Output: a set of colored cone centroids passed down the pipeline to Path Planning and Controls.

  • /perc_cones

    • interfaces::msg::ConeArray

Algorithm

../../_images/coloring_algo_diagram.svg

Algorithm diagram

Simplified Point to Pixel Mapping Algorithm

for each set of cone centroids / camera image do:
    YOLO v5 cone detection inference: bounding boxes of different color/size classes

    for each cone centroid do:
        transform to image space via transform matrix (see Direct Linear Transform below)

        if point is within a single bounding box classify as that color

        if point is in multiple boxes use a rough depth heuristic to pick one box

        else label the point as unknown

Notes

  • Point to Pixel Mapping makes the assumption that the cameras remain rigid with respect to the LiDAR at all times.

  • Our YOLO v5 is trained on data from Formula Student Objects in Context Dataset (FSOCO)

  • Depth heuristic uses the idea that the area of the bounding box roughly corresponds to depth

Direct Linear Transform (DLT)

What is DLT?

Direct Linear Transform (DLT) is the technique we use to solve for the static transform matrix from LiDAR to camera space. It consists a calibration sequence that uses a series of (at least 6) points identified by hand in both the LiDAR and camera frames and a mathematical formula to solve for the matrix.

Why use DLT?

It is insufficient to use geometric approaches, e.g, measuring or CAD to estimate the static transformation matrix from camera to LiDAR. This is due to a multitude of reasons, but primarily results from the difference between design and fabrication. DLT allows for an accurate transform to be calculated via calibration instead.

../../_images/ppm_calibration.JPG

Figure: calibration setup consisting of many cones spread throughout at different heights and depths

Warning

This method is heavily dependant on a good calibration. If the sensors move relative to each other or if the calibration points weren’t picked at various depths / heights, the accuracy drops off steeply.