SSL Detector: Objects Detection and Position Estimation at the RoboCup Small Size League (SSL)

Hi!

We’ve made this project for enabling our omnidirectional robot to execute soccer tasks autonomously:

In the RoboCup Small Size League (SSL), teams are encouraged to propose solutions for enabling robots to execute basic soccer tasks inside the field using only embedded sensing information for the Vision Blackout Challenge. We propose an embedded monocular vision approach for detecting objects and estimating relative positions inside the soccer field.

During soccer matches and especially for the Vision Blackout challenge, SSL objects mostly lay on the ground, and we exploit this prior knowledge for proposing a monocular vision solution for detecting and estimating their relative positions to the robot:

  • Camera is fixed to the robot and its intrinsic and extrinsic parameters are obtained offline using calibration and pose computation techniques from the Open Computer Vision Library (OpenCV);
  • SSD MobileNet v2 is used for detecting objects’ 2D bounding boxes on camera frames;
  • Linear regression is applied to the bounding box’s coordinates, assigning a point on the field that corresponds to the object’s bottom center, which has its relative position to the camera, and, therefore, to the robot, estimated using pre-calibrated camera parameters;
  • The system achieves real-time performance with an average processing speed of 30 frames per second and 10.8 Watts of power consumption.

The following figure illustrates a scheme for the proposed method:

The Nano sends target positions and kicking commands to the robot’s microncontroller through UDP socket, enabling it execute basic soccer tasks autonomously:

  1. Grabbing a Stationary Ball:
    stage1

  2. Scoring on an Empty Goal:
    stage2

Code and Documentation are available at:

Our full paper presented at the 2022’s RoboCup Symposium can be found at arXiv: [2207.09851] An Embedded Monocular Vision Approach for Ground-Aware Objects Detection and Position Estimation

Thanks!
João

2 Likes

Very impressive!

I’m curious about the 3.5m limitation mentioned in your paper: “..detecting balls, robots, and goals with distances up to 3.5 meters.” What prevents the system from working for distances greater than 3.5 m?

Detecting balls for distances further than 3.5m is hard due to the ball’s size (42.67 mm of diameter), which makes it too small in the image and hard to detect. Other objects are still detected, but their position estimations get less accurate with the distances.

Thank you for that explanation. That makes sense.

Would a higher-resolution camera address this issue?

FWIW: I’m looking to build something similar, but unrelated to RoboCup. Orin Nano-based. The general idea is to work outdoors with a full-sized (#5) ball. You pass the ball in the general direction of the bot, it computes the trajectory and moves to block/intercept the ball (movement is roughly perpendicular to the path of the ball). It then positions itself to “punch” the ball back to the kicker. Essentially, a scaled-up version of what you’ve built, but likely with a different drive technology (e.g., Swerve drive instead of Omni wheel).

I don’t think that solves the issue, because the CNN input resolution stays the same. I would say that increasing the camera resolution would only increase computation time in the pre/post processing steps. One thing you can try is to decrease the confidence thresholds for ball detection, it might not be a reliable solution, but I think is the most straight forward path you can try.

Also, if you use a higher resolution in the camera and a network with higher input resolution it can help with the issue. Currently, I would suggest you to try different types of yolov8, since deploying them in the Orin Nano with TensorRT right now is extremely easy.

Yes, the higher camera resolution would have to be paired with an increase in CNN resolution.

Currently looking at DeepStream SDK + YOLO.

Probably helps that the ball will be bigger, too :-) I want to be able to identify and track the ball from at least least 20 m away (preferably 30).

Thinking about this some more, a #5 football/soccer ball is over 5 times the diameter of a golf ball, so in my scenario, your h/w+s/w “as is” should have a range close to 20 m.