Joyous 工程師の師: Easy Real-Time Gesture Recognition with MediaPipe & Python! (AI Tutorial)

Unlock the power of MediaPipe for real-time hand tracking and gesture recognition! This summary guides you through using Python, OpenCV, and MediaPipe to detect hands, identify key landmarks, and even build a simple gesture recognition system. Discover how to interpret hand positions, calculate finger angles, and translate those into recognizable gestures.

Quick Takeaways:

Leverage OpenCV (cv2) for video capture and image display.
Utilize MediaPipe's hand tracking solution (mp.solutions.hands) for detecting hand landmarks.
Convert images from BGR to RGB format for MediaPipe compatibility.
Visualize hand landmarks and connections using MediaPipe's drawing utilities.
Calculate finger angles to determine gestures like "thumbs up," "rock on," or even obscuring gestures!
Learn how to adjust MediaPipe parameters like hand detection confidence and landmark tracking stability.
Display the recognized hand gesture on the screen in real-time.
Apply pixelization to specific gestures for privacy purposes.

This summary is shown in the default language () as it was not available in your selected language.

Introduction to Hand Gesture Recognition Project

This article focuses on a hand gesture recognition project using Python and various libraries. The main goal is to enable the computer to recognize different hand gestures through the camera and perform corresponding operations.

Setting Up the Project in VS Code

Create a New File: Open Visual Studio Code and create a new file named Hand_Pos.py.
Import Libraries:
- Import cv2, which is the OpenCV module used for image processing functions.
- Import Mediapipe, the main model suite for advanced functions like hand detection and face recognition.

Initializing the Camera

Start the Camera: Use cv2.VideoCapture(0) to start the computer's default camera. The parameter 0 represents the first camera device. Assign this camera control object to the variable cap.
- If you want to use the second camera, change the parameter 0 to 1 or 2.
- Set Up a Loop to Capture Frames:
- Create a while loop to continuously capture frames from the camera.
- Inside the loop, use cap.read() to read the camera frame. It returns two values: ret (indicating whether the frame was read successfully) and img (the captured frame data).
- If ret is False (e.g., camera not inserted or device abnormal), print an error message and end the loop.
- Display the Frame: Use cv2.imshow to display the frame in a window named "Joyous img".
- Check for Keyboard Input: Use cv2.waitKey(1) to check for keyboard input every 1 millisecond. If the user presses the 'q' key (make sure to switch to English), terminate the loop and end the program.

Adding MediaPipe Hand Detection

Get the Hand Detection Tool: Through mp.solutions.hands module, get the MediaPipe hand detection tool and name it mp_hands.
Create a Hand Detection Model Workspace:
- Use the with syntax to create a workspace for the hand detection model.
- Pass several parameters to set the model's behavior:
  - model_selection=0 for a simpler model suitable for devices with limited performance.
  - max_num_hands=1 to detect one hand initially.
  - min_detection_confidence (range 0 - 1) represents the model's confidence in determining if it's a hand.
  - min_tracking_confidence (range 0 - 1) is the stability requirement for key point tracking after detecting a hand.
  - Convert Image Format: Since the camera reads images in BGR format and MediaPipe needs RGB format, convert the image to RGB and store it in the variable img2.
  - Process the Image with the Model: Pass the image to hands.process function. This analyzes the image for hands and detects the coordinates of 21 hand key points, which are stored in the results variable.
  - Print Key Point Information: Print results.multi_hand_landmarks to see the key point information. When there is no hand in the frame, the result is None. When a hand is present, a set of x, y, z coordinates (where x is the horizontal axis, y is the vertical axis, and z is the depth) will be displayed.

Drawing Hand Key Points on the Image

Import Drawing Tools: Import MediaPipe's drawing tools and style modules. The drawing tool is used to draw nodes and skeletons, and the style module is used to change colors and shapes.
Draw Key Points and Skeleton:
- Use an if statement to check if a hand is detected. If so, use a for loop to process each of the 21 key points.
- Call the draw_landmarks function with five parameters:
  - img: The image on which to draw.
  - hand_landmarks: To mark the 21 key point positions on the hand.
  - HAND_CONNECTIONS to draw the hand skeleton.
- Optionally, adjust the fourth and fifth parameters to change the size, color, and transparency of the nodes and the color and thickness of the lines.

Adjusting MediaPipe Hand Detection Parameters

Max Number of Hands:
- Initially set max_num_hands to 1. When two hands are raised, only one hand is detected.
- Change it to 2, and both hands are successfully detected. This can be adjusted according to the application scenario.
- Hand Detection Confidence:
- Set min_detection_confidence to 1.0. The system is too strict, and even a real hand may not be detected.
- Set it to 0.1. The detection sensitivity increases, but there is a higher chance of false positives, like detecting a non - hand object (e.g., a toy) as a hand.
- Key Point Tracking Confidence:
- Set min_tracking_confidence to 1.0. Even if a hand is detected, key points may not be displayed.
- Set it to 0.1. Key points are more likely to appear, but there may be unstable or incorrect tracking.

Implementing Hand Gesture Recognition

Principle of Hand Gesture Recognition:
- Observe the curling of each finger to determine the gesture. For example, for the middle finger, calculate the angle between two vectors formed by key points to determine if it is straight or bent.
- Writing the Angle Calculation Function:
- Define a function vector_2d_angle() to calculate the angle between two 2D vectors.
- Use the dot product formula to calculate the angle, and handle the case of division by 0 by setting the angle to 180 degrees.
- Getting Finger Skeleton Vectors:
- Define a function hand_angle() to get the vectors of the finger skeletons.
- For each finger, calculate two vectors from specific key points and use the vector_2d_angle() function to calculate the angle. Store the angles of all five fingers in a list and return it.
- Defining Hand Gesture Judging Conditions:
- Define a function hand_pos() to judge the hand gesture based on the angles of the fingers.
- Set a judgment standard: if the angle of a finger is less than 50 degrees, it is considered straight; if it is greater than or equal to 50 degrees, it is considered bent.
- Write logical judgment statements for different hand gestures, such as 'good' (thumb straight, other fingers bent), 'no!!!' (only middle finger straight), 'ROCK!' (thumb, index finger, and little finger straight), '0' (all fingers bent), and '5' (all fingers straight).

Displaying Hand Gesture Results

Setting Up the Display:
- Move the camera - opening code for better organization.
- After identifying the hand gesture, use cv2.putText() to display the gesture name on the image. Set parameters such as font style, size, color, and thickness.
- Preparing Key Point Coordinates:
- Create three lists: finger_point to store the actual (x, y) coordinates of each key point, and fx and fy for the x and y coordinates for the mosaic.
- Use a for loop to extract the coordinates of each key point. Note that MediaPipe returns normalized values, so multiply them by the actual width w and height h of the image.
- Calculating and Displaying Gestures:
- Pass the finger_point data to the hand_angle function to calculate finger angles.
- Then pass the angles to the hand_pos() function to determine the hand gesture.
- Use cv2.putText() to display the gesture name on the image.

Adding Mosaic for Inappropriate Gestures

Detecting the 'No' Gesture:
- When the recognized gesture is 'no!!!' (only middle finger straight), calculate the range of the middle finger on the image.
- Use max() and min() functions on the fx and fy coordinates to find the outermost points, and add an expansion range of - 10.
- Perform boundary checks to ensure the mosaic area does not exceed the image boundaries.
- Creating the Mosaic:
- Extract the mosaic area from the original image img based on the calculated range.
- Resize the area to an 8x8 pixel small image to make it blurry.
- Resize the blurry image back to the original size using the nearest - neighbor interpolation method to create a mosaic effect.
- Paste the blurry image back to the original position.
- Use an else statement to handle cases when the gesture is not 'no!!!'.

Conclusion and Future Applications

This hand gesture recognition project allows for real-time gesture recognition and additional processing for specific gestures. It has practical applications in privacy protection, video games, and interactive design. If this tutorial is helpful, please like, subscribe, and share.

About MediaPipe

MediaPipe is an open-source cross-platform framework by Google for real-time visual and machine learning processing, especially suitable for image recognition, hand tracking, and face detection tasks. It simplifies complex image processing tasks by providing easy-to-use modules.

Testing the Installation

Before starting the project, ensure the successful installation of three key libraries: mediapipe, tensorflow, and opencv-python. You can test the installation by running simple import commands in the Python environment. If there are no errors, the installation is successful.

Easy Real-Time Gesture Recognition with MediaPipe & Python! (AI Tutorial)

Summary

Quick Abstract

Introduction to Hand Gesture Recognition Project

Setting Up the Project in VS Code

Initializing the Camera

Adding MediaPipe Hand Detection

Drawing Hand Key Points on the Image

Adjusting MediaPipe Hand Detection Parameters

Implementing Hand Gesture Recognition

Displaying Hand Gesture Results

Adding Mosaic for Inappropriate Gestures

Conclusion and Future Applications

About MediaPipe

Testing the Installation

Quick Actions

More from Joyous 工程師の師

【AI工具：MixerBox AI Podcast】現在AI那麼瘋狂連 Podcast 都能做! 口齒不清的救星? | Podcast | MixerBox AI | 機器學習-從理論到實作攻略

【演算法：模糊控制】掌握模糊控制的核心：從原理到應用的全過程教學 | Mastering Fuzzy Logic: From Fuzzy Sets to Intelligent Controllers

Related Summaries

【AI工具：MixerBox AI Podcast】現在AI那麼瘋狂連 Podcast 都能做! 口齒不清的救星? | Podcast | MixerBox AI | 機器學習-從理論到實作攻略

【演算法：模糊控制】掌握模糊控制的核心：從原理到應用的全過程教學 | Mastering Fuzzy Logic: From Fuzzy Sets to Intelligent Controllers

【AI工具：MixerBox AI Podcast】現在AI那麼瘋狂連 Podcast 都能做! 口齒不清的救星? | Podcast | MixerBox AI | 機器學習-從理論到實作攻略

【演算法：模糊控制】掌握模糊控制的核心：從原理到應用的全過程教學 | Mastering Fuzzy Logic: From Fuzzy Sets to Intelligent Controllers

Summarize a New YouTube Video