Mouse Control Using Eye Tracking

Mouse Control Using Eye Tracking with Python

Eye-tracking-based mouse control is an innovative way to interact with your computer, particularly helpful in accessibility scenarios. In this tutorial, we’ll guide you step by step through setting up a system to control the mouse cursor using eye movements and blinking, leveraging Python libraries such as OpenCV, MediaPipe, and PyAutoGUI.

We’ll break this down into several steps: detecting your face and eyes, tracking eye movement, and mapping eye position to screen coordinates for mouse control.

Prerequisites

Before starting, ensure you have the required libraries installed in your Python environment. This project depends on the following:

Required Libraries

OpenCV: For capturing video from the webcam and image processing.
MediaPipe: For face and eye detection using the face mesh model.
PyAutoGUI: For controlling the mouse programmatically.

Installation Instructions

Install the necessary libraries using pip. Open your terminal or command prompt and run:

pip install opencv-python mediapipe pyautogui

This will install all the packages needed for this project.

Setup

Setting Up the Environment

Once the required libraries are installed, you need to set up your environment by initializing the camera and face mesh detection model. We'll also fetch the screen dimensions to map eye positions to the screen coordinates later.

Importing Libraries and Initializing the Camera

Here’s how you can import the libraries and set up the camera:

import cv2
import mediapipe as mp
import pyautogui

# Initialize camera and face mesh model
cam = cv2.VideoCapture(0)
face_mesh = mp.solutions.face_mesh.FaceMesh(refine_landmarks=True)

# Get screen dimensions
screen_w, screen_h = pyautogui.size()

Explanation:

cv2.VideoCapture(0): Opens the default camera (webcam) to capture video frames.
mp.solutions.face_mesh.FaceMesh(refine_landmarks=True): Initializes MediaPipe's face mesh model for facial landmark detection, with refine_landmarks=True ensuring more detailed detection, especially around the eyes.
pyautogui.size(): Gets the width and height of the computer screen to map eye positions to the correct screen coordinates.

Face and Eye Detection

In this step, we’ll capture frames from the webcam and detect facial landmarks using MediaPipe’s face mesh model. We’ll focus specifically on the eye landmarks, which we will later use to control the mouse.

Code for Detecting Faces and Eyes

while True:
    # Capture frame
    _, frame = cam.read()
    frame = cv2.flip(frame, 1)  # Flip the frame horizontally
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)  # Convert to RGB

    # Process the frame to find landmarks
    output = face_mesh.process(rgb_frame)
    landmark_points = output.multi_face_landmarks
    frame_h, frame_w, _ = frame.shape

    # Check if landmarks are detected
    if landmark_points:
        landmarks = landmark_points[0].landmark

        # Draw eye landmarks
        for id, landmark in enumerate(landmarks[474:478]):
            x = int(landmark.x * frame_w)
            y = int(landmark.y * frame_h)
            cv2.circle(frame, (x, y), 3, (0, 255, 0))  # Draw landmark

            # Map landmark coordinates to screen coordinates
            if id == 1:  # Typically the right eye landmark
                # Normalize the coordinates to the screen size
                screen_x = int(screen_w * landmark.x)
                screen_y = int(screen_h * landmark.y)

                # Move mouse to the new position
                pyautogui.moveTo(screen_x, screen_y)

Explanation:

Frame Capture: The camera captures a frame, which is then flipped horizontally to simulate a mirror effect (this is often more intuitive when controlling with eye movement).
Landmark Detection: The face mesh model detects the landmarks in the frame, which are returned as multi_face_landmarks. This contains facial features, including eyes.
Eye Landmarks: The eye landmarks (IDs 474 to 478) are used for tracking eye movement. Circles are drawn around them for visualization purposes.
Mouse Movement: For one of the eye landmarks (ID 1 in this case, corresponding to the right eye), the normalized coordinates are used to move the mouse cursor on the screen using pyautogui.moveTo(screen_x, screen_y).

Mouse Control Based on Eye Movement

The eye movements are mapped to screen coordinates to move the mouse. We also detect blinking to perform mouse click actions. Here’s how you can implement this.

Adding Mouse Click Control via Blinking

# Detect blinking (click action)
        left_eye = [landmarks[145], landmarks[159]]  # Landmarks for left eye
        for landmark in left_eye:
            x = int(landmark.x * frame_w)
            y = int(landmark.y * frame_h)
            cv2.circle(frame, (x, y), 3, (0, 255, 255))  # Draw left eye landmarks

        # Click if the eyes are closed (or some other condition you define)
        if (left_eye[0].y - left_eye[1].y) < 0.004:  # Adjust threshold as necessary
            pyautogui.click()
            pyautogui.sleep(1)  # Sleep to avoid multiple clicks

Explanation:

Left Eye Landmarks: Landmarks for the left eye (ID 145 and 159) are detected. Circles are drawn around these landmarks to visualize the eye.
Blink Detection: If the vertical distance between the two eye landmarks (i.e., eye openness) falls below a certain threshold (here 0.004), we assume a blink has occurred. This triggers a mouse click using pyautogui.click().
Click Delay: After a click is registered, we use pyautogui.sleep(1) to prevent multiple consecutive clicks, providing a one-second delay between clicks.

Running the Code

After implementing the above steps, you can run the script to test the eye-controlled mouse system. To execute the script, run:

python eye_mouse_control.py

Exiting the Program

To gracefully exit the program and release all resources, press the ‘q’ key on your keyboard. Here's the final part of the code that handles this:

# Show the frame with landmarks
    cv2.imshow('Eye Controlled Mouse', frame)
    
    # Exit loop on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
# Release resources
cam.release()
cv2.destroyAllWindows()

cv2.imshow(): Displays the video feed with detected eye landmarks.
cv2.waitKey(1): Monitors for key presses. If 'q' is pressed, the loop is exited.
cam.release() and cv2.destroyAllWindows(): Ensure the webcam is released and the OpenCV windows are closed gracefully.

Conclusion

In this tutorial, you learned how to:

Set up an environment for eye tracking using OpenCV, MediaPipe, and PyAutoGUI.
Detect facial landmarks and focus on eye landmarks for mouse control.
Map eye movement to screen coordinates for controlling the mouse cursor.
Detect blinks to trigger mouse clicks.

This system opens up possibilities for accessibility solutions or hands-free interaction with a computer. Feel free to adjust the thresholds and landmarks to fine-tune the eye-tracking sensitivity for better control.