Mouse Control Using Eye Tracking with Python

Eye-tracking-based mouse control is an innovative way to interact with your computer, particularly helpful in accessibility scenarios. In this tutorial, we’ll guide you step by step through setting up a system to control the mouse cursor using eye movements and blinking, leveraging Python libraries such as OpenCV, MediaPipe, and PyAutoGUI.

We’ll break this down into several steps: detecting your face and eyes, tracking eye movement, and mapping eye position to screen coordinates for mouse control.

Prerequisites

Before starting, ensure you have the required libraries installed in your Python environment. This project depends on the following:

Required Libraries

Installation Instructions

Install the necessary libraries using pip. Open your terminal or command prompt and run:

pip install opencv-python mediapipe pyautogui

This will install all the packages needed for this project.

Setup

Setting Up the Environment

Once the required libraries are installed, you need to set up your environment by initializing the camera and face mesh detection model. We'll also fetch the screen dimensions to map eye positions to the screen coordinates later.

Importing Libraries and Initializing the Camera

Here’s how you can import the libraries and set up the camera:

import cv2
import mediapipe as mp
import pyautogui

# Initialize camera and face mesh model
cam = cv2.VideoCapture(0)
face_mesh = mp.solutions.face_mesh.FaceMesh(refine_landmarks=True)

# Get screen dimensions
screen_w, screen_h = pyautogui.size()

Explanation:

Face and Eye Detection

In this step, we’ll capture frames from the webcam and detect facial landmarks using MediaPipe’s face mesh model. We’ll focus specifically on the eye landmarks, which we will later use to control the mouse.

Code for Detecting Faces and Eyes

while True:
    # Capture frame
    _, frame = cam.read()
    frame = cv2.flip(frame, 1)  # Flip the frame horizontally
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)  # Convert to RGB

    # Process the frame to find landmarks
    output = face_mesh.process(rgb_frame)
    landmark_points = output.multi_face_landmarks
    frame_h, frame_w, _ = frame.shape

    # Check if landmarks are detected
    if landmark_points:
        landmarks = landmark_points[0].landmark

        # Draw eye landmarks
        for id, landmark in enumerate(landmarks[474:478]):
            x = int(landmark.x * frame_w)
            y = int(landmark.y * frame_h)
            cv2.circle(frame, (x, y), 3, (0, 255, 0))  # Draw landmark

            # Map landmark coordinates to screen coordinates
            if id == 1:  # Typically the right eye landmark
                # Normalize the coordinates to the screen size
                screen_x = int(screen_w * landmark.x)
                screen_y = int(screen_h * landmark.y)

                # Move mouse to the new position
                pyautogui.moveTo(screen_x, screen_y)

Explanation:

  1. Frame Capture: The camera captures a frame, which is then flipped horizontally to simulate a mirror effect (this is often more intuitive when controlling with eye movement).
  2. Landmark Detection: The face mesh model detects the landmarks in the frame, which are returned as multi_face_landmarks. This contains facial features, including eyes.
  3. Eye Landmarks: The eye landmarks (IDs 474 to 478) are used for tracking eye movement. Circles are drawn around them for visualization purposes.
  4. Mouse Movement: For one of the eye landmarks (ID 1 in this case, corresponding to the right eye), the normalized coordinates are used to move the mouse cursor on the screen using pyautogui.moveTo(screen_x, screen_y).

Mouse Control Based on Eye Movement

The eye movements are mapped to screen coordinates to move the mouse. We also detect blinking to perform mouse click actions. Here’s how you can implement this.

Adding Mouse Click Control via Blinking

# Detect blinking (click action)
        left_eye = [landmarks[145], landmarks[159]]  # Landmarks for left eye
        for landmark in left_eye:
            x = int(landmark.x * frame_w)
            y = int(landmark.y * frame_h)
            cv2.circle(frame, (x, y), 3, (0, 255, 255))  # Draw left eye landmarks

        # Click if the eyes are closed (or some other condition you define)
        if (left_eye[0].y - left_eye[1].y) < 0.004:  # Adjust threshold as necessary
            pyautogui.click()
            pyautogui.sleep(1)  # Sleep to avoid multiple clicks

Explanation:

Running the Code

After implementing the above steps, you can run the script to test the eye-controlled mouse system. To execute the script, run:

python eye_mouse_control.py

Exiting the Program

To gracefully exit the program and release all resources, press the ‘q’ key on your keyboard. Here's the final part of the code that handles this:

# Show the frame with landmarks
    cv2.imshow('Eye Controlled Mouse', frame)
    
    # Exit loop on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
# Release resources
cam.release()
cv2.destroyAllWindows()

Conclusion

In this tutorial, you learned how to:

  1. Set up an environment for eye tracking using OpenCV, MediaPipe, and PyAutoGUI.
  2. Detect facial landmarks and focus on eye landmarks for mouse control.
  3. Map eye movement to screen coordinates for controlling the mouse cursor.
  4. Detect blinks to trigger mouse clicks.

This system opens up possibilities for accessibility solutions or hands-free interaction with a computer. Feel free to adjust the thresholds and landmarks to fine-tune the eye-tracking sensitivity for better control.