Eye-tracking-based mouse control is an innovative way to interact with your computer, particularly helpful in accessibility scenarios. In this tutorial, we’ll guide you step by step through setting up a system to control the mouse cursor using eye movements and blinking, leveraging Python libraries such as OpenCV, MediaPipe, and PyAutoGUI.
We’ll break this down into several steps: detecting your face and eyes, tracking eye movement, and mapping eye position to screen coordinates for mouse control.
Before starting, ensure you have the required libraries installed in your Python environment. This project depends on the following:
Install the necessary libraries using pip
. Open your
terminal or command prompt and run:
-python mediapipe pyautogui pip install opencv
This will install all the packages needed for this project.
Once the required libraries are installed, you need to set up your environment by initializing the camera and face mesh detection model. We'll also fetch the screen dimensions to map eye positions to the screen coordinates later.
Here’s how you can import the libraries and set up the camera:
import cv2
import mediapipe as mp
import pyautogui
# Initialize camera and face mesh model
= cv2.VideoCapture(0)
cam = mp.solutions.face_mesh.FaceMesh(refine_landmarks=True)
face_mesh
# Get screen dimensions
= pyautogui.size() screen_w, screen_h
cv2.VideoCapture(0)
: Opens the default
camera (webcam) to capture video frames.mp.solutions.face_mesh.FaceMesh(refine_landmarks=True)
:
Initializes MediaPipe's face mesh model for facial landmark detection,
with refine_landmarks=True
ensuring more detailed
detection, especially around the eyes.pyautogui.size()
: Gets the width and
height of the computer screen to map eye positions to the correct screen
coordinates.In this step, we’ll capture frames from the webcam and detect facial landmarks using MediaPipe’s face mesh model. We’ll focus specifically on the eye landmarks, which we will later use to control the mouse.
while True:
# Capture frame
= cam.read()
_, frame = cv2.flip(frame, 1) # Flip the frame horizontally
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Convert to RGB
rgb_frame
# Process the frame to find landmarks
= face_mesh.process(rgb_frame)
output = output.multi_face_landmarks
landmark_points = frame.shape
frame_h, frame_w, _
# Check if landmarks are detected
if landmark_points:
= landmark_points[0].landmark
landmarks
# Draw eye landmarks
for id, landmark in enumerate(landmarks[474:478]):
= int(landmark.x * frame_w)
x = int(landmark.y * frame_h)
y 3, (0, 255, 0)) # Draw landmark
cv2.circle(frame, (x, y),
# Map landmark coordinates to screen coordinates
if id == 1: # Typically the right eye landmark
# Normalize the coordinates to the screen size
= int(screen_w * landmark.x)
screen_x = int(screen_h * landmark.y)
screen_y
# Move mouse to the new position
pyautogui.moveTo(screen_x, screen_y)
multi_face_landmarks
. This contains facial features,
including eyes.pyautogui.moveTo(screen_x, screen_y)
.The eye movements are mapped to screen coordinates to move the mouse. We also detect blinking to perform mouse click actions. Here’s how you can implement this.
# Detect blinking (click action)
= [landmarks[145], landmarks[159]] # Landmarks for left eye
left_eye for landmark in left_eye:
= int(landmark.x * frame_w)
x = int(landmark.y * frame_h)
y 3, (0, 255, 255)) # Draw left eye landmarks
cv2.circle(frame, (x, y),
# Click if the eyes are closed (or some other condition you define)
if (left_eye[0].y - left_eye[1].y) < 0.004: # Adjust threshold as necessary
pyautogui.click()1) # Sleep to avoid multiple clicks pyautogui.sleep(
0.004
), we assume a blink has occurred.
This triggers a mouse click using pyautogui.click()
.pyautogui.sleep(1)
to prevent multiple consecutive clicks,
providing a one-second delay between clicks.After implementing the above steps, you can run the script to test the eye-controlled mouse system. To execute the script, run:
python eye_mouse_control.py
To gracefully exit the program and release all resources, press the ‘q’ key on your keyboard. Here's the final part of the code that handles this:
# Show the frame with landmarks
'Eye Controlled Mouse', frame)
cv2.imshow(
# Exit loop on 'q' key press
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release resources
cam.release() cv2.destroyAllWindows()
cv2.imshow()
: Displays the video feed
with detected eye landmarks.cv2.waitKey(1)
: Monitors for key
presses. If 'q' is pressed, the loop is exited.cam.release()
and
cv2.destroyAllWindows()
: Ensure the webcam is
released and the OpenCV windows are closed gracefully.In this tutorial, you learned how to:
This system opens up possibilities for accessibility solutions or hands-free interaction with a computer. Feel free to adjust the thresholds and landmarks to fine-tune the eye-tracking sensitivity for better control.