17. Wildcard Week

Machine Vision Facial Expression Interaction for Jelamp

Project Title: Facial Expression Recognition for Jelamp Using ESP32-S3 Camera and Edge Impulse

Assignment

The Wildcard Week assignment is to design and produce something with a digital process that incorporates computer-aided design and manufacturing, but is not covered in another assignment. The documentation should explain the requirements that the work meets and include enough information for someone else to reproduce it.

For this week, I chose to explore machine vision and embedded machine learning. I used an ESP32-S3 camera board and Edge Impulse to train a simple facial expression recognition model. The model recognizes basic facial expressions and changes the color of a NeoPixel LED ring based on the detected expression.

This experiment directly supports my final project, Jelamp, a 3-DOF interactive robotic desk lamp inspired by the shape of a giraffe.

What I Made

I made a small interactive machine vision system for Jelamp. The system uses a camera to capture the user's face, runs a trained machine learning model on the ESP32-S3, and changes the NeoPixel LED color based on the detected facial expression.

User face
  |
  v
ESP32-S3 camera
  |
  v
Edge Impulse facial expression model
  |
  v
Expression classification
  |
  v
NeoPixel LED color response

Example Responses

Facial ExpressionNeoPixel Response
HappyWarm yellow / rainbow animation
NeutralSoft white / calm blue
Sad / SurprisedBlue or purple breathing light

Why I Chose This

Jelamp is designed to be more than a normal desk lamp. It is an expressive robotic lamp that can move, sense, and respond to the user. For Wildcard Week, I wanted to add a visual interaction feature. By using a camera and machine learning model, Jelamp can detect simple facial expressions and respond with light.

This makes the lamp feel more alive and interactive.

How This Meets the Assignment Requirement

This project meets the Wildcard Week requirement because it uses a digital process not covered in my other assignments:

Data collection -> Model training -> Embedded deployment -> Physical interaction

System Overview

+------------------------+
|       User Face        |
+-----------+------------+
            |
            v
+------------------------+
| ESP32-S3 Camera Board  |
| Captures image input   |
+-----------+------------+
            |
            v
+------------------------+
|   Edge Impulse Model   |
| Classifies expression  |
+-----------+------------+
            |
            v
+------------------------+
| ESP32-S3 Logic         |
| Chooses LED response   |
+-----------+------------+
            |
            v
+------------------------+
| NeoPixel LED Ring      |
| Shows color feedback   |
+------------------------+

Materials and Components

ComponentPurpose
ESP32-S3 camera board / XIAO ESP32S3 SenseCamera input and model inference
NeoPixel ring / WS2812B LEDsProgrammable LED output
USB-C cableProgramming and power
Jumper wiresConnections
ComputerEdge Impulse training and Arduino programming
Arduino IDEUploading code
Edge ImpulseDataset, training, and deployment

Step-by-Step Process

Step 1: Define the Scope

The first step was to decide what kind of expressions the system should recognize. To keep the project manageable, I started with simple expression classes:

Happy
Neutral
Sad / Surprised

If the model was not accurate enough, I planned to simplify it to two classes:

Happy
Not Happy

This helped keep the project realistic and achievable.

Step 2: Set Up the ESP32-S3 Camera

I connected the ESP32-S3 camera board to my computer and tested whether the camera could capture images correctly. The camera board was used as the input device for collecting facial expression data and later running the trained model.

  • Connecting the board by USB-C
  • Checking that the board was recognized by the computer
  • Testing basic camera capture
  • Confirming the image quality and lighting conditions

Step 3: Collect the Dataset

I collected images for each expression class.

ClassTarget Number of ImagesNotes
Happy30-50Smiling face
Neutral30-50Normal face
Sad / Surprised30-50Stronger facial expression

To improve the dataset, I tried to vary:

  • Lighting
  • Face angle
  • Distance from the camera
  • Background
  • Expression intensity

This helped the model learn more reliable patterns instead of only memorizing one image condition.

Step 4: Train the Model in Edge Impulse

After collecting the dataset, I uploaded the images to Edge Impulse and labeled them by expression. Then I created an image classification model.

  1. Upload image data
  2. Label each class
  3. Create an image classification impulse
  4. Train the model
  5. Check the accuracy
  6. Review the confusion matrix
  7. Test the model with new images

The confusion matrix helped me see which expressions were easy or difficult for the model to recognize.

Step 5: Deploy the Model to ESP32-S3

After training, I exported the Edge Impulse model as an Arduino library. Then I installed the library in Arduino IDE and uploaded the model inference code to the ESP32-S3 board.

The board captured an image, ran inference, and printed the predicted expression and confidence score in the Serial Monitor.

Prediction: happy
Confidence: 0.86
Inference time: ___ ms

Step 6: Connect the NeoPixel Ring

Next, I connected the NeoPixel ring to the ESP32-S3.

NeoPixel PinConnects To
5V5V
GNDGND
DINESP32-S3 GPIO pin

The ESP32-S3 and NeoPixel must share a common ground.

Before connecting the machine learning result, I tested simple LED colors first:

Red
Green
Blue
White
Rainbow

Step 7: Map Expressions to LED Responses

After confirming that the NeoPixel worked, I connected the model result to LED behavior.

If expression is Happy:
  show warm yellow or rainbow animation

If expression is Neutral:
  show soft white or calm blue

If expression is Sad or Surprised:
  show blue or purple breathing light

This made the system interactive because the physical LED output changed according to the user's facial expression.

Step 8: Test the Interaction

I tested the system by showing different facial expressions in front of the camera.

  • Expression shown
  • Predicted label
  • Confidence score
  • LED response
  • Whether the result was correct
TestReal ExpressionPredicted ExpressionConfidenceLED ResponseResult
1HappyHappy___YellowCorrect
2NeutralNeutral___WhiteCorrect
3SadNeutral___WhiteNeeds improvement

Problems and Solutions

ProblemSolution
The model confused neutral and sad expressionsI added more training images with clearer expressions
Camera image was too darkI improved the lighting
LED color did not change correctlyI checked the GPIO pin and NeoPixel wiring
The model was slowI reduced the number of classes and simplified the model
Prediction confidence was lowI collected more balanced data for each class

Final Result

The final system can use an ESP32-S3 camera to classify simple facial expressions and change the NeoPixel LED color based on the result. This demonstrates how machine vision can be embedded into a physical interactive object.

For my final project, this feature can become part of Jelamp's emotional response system. Jelamp can use visual input to understand the user's expression and respond through light, movement, or both.

Minimum Viable Version

If time is limited, the project can still meet the assignment requirement with two expression classes:

Happy
Not Happy

The output behavior can be:

Happy -> yellow NeoPixel
Not Happy -> blue NeoPixel

This still demonstrates:

Files to Include

To make the project reproducible, I should include:

Reflection

This week helped me understand how machine learning can be used as part of a physical interactive product. Instead of only controlling LEDs manually, I trained a model that allows the lamp to respond to visual input.

The most important lesson was that the quality of the dataset strongly affects the quality of the result. Good lighting, clear labels, and enough examples are important for a reliable model.

This experiment is useful for Jelamp because it adds a new layer of interaction. In the future, I can combine facial expression detection with servo movement so the lamp can respond with both motion and light.

← Week 16 Back to Assignments Final Project →