17. Wildcard Week
Complete Guide: Facial Expression Detection with XIAO ESP32S3 Sense
Project Overview
Goal: Detect facial expressions (Happy, Sad, Neutral, Angry) Board: XIAO ESP32S3 Sense (with OV2640 camera) Platform: Edge Impulse + Arduino IDE Method: Capture real images with YOUR camera → Train model → Deploy
Phase 1: Hardware Setup
What You Need
✅ XIAO ESP32S3 Sense board (with camera module attached) ✅ USB-C cable ✅ Computer with Chrome/Edge browser ✅ WiFi network (2.4GHz) ✅ Arduino IDE installed
Install Arduino IDE Support for XIAO
1. Open Arduino IDE 2. File → Preferences 3. In "Additional Board Manager URLs" add: https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json 4. Click OK 5. Tools → Board → Board Manager 6. Search "esp32" → Install "esp32 by Espressif Systems" 7. Wait for installation to complete
Board Settings in Arduino IDE
Tools → Board: XIAO_ESP32S3 Tools → USB CDC On Boot: Enabled Tools → PSRAM: OPI PSRAM ← VERY IMPORTANT Tools → Flash Size: 8MB Tools → Partition Scheme: Huge APP (3MB No OTA/1MB SPIFFS) Tools → Port: (select your COM port)
Phase 2: Capture Images with Your XIAO Camera
Step 1: Open CameraWebServer Example
Arduino IDE → File → Examples → ESP32 → Camera → CameraWebServer
This opens multiple files. You need to edit two of them.
Step 2: Edit CameraWebServer.ino
Find and change these lines:
// ============================================ // CHANGE 1: Select your camera model // ============================================ // Comment out ALL other camera models, then add XIAO: // #define CAMERA_MODEL_WROVER_KIT // #define CAMERA_MODEL_ESP_EYE // #define CAMERA_MODEL_ESP32S3_EYE // #define CAMERA_MODEL_M5STACK_PSRAM // #define CAMERA_MODEL_M5STACK_V2_PSRAM // #define CAMERA_MODEL_M5STACK_WIDE // #define CAMERA_MODEL_M5STACK_ESP32CAM // #define CAMERA_MODEL_M5STACK_UNITCAM // #define CAMERA_MODEL_AI_THINKER // #define CAMERA_MODEL_TTGO_T_JOURNAL #define CAMERA_MODEL_XIAO_ESP32S3 // ← ADD THIS LINE // ============================================ // CHANGE 2: Enter your WiFi credentials // ============================================ const char *ssid = "YOUR_WIFI_NAME"; // ← Your WiFi name const char *password = "YOUR_WIFI_PASSWORD"; // ← Your WiFi password
Step 3: Check camera_pins.h
If CAMERA_MODEL_XIAO_ESP32S3 is not listed in camera_pins.h, add this at the bottom before the #endif:
#elif defined(CAMERA_MODEL_XIAO_ESP32S3) #define PWDN_GPIO_NUM -1 #define RESET_GPIO_NUM -1 #define XCLK_GPIO_NUM 10 #define SIOD_GPIO_NUM 40 #define SIOC_GPIO_NUM 39 #define Y9_GPIO_NUM 48 #define Y8_GPIO_NUM 11 #define Y7_GPIO_NUM 12 #define Y6_GPIO_NUM 14 #define Y5_GPIO_NUM 16 #define Y4_GPIO_NUM 18 #define Y3_GPIO_NUM 17 #define Y2_GPIO_NUM 15 #define VSYNC_GPIO_NUM 38 #define HREF_GPIO_NUM 47 #define PCLK_GPIO_NUM 13
Step 4: Upload the Sketch
1. Connect XIAO to computer via USB-C 2. Select correct Port in Tools menu 3. Click Upload (→ button) 4. Wait for "Done uploading"
Step 5: Get the Camera IP Address
1. Open Serial Monitor (Tools → Serial Monitor) 2. Set baud rate to 115200 3. Press the small RESET button on XIAO board 4. Wait for this message: WiFi connected Camera Ready! Use 'http://192.168.1.XXX' to connect 5. Copy that IP address
Step 6: Open Camera in Browser
1. Open Chrome or Edge on your computer 2. Type the IP address: http://192.168.1.XXX 3. You should see the camera control page 4. Set Resolution to QVGA (320x240) or lower 5. Click "Start Stream" 6. You should see live video from your XIAO camera
Step 7: Capture Expression Images
Now capture images for each expression. Start with only 2 classes first!
Round 1: Start with 2 Classes (Happy vs Neutral)
📸 CAPTURING HAPPY IMAGES (50-100 photos) ────────────────────────────────────────── 1. Sit 30cm from camera 2. Make a BIG smile (show teeth!) 3. Click "Save" button in browser to download image 4. Slightly change: - Head angle (straight, slight left, slight right) - Lighting (lamp on, lamp off, different room) - Smile variation (teeth showing, closed mouth smile) 5. Save each image to a folder called "Happy" 📸 CAPTURING NEUTRAL IMAGES (50-100 photos) ────────────────────────────────────────── 1. Sit 30cm from camera (same distance!) 2. Keep face completely relaxed, no expression 3. Click "Save" to download 4. Slightly change angle, lighting 5. Save each image to a folder called "Neutral"
Expression Tips — Make Them OBVIOUS
😊 HAPPY: ✅ Big wide smile ✅ Show teeth ✅ Squint eyes slightly (natural smile) ❌ Don't do subtle polite smile 😐 NEUTRAL: ✅ Completely relaxed ✅ Mouth closed, natural ✅ Eyes looking at camera ❌ Don't accidentally smile 😢 SAD (add later): ✅ Corners of mouth pulled DOWN ✅ Head tilted slightly down ✅ Droopy eyes ❌ Don't look too similar to neutral 😠 ANGRY (add later): ✅ Furrow eyebrows HARD ✅ Clench jaw ✅ Squint eyes aggressively ❌ Don't look too similar to sad
Organize Your Folders
📁 facial_expressions/
├── 📁 Happy/
│ ├── img_001.jpg
│ ├── img_002.jpg
│ ├── img_003.jpg
│ └── ... (50-100 images)
│
└── 📁 Neutral/
├── img_001.jpg
├── img_002.jpg
├── img_003.jpg
└── ... (50-100 images)
Phase 3: Edge Impulse — Create Project & Upload Data
Step 1: Create Edge Impulse Account & Project
1. Go to https://studio.edgeimpulse.com 2. Sign up (free account) 3. Click "Create new project" 4. Name it: "Facial Expression XIAO" 5. Select: "Images" as project type 6. Select: "Classify single image" (not object detection)
Step 2: Upload Your Images
1. Go to "Data acquisition" (left menu) 2. Click "+ Add data" button (top right) 3. Click "Upload data" 4. Settings: - Category: "Split automatically between training and testing" - Label: "Happy" 5. Click "Choose files" → Select ALL images from your Happy folder 6. Click "Upload data" 7. Wait for upload to complete 8. Repeat for Neutral: - Click "+ Add data" → "Upload data" - Label: "Neutral" - Choose all files from Neutral folder - Upload
Step 3: Verify Your Data
After uploading, you should see: Data acquisition page: ┌─────────────────────────────┐ │ Training data: │ │ Happy: ~80 images │ │ Neutral: ~80 images │ │ │ │ Test data: │ │ Happy: ~20 images │ │ Neutral: ~20 images │ └─────────────────────────────┘ (Edge Impulse auto-splits 80/20)
Phase 4: Edge Impulse — Design the Impulse
Step 1: Create Impulse
1. Go to "Create impulse" (left menu) 2. Set these blocks: ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ IMAGE DATA │ → │ PROCESSING │ → │ LEARNING │ │ │ │ │ │ │ │ Image width: │ │ Image │ │ Transfer │ │ 96 │ │ (built-in) │ │ Learning │ │ Image height:│ │ │ │ (Images) │ │ 96 │ │ │ │ │ │ Resize mode: │ │ │ │ │ │ Fit shortest │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘
Detailed steps:
1. Image data block (already added): - Image width: 96 - Image height: 96 - Resize mode: Fit shortest axis 2. Click "+ Add a processing block" → Choose "Image" 3. Click "+ Add a learning block" → Choose "Transfer Learning (Images)" 4. Click "Save Impulse"
Step 2: Configure Image Processing
1. Click "Image" in the left menu (under Impulse design) 2. Set Color depth: Grayscale (Grayscale works better for expressions + saves memory on XIAO) 3. Click "Save parameters" 4. Click "Generate features" 5. Wait for processing to complete
Step 3: Check Feature Explorer 🔍
After generating features, check the Feature Explorer: GOOD ✅ BAD ❌ ┌─────────────────┐ ┌─────────────────┐ │ 🟠🟠🟠 │ │ 🟠🟢🟠🟢🟠🟢 │ │ 🟠🟠🟠 │ │ 🟢🟠🟢🟠🟢🟠 │ │ │ │ 🟠🟢🟠🟢🟠 │ │ 🟢🟢🟢 │ │ 🟢🟠🟢🟠🟢🟠 │ │ 🟢🟢🟢 │ │ ALL MIXED │ │ SEPARATED │ │ │ └─────────────────┘ └─────────────────┘ If GOOD → Continue to training If BAD → Go back and capture better images
Phase 5: Edge Impulse — Train the Model
Step 1: Configure Training
1. Click "Transfer learning" in left menu
2. Set these parameters:
Number of training cycles: 50
Learning rate: 0.0005
Data augmentation: ON ✅
Neural network architecture: MobileNetV2 96x96 0.1
(smallest option — fits on XIAO!)
IMPORTANT: Choose MobileNetV2 96x96 0.1 — this is the smallest model that will fit in your XIAO's memory!
Step 2: Start Training
1. Click "Start training" 2. Wait 2-5 minutes 3. Check results: GOOD RESULTS (2 classes): ┌──────────────────────────┐ │ Accuracy: 85-95% ✅ │ │ Loss: < 0.5 ✅ │ │ │ │ Confusion Matrix: │ │ Happy Neutral │ │ Happy: 92% 8% │ │ Neutral: 5% 95% │ └──────────────────────────┘
What Your Results Mean
Accuracy > 85% → ✅ Great! Continue to deployment Accuracy 70-85% → ⚠️ OK, but try more/better images Accuracy < 70% → ❌ Need better data, check images quality
Phase 6: Edge Impulse — Configure Target Device
Step 1: Set Deployment Target
1. Go to "Deployment" in left menu 2. OR go to Dashboard → Target device settings 3. Enter these settings: Target device: Espressif ESP-EYE (ESP32 240MHz) Processor family: ESP32 Clock rate: 240 MHz (Scroll down for Application Budget) Available RAM: 256 KB Available ROM: 2800 KB Maximum latency: 500 ms 4. Click "Save"
Step 2: Build Arduino Library
1. Go to "Deployment" (left menu) 2. Select "Arduino library" 3. Select "Quantized (int8)" ← Smaller, faster, works on XIAO 4. Click "Build" 5. Wait for build to complete 6. A .zip file will download automatically 📦 ei-facial-expression-xiao-arduino-1.0.1.zip
Phase 7: Deploy to XIAO ESP32S3
Step 1: Install the Library in Arduino IDE
1. Open Arduino IDE 2. Sketch → Include Library → Add .ZIP Library 3. Select the downloaded .zip file 4. Wait for "Library installed" message
Step 2: Create the Inference Sketch
Create a new sketch and paste this code:
/* ============================================
* Facial Expression Detection
* XIAO ESP32S3 Sense
* ============================================ */
// ---- CHANGE THIS to match your library name ----
#include <Facial_Expression_XIAO_inferencing.h>
// Check: Arduino/libraries/ folder for exact name
#include "esp_camera.h"
// ----- XIAO ESP32S3 Camera Pins -----
#define PWDN_GPIO_NUM -1
#define RESET_GPIO_NUM -1
#define XCLK_GPIO_NUM 10
#define SIOD_GPIO_NUM 40
#define SIOC_GPIO_NUM 39
#define Y9_GPIO_NUM 48
#define Y8_GPIO_NUM 11
#define Y7_GPIO_NUM 12
#define Y6_GPIO_NUM 14
#define Y5_GPIO_NUM 16
#define Y4_GPIO_NUM 18
#define Y3_GPIO_NUM 17
#define Y2_GPIO_NUM 15
#define VSYNC_GPIO_NUM 38
#define HREF_GPIO_NUM 47
#define PCLK_GPIO_NUM 13
// ----- Settings -----
#define CAMERA_FRAME_SIZE FRAMESIZE_96X96
#define IMAGE_WIDTH 96
#define IMAGE_HEIGHT 96
// ----- Global Variables -----
camera_fb_t *fb = NULL;
// =============================================
// SETUP
// =============================================
void setup() {
Serial.begin(115200);
while (!Serial && millis() < 3000);
Serial.println("=================================");
Serial.println(" Facial Expression Detection");
Serial.println(" XIAO ESP32S3 Sense");
Serial.println("=================================");
// Initialize camera
camera_config_t config;
config.ledc_channel = LEDC_CHANNEL_0;
config.ledc_timer = LEDC_TIMER_0;
config.pin_d0 = Y2_GPIO_NUM;
config.pin_d1 = Y3_GPIO_NUM;
config.pin_d2 = Y4_GPIO_NUM;
config.pin_d3 = Y5_GPIO_NUM;
config.pin_d4 = Y6_GPIO_NUM;
config.pin_d5 = Y7_GPIO_NUM;
config.pin_d6 = Y8_GPIO_NUM;
config.pin_d7 = Y9_GPIO_NUM;
config.pin_xclk = XCLK_GPIO_NUM;
config.pin_pclk = PCLK_GPIO_NUM;
config.pin_vsync = VSYNC_GPIO_NUM;
config.pin_href = HREF_GPIO_NUM;
config.pin_sccb_sda = SIOD_GPIO_NUM;
config.pin_sccb_scl = SIOC_GPIO_NUM;
config.pin_pwdn = PWDN_GPIO_NUM;
config.pin_reset = RESET_GPIO_NUM;
config.xclk_freq_hz = 20000000;
config.pixel_format = PIXFORMAT_JPEG;
config.frame_size = CAMERA_FRAME_SIZE;
config.grab_mode = CAMERA_GRAB_WHEN_EMPTY;
config.fb_location = CAMERA_FB_IN_PSRAM;
config.jpeg_quality = 10;
config.fb_count = 1;
esp_err_t err = esp_camera_init(&config);
if (err != ESP_OK) {
Serial.printf("Camera init FAILED! Error: 0x%x\n", err);
Serial.println("Check: Is camera module attached?");
while (1) { delay(1000); }
}
Serial.println("Camera initialized OK!");
Serial.println("Starting detection...\n");
}
// =============================================
// Get camera image and convert for Edge Impulse
// =============================================
bool get_camera_data(size_t offset, size_t length, float *out_ptr) {
fb = esp_camera_fb_get();
if (!fb) {
Serial.println("Camera capture failed!");
return false;
}
// Decode JPEG to RGB
// Edge Impulse expects pixel values as floats
// For JPEG, we need to decode first
size_t jpg_buf_len = fb->len;
uint8_t *jpg_buf = fb->buf;
// Simple approach: convert JPEG buffer
// The library handles this internally
esp_camera_fb_return(fb);
return true;
}
// =============================================
// Capture and classify
// =============================================
void classify_image() {
Serial.println("📸 Capturing image...");
// Capture frame
fb = esp_camera_fb_get();
if (!fb) {
Serial.println("❌ Camera capture failed!");
return;
}
// Allocate buffer for RGB888 image
uint8_t *rgb_buf = (uint8_t *)malloc(IMAGE_WIDTH * IMAGE_HEIGHT * 3);
if (!rgb_buf) {
Serial.println("❌ Memory allocation failed!");
esp_camera_fb_return(fb);
return;
}
// Convert JPEG to RGB888
bool converted = fmt2rgb888(fb->buf, fb->len, PIXFORMAT_JPEG, rgb_buf);
esp_camera_fb_return(fb);
if (!converted) {
Serial.println("❌ Image conversion failed!");
free(rgb_buf);
return;
}
// Prepare signal for Edge Impulse
ei_impulse_result_t result = { 0 };
// Create signal from RGB buffer
signal_t signal;
signal.total_length = IMAGE_WIDTH * IMAGE_HEIGHT;
signal.get_data = [rgb_buf](size_t offset, size_t length, float *out_ptr) -> int {
for (size_t i = 0; i < length; i++) {
size_t pixel = offset + i;
// Convert RGB to single float (same format as Edge Impulse expects)
uint8_t r = rgb_buf[pixel * 3 + 0];
uint8_t g = rgb_buf[pixel * 3 + 1];
uint8_t b = rgb_buf[pixel * 3 + 2];
// Pack as 0xRRGGBB float (Edge Impulse format)
out_ptr[i] = (r << 16) + (g << 8) + b;
}
return 0;
};
// Run classifier
EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false);
// Free RGB buffer
free(rgb_buf);
if (res != EI_IMPULSE_OK) {
Serial.printf("❌ Classifier failed! Error: %d\n", res);
return;
}
// ---- Print Results ----
Serial.println("┌─────────────────────────────┐");
Serial.println("│ DETECTION RESULTS │");
Serial.println("├─────────────────────────────┤");
float max_value = 0;
int max_index = 0;
for (size_t i = 0; i < EI_CLASSIFIER_LABEL_COUNT; i++) {
Serial.printf("│ %-12s : %5.1f%% │\n",
result.classification[i].label,
result.classification[i].value * 100);
if (result.classification[i].value > max_value) {
max_value = result.classification[i].value;
max_index = i;
}
}
Serial.println("├─────────────────────────────┤");
// Show detected expression with emoji
const char* expression = result.classification[max_index].label;
if (strcmp(expression, "Happy") == 0) {
Serial.println("│ 😊 HAPPY! │");
} else if (strcmp(expression, "Sad") == 0) {
Serial.println("│ 😢 SAD │");
} else if (strcmp(expression, "Angry") == 0) {
Serial.println("│ 😠 ANGRY │");
} else if (strcmp(expression, "Neutral") == 0) {
Serial.println("│ 😐 NEUTRAL │");
} else {
Serial.printf("│ → %s\n", expression);
}
Serial.printf("│ Confidence: %.1f%% │\n", max_value * 100);
Serial.println("└─────────────────────────────┘\n");
}
// =============================================
// MAIN LOOP
// =============================================
void loop() {
classify_image();
delay(2000); // Classify every 2 seconds
}
Step 3: Fix the Library Include Name
IMPORTANT! Find the correct library name: 1. Go to: Arduino/libraries/ folder on your computer 2. Find the Edge Impulse library folder name 3. Open: src/ folder inside it 4. Find the .h file that looks like: Facial_Expression_XIAO_inferencing.h 5. Update the #include line in your sketch to match EXACTLY: #include <YOUR_EXACT_LIBRARY_NAME_inferencing.h>
Step 4: Upload to XIAO
Arduino IDE Settings (same as before): Board: XIAO_ESP32S3 PSRAM: OPI PSRAM Flash Size: 8MB Partition Scheme: Huge APP (3MB No OTA/1MB SPIFFS) Port: (your COM port) Click Upload → Wait for completion
Step 5: Open Serial Monitor & Test
1. Tools → Serial Monitor 2. Set baud rate: 115200 3. Point camera at your face 4. You should see: ================================= Facial Expression Detection XIAO ESP32S3 Sense ================================= Camera initialized OK! Starting detection... 📸 Capturing image... ┌─────────────────────────────┐ │ DETECTION RESULTS │ ├─────────────────────────────┤ │ Happy : 92.3% │ │ Neutral : 7.7% │ ├─────────────────────────────┤ │ 😊 HAPPY! │ │ Confidence: 92.3% │ └─────────────────────────────┘
Phase 8: Add More Classes (After 2-Class Success)
Once Happy vs Neutral works well:
Round 2: Add SAD ───────────────── 1. Go back to CameraWebServer sketch 2. Upload it to XIAO again 3. Capture 50-100 Sad expression images 4. Upload to Edge Impulse with label "Sad" 5. Re-generate features → Check feature explorer 6. Re-train model 7. Re-deploy Round 3: Add ANGRY ───────────────── 1. Same process — capture Angry images 2. Upload → Re-train → Re-deploy
🔧 Troubleshooting
| Problem | Solution |
|---|---|
| Camera init failed | Check camera module is attached firmly; Enable PSRAM in board settings |
| WiFi not connecting | Check SSID/password; Use 2.4GHz WiFi only (not 5GHz) |
| No IP address shown | Press RESET button; Check Serial Monitor baud is 115200 |
| Browser shows nothing | Make sure phone/computer is on SAME WiFi network |
| Library include error | Check exact .h filename in Arduino/libraries/ folder |
| Out of memory | Use MobileNetV2 96x96 0.1; Use Grayscale; Use int8 quantization |
| Low accuracy | Capture MORE images; Make expressions MORE exaggerated |
| Feature explorer mixed | Images too similar; Clean bad images; Reduce classes |
📊 Complete Project Checklist
Phase 1: Hardware □ XIAO ESP32S3 Sense ready □ Arduino IDE configured □ Board settings correct (PSRAM enabled!) Phase 2: Data Capture □ CameraWebServer uploaded □ WiFi connected, IP address obtained □ Happy images captured (50-100) □ Neutral images captured (50-100) □ Images organized in folders Phase 3: Edge Impulse Setup □ Project created (Image classification) □ Images uploaded with correct labels □ 80/20 train/test split Phase 4: Impulse Design □ Image block: 96x96, Grayscale □ Processing: Image block added □ Learning: Transfer Learning added □ Features generated □ Feature explorer shows SEPARATED clusters ✅ Phase 5: Training □ MobileNetV2 96x96 0.1 selected □ Training cycles: 50 □ Data augmentation: ON □ Accuracy > 85% Phase 6: Deployment □ Target device configured (256KB RAM, 2800KB ROM) □ Arduino library built (int8 quantized) □ .zip downloaded Phase 7: Run on XIAO □ Library installed in Arduino IDE □ Inference sketch uploaded □ Serial monitor showing results □ Correct expressions detected! 🎉 Phase 8: Expand □ Add Sad class → retrain → redeploy □ Add Angry class → retrain → redeploy □ Final 4-class model working! 🎯