Skip to content

Week 17 Wildcard Week

Individual assignment

Tiny ML AI vision with Seeed XIAO ESP32S3 Sense for classification and recognition

Tiny ML

TinyML means running machine learning models on very small, low-power devices such as microcontrollers.

Traditional AI usually runs on powerful computers, GPUs, or cloud servers. TinyML is different because it allows AI models to run directly on small embedded hardware with limited memory, limited processing power, and low energy consumption.

TinyML is useful for:

Application Example
Smart sensors Detecting sound, motion, or objects
Wearable devices Gesture or activity recognition
Industrial monitoring Detecting abnormal machine behavior
Smart homes Local object or person detection
Fab Lab projects Interactive tools and intelligent devices

In this project, TinyML is used for image classification.

AI vision

AI Vision is a technology that allows a machine to understand image content using artificial intelligence.

In this project, the camera captures an image, and the model predicts which category the image belongs to.

Seeed XIAO ESP32S3 Sense

I chose the XIAO ESP32S3 Sense because it is small, powerful, and suitable for TinyML vision applications.

It includes:

Feature Description
Microcontroller ESP32-S3
Camera Onboard camera module
Wireless Wi-Fi and Bluetooth Low Energy
Memory PSRAM and Flash memory
Size Very compact
Extra sensor Digital microphone
Power USB-C or battery

The XIAO ESP32S3 Sense is a good platform for this project because it combines a camera, wireless communication, and enough processing capability for simple embedded AI tasks.

Application: Image processing, speech recognition, video monitoring, wearable devices, smart homes, health monitoring, education, low-power (LP) networking, and rapid prototyping.

Comparison: XIAO ESP32-C3 vs XIAO ESP32S3 Sense

Feature XIAO ESP32-C3 XIAO ESP32S3 Sense Key difference
Processor Single-core RISC-V processor Dual-core Xtensa LX7 processor ESP32-S3 has more processing power and is better for complex tasks.
Clock speed Up to 160 MHz Up to 240 MHz ESP32-S3 runs faster than ESP32-C3.
Flash memory 4 MB Flash 8 MB Flash ESP32-S3 provides more storage for firmware, libraries, and model files.
PSRAM No external PSRAM 8 MB PSRAM ESP32-S3 Sense can handle larger buffers, camera images, and AI models.
Wireless connectivity Wi-Fi 4 and Bluetooth 5 Wi-Fi 4 and Bluetooth 5 Both support wireless communication, but ESP32-S3 can handle heavier wireless and processing tasks more efficiently.
AI capability Suitable for simple IoT logic and lightweight tasks More suitable for TinyML and AI vision applications ESP32-S3 Sense is better for embedded AI projects.
Built-in sensors No built-in camera, microphone, or IMU Includes camera module and microphone ESP32-S3 Sense has more integrated sensing capabilities.
Camera support Not designed as a camera-based AI board Includes camera module ESP32-S3 Sense is much better for AI vision projects.
Audio support Limited; external hardware needed Built-in digital microphone ESP32-S3 Sense can be used for voice, sound, or audio recognition projects.
Power consumption Better for low-power and simple applications Higher performance, but may consume more power ESP32-C3 is better when low power is the main priority.
Typical use cases Simple IoT devices, sensors, wireless control, low-power nodes AI vision, TinyML, image classification, audio recognition, advanced IoT ESP32-C3 is simpler and lower-power; ESP32-S3 Sense is more powerful and sensor-rich.
Best for Lightweight IoT and cost-sensitive projects AI, camera-based projects, motion/audio sensing, advanced connected devices Choose based on whether the project needs AI and camera processing.

Summary

The XIAO ESP32-C3 is suitable for simple, low-power IoT applications such as sensor nodes, wireless switches, and basic connected devices.

The XIAO ESP32S3 Sense is better suited for more demanding projects, especially those involving TinyML, AI vision, image classification, audio processing, and sensor-based interaction. Its dual-core processor, higher clock speed, larger memory, PSRAM, camera, and microphone make it a stronger choice for AI and interactive projects.

Edge Impulse

Edge Impulse is an end-to-end development platform for Edge AI and TinyML. It helps developers build machine learning models and deploy them to edge devices such as microcontrollers, sensors, cameras, gateways, CPUs, GPUs, and NPUs.

A typical Edge Impulse workflow includes:

  • Collecting and labeling data
  • Designing a machine learning pipeline
  • Extracting features from sensor or image data
  • Training and testing a model
  • Optimizing it for constrained hardware
  • Deploying it to real embedded devices

Its main concept is called an Impulse, which is basically a machine learning pipeline:

data input → signal processing / feature extraction → model training → testing → deployment

Edge Impulse is commonly used for:

  • Sensor AI, such as gesture recognition, vibration monitoring, anomaly detection, and audio classification.
  • Computer vision, such as image classification and object detection.
  • Industrial and IoT applications, such as predictive maintenance, smart agriculture, smart devices, and on-device monitoring.

For embedded developers, its biggest value is that it simplifies the TinyML workflow. Instead of manually handling data collection, model training, quantization, conversion, and deployment, Edge Impulse provides a more integrated process.

In one sentence: Edge Impulse makes it easier to build and deploy machine learning models on real-world edge devices.

Vision AI with XIAO ESP32S3 Sense for face recognition

Step 1 — Connect XIAO ESP32S3 with a laptop

Step 2 — Log in to Edge Impulse and select vision training

Step 3 — Collect raw data and connect the board

Collect raw data for training, select XIAO ESP32S3 Sense, and connect the board to Edge Impulse.

To set up the Seeed XIAO ESP32S3 Sense, follow this guide: Seeed XIAO ESP32S3 Sense. I used Cursor to help install the Edge Impulse CLI and configure Arduino IDE, but Cursor took around 30 minutes and did not finish the environment configuration. Then I switched to the SenseCraft platform for training directly, because the SenseCraft platform already had the configuration completed and I could connect the device directly.

Open SenseCraft, select SenseCraft AI, and start AI training.

Step 4 — Log in and set up the category

Step 5 — Collect raw data for each type

Step 6 — Training

Click training. After around 2 minutes, the training finished.

Step 7 — Testing

See the video below. The TinyML classification model appears to have high accuracy.

Step 8 — Deploy the trained AI classification model to ESP32-S3 Sense

Then I got an AI camera that can help me recognize different statues or characters.