Week 18 - Applications and Implications
- assignment
- Propose a final project masterpiece that integrates the range of units
covered,
- answering:
- What will it do?
- Who's done what beforehand?
- What will you design?
- What materials and components will be used?
- Where will come from?
- How much will they cost?
- What parts and systems will be made?
- What processes will be used?
- What questions need to be answered?
- How will it be evaluated?
- answering:
- Your project should incorporate 2D and 3D design,
- additive and subtractive fabrication processes,
- electronics design and production,
- embedded microcontroller interfacing and programming,
- system integration and packaging
- Where possible, you should make rather than buy
- the parts of your project
- Projects can be separate or joint, but need to show individual
- mastery of the skills, and be independently operable
- Propose a final project masterpiece that integrates the range of units
covered,
What will it do? This project is a smart glasses kit that can be attached to a regular pair of glasses to provide intelligent features. It supports voice interaction and visual understanding. Example use cases include:
- Asking “What’s the weather like today?” and receiving a spoken response.
- Asking “What’s in front of me?” and getting a description of the scene.
Who’s done what beforehand? Commercial smart glasses like Google Glass and Meta Ray-Ban provide integrated AR features, but are often expensive and closed-source. This project aims to create a low-cost, open-source alternative focused on voice and vision capabilities, especially tailored for prototyping and educational use.
This project is based on the xiaozhi-esp32
platform. On top of this foundation, I made the following key contributions:
- Hardware Driver Integration: I developed and integrated drivers for the custom hardware modules I designed.
- Hardware Adaptation: I adapted the system to ensure compatibility with the newly designed hardware components.
- Image Recognition Functionality: I added support for image recognition, enabling the device to perform visual analysis.
These additions significantly extended the original functionality of xiaozhi-esp32
, making it more suitable for our smart glasses kit application.
What will you design?
- A custom PCB (SmartGlassesKit) with minimal components (one button and one LED) for control and feedback.
- A 3D printed enclosure that can be attached to various glasses frames.
- Integration of microphone, speaker, camera, display, and prism into a wearable form.
What materials and components will be used?
- Microcontroller: XIAO ESP32S3 with Sense (camera + microphone) — $14.99
- Custom SmartGlassesKit PCB: ¥ 20 for 5 pieces
- GC9107 0.85” LCD: ¥ 29
- MAX98357 audio amplifier: ¥ 10
- Speaker: ¥ 5
- LED x5: ¥ 5
- Button x5: ¥ 5
- Beam splitter prism: ¥ 20
- Lion battery: ¥ 10
- Enclosure: 3D printed using PLA/TPU
Where will they come from?
- Most electronic components will be sourced from local Chinese electronics markets or online platforms like Taobao.
- The 3D printed parts will be fabricated in-house using a personal or FabLab 3D printer.
How much will they cost? Approximate total cost (for one unit):
- XIAO ESP32S3 + Sense: $14.99
- PCB and components: ¥ 35
- LCD: ¥ 29
- Audio and speaker: ¥ 15
- Prism: ¥ 20
- Enclosure (3D printed): negligible material cost Total: around $30–35
What parts and systems will be made?
- A custom PCB (SmartGlassesKit)
- A 3D printed enclosure
- Full system integration of voice capture, camera input, audio output, display, and control logic.
What processes will be used?
- 2D design (GIMP)
- 3D design (FreeCAD)
- PCB design (JLC PCB)
- 3D printing (additive manufacturing)
- Soldering and electronics assembly
- Embedded programming (ESP-IDF or Arduino)
- Application development (Python)
- System integration and testing
What questions need to be answered?
- How to optimize latency and performance between the device and server?
- How to ensure a comfortable and stable fit on various glasses frames?
- How to manage power and battery life in a compact wearable form?
How will it be evaluated?
- The device should independently perform speech recognition (STT), query processing, and TTS response.
- It should capture an image and describe the scene when asked.
- All components should work in an integrated, wearable form factor.
- The system should be reproducible and documented.