Skip to content

Week 18 Applications and Implications

Individual assignment

Lucky Bot — Applications and Implications

1. What will it do?

Lucky Bot is a small desktop voice companion. It listens to the user, wakes up when the user says "Hey Lucky," converts speech to text, sends the text to an LLM, and replies through a cloned voice of Lucky.

The goal is not only to make a normal voice assistant, but to create a more personal and warm interaction experience. Lucky Bot has a physical enclosure, a small display UI, a microphone, and a web dashboard. The display shows different states such as listening, thinking, speaking, and task done.

The main interaction flow is:

Wake word → ASR → LLM → TTS (cloned voice) → Speaker

The final result should be an independently operable intelligent voice buddy that combines hardware design, electronics, embedded programming, AI software, voice cloning, UI design, and system integration.

2. Who has done what beforehand?

There are many existing voice assistants, such as smart speakers and mobile assistants. These products can answer questions, control devices, and use cloud AI services. However, most of them are commercial, closed-source, and not very personal.

My project is different because I am not only assembling a software chatbot. I am designing and fabricating a complete physical object: enclosure, stand, PCB/electronics integration, display UI, voice interaction pipeline, and web dashboard. The special point is the cloned Lucky voice, which makes the assistant feel like a familiar voice buddy.

3. What will I design?

I will design the following parts and systems:

Hardware design

  • 3D model of the Lucky Bot enclosure
  • 2D design for stand/storage box
  • Laser-cut transparent acrylic stand/base
  • Internal layout for PCB and wires
  • Assembly

Electronics

  • XIAO ESP32C3
  • Microphone input
  • 1.28 inch display
  • Custom PCB or prototype PCB for connecting components

Software

  • Wake word "Hey Lucky"
  • ASR using Whisper
  • LLM conversation using local LLM or OpenAI API
  • TTS / voice cloning pipeline (F5 TTS or iFlytek API)
  • Display UI states: listening, thinking, speaking, task done
  • Web dashboard for checking status

AI and interaction

  • Clone Lucky's voice, build a warm conversational personality
  • Optimize timing, response quality, and naturalness
  • Test local and cloud TTS options:
  • Local F5-TTS
  • F5-TTS + T4 server
  • iFlytek API

4. What materials and components will be used?

Components

Category Component
Input Microphone, INMP441
Output Display, GC9A01 1.28 inch LCD SPI display
MCU XIAO ESP32 C3
PCB PCB designed earlier and produced by JLC

  • Mic, INMP441 purchased from Taobao

  • Display: GC9A01 1.28 inch, recommended by Gemini, purchased from Taobao

  • XIAO ESP32 C3

Estimated total prototype cost: 10 USD. OpenAI API from company (free to use), and iFlytek (30 days free).

5. What parts and systems will be made?

I will make the following parts myself:

  • a. 2D design for stand and laser cutting
  • b. 3D design for enclosure and 3D printing
  • c. PCB design and small CNC
  • d. Prototyping
  • e. Software development with Cursor

Software pipeline: ASR → Wake word → LLM → Clone the voice → TTS → UI of display and web

6. What processes will be used?

This project integrates many Fab Academy skills:

Fab Academy Skill How I use it in Lucky Bot
2D design Stand, laser-cut acrylic base
3D design Enclosure
Additive fabrication 3D printing the enclosure
Subtractive fabrication Laser cutting the transparent stand/base
Electronics design PCB/prototype board for connecting microcontroller, mic, display
Electronics production Soldering, wiring, PCB assembly
Embedded programming XIAO ESP32C3 control, display states, communication
Input device Microphone / wake word input
Output device Display and speaker
Networking / communication Device-to-server or device-to-web-dashboard communication
Interface/application programming Web dashboard and display UI
System integration Combining enclosure, electronics, AI software, and voice output
Project development BOM, schedule, testing, documentation, final slide/video

7. What questions need to be answered?

The main open questions are:

  • Voice latency: Which TTS solution gives the best balance between voice quality and speed?
  • Local vs cloud: Should the final demo use local TTS/LLM, cloud API, or a hybrid solution?
  • Wake word sensitivity: Can "Hey Lucky" be detected reliably and smoothly?
  • UI synchronization: Can the display state match the real software state accurately?
  • Thermal and space design: Can all electronics fit inside the enclosure safely?
  • User experience: Latency and LLM quality

8. How will it be evaluated?

Basic: All parts will be completed on time, and all functional parts are running correctly.

Excellent:

  • 3D design looks nice
  • Voice interactive experience is good
  • Latency has been optimized to a good status, such as:
  • Wake up: Easy to wake up, and generally within 1–2 s
  • Thinking: Within 1–3 s
  • UI sync with the related status