Skip to content

Week 10 Output Devices

Group assignment

Week 10 Chaihuo group assignment.

Individual assignment

For my final project, when voice has been collected and sent to the LLM, the LLM replies and the UI on the display changes accordingly. Therefore, there are three outputs: display on the web, display on the device, and TTS through the laptop.

Web UI

Basic functions, and the same content sent to Cursor as a prompt:

  1. Need a switch to enable the Mac server at once (Cursor suggested whether this is necessary)
  2. Need a status bar to display the Wi-Fi connection status of the ESP32 and the laptop
  3. Need a box to display the text content as mentioned, and another box to display the response
  4. Need a status bar: speaking (display when a person is speaking), thinking (display when the AI bot is thinking)
  5. Overall page style: Apple style

Gemini gave me the diagram:

The first version

Open Mac server again:

lsof -tiTCP:8765 -sTCP:LISTEN | xargs kill
"/Users/jerryrong/Fablab/Final project/Final- coding/server/start.sh"

Flash new firmware:

cd "/Users/jerryrong/Fablab/Final project/Final- coding"
/Users/jerryrong/Library/Python/3.9/bin/platformio run -t upload

Web to check:

  • Launching website: http://127.0.0.1:8764/
  • Web: http://127.0.0.1:8765/

Optimization

  • I need a button to turn on the Mac server
  • Change the icon of the robot

Display UI

My idea:

  • Use my own image
  • The text will be displayed on the screen, in Chinese cursive script style
  • For images, I will use three images to cover all 9 states

The current setting with some images generated by code, as below:

I sent the images to Cursor and required the status and images — different status with different image:

I got the initial firmware and burned it to the ESP32:

cd "/Users/jerryrong/Fablab/Final project/Final- coding"
/Users/jerryrong/Library/Python/3.9/bin/platformio run -t upload

The UI has been updated. Testing as the video (file "UI"):

I did some other optimization on the UI to make sure the status is closer to daily use. For example, show Wi-Fi disconnection status and maintain that UI for a long time until connected again.

For the text showing on the display, it is very hard for the Xiao ESP32-C3, as there is no SRAM like the Xiao S3. I had to give up this function.

A video showing the UI and Web:

Output on voice

I am using Cursor for software design. My idea (logic of software) and output are as below:

  1. After powering on, once the ESP successfully connects, play an audio file located at /Users/jerryrong/Fablab/Final project/音频/开机.m4a. The display image should remain as the first one.
  2. Each time the device is awakened, play another audio file /Users/jerryrong/Fablab/Final project/音频/干嘛.m4a to create interaction, letting users know the bot has received their input. Use the second image during this playback.
  3. During standby mode — when there is no conversation between the user and the AI bot — the AI bot should randomly play a third audio file /Users/jerryrong/Fablab/Final project/音频/笑声.m4a without a fixed interval. Note: this audio plays only when in standby; it should not play during active conversations. The display image remains unchanged, using the first image as before.

Finally, whether to compress the files depends on the hardware specifications.

I sent the same prompt to Cursor but in Chinese:

Got the initial firmware.

Updated Mac server:

"/Users/jerryrong/Fablab/Final project/Final- coding/server/launcher.sh"

New firmware:

cd "/Users/jerryrong/Fablab/Final project/Final- coding"
/Users/jerryrong/Library/Python/3.9/bin/platformio run -t upload

With it, it runs well, as shown in the video (+voice interaction/file):