Skip to content

Final Project

Requirements

Project Overview

Project Name: Capybara

Objective: Talking Capybara

Target User: for people who need an extra companion

Functional Requirements (mvp)

  1. Core Function: talk
  2. Input: mic
  3. Processing: speech to text - large language model - text to speech pipeline on external server
  4. Output: sound

Functional Requirements (v2)

  1. Core Function: see and talk
  2. Input: mic, camera
  3. Processing: speech to text - large multimodal language model - text to speech pipeline on external server
  4. Output: sound

Functional Requirements (v3)

  1. Core Function: see, talk and move
  2. Input: mic, camera
  3. Processing: speech to text - large multimodal language model with tool calls - text to speech pipeline on external server
  4. Output: sound, motor movements

Sketch

I used chatgpt to generate some sketches

v1 Sketch 1 prompt: generate the shape of a capybara that is easy to design in cad and add color

v2 Sketch 2 prompt: generate an image of how it would look if it were 3d printed and had components like a mic, speaker, battery, camera for a talking llm toy

v3 Sketch 3 prompt: generate an image with motors

Reference implementations

lanturn

pipecat-esp32

speechmatics

quadruped robot

fish

more fish