Skip to main content

Final Project and Past Things ✅

Mr. M - The local personal assistant powered by LLM,

Presentation

Presentation Silde:

Presentation Video:

Overview

Basic work for Mr.M:

  1. It will receive the audio message(audio to wav file) from the INMP441 and XIAO ESP32C3 mobile module, on the same Wi-Fi.
  2. Then it will convert the audio to words(text) and try to understand it: For tasks that are already prompted, the calendar information(JSX file) will be automatically generated.
  3. The corresponding calendar message or some information that is also prompted will be sent to another mobile module via MQTT, displaying with ILI9341 display.

Overview images:

image

Featured images:

Microphone Input:

image

info

This input is might not looking very well, here is the specific structure of it: image

The PCB can fit the 3D printed part well, but it is more reassuring to have a post stuck like left.

Display Output:

image

Operation UI Image(I really really be proud of this):

image

Implementation Details

Materials and Components:

Updated on 7.8th, I have updated my project, with both input and output.

ComponentQuantity
reComputer J4012x1
reRouterx1
Monitorx1
XIAO ESP32C3x2
INMP441x1
3 Pin Header SMDx2
Grove Female Headerx1
ILI9341x1
Grove RGB LED Ringx1
Power Supply Extensionx1
Network Cablex1
Type-C Male Connector to Three Type-C Female Connectorx1
M5*60 screwx4
M5 nutx8

Here is the overview of the materials:

image

Files Sharing

Embedded Microcontroller Interfacing and Programming:

For the first is my software testing part:

(Input) Mobile Module - Raw Audio Data to WAV file

The function: Powering it the board up and it will record and generate a WAV file and transmit to a server, where the IP is given by the nework.

The reComputer can read the IP and download the file from the temporary server.

More details on the (input) Mobile Module - Raw Audio Data to Words

WAV file to text on reComputer

Convert WAV file to text using canary-1b and input the text into the LLM(Ollama API):

  • The WAV file is on the top right corner of the monitor.
  • The lower left corner is where the interactive page (docusaurus pages) runs
  • The upper left corner is where the audio file is downloaded

More details on the (input) Mobile Module - Raw Audio Data to Words

info

To achieve this part, I need to set up a LLM-powered bot on my reComputer:

More details on the Chat Bot(Local Server)

For making this bot looking better, I use GPT itself to generate code(js, css, tsx) to looking better:

More details on the Operating UI Setting

Audio Text to Designed calendar standards(JSX)

I need to be able to translate my words(audio text file) into a calendar JSX standard file that docusaurus platform can render:

info

It is testing on MAC. But reComputer and MAC both sharing Linux, hence the code is shared.

More details on the Local LLM and Auto-generation tsx file

info

For using LLM correctly, I need to learn and apply well using Prompt Enginner:

More details on the Prompt Setup

(Output) Mobile Module - Receive and Display Calendar Information

After the reComputer generating the standard JSX calendar files, from the processed text file. I shall transmit the message from a text file to a mobile module, displaying it, and maybe controlling something as well.

image

More details on the Mobile Module - Receive and Display Calendar Information

info

For applying this, it requires MQTT function and I have apply mosquitto in reComputer, let it functioning as the broker. image

More details on the week 15 - MQTT connect with XIAO boards and Docusaurus-website

And using Docusaurus page to construct WebSocket and implement MQTT function under 192.168.6.1 network(reRouter)

More details on the week 15 - MQTT connect with XIAO boards and Docusaurus-website

Interfacing

I design the bot using my assignment website page and put all wireless function together. These can be done as automated.

info

The automation is great and convenient for the users, but they can always manually control, under the FAB Academy MQTT. For example:

image

image

Electronics Design and Production:

Updated on 7.8th, I have updated my project, with both input and output. Thus, my function is upgraded, using my own PCB.

(Input) Mobile Module - Raw Audio Data to Words

I want this module whiling receiving my voice it can display some RGB LED lights showing it is working, or other things. I then connect two 3 Pin Header SMD in it.

image

More details on the (input) Mobile Module - PCB design

(Output) Mobile Module - Receive and Display Calendar Information

I want this module whiling receiving the calendar information it can control something, like a relay. I then add a Grove port and a 8 PinHeader P2.54mm on my PCB:

image

More details on the (Output) Mobile Module - PCB design

Fabrication Processes:

Updated on 7.8th, I have updated my project, with both input and output. Thus, my design are changing here:

The case for input module:

Top cover(3D printing):

Middle part(3D printing):

image1

And the RGB light cover(2D design):

image1

Eventually:

image1

Extend the rgb lights out, along with XIAO antenna:

image1

Finally put the acrylic plate init can cover it up:

image1

The case for output module:

Display holder and Grove Relay holder(3D printing):

Eventually, it looks fine:

The system case design - 2D laser cutting

This part is to make the work look good, tidy and logical.

  1. The RGB display is best with a filter layer. So cut some acrylic sheets against the blueprints:

image1

image1

  1. Since I used some big equipments/devices (reComputer and reRouter), I'm going to have two large acrylic sheets to hold everything together:

Thus the laser cutting is necessary:

image1

image1

The system case design - 3D printing

This part is to make the work look good, tidy and logical.

  1. My Devices need to be supported, fixed and not moving everywhere. Thus I need to design some 3D parts holding them:

image1

For other consideration, I remove the original.

image1

System Integration and Packaging:

The whole system basic idea:

image1

  • Main computing device: reComputer, offering MQTT broker, AI computing, Running LLM and Browser Website.
  • Networking device(I want to ensure all thing is local): reRouter, offering Wi-Fi wireless connection and wire connection.
  • One mobile module with INMP441 is inputting voice
  • One mobile module with ILI9341 is display the processed information.

The whole system carraging:

The packaging is big, even though I remove the original cases of computing devices. It is still big, no mention the power adapter. But this can be carried:

image1

Updated with monitor and power adapter

Continuing on the monitor, for the carrierable goal, I have to consider a carrierable monitor. I bought one and its size is showing by the producer:

image1

There are four screw holes in the back, and then I measured the distance from the screw holes to the edge and the distance between the holes:

image1

image1

Then I need to calculate te length of the display board, where it must ensure that it is not too long to affect the appearance and not too short to connect:

image1

Later I design the attached board for monitor on the OnShape:

image1

Laser cutting and done, looking fine on both sides:

image1

image1

And for the power adapter I need to hold them tied as well:

image1

And considering the monitor, I need to design the gap on each plate, for the holding:

The bottom:

image1

The top:

image1

Cutting the board and insert the modules in it:

image1

Screw it in. Almost done

image1

Adding the mobile module:

image1

Move them outsides:

image1

Settling Phase - ✅ on 5.17th

What I am going to achieve from 5.17th

A personal assistant, bonding a carriable recording module, saving all my data and helping my daily life:

  1. For now, converting my chooseable data to standard format(JSX) then present my work as calendar.
  2. Interoperable with local LLM.
  3. The talking savable.

What I should design

  1. A portable voice recognization module, carrying wireless charge, and eye-catching feedback.
  2. A LLM-powered AI computing device, running local LLM and enabling the websites.
  3. The design holds everything together.

Here is the basic idea:

And the basic connection:

For the final look like, the design should be a 420mmx210mmx60mm box.

But this is too big for looking, hence, put them in a case would be really good:

Thus, my final design of my idea is:

For the main parts

For the carriable module:

The Final

This is current status I'm at:

And the progress to achieve them is:

Weekly assignments

Neil's classes that I missing: week5, week7, week8, week9, week10, week16, week17, week18, week19, week20.

Assignments that I missing: week3, week5, week7, week8, week10, week12, week13, week16, week17, week18, week19, week20.

info

Written on 2024.5.13th, some memo.

I have updated my week 2 assignment today because I realized handwriting never looks good, and I need professional tools to help me sort it out.

I found the size parameters on the official website of each product listed below, which can help me build the 3D model in the future.

image0

info

Written on 2024.5.10th, some memo.

I have updated my pages today, and wrote some thinkings, as well as finished my personal details. It is a fun journey indeed, esspecially since I have to deal with a lot of stress from my new responsibilities at work. I have a new me.

info

Written on 2024.4.28th, a little update about my work.

I was thrilled by receiving my reComputer J4012 several days ago:

And I have set it up right on the top of my working position. This will be the prototype of my final project: a local, high AI-enabled computing device, running a smart personal assistant mode, ready to help me with everywhere I go(the calendar is just a simple function of it).

But I faced a big trouble of the device - I can't run the docusaurus page on the Jetson, and here are some of my progress:

  1. One requirement for running docusaurus pages is nodejs and its minimum version is 18 but the Jetson only supports 10
  2. I learnt that I can use apt-get tool to install the nodejs-10 but that's it. So I have to find the other ways from the nodejs official website
  3. I try to download the binaries files for the lastest one, but I accidently download the wrong one, that the reComputer is ARM-architecture Linux computer and I downloaded the X86-architecture Liunx version.
  4. After facing some problems about environment setting(command echo, ln, etc, this costs me so much time) I have finally searched for the right one.
  5. So it appears that I only need to locate the downloaded binaries file in my terminal, then excute sudo cp -R * /usr/local/ will do the work
    • cp is for copy -R is the option for recursively copy directories and their contents.
    • * represents the wildcard character that represents all files and folders in the current directory.
    • /usr/local/ this is the PATH and the environment setting I am looking for. Copying to this location means that Node.js will be installed system-wide and accessible to all users.
  6. Finally I can run npm init docusaurus to install the docusaurus initial pages and replace the files by my fablab folder. Then I run npm run start and finally have my website, there goes my calendar(so not easy...):

But I did it! So interesting! The next steps(running ollama and do the auto-generation jsx files) will be easy.

Install Ollama and run Llama3:

There requires curl tool to pull the latest Ollama and here is the way to install it on the Jetson(ARM-architecture):

wget https://curl.se/download/curl-7.51.0.tar.bz2
tar xvf url-7.51.0.tar.bz2
cd curl-7.51.0
sudo ./configure
sudo make
sudo make install

And then I can use this official command curl -fsSL https://ollama.com/install.sh | sh to download the Ollama. I then use ollama run llama3 to pull the library on my device.

info

Written on 2024.4.27th, a little update about my work.

Continued on the AI recitation... Still have half vedio to process...

Combining technique and do something personal:

Retrieval Augmented Generation(RAG): Provide the domain knowledge documents, where the document is chunked(differnet blocks) and context semantically vectorized(like the number vectorized). But it requires curation. This skill does not required amount of documents.

Fine-tuning: require amount of documents, the size is big

AI tools:

  • Whisper Model: speecch-to-text, can transcribe audio files in 100 languages
  • WhisperWriter is a small speech-to-text app that uses OpenAI's Whisper model to auto-transcribe recordings from a user's microphone.
  • Local LLM Apps: there are different ones, and I have discovered a little bit and decided to use local LLM(Ollama) and use docusaurus framework to build my website(UI).
info

Written on 2024.4.26th, a little update about my work.

When I am busy with my reComputer(arrived a couple days ago) developing and sad about the progress and worry about my other assignments... I just found the recitation about AI(LLM) toolsXD Oh My God - that is what I wanted! So much! I have the idea, I have the knowledge about LLM, but at the same time I have the job.... - which means I don't have much time to start at the beginning and this is the reason I couldn't sleep well these days...

I will watch this video and finish it before I went to bed today... And... That some notes here:

LLM-powered CAD:

  • Method 1: Using DSL(Domain Specific Language)
    • prompt: give a function box(x, y, w, h, d) and ask GPT to generate
  • Method 2: OpenJSCAD: multipe one, not sure which one (this is GitHub repo)
    • The idea is using JS to describe the CAD design, and the JS code can be generated by LLM
  • Method 3: Python - PyVista
    • The idea is using Python to describe the CAD design, and the Python code can be generated by LLM
  • Method 4: text-to-cad.zoo.dev
    • Directly using text to generate CAD files

LLM-powered Laser Cutting:

  • Method 1: not sure right now

LLM-powered CNC:

analyze the models using ChatGPT - which I might not be able to use.. analyze the performance with the models

CAD - Strengths and Limitations

Strengths:

  • Respects high-level spatial constraints.such as a desian element's absolute size orits position relative to another element of thedesign.
  • lteration Support
  • Parametric design: ability to createparameters, bounds and constraints fortext-based designs and already existing, andinterpolate and change designs
  • Modularity and Hierarchy: works better when submodules are formed first.

Limitations:

  • Lack of spatial awareness created difficulties with constraint handling
  • Problem with Scalability

LLM-powered 3d Model:

LLM-powered Electronics Design:

  • Method 1: chat with any PDF: ask the datasheet about the components, microcontrollers, other modules. Which pin can connect, what code might be, etc.
  • Method 2: BoardDesignerGPT: need gpt API

Example by student OpenAir

info

Written on 2024.4.17th, a little update about my work.

Achieve the function of generating the local content from LLM and convert it into a formal standard jsx file: here and I have my own calendar UI....

info

Written on 2024.4.15th, a little update about my work.

I think I need to make a UI for presenting both calendar and the communication with the LLM, for knowing if it is workable. I have searched for multiple opensource calendar software and they all seem meet my requirments, with customizable parameters. But they all seem complicated - They do have many functions and support various extensions, but it takes too much time to understand and learn one by one. I don’t have enough time, so I plan to study it later.

  • But I still need an automated calendar UI to help me convert my recording content into what I need to do and display it so that I can see it at any time, and it has to be local deployment.
  • I suddenly realized that for locally deployed web pages, maybe "Docusaurus" has this scalability, so I started looking for. Then I did find it - There is a date picker component for React called "DayPicker" supporting Docusaurus.

I then have the idea of doing it:

  1. Ensure the DayPicker is functional on my FabLab pages
  2. For the output content from LLM, making it as format as a JS file, including some important information, and stored as a text file
    • I also need to make a continuous monitoring capability, to watch the folder of the files containing converted files. The "chokidar" might be a good idea.
  3. Then use "Handlebars" and node.js to generate React components with basically the same structure - the different ones will be the information of me recording contents
    • These components will be stored in the path of "/src/components/calendar", named one by one
  1. Finally, creating a TSX page, to input all the components one by one - Then I will have my contents of me working and the timeline

This will be the function of "Ensure the standard/formal content output by large language models to be correctly convert to task chasing, presented as the UI of calendar".


By the way, the Google Calendar sounds also good, and might be easier:

  1. I can first deploy the Google Workspace in my computer with python.
  2. Then use python as well to create the JS file with standards to upload the event/my working to the cloud.
    • The JS file can be indeed generated automatically
  3. For the UI itself - I go to the Google Calendar and all will be fine.

But it requires the cloud services, and the UI is not very well. I like the local services. And the method of Docusaurus might be not good looking at the first place, but I can customize it, which means I can make the UI with my ideas. I choose the DayPicker.:D


info

Written on 2024.4.6th, my work is finally on track and it is time to do this lab right.

As a content and web manager for my company, my role involves extensive coordination with internal team members, as well as overseeing an additional project that requires engaging with numerous external stakeholders. Amid these responsibilities, I am also dedicating time to learning and developing within the fablab space. Given the multifaceted nature of my work, effective time and project management are not just beneficial—they are essential.

Recognizing the need for assistance in managing this complex array of duties, I am exploring the potential of integrating a personal AI assistant into my workflow. The recent advancements in large language models (LLMs) have caught my attention, presenting a promising solution that could offer the precision and efficiency that physical assistance may not always be able to provide. I am interested in leveraging the capabilities of an AI to enhance my productivity, especially in the context of the fablab project, where innovation and meticulous coordination are key.

Conceptualization Phase - ✅ on 4.6th

My Final Project Idea - Task Management Calendar

I plan for an AI assistant with a physical calendar interface, powered by an NVIDIA Jetson unit running a large language model, with an external display and a portable, voice-activated module.

Define Objectives

  • It is a task management AI assistant with scheduling capability.
  • In the future, it might have reminder, remote support, wireless function as well.

Current User - Matthew

  • Manage different types of products, ensure the description and other content meet the standards.
  • Output the priority of the tasks, ensure the meet the timeline, for day/week/month.

Requirements Gathering

  • NVIDIA Jetson Orin NX module-powered devices, LLM integrated.
  • Displays supporting the devices above, with the suitable size.
  • Microphone and voice stored module, voice recognition.

Design Phase

Hardware Design

  • NVIDIA Jetson devices - reComputer J4012, with mechancial parameters: 130mm x 120mm x 58.5mm.

  • The size of display is 7-inch, with HDMI connector.

  • The voice recoginiction module is DIY, with Seeed Studio XIAO board, recorder chip, button, SD card and USB-connector.

  • The assembled and diy cutting things for integrate all parts.

Software Design

  1. The LLM model supported on reComputer.
  2. UI design on the display.
  3. The code about pressing button and record voice, with XIAO.
  4. The program that automatically copy the voice files into the Jetson platform.
  5. The program that convert voice into words(LLM) then convert to the standard command and store the csv files.
  6. The program that read the csv files and output on the display.

The Development Phase

Since I am pretty busy before 2024.4.6th, I should catch up from week11, in three portions:

  • The main courses
  • The courses to catch up with
  • Development of my final project own

The previous

June 5th will be the final and I still have 8 weeks to catch up. Here are something I currently finishing and assignments that I need to catch up with:

The plan from 2024.4.6th

As a project manager in the company, there should be a plan for me doing and catching: