Final Project and Past Things ✅
Mr. M - The local personal assistant powered by LLM,
Presentation
Presentation Silde:
Presentation Video:
Overview
Basic work for Mr.M:
- It will receive the audio message(audio to wav file) from the INMP441 and XIAO ESP32C3 mobile module, on the same Wi-Fi.
- Then it will convert the audio to words(text) and try to understand it: For tasks that are already prompted, the calendar information(JSX file) will be automatically generated.
- The corresponding calendar message or some information that is also prompted will be sent to another mobile module via MQTT, displaying with ILI9341 display.
Overview images:
Featured images:
Microphone Input:
This input is might not looking very well, here is the specific structure of it:
The PCB can fit the 3D printed part well, but it is more reassuring to have a post stuck like left.
Display Output:
Operation UI Image(I really really be proud of this):
Implementation Details
Materials and Components:
Updated on 7.8th
, I have updated my project, with both input and output.
Here is the overview of the materials:
Files Sharing
Chat Bot Related
- ChatBot UI Interface
- Mr.M's Prompt modelfile
- Code of Ollama Usage and integrating Docusaurus-typed Page
- Code of Text file(.txt) to Calendar Information(.jsx)
Input Related
- (Input)Recording Module, Iron Man Reactor Core - Top
- (Input)Recording Module, Iron Man Reactor Core - White transparent acrylic
- (Input)Recording Module, Iron Man Reactor Core - Middle
- Code of Raw Audio Data to Words
- PCB Design of Recording Module KiCAD Design
- Week 12 - Input KiCAD Design
Output Related
- (Output)ILI9341 Display Support
- Receive and Display Calendar Information
- ILI9341 Display Calendar Information KiCAD Design
Integration Related
- reComputer Support
- reRouter Support
- Monitor Attached Board
- Top Cover - Iron Man Logo reference by FreeDXF
- Top Cover
- Bottom Base
Embedded Microcontroller Interfacing and Programming:
For the first is my software testing part:
(Input) Mobile Module - Raw Audio Data to WAV file
The function: Powering it the board up and it will record and generate a WAV file and transmit to a server, where the IP is given by the nework.
The reComputer can read the IP and download the file from the temporary server.
More details on the (input) Mobile Module - Raw Audio Data to Words
WAV file to text on reComputer
Convert WAV file to text using canary-1b
and input the text into the LLM(Ollama API):
- The WAV file is on the top right corner of the monitor.
- The lower left corner is where the interactive page (docusaurus pages) runs
- The upper left corner is where the audio file is downloaded
More details on the (input) Mobile Module - Raw Audio Data to Words
To achieve this part, I need to set up a LLM-powered bot on my reComputer:
More details on the Chat Bot(Local Server)
For making this bot looking better, I use GPT itself to generate code(js
, css
, tsx
) to looking better:
More details on the Operating UI Setting
Audio Text to Designed calendar standards(JSX)
I need to be able to translate my words(audio text file) into a calendar JSX standard file that docusaurus platform can render:
It is testing on MAC. But reComputer and MAC both sharing Linux, hence the code is shared.
More details on the Local LLM and Auto-generation tsx file
For using LLM correctly, I need to learn and apply well using Prompt Enginner:
More details on the Prompt Setup
(Output) Mobile Module - Receive and Display Calendar Information
After the reComputer generating the standard JSX calendar files, from the processed text file. I shall transmit the message from a text file to a mobile module, displaying it, and maybe controlling something as well.
More details on the Mobile Module - Receive and Display Calendar Information
For applying this, it requires MQTT function and I have apply mosquitto
in reComputer, let it functioning as the broker.
More details on the week 15 - MQTT connect with XIAO boards and Docusaurus-website
And using Docusaurus page to construct WebSocket
and implement MQTT function under 192.168.6.1
network(reRouter)
More details on the week 15 - MQTT connect with XIAO boards and Docusaurus-website
Interfacing
I design the bot using my assignment website page and put all wireless function together. These can be done as automated.
The automation is great and convenient for the users, but they can always manually control, under the FAB Academy MQTT. For example:
Electronics Design and Production:
Updated on 7.8th
, I have updated my project, with both input and output. Thus, my function is upgraded, using my own PCB.
(Input) Mobile Module - Raw Audio Data to Words
I want this module whiling receiving my voice it can display some RGB LED lights showing it is working, or other things. I then connect two 3 Pin Header SMD in it.
More details on the (input) Mobile Module - PCB design
(Output) Mobile Module - Receive and Display Calendar Information
I want this module whiling receiving the calendar information it can control something, like a relay. I then add a Grove port and a 8 PinHeader P2.54mm on my PCB:
More details on the (Output) Mobile Module - PCB design
Fabrication Processes:
Updated on 7.8th
, I have updated my project, with both input and output. Thus, my design are changing here:
The case for input module:
Top cover(3D printing):
Middle part(3D printing):
And the RGB light cover(2D design):
Eventually:
Extend the rgb lights out, along with XIAO antenna:
Finally put the acrylic plate init can cover it up:
The case for output module:
Display holder and Grove Relay holder(3D printing):
Eventually, it looks fine:
The system case design - 2D laser cutting
This part is to make the work look good, tidy and logical.
- The RGB display is best with a filter layer. So cut some acrylic sheets against the blueprints:
- Since I used some big equipments/devices (reComputer and reRouter), I'm going to have two large acrylic sheets to hold everything together:
Thus the laser cutting is necessary:
The system case design - 3D printing
This part is to make the work look good, tidy and logical.
- My Devices need to be supported, fixed and not moving everywhere. Thus I need to design some 3D parts holding them:
For other consideration, I remove the original.
System Integration and Packaging:
The whole system basic idea:
- Main computing device: reComputer, offering MQTT broker, AI computing, Running LLM and Browser Website.
- Networking device(I want to ensure all thing is local): reRouter, offering Wi-Fi wireless connection and wire connection.
- One mobile module with INMP441 is inputting voice
- One mobile module with ILI9341 is display the processed information.
The whole system carraging:
The packaging is big, even though I remove the original cases of computing devices. It is still big, no mention the power adapter. But this can be carried:
Updated with monitor and power adapter
Continuing on the monitor, for the carrierable goal, I have to consider a carrierable monitor. I bought one and its size is showing by the producer:
There are four screw holes in the back, and then I measured the distance from the screw holes to the edge and the distance between the holes:
Then I need to calculate te length of the display board, where it must ensure that it is not too long to affect the appearance and not too short to connect:
Later I design the attached board for monitor on the OnShape:
Laser cutting and done, looking fine on both sides:
And for the power adapter I need to hold them tied as well:
And considering the monitor, I need to design the gap on each plate, for the holding:
The bottom:
The top:
Cutting the board and insert the modules in it:
Screw it in. Almost done
Adding the mobile module:
Move them outsides:
Settling Phase - ✅ on 5.17th
What I am going to achieve from 5.17th
A personal assistant, bonding a carriable recording module, saving all my data and helping my daily life:
- For now, converting my chooseable data to standard format(JSX) then present my work as calendar.
- Interoperable with local LLM.
- The talking savable.
What I should design
- A portable voice recognization module, carrying wireless charge, and eye-catching feedback.
- A LLM-powered AI computing device, running local LLM and enabling the websites.
- The design holds everything together.
Here is the basic idea:
And the basic connection:
For the final look like, the design should be a 420mmx210mmx60mm box.
But this is too big for looking, hence, put them in a case would be really good:
Thus, my final design of my idea is:
For the main parts
For the carriable module:
Specific Potions and Related Plan
The Final
This is current status I'm at:
And the progress to achieve them is:
Weekly assignments
Neil's classes that I missing: week5, week7, week8, week9, week10, week16, week17, week18, week19, week20.
Assignments that I missing: week3, week5, week7, week8, week10, week12, week13, week16, week17, week18, week19, week20.
Written on 2024.5.13th
, some memo.
I have updated my week 2 assignment today because I realized handwriting never looks good, and I need professional tools to help me sort it out.
I found the size parameters on the official website of each product listed below, which can help me build the 3D model in the future.
- 2.4 inch TFT Display
- 13.1 inch HDMI Display
- Wireless Charge Module(both receiver and transmitter)
- reComputer J4012
- XIAO ESP32C3
- WS2813 RGB LED
- reRouter CM4
Written on 2024.5.10th
, some memo.
I have updated my pages today, and wrote some thinkings, as well as finished my personal details. It is a fun journey indeed, esspecially since I have to deal with a lot of stress from my new responsibilities at work. I have a new me.
Written on 2024.4.28th
, a little update about my work.
I was thrilled by receiving my reComputer J4012 several days ago:
And I have set it up right on the top of my working position. This will be the prototype of my final project: a local, high AI-enabled computing device, running a smart personal assistant mode, ready to help me with everywhere I go(the calendar is just a simple function of it).
But I faced a big trouble of the device - I can't run the docusaurus page on the Jetson, and here are some of my progress:
- One requirement for running docusaurus pages is nodejs and its minimum version is 18 but the Jetson only supports 10
- I learnt that I can use
apt-get
tool to install the nodejs-10 but that's it. So I have to find the other ways from the nodejs official website - I try to download the binaries files for the lastest one, but I accidently download the wrong one, that the reComputer is ARM-architecture Linux computer and I downloaded the X86-architecture Liunx version.
- After facing some problems about environment setting(command
echo
,ln
, etc, this costs me so much time) I have finally searched for the right one. - So it appears that I only need to locate the downloaded binaries file in my terminal, then excute
sudo cp -R * /usr/local/
will do the workcp
is for copy-R
is the option for recursively copy directories and their contents.*
represents the wildcard character that represents all files and folders in the current directory./usr/local/
this is the PATH and the environment setting I am looking for. Copying to this location means that Node.js will be installed system-wide and accessible to all users.
- Finally I can run
npm init docusaurus
to install the docusaurus initial pages and replace the files by my fablab folder. Then I runnpm run start
and finally have my website, there goes my calendar(so not easy...):
But I did it! So interesting! The next steps(running ollama and do the auto-generation jsx files) will be easy.
Install Ollama and run Llama3:
There requires curl
tool to pull the latest Ollama and here is the way to install it on the Jetson(ARM-architecture):
wget https://curl.se/download/curl-7.51.0.tar.bz2
tar xvf url-7.51.0.tar.bz2
cd curl-7.51.0
sudo ./configure
sudo make
sudo make install
And then I can use this official command curl -fsSL https://ollama.com/install.sh | sh
to download the Ollama. I then use ollama run llama3
to pull the library on my device.
Written on 2024.4.27th
, a little update about my work.
Continued on the AI recitation... Still have half vedio to process...
Combining technique and do something personal:
Retrieval Augmented Generation(RAG): Provide the domain knowledge documents, where the document is chunked(differnet blocks) and context semantically vectorized(like the number vectorized). But it requires curation. This skill does not required amount of documents.
Fine-tuning: require amount of documents, the size is big
AI tools:
- Whisper Model: speecch-to-text, can transcribe audio files in 100 languages
- WhisperWriter is a small speech-to-text app that uses OpenAI's Whisper model to auto-transcribe recordings from a user's microphone.
- Local LLM Apps: there are different ones, and I have discovered a little bit and decided to use local LLM(Ollama) and use docusaurus framework to build my website(UI).
Written on 2024.4.26th
, a little update about my work.
When I am busy with my reComputer(arrived a couple days ago) developing and sad about the progress and worry about my other assignments... I just found the recitation about AI(LLM) toolsXD Oh My God - that is what I wanted! So much! I have the idea, I have the knowledge about LLM, but at the same time I have the job.... - which means I don't have much time to start at the beginning and this is the reason I couldn't sleep well these days...
I will watch this video and finish it before I went to bed today... And... That some notes here:
LLM-powered CAD:
- Method 1: Using DSL(Domain Specific Language)
- prompt: give a function box(x, y, w, h, d) and ask GPT to generate
- Method 2: OpenJSCAD: multipe one, not sure which one (this is GitHub repo)
- The idea is using JS to describe the CAD design, and the JS code can be generated by LLM
- Method 3: Python - PyVista
- The idea is using Python to describe the CAD design, and the Python code can be generated by LLM
- Method 4: text-to-cad.zoo.dev
- Directly using text to generate CAD files
LLM-powered Laser Cutting:
- Method 1: not sure right now
LLM-powered CNC:
analyze the models using ChatGPT - which I might not be able to use.. analyze the performance with the models
CAD - Strengths and Limitations
Strengths:
- Respects high-level spatial constraints.such as a desian element's absolute size orits position relative to another element of thedesign.
- lteration Support
- Parametric design: ability to createparameters, bounds and constraints fortext-based designs and already existing, andinterpolate and change designs
- Modularity and Hierarchy: works better when submodules are formed first.
Limitations:
- Lack of spatial awareness created difficulties with constraint handling
- Problem with Scalability
LLM-powered 3d Model:
- Method 1: Threestudio
- Method 2: Genie, lumalabsai
- Method 3: depthfusion, github
LLM-powered Electronics Design:
- Method 1: chat with any PDF: ask the datasheet about the components, microcontrollers, other modules. Which pin can connect, what code might be, etc.
- Method 2: BoardDesignerGPT: need gpt API
Example by student OpenAir
Written on 2024.4.17th
, a little update about my work.
Achieve the function of generating the local content from LLM and convert it into a formal standard jsx file: here and I have my own calendar UI....
Written on 2024.4.15th
, a little update about my work.
I think I need to make a UI for presenting both calendar and the communication with the LLM, for knowing if it is workable. I have searched for multiple opensource calendar software and they all seem meet my requirments, with customizable parameters. But they all seem complicated - They do have many functions and support various extensions, but it takes too much time to understand and learn one by one. I don’t have enough time, so I plan to study it later.
- But I still need an automated calendar UI to help me convert my recording content into what I need to do and display it so that I can see it at any time, and it has to be local deployment.
- I suddenly realized that for locally deployed web pages, maybe "Docusaurus" has this scalability, so I started looking for. Then I did find it - There is a date picker component for React called "DayPicker" supporting Docusaurus.
I then have the idea of doing it:
- Ensure the DayPicker is functional on my FabLab pages
- For the output content from LLM, making it as format as a JS file, including some important information, and stored as a text file
- I also need to make a continuous monitoring capability, to watch the folder of the files containing converted files. The "chokidar" might be a good idea.
- Then use "Handlebars" and node.js to generate React components with basically the same structure - the different ones will be the information of me recording contents
- These components will be stored in the path of "/src/components/calendar", named one by one
- Finally, creating a TSX page, to input all the components one by one - Then I will have my contents of me working and the timeline
This will be the function of "Ensure the standard/formal content output by large language models to be correctly convert to task chasing, presented as the UI of calendar".
By the way, the Google Calendar sounds also good, and might be easier:
- I can first deploy the Google Workspace in my computer with python.
- Then use python as well to create the JS file with standards to upload the event/my working to the cloud.
- The JS file can be indeed generated automatically
- For the UI itself - I go to the Google Calendar and all will be fine.
But it requires the cloud services, and the UI is not very well. I like the local services. And the method of Docusaurus might be not good looking at the first place, but I can customize it, which means I can make the UI with my ideas. I choose the DayPicker.:D
Written on 2024.4.6th
, my work is finally on track and it is time to do this lab right.
As a content and web manager for my company, my role involves extensive coordination with internal team members, as well as overseeing an additional project that requires engaging with numerous external stakeholders. Amid these responsibilities, I am also dedicating time to learning and developing within the fablab space. Given the multifaceted nature of my work, effective time and project management are not just beneficial—they are essential.
Recognizing the need for assistance in managing this complex array of duties, I am exploring the potential of integrating a personal AI assistant into my workflow. The recent advancements in large language models (LLMs) have caught my attention, presenting a promising solution that could offer the precision and efficiency that physical assistance may not always be able to provide. I am interested in leveraging the capabilities of an AI to enhance my productivity, especially in the context of the fablab project, where innovation and meticulous coordination are key.
Conceptualization Phase - ✅ on 4.6th
My Final Project Idea - Task Management Calendar
I plan for an AI assistant with a physical calendar interface, powered by an NVIDIA Jetson unit running a large language model, with an external display and a portable, voice-activated module.
Define Objectives
- It is a task management AI assistant with scheduling capability.
- In the future, it might have reminder, remote support, wireless function as well.
Current User - Matthew
- Manage different types of products, ensure the description and other content meet the standards.
- Output the priority of the tasks, ensure the meet the timeline, for day/week/month.
Requirements Gathering
- NVIDIA Jetson Orin NX module-powered devices, LLM integrated.
- Displays supporting the devices above, with the suitable size.
- Microphone and voice stored module, voice recognition.
Design Phase
Hardware Design
- NVIDIA Jetson devices - reComputer J4012, with mechancial parameters: 130mm x 120mm x 58.5mm.
- The size of display is 7-inch, with HDMI connector.
- The voice recoginiction module is DIY, with Seeed Studio XIAO board, recorder chip, button, SD card and USB-connector.
- The assembled and diy cutting things for integrate all parts.
Software Design
- The LLM model supported on reComputer.
- UI design on the display.
- The code about pressing button and record voice, with XIAO.
- The program that automatically copy the voice files into the Jetson platform.
- The program that convert voice into words(LLM) then convert to the standard command and store the csv files.
- The program that read the csv files and output on the display.
The Development Phase
Since I am pretty busy before 2024.4.6th, I should catch up from week11, in three portions:
- The main courses
- The courses to catch up with
- Development of my final project own
The previous
June 5th will be the final and I still have 8 weeks to catch up. Here are something I currently finishing and assignments that I need to catch up with:
The plan from 2024.4.6th
As a project manager in the company, there should be a plan for me doing and catching: