18. Applications and Implications¶

Weekly Assignment:¶

Propose a final project masterpiece that integrates the range of units covered, answering:

What will it do?
Who's done what beforehand?
What will you design?
What materials and components will be used?
Where will come from?
How much will they cost?
What parts and systems will be made?
What processes will be used?
What questions need to be answered?
How will it be evaluated?
What tasks have been completed?
What tasks remain?
What has worked? What hasn't?
What questions need to be resolved?
What have you learned?

Your project should incorporate 2D and 3D design, additive and subtractive fabrication processes, electronics design and production, embedded microcontroller interfacing and programming, system integration and packaging

Where possible, you should make rather than buy the parts of your project

Projects can be separate or joint, but need to show individual mastery of the skills, and be independently operable

Question Responses¶

What will it do?

The project is a voice-activated 3-axis pen plotter that listens to spoken commands, transcribes them into text using Whisper, and draws corresponding images by sending pre-generated G-code files to a GRBL-controlled CNC machine. A potential future goal is to dynamically generate G-code from AI-generated images by using Inkscape as a mediator—converting SVG outputs into G-code, allowing for expanded creations beyond the predefined set.

A Raspberry Pi manages audio input, UI display, and serial communication with the plotter. The touchscreen interface allows users to review the transcribed prompt, select or confirm the image, and initiate the drawing process. The machine produces simple line art drawings based on a library of predefined G-code files tied to specific voice commands.

Who's done what beforehand?

My project is inspired by Jack Hollingsworth's Ouiji Board, which uses ChatGPT to generate responses and control stepper motors to physically "move" a planchette. I learned about this through the ouiji board group project from last year’s Fab Academy cycle. Both projects demonstrated how artificial intelligence could be used for real-world motion control through G-code to command stepper motors. I was especially drawn to the voice-activated aspect, which made the interaction feel more natural and autonomous. I wanted to explore that further in a visual way.

What will you design?

I will design the 3D components such as the pen lift mechanism, the carriage that holds the pen, and the motor mounts and attachments for the linear rail. I will also design the cable management system, the housing for the Raspberry Pi and touchscreen, and any structural elements needed to stabilize the CoreXY frame. Additionally, I will design a custom PCB to manage motor control and possibly handle sensor inputs like limit switches.

What materials and components will be used? How much will they cost?

This is my bill of materials that lists out all the components and costs.

Where will these materials come from? What parts and systems will be made?

Most components—such as stepper motors, CNC shield, and structural hardware—are available in the Fab Lab. However, a few specialized parts like the touchscreen display and the ReSpeaker 2-Mics Pi HAT were purchased separately from online retailers (e.g., Amazon, Seeed Studio). 3D printed parts will be fabricated in-house, and the custom PCB will be designed and milled using lab equipment.

What processes will be used?

This project will use a mix of digital fabrication, electronics assembly, and software development processes, including:

3D Printing for fabricating custom mechanical components like the pen lift mechanism, motor mounts, and structural brackets.
CNC Milling to manufacture the custom PCB for motor and signal routing. Soldering for assembling the custom PCB and making reliable electrical connections between components.
Laser and maybe vinyl cutting for aesthetic base structural parts and enclosures.
Embedded Programming – to configure the GRBL firmware on the Arduino and potentially program the custom PCB.
Python Scripting to build the touchscreen interface, handle audio transcription (Whisper), interact with the ChatGPT API, and send G-code to GRBL.
Voice Recognition + AI Processing for capturing and processing user input with Whisper and ChatGPT to generate drawing commands
G-code Generation and visualization, possibly using Inkscape as a middle layer to convert AI-generated drawings into usable G-code.

What questions need to be answered/resolved?

Where does the custom PCB best fit in the system? Should it act as a signal breakout board, or handle logic beyond what the CNC shield provides?
How reliable is Whisper’s transcription in real-world (noisy) environments and will it be consistent in general?
How robust is ChatGPT’s G-code generation? What can I do so that it doesn’t output invalid toolpaths?
Should Inkscape be used as an intermediary step between AI and G-code output? Would it be better to have a secondary option where ChatGPT generates an SVG that’s then processed into G-code using Inkscape?
What is the best way to display and preview G-code on the Raspberry Pi touchscreen? How would I visualize it?
What fail-safes or feedback systems are needed (limit switches or emergency stop)?

How will it be evaluated?

Accuracy of the voice recognition. How well the system transcribes spoken commands into text using Whisper, especially in typical usage conditions.
Correct and reliable G-code generation. Whether the AI-generated G-code produces the intended drawings without errors or machine faults, including successful use of Inkscape as an intermediary if implemented.
Mechanical performance. The precision, smoothness, and repeatability of the pen plotter’s movements, including reliable pen lifting and placement.
User Interface Usability. How intuitive and responsive the touchscreen UI is for reviewing transcriptions, previewing G-code, and controlling the plotter.
System Integration and Stability. How well all components (audio input, AI processing, G-code sending, motor control) work together without crashes, communication failures, or unexpected hiccups.
Safety and Fail-Safe Functionality. Effective operation of limit switches, emergency stops, and error handling to protect hardware.
Overall User Experience. The ease of issuing a voice command and successfully generating a drawing with minimal manual intervention or troubleshooting.

What tasks have been completed?

I have installed the 7-inch Raspberry Pi touchscreen display and gotten it to work where I can access the internet and the command prompt.

I have set up the ReSpeaker 2-Mics Pi HAT on the Raspberry Pi 4 and confirmed that the microphone can functionally capture audio.

I have designed all the mechanical parts I currently believe I need, such as the pen lift mechanism and linear rail attachments for the CoreXY pen plotter.

I have developped a plan to send G-code from the Raspberry Pi to the GRBL controller using pySerial.

I have considered potential failure modes, such as misheard commands, AI G-code inaccuracies, and mechanical issues like pen lift failure or shaking. I have received feedback from mentors suggesting alternate approaches, including using Inkscape as a mediator for AI-generated images to G-code conversion and the possibility of pre-coded drawings for reliability.

What tasks remain?

I still need to develop and finalize the Python software pipeline to fully integrate audio capture, transcription (Whisper), AI prompt sending (ChatGPT), and G-code generation. I also have yet to implement and test the communication script using pySerial to and verify motor control. Before I can get to that, I would need to print and assemble the CoreXY pen plotter frame and mount all mechanical components, including stepper motors, limit switches, and the pen lift mechanism. When that's done, I'd need Optimize and calibrate the mechanical and control system to improve drawing accuracy, stability, and repeatability. I also need to begin creating the touchscreen user interface to display transcriptions, preview G-code files, and provide real-time plotter status updates. When those parts are set up I would need to conduct iterative testing of individual subsystems. Prepare documentation, troubleshoot failure modes as needed, implementing safety features.

Possibly explore and implement the Inkscape-mediated workflow for AI-generated images to G-code conversion to increase drawing reliability.

What has worked? What hasn't?

So far, several key components have worked well: the Raspberry Pi touchscreen display is fully functional, the ReSpeaker 2-Mics Pi HAT reliably captures audio input, and the mechanical designs for the pen lift mechanism and linear rail attachments are complete. Communication plans using pySerial to send G-code to the GRBL controller are in place, and basic system integration has been established.

However, the direct AI-to-G-code generation has presented significant challenges. Neil expressed serious concerns about the reliability and maturity of technology that translates natural language prompts directly into accurate G-code. Because of this, Neil suggested an alternative approach: using AI to generate an image first, then converting that image into G-code via software like Inkscape. This idea was mirrored by Mr. Nelson, who recommended having the AI generate images that could be fed into a G-code generating tool, improving accuracy and reliability. Mr. Dubick also suggested a simpler but robust method — pre-coding a set of images the machine can draw well, then triggering these with voice commands. This reduces the risk of AI misinterpretation and ensures consistent plotting results.

What have I learned?

I have learned that integrating AI technologies like direct natural language to G-code generation is promising but still experimentally fragile for real-world applications requiring precision, like CNC plotting. This has pushed me to consider hybrid approaches and mentor suggestions have opened my views to other strategies, such as using AI-generated images processed through established tools like Inkscape or relying on pre-coded designs—to ensure consistent results.

Technically, I gained hands-on experience setting up and configuring hardware components like the Raspberry Pi touchscreen and ReSpeaker 2-Mics Pi HAT, learning about their integration challenges and solutions. I also learned a lot about Raspberry Pi's, having started out never using them before. I definitely still have much to figure out about it though. Additionally, I learned the value of modular design and incremental testing—validating each subsystem independently before combination. This is to isolate and address potential failure points early.

Last update: May 23, 2025