Output Devices
Tasks
Group assignment:
- Measure the power consumption of an output device.
- Document your work on the group work page and reflect on your individual page what you learned.
Individual assignment:
- Add an output device to a microcontroller board you’ve designed and program it to do something.
Group assignment
My contribution was that I did the group assignment.
Follow the link to find it: group assignment week 09
Objectives for the week:
- Making a breakout board with 16 red LEDs
- With multiplexing
- Control a tiny pololu motor
- Control an LED display to output speed numbers (tachometer!)
LED multiplexing
My final project, a bicycle wheel display, will involve controlling quite a few LEDs and switching them as quickly as possible as possible. The easy option electronics- and coding-wise would be something like neopixels, but the NeoPixel refresh rate is too low.
An alternative is row-column multiplexing. When row-column multiplexing LEDs, you can control row x column
LEDs with row + column
pins. 16 LEDs can therefore be controlled with 8 pins. As you scale the number of LEDs increases square while number of pins increases linearly: 100 LEDS can be controlled with only 20 pins, so I could control 24 RGB LEDs (72 channels) with sqrt(72) * 2 = 16.97
only 18 pins.
Breakout board design
I decided to implement this as a breakout board. This first iteration will only have 16 LEDs, controlled by 4 row
pins and four column
pins.
Initial doubts:
- Do I need transistors to drive this?
- Q1, 3 and 5 should be PNP not NPN Transistors, especially if V+ is greater than the V+ of your Arduino. quote
- Do we not have transistors that are not mosfets?
- If I do need transistors, should I not put them in both the Vin and GND paths?
- What is a shunt exactly?
- How do I use Spice in KiCAD?2
- How do you make a double sided board?
Results
Amazingly, this worked out the first time! I played around a bit with 4 LEDs on a breadboard and then did a lot of reading. This led me to settle into a 4 row, 4 column design. In this video, the LEDs look static to the eye, but the shutter speed syncs so that in the recording it seems they are moving:
The schematic is the following:
I used N-channel mosfets, so they need to be placed downstream of the load. A column pin is set HIGH, so that it outputs current into the MOSFETS, whose gate is connected to a row pin. When the MOSFET conducts, current flows through the LED, the mosfet, and then to ground.
This is kind of an intermediate design. The MOSFET allows me to turn on a whole row at a time without overwhelming the max sink current of the MCU, but the rows are not thus protected. This works because of time multiplexing: I light the device a single row at a time, so there will never be more than one LED on per column in any single moment.
I used a two-layer board design to avoid needing many 0 ohm resistors for jumping.
Front:
Back:
The fabrication was surprisingly easy! I used two .85mm-wide holes for alignment placed centrally. After milling the front, I put in sewing pins through these holes, cut the pin heads off, then reversed the board using the pins as guides. After milling the back hopper, holes, and outline, the resulting alignment was quite great! For vias I just used copper wire:
It can even serve as a peineta in a pinch!
It connects to my dev board like this:
If you want to play around with it, you can find the KiCAD project here.
Lessons learned about 2-sided boards
- Careful when mirroring the back side! Mirror it with the outline, or it will become misaligned.
- When placing it on the mill, be mindful of the way it will fall when you turn it: otherwise it might fall out of the sacrificial board and collide with the walls of the Roland.
- To be tried in the future: drilling the alignment holes with the 1/64" endmill would save 2 tool changes.
- Update 2024-06-12: tried it, didn't work: the 1/64" endmill isn't long enough so its shank collides with the hole edges.
Fast switching
I've calculated the precision that I'll need to turn on and off LEDs accurately enough in my final project. It turns that, at 20km/h, a relaxed cycling cruise speed, a given spot near the edge of the wheel takes just 1ms to traverse 5mm. This means that if I want to have pixels around 5mm in size I'll need switching times way below 1ms. I might also need very powerful LEDs in order to have them on as short as possible
First attempt with timer interrupts
Of course, blocking IO is not an option. My first explorations have used MicroPython for speed of development. I used timers to turn and off each row in turn. The result worked great up to around 20fps, but above that would result in weird behavior.
First working attempt in MicroPython. Click to view code
from machine import Pin, PWM, Timer, UART
import time
import micropython
micropython.alloc_emergency_exception_buf(100)
## Barduino
# Barduino pin 14 is connected to the buzzer
r1 = Pin(11, Pin.OUT)
r2 = Pin(12, Pin.OUT)
r3 = Pin(13, Pin.OUT)
r4 = Pin(15, Pin.OUT)
c1 = Pin(16, Pin.OUT)
c2 = Pin(17, Pin.OUT)
c3 = Pin(18, Pin.OUT)
c4 = Pin(21, Pin.OUT)
rows = (r1, r2, r3, r4)
cols = (c1, c2, c3, c4)
all_pins = rows + cols
def all_off():
for pin in all_pins:
pin.off()
def cols_on():
for col in cols:
col.on()
def rows_on():
for row in rows:
row.on()
def cols_off():
for col in cols:
col.off()
def rows_off():
for row in rows:
row.off()
def row_pattern(rows, pattern, duration):
Timer(1).deinit()
all_off()
this_row = rows[0]
other_rows = rows[1:]
these_lights = pattern[:4]
other_lights = pattern[4:]
this_row.on()
for on, col in zip(these_lights, cols):
if on:
col.on()
def next_step(t):
all_off()
print(other_lights)
if len(other_rows) > 0:
row_pattern(other_rows, other_lights, duration)
timer = Timer(1).init(period=duration, mode=Timer.ONE_SHOT, callback=lambda t: next_step(t))
def show(pattern, duration, refresh_rate):
millis = 1000 // refresh_rate
Timer(0).init(freq=refresh_rate, mode=Timer.PERIODIC, callback = lambda t: row_pattern(rows, pattern, millis // 4)) # shortcut; might lead to inaccuracy
Timer(2).init(period=duration, mode=Timer.ONE_SHOT, callback = lambda t: Timer(0).deinit())
def pos_to_row_col(n):
row = rows[n % 4]
col = cols[n // 4]
# 1 to 16
primes = (True, True, True, False, True, False, True, False,
False, False, True, False, True, False, False, False)
show(primes, 2000, refresh_rate = 10)
Detour: Researching direct port manipulation
I had read that digitalWrite
is very slow and that the fastest way to go is to do Direct Port Manipulation. This is something that I've been interested in because I come from a high-level programming background and it feels that this kind of thing lies at the essence of MCU programming wizardry.
I read a loooooot of references and finally found one that made it click for me: What is the fastest way to read/write GPIOs on SAMD21 boards?. From there:
For a custom SAMD21 board with consecutively number bits on PORTA, you can do the fastest read with something like:static inline boolean fastRead(int bitnum) { return !! (PORT_IOBUS->Group[0].IN.reg & (1<<bitnum)); } and write with: > static inline void fastWrite(int bitnum, int val) { if (val) PORT_IOBUS->Group[0].OUTSET.reg = (1<<bitnum); else PORT_IOBUS->Group[0].OUTCLR.reg = (1<<bitnum); }
I used ChatGPT to understand the magic and I think I got it: a port is a group of pins: 8 in ATTinys (that's were the PAxx, PBxx, PCxx numbers come from, with a max of 8) and 32 in SAMD21s: in my particular ATSAMD21E18A, all GPIO pins are in a single port, port A.
Registers are 32-bit numbers (in this processors) which you can write in order to set properties for the pins: for example, if I wanted to set pins 2,3, and 4 of an 8-bit port to INPUT
, I would write to the corresponding INPUT
register (whose actual name I'd have to find in the datasheet): INPUT = 0b00001110
, or something like that.
For my processor, the list of GPIO registers can be found in page 371 of the SAM D21 Family Data Sheet (pdf link)
It's not quite writing assembly, but it feels like it's just one step above.
Backtrack: do I really need timer interrupts and Direct Port Manipulation?
At this point, I was all eager to start writing arcane incantations, but I remembered a piece of advice that I often tell my students in Machine Learning: always do the stupidly obvious thing first, if only to have a baseline to measure against later when you build the complicated "smart" version. In Machine Learning, the gains often are not worth it.
So I took a step back and wrote a first version of the code that uses digitalWrite()
to display a pattern in a single row of my LED multiplexer and micros()
to measure how long it takes.
First attempt with switching time measure. Click to view code.
const int r0 = 0;
const int r1 = 1;
const int r2 = 2;
const int r3 = 3;
const int c0 = 4;
const int c1 = 5;
const int c2 = 6;
const int c3 = 7;
const int nRows = 4;
const int nCols = 4;
int rows[nRows] = { r0, r1, r2, r3 };
int cols[nCols] = { c0, c1, c2, c3 };
int all_pins[nRows + nCols] = { r0, r1, r2, r3, c0, c1, c2, c3 };
bool pattern[nRows * nCols];
bool primes[nRows * nCols] = { false, true, true, false, true, false, true, false, false, false, true, false, true, false, false, false };
long iteration = 0;
long start, end;
int nCycles = 1000;
void setup() {
for (int i = 0; i < 8; i++) {
pinMode(all_pins[i], OUTPUT);
}
Serial.begin(115200);
// See p 378 of the datasheet
PORT_IOBUS->Group[0].OUTCLR; // https://forum.arduino.cc/t/what-is-the-fastest-way-to-read-write-gpios-on-samd21-boards/907133/9
start = micros();
Serial.println("Let us play");
}
void loop() {
rowShow(0, primes);
if (iteration % nCycles == 0) {
end = micros();
long averageTime = (end - start) / nCycles;
Serial.println(averageTime);
start = micros();
}
iteration += 1;
}
void rowShow(int rowNumber, bool pattern[]) {
allOff();
digitalWrite(rows[rowNumber], HIGH);
for (int i = 0; i < nCols; i++) {
int position = rowNumber * nRows + i;
if (pattern[position]) {
digitalWrite(cols[i], HIGH);
}
}
}
void allOff() {
for (int i = 0; i < 8; i++) {
digitalWrite(all_pins[i], LOW);
}
}
void allOn() {
for (int i = 0; i < 8; i++) {
digitalWrite(all_pins[i], HIGH);
}
}
Turns out, it only takes 34us to switch a whole row with this approach! I could allot my 1ms cycle time into 4x34=136 ms for switching and leave the pins on for the remainder 864us. That would provide a time on per row of 864/4= 216us, so a duty cycle of 21.6% and a smear of .216ms * 5mm/ms ~= 1mm.
It's not perfect, and I think it will be noticeable, but it's a starting point. This is enough for the POC, and possibly for the MVP. I'll have to hold myself and save the wizardry for later.
When I come back to it, I think also this Accessing SAM MCU Registers in C guide can be super useful.
Second approach: state machine
All code examples are saved as commits in my repo for this experimentation.
The problem now becomes switching row per row in time. I can use a state machine approach to begin. I found this State Machine and Timers, Medium level tutorial useful, even if I don't quite do it the way they do.
During performance profiling of my solution I found a super funny phenomenon: printing a single double would cause a noticeable flicker of the LEDs. I narrowed down to this:
Version without flicker. Click to show code.
currentRow = 0;
busyFraction = double(busyMicros) / double(elapsed);
Serial.print("busyMicros: ");
Serial.print(busyMicros);
Serial.print(" elapsed:");
Serial.println(elapsed);
busyMicros = 0;
start = micros();
Version with flicker. Click to show code.
currentRow = 0;
busyFraction = double(busyMicros) / double(elapsed);
Serial.print("busyFraction: ");
Serial.print(busyFraction);
Serial.print(" busyMicros: ");
Serial.print(busyMicros);
Serial.print(" elapsed:");
Serial.println(elapsed);
busyMicros = 0;
start = micros();
So, weirdly, it was only the printing and not the calculation that took a long time! Since I was counting time already it was easy to see the exact time penalty of that single print
: cycle time was increased by 6ms, which is huge!11
Anyway, so the version (commit) that worked was:
First version with full frame. Click to show code.
const int r0 = 0;
const int r1 = 1;
const int r2 = 2;
const int r3 = 3;
const int c0 = 4;
const int c1 = 5;
const int c2 = 6;
const int c3 = 7;
const int nRows = 4;
const int nCols = 4;
int rows[nRows] = { r0, r1, r2, r3 };
int cols[nCols] = { c0, c1, c2, c3 };
int all_pins[nRows + nCols] = { r0, r1, r2, r3, c0, c1, c2, c3 };
const unsigned int frameRate = 1000;
unsigned long microsecondsPerFrame = 1000000 / frameRate;
// For performance profiling
long busyMicros = 0;
float busyFraction = 0.0;
bool pattern[nRows * nCols];
bool primes[nRows * nCols] = { false, true, true, false, true, false, true, false, false, false, true, false, true, false, false, false };
long iteration = 0;
long start, end, now;
int nCyclesRefresh = 10000;
// State machine
byte prevRow = 0;
byte currentRow = 0;
bool debug = true;
void setup() {
for (int i = 0; i < 8; i++) {
pinMode(all_pins[i], OUTPUT);
}
Serial.begin(115200);
delay(100);
Serial.println("Let us play");
start = micros();
}
void loop() {
now = micros();
prevRow = currentRow;
updateState(now);
if (currentRow != prevRow) {
rowShow(currentRow, primes);
}
iteration += 1;
}
void updateState(long now) {
long elapsed = now - start;
int segment = elapsed / (microsecondsPerFrame / nRows);
if (segment > 3) {
currentRow = 0;
busyFraction = double(busyMicros) / double(elapsed);
// Serial.print("microsecondsPerFrame: ");
// Serial.println(microsecondsPerFrame);
Serial.print(" busyMicros: ");
Serial.print(busyMicros);
Serial.print(" elapsed:");
Serial.println(elapsed);
busyMicros = 0;
start = micros();
} else {
currentRow = segment;
}
}
void rowShow(int rowNumber, bool pattern[]) {
long thisStart = micros();
allOff();
digitalWrite(rows[rowNumber], HIGH);
for (int i = 0; i < nCols; i++) {
int position = rowNumber * nRows + i;
if (pattern[position]) {
digitalWrite(cols[i], HIGH);
}
}
busyMicros += micros() - thisStart;
}
void allOff() {
for (int i = 0; i < 8; i++) {
digitalWrite(all_pins[i], LOW);
}
}
void allOn() {
for (int i = 0; i < 8; i++) {
digitalWrite(all_pins[i], HIGH);
}
}
State is described by currentRow
; every iteration of the loop we check the microseconds elapsed and update it accordingly, giving each row an even part of the time. It worked great until at least 1000fps. Going over that caused uneven illumination of the LEDs, which I assume is caused because some updates are skipped. I didn't bother to diagnose it fully because 1000fps is right at my target for my POC.
On a later modification, just out of curiosity, I took the fastWrite
function from What is the fastest way to read/write GPIOs on SAMD21 boards? and replaced all instances of digitalWrite
in my code with it. It speeded up things from ~120us to 20us spent in the switching function. Now I can drive the multiplexer up to 10000fps with no visible artifacts! I'm going to target 5000fps, which gives me an uncertainty of ~1mm at 20km/h near the edge of the wheel.
`fastWrite`. Click to show code.
static inline void fastWrite(int bitnum, int val) {
if (val)
PORT_IOBUS->Group[0].OUTSET.reg = (1<<bitnum);
else
PORT_IOBUS->Group[0].OUTCLR.reg = (1<<bitnum);
}
References
https://www.jameco.com/Jameco/workshop/learning-center/electronic-fundamentals-working-with-led-dot-matrix-displays.html
http://amigojapan.github.io/Arduino-LED-Matrix-Display/
- ESP32 S3 pinouts
- SAM D21 Family Data Sheet (pdf link)
MOSFETs
Timers, clocks, and multitasking
- Timers in microPython
- Timer with microsecond resolution: Apparently going under 1ms in the ESP32 is not viable. Lucky I went with SAMD21.
- Weightless threads: Very interesting pattern for pseudo-concurrency in microPython.
- Timers on the ESP32: Very useful for non-blocking activation of pins. *Arduino micros() function with 0.5us precision - using my Timer2_Counter Library
- Help flashing LEDS for specific amount of time using sensor
- What is the fastest way to read/write GPIOs on SAMD21 boards?
- SAMD21 Arduino Timer Example
Direct Port Manipulation
- How to access pins on SAMD21 E18A with Arduino Framework on custom board?
- The Case for Direct Port Manipulation
- Arduino and port manipulation
- A SAMD21 ARM Issue (versus AVR architecture)
- Accessing SAM MCU Registers in C: official guide from Microchip for bare metal C programming in the SAMD21.
- How to access pins on SAMD21 E18A with Arduino Framework on custom board?: using Platform IO (VS Code) and able to use Arduino libraries.
State Machines
- State Machine and Timers, Medium level tutorial
- State Machines for Event-Driven Systems: Very interesting example of the Finite State Machines pattern to handle events in context. Uses pointer-to-functions.
- Introduction to Hierarchical State Machines: An extension of the previous reference. Basically implements a class hierarchy from scratch.
Platform IO
- Programm SAMD21 directly: explains how to modify a
platform.ini
. - Custom Embedded Boards: official docs.
- SAM platforms available in platformIO.
- Samd21 custom board and Arduino framework