5.final project image recognition
Image Recognition system (Raspberry Pi Camera)¶
Overview¶
This is the final version of the Image Recognition System within the FabAcademy period👍
Created the image recognition program part in week11.
You can see it at 0:00 - 0:07 in the following video.
Design with Fusion360¶
Designed the following
Made of 2.5mm MDF.
Electric Circuit¶
Connect a tact switch to GPIO 10 of the Raspberry Pi.
And connect the Raspberry Pi camera to the Raspberry Pi with a ribbon cable.
Machine Learning¶
I used teachable machine for machine learning
teachable machine allows you to have machine learning on your app and use models in python or javaScript
Select Get Started => Image Project
In this case, select the standard image model (224px x 224px, color image) to output in sensorflowlite format.
The following screen appear
Add the images to have recognized to each class.
This time, recognized the following things
Imported each images.
What I did this time was to recognize images of the following trash separation and color coding of PET bottle caps
Sorting Trash¶
Color-coding of plastic bottle caps¶
Name the class as follows and let it machine learn.
Export in tensorflow lite format
The trained model can identify the following things
Trash Separation
( Trash that can be turned into charcoal / Trash that cannot be turned into charcoal / plastics )
Color-coding of plastic bottle caps
( White / Blue / Red )
Programming¶
Referred by this video
# Referenced by https://www.youtube.com/watch?v=EY3OVoh-014
import time
import tensorflow as tf
import numpy as np
import cv2
import RPi.GPIO as GPIO
import requests as req
from imutils.video.pivideostream import PiVideoStream
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BOARD)
GPIO.setup(10, GPIO.IN, pull_up_down=GPIO.PUD_DOWN)
interpreter = tf.lite.Interpreter(model_path="model_unquant.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
target_height = input_details[0]["shape"][1]
target_width = input_details[0]["shape"][2]
f = open("labels.txt", "r")
lines = f.readlines()
f.close()
classes = {}
for line in lines:
pair = line.strip().split(maxsplit=1)
classes[int(pair[0])] = pair[1].strip()
def detect(frame):
resized = cv2.resize(frame, (target_width, target_height))
input_data = np.expand_dims(resized, axis=0)
input_data = (np.float32(input_data) - 127.5) / 127.5
interpreter.set_tensor(input_details[0]["index"], input_data)
interpreter.invoke()
detection = interpreter.get_tensor(output_details[0]["index"])
return detection
def draw_detection(frame, detection):
for i, s in enumerate(detection[0]):
tag = f"{classes[i]}: {s*100:.2f}%"
cv2.putText(frame, tag, (10, 20 + 20 * i),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
return frame
def main():
camera = PiVideoStream(resolution=(512, 400)).start()
time.sleep(2)
while True:
frame = camera.read()
detection = detect(frame)
value = classes[detection.tolist()[0].index(
max(detection.tolist()[0]))]
drawn = draw_detection(frame, detection)
cv2.imshow("frame", drawn)
if GPIO.input(10) == GPIO.HIGH:
request = 'ESP32_IP_Address' + '/' + value
response = req.get(request)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
camera.stop()
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
Explanation of Codes¶
First imported the following library modules
import time
import tensorflow as tf
import numpy as np
import cv2
import RPi.GPIO as GPIO
import requests as req
from imutils.video.pivideostream import PiVideoStream
Module for handling time.
import time
Import tensorflow
library for use in machine learning
import tensorflow as tf
Library for fast numerical calculation
Import numpy as np
Import OpenCV
Library for processing images and videos
import cv2
Library for controlling RaspberryPi with Python
import RPi.GPIO as GPIO
Python HTTP communication library.
import requests as req
Library for both PiCamera modules and USB cameras
from imutils.video.pivideostream import PiVideoStream
Use GPIO.setwarnings(False) to disable warnings.
GPIO.setwarnings(False)
This is for GPIO pin numbering setup
GPIO.setmode(GPIO.BOARD)
Set pin 10 to be an input pin and set initial value to be pulled low
GPIO.setup(10, GPIO.IN, pull_up_down=GPIO.PUD_DOWN)
Load a TFLite model
interpreter = tf.lite.Interpreter(model_path="model_unquant.tflite")
Memory allocation. This is required immediately after model loading.
interpreter.allocate_tensors()
Get the properties of the input and output layers of the training model.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
Obtaining the tensor data configuration of the input layer
target_height = input_details[0]["shape"][1]
target_width = input_details[0]["shape"][2]
The variable f contains data for reading and writing files. This is called a file object.
Assign the contents of labels.txt as read-only data to variable f.
f = open("labels.txt", "r")
readlines() reads the entire contents of the file, line by line, into a list
lines = f.readlines()
When a machine learning model is exported from teachable machine, labels.tet is exported at the same time.
The content of labels.txt is as follow
0 charcol
1 notcharcol
2 plastic
As a rule, when you are finished using a file object, you must call the close() method to close (close) it.
If you leave the file without calling close(), it will be recognized that the file is still in use. This will inhibit other programs from attempting to use the same file.
f.close()
Assign dictionary to the variable class
classes = {}
Each of the three is added to classes as {key: value} in labels.txt
0 charcol
1 notcharcol
2 plastic
for line in lines:
pair = line.strip().split(maxsplit=1)
classes[int(pair[0])] = pair[1].strip()
It goes as follows
classes = {
0: charcol
1: notcharcol
2: plastic
}
Explain inside the main function.
def main():
camera = PiVideoStream(resolution=(512, 400)).start()
time.sleep(2)
while True:
frame = camera.read()
detection = detect(frame)
value = classes[detection.tolist()[0].index(
max(detection.tolist()[0]))]
drawn = draw_detection(frame, detection)
cv2.imshow("frame", drawn)
if GPIO.input(10) == GPIO.HIGH:
request = 'ESP32_IP_Address' + '/' + value
response = req.get(request)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
camera.stop()
cv2.destroyAllWindows()
Open camera stream
PiVideoStream is a module for PiCamera
camera = PiVideoStream(resolution=(512, 400)).start()
time.sleep(2)
Inside while True: process is looped
while True:
frame = camera.read()
detection = detect(frame)
value = classes[detection.tolist()[0].index(
max(detection.tolist()[0]))]
drawn = draw_detection(frame, detection)
cv2.imshow("frame", drawn)
if GPIO.input(10) == GPIO.HIGH:
request = 'ESP32_IP_Address' + '/' + value
response = req.get(request)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
camera.stop()
cv2.destroyAllWindows()
read() is used to retrieve a single photo from the frame.
frame = camera.read()
detection = detect(frame)
The contents of detect() are as follows
def detect(frame):
resized = cv2.resize(frame, (target_width, target_height))
input_data = np.expand_dims(resized, axis=0)
input_data = (np.float32(input_data) - 127.5) / 127.5
interpreter.set_tensor(input_details[0]["index"], input_data)
interpreter.invoke()
detection = interpreter.get_tensor(output_details[0]["index"])
return detection
this will resize the image to have target_width (width) and target_height (height):
resized = cv2.resize(frame, (target_width, target_height))
np.expand_dims() adds a new dimension of size 1
input_data = np.expand_dims(resized, axis=0)
Each RGB 0 ~ 255 pixel value should fall in the range of -1 to 1
input_data = (np.float32(input_data) - 127.5) / 127.5
Set pointer to tensor data in index
interpreter.set_tensor(input_details[0]["index"], input_data)
Predicts classification results
interpreter.invoke()
Inference results are stored in the index of output_details
detection = interpreter.get_tensor(output_details[0]["index"])
Returns the value of detection
return detection
Pick the highest value in classes.
value = classes[detection.tolist()[0].index(
max(detection.tolist()[0]))]
classes = {
0: charcol
1: notcharcol
2: plastic
}
For example If charcol 90% , charcol 5% ,plastic 5% charcol is selected.
drawn = draw_detection(frame, detection)
The contents of detect() are as follows
def draw_detection(frame, detection):
for i, s in enumerate(detection[0]):
tag = f"{classes[i]}: {s*100:.2f}%"
cv2.putText(frame, tag, (10, 20 + 20 * i),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
return frame
Retrieve index numbers and scores one by one
for i, s in enumerate(detection[0]):
Use the percentages in 100 such as charcol: 90%
tag = f"{classes[i]}: {s*100:.2f}%"
cv2.putText(img, text, org, fontFace, fontScale, color, thickness)
img: OpenCV image
text: text
org: The coordinates of the lower left corner of the text in (x, y)
fontFace: Font. Only a few types can be specified, such as cv2.FONT_HERSHEY_SIMPLEX
color: Text color. (blue, green, red)
thickness: Line thickness. Optional, default value is 1
cv2.putText(frame, tag, (10, 20 + 20 * i),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
Returns the value of frame
return frame
Display the image on the window
The first argument is a string window name. The second argument is the image to be displayed. Multiple windows can be displayed as needed, but each window must have a different name.
cv2.imshow("frame", drawn)
When the button is pressed, HTTP GET request is sent by adding /value (charcol or notcharcol or plastic) to the ESP32 listening as webserver
if GPIO.input(10) == GPIO.HIGH:
request = 'ESP32_IP_Address' + '/' + value
response = req.get(request)
Exit while loop if q key is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
Only when the module is executed directly, the code to be executed is written in if block
if __name__ == "__main__":
main()
Packaging¶
The upper section can be bent by kerf bending as shown below.
Insert into the calf bending gap in the column supporting the sorter.
Insert it into the kerf bending gap of the table on which the sorter is placed.
The ribbon cable connecting the camera to the Raspberry Pi is threaded through the camera stand.
I attached the button by drilling hole in the MDF calf bending as shown below.
Drilled the following holes to connect to the Raspberry Pi connectors