Robotics
Robots have evolved into our daily world from restaurants to manufacturing parts for planes. The evolution of robots is fascinating as researchers and engineers begin to build humanoids who can talk or rescue robots made from hydraulic fluid conducted in Boston Dynamics to even robotaxis! However, robots need specific code inputted into their hub, and this code has evolved into a new era of Machine Learning.
What is Machine Learning?
Machine Learning capabilities in the real world.
Machine learning is like human learning where a model running on machine learning takes data and algorithms and trains on those models to get a better model that can now provide accurate predictions on future occurrences. Machine Learning is becoming the new era of code as Generative AI allows for the creation of content, design, code, and even influence business decisions.
However, machine learning is not that simple.
Libraries of ML
Pandas:
helps coders work with data and analyze and manipulate that data and allows for the simplification of data into a streamlined code that makes users to easily read, especially for a large training set.
Keras:
Simplifies the development of neural networks and allows for rapid prototyping on Computer Processing Units (CPUs) and Graphic Processing Units (GPUs). This is incredibly important as the use of computer chips becomes prevalent in ML models corresponding with its speed.
Tensorflow:
A major library used in almost all ML models that allows for the analysis of data and allows with the compatibility of Tensor Processing Units (TPUs) (Google’s circuits to boost ML workloads) and GPUs. One application of Tensorflow is in Tesla car’s when self-driving mode is on!
Keras and Tensorflow are even used together as they do similar functions where Tensorflow allows for a larger scalability and boosts efficiency and processing. For example, when scaling data processing, you can use tf.data and Keras pre-processing layers to create a data pipeline. There are also Application Programming Interfaces (APIs) that allow different software to interact together, and APIs used in Keras or Tensorflow such as Tensorflow and Keras Sequential API to train neural networks.
Scikit-Learn:
A library of tools for machine learning which is generally used in Matpotlib and Numpy that allows for predicting decisions, gather data points together, identifying objects based on a set training model, evaluating models, and many more functions regarding data and ML model functions.
Matpotlib:
A library that allows for data to be viewed such as graphs, charts, and other visualizations of data.
Scipy
A library that allows for regression models to be inputted as well as math to be integrated into Machine Learning models.
Numpy
A library of Machine Learning libraries designated to manage large amounts of data.
OpenCV
A library used for computer vision. For example, cameras could use when determining if a person has littered on the road.
How to make an ML model?
A simplified example of an ML model using Scikit-learn
Data Preparation
Collecting data is crucial as that is what your model will be trained on in order to give you accurate predictions.
Make sure to contextualize the data, whether it is images, texts, tabular data, etc.
Clean data by filling in missing data, removing duplicates, and missing values.
Finally, splitting data into training and testing sets is important because the training data is what the ML model is trained on to learn about predictions, patterns, and features from the data while testing data is used to evaluate the model’s performance by giving this unseen data and assess the model’s ability for its given function.
Select a Model
Choosing an algorithm for your ML model depending on your wanted function such as linear regression, neural networks, and decision trees.
Neural networks are used in robotics as it is an alogirithm that suppossed to work similarly like the human brain. Functions such as computer visualization and prediction can be done through neural networks.
Linear Regression is used to find a line that best fits with the given data used to predict continous values.
Decision trees are used to make decisions; however, there are levels of a decisions given such as determining the outcomes, risks, resource costs, probability, etc.
Train the Model
Fit the model by using the training data to train the model and customize batch size and the amount of data wanted to be used (epoches)
Overfitting and underfitting are problems that can occur during this process due to algorithms resembling the training data too closely or not enough respectively that it cannot give accurate predictions to the testing data.
Assessing the Model
Using the testing data in order to evaluate the model’s accuracy.
Metrics: Assess accuracy, precision, recall, F1 score, mean squared error, or other relevant metrics, depending on the model’s purpose
Modifying the Model
Changing algorithms, training/testing data as new data comes out, cross-validating to improve the model’s performance, etc.
Deploying the Model
Store the trained model for future use and integrate the model in an environment where it is constantly doing its function on new testing data
Different Environments include Jupyter notebook, Google Colab, and local IDEs
What are Robots?
Robots are machines that senses, processes, and actuates that interact with the environment.
Sensing
Sensing are typically done by sensors commonly done by cameras to visualize their surrounding environment, sound sensors that uses sound waves to detect obstacles around them, and other sensors such as pressure, temperature, etc.
Processing
Processing is the next step after sensors where the robot would then convert its input from the sensors into code that would then go to the control hub to be processed.
Actuating
Finally, actuating is the output after the processing is completed where it interacts with the environment and performs tasks whether that means to stop, drive, etc.
Why are Robots Important?
Robots have rapidly advancing in our new world of AI and changing our world before we know it. They are being used in factories to automate the build of cars, plane parts, and other manufactured parts. Furthermore, robots have been improving the way cars move, watering plants, and even serving food at restaurants. Its ability to conduct human tasks allows it to be constantly efficient and cheap.
How are Robots Used with ML Models?
Robots have started to become much more complex using ML models from being able to map out roads to identifying objects when functioning. Machine learning code often needs to be optimized for performance, real-time decision-making, and hardware constraints.
Example Use Case:
For example, when robots use cameras, robots have to navigate a world of obstacles among houses, factories, etc.
A possible ML model to integrate into a robotics application would be using Python with libraries and TensorFlow for the model with a ROS (Robot operation System) as the robotic framework for object detection.
Packages Installation
Installing the necessary packages
Loading a pre-trained model
We imported Numpy as it can handle large amounts of data (needed when determining objects to give accurate predictions to the robot) and cv2 which is used for computer vision. TensorFlow is used for the analysis of the data inputted from the cameras themselves and build on the new data.
After loading the pre-trained model (already trained with the training and testing data sets) we would define the function which is to detect the objects from the input of the camera.
Here we are integrating the ROS for real-time operation by first importing the necessary software such as the CV bridge which converts OpenCV images to ROS messages while sensor_msgs.msg allows for common message definitions for robot sensors that help facillitate the communication of sensor data between nodes and importing “Image” means that it is being used for representing image data from a camera while containing metadata of the object itself: height, width, encoding type (allows for algorithms to easily recognize the data), and the image of the object itself. Finally, importing rospy is the ROS environment itself.
We can then see a defined node and connect the ROS to the camera and using the CV Bridge, in the “image_callback” it converts the ROS messages into Open CV image formats that leads to the detection of objects using cv_image and then finally processing and publishing the images.
Here is the full code:
import rospy
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
class ObjectDetectionNode:
def init(self):
self.bridge = CvBridge()
self.model = tf.saved_model.load(‘ssd_mobilenet_v2_fpnlite/saved_model’)
rospy.init_node(‘object_detection_node’, anonymous=True)
rospy.Subscriber(‘/camera/rgb/image_raw’, Image, self.image_callback)
self.detection_pub = rospy.Publisher(‘/detections’, Image, queue_size=10)
def image_callback(self, msg):
# Convert ROS image message to OpenCV
formatcv_image = self.bridge.imgmsg_to_cv2(msg, “bgr8”)
# Detect objects
detections = self.detect_objects(cv_image)
# Process and publish detections
detection_image = self.visualize_detections(cv_image, detections)
detection_msg = self.bridge.cv2_to_imgmsg(detection_image, “bgr8”)
self.detection_pub.publish(detection_msg)
In the first definition, the image data is being converted into a TensorFlow tensor, then by adding a new axis for creating a batch dimension as TensorFlow expects inputs in batches, and then finally passing the image tensor into the model in order to provide detection results like bounding boxes, class labels, and confidence scores.
The visualization of detections portion refers to processing the detections and overlaying visual elements (like bounding boxes) over the image and returning the image with the visual modification and essentially puts the data into a human-understandable way.
Finally, the last part allows for the model to keep on running as incoming data continues to be inputted. The ObjectDetectionNode handles the detection and visualization process while the rospy.spin continues the ROS node to process incoming messages is constantly inputted. The except function finally allows for any errors or exception when the ROS node is disrupted.
Here is the code:
def detect_objects(self, image):
input_tensor = tf.convert_to_tensor(image)
input_tensor = input_tensor[tf.newaxis, …]
detections = self.model(input_tensor)
return detections
def visualize_detections(self, image, detections):
# Visualization logic here (e.g., draw bounding boxes)
for detection in detections:
# Process detections and overlay on image
pass
return image
if name == ‘__main__’:
try:
node = ObjectDetectionNode()
rospy.spin()
except rospy.ROSInterruptException:
pass
Conclusion
In conclusion, robots are advancing with ML making it more complex by making predictions, classifying images, and automating by themselves, and it is definitely going to be a huge step in humanity.
Sources:
Geron, Aurélien. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 2nd ed., O’Reilly Media, 2019.
Kaehler, Adrian, and Gary Bradski. Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. 2nd ed., O’Reilly Media, 2016.
Quigley, Morgan, et al. Programming Robots with ROS: A Practical Introduction to the Robot Operating System. O’Reilly Media, 2015.
Robotics Casual. “ROS Tutorial: How to use OpenCV in a Robot Pick and Place task for Computer Vision.” Robotics Casual, 2022, https://roboticscasual.com/ros-tutorial-how-to-use-opencv-in-a-robot-pick-and-place-task-for-computer-vision/. Accessed 12 Aug. 2024.
ROS Wiki Contributors. “cv_bridge/Tutorials.” ROS Wiki, ROS, 1 Feb. 2011, http://wiki.ros.org/cv_bridge/Tutorials. Accessed 12 Aug. 2024.