Socket
Course Description
Train and run machine learning models in JavaScript! Learn the fundamentals of machine learning and use pre-trained models to add image recognition with only a few lines of code using Tensorflow.js. Train custom models with input from the webcam. Learn how to optimize and improve model accuracy and get creative with object, gesture, and audio recognition.
This course and others like it are available as part of our Frontend Masters video subscription.
Preview
CloseCourse Details
Published: June 19, 2024
Learning Paths
Learn Straight from the Experts Who Shape the Modern Web
Your Path to Senior Developer and Beyond
- 200+ In-depth courses
- 18 Learning Paths
- Industry Leading Experts
- Live Interactive Workshops
Table of Contents
Introduction
Section Duration: 13 minutes
- Charlie Gerard begins the course by sharing some creative projects created with TensorFlow.js. These projects involve image, audio, and body movement recognition and range from an application that detects if an object is recyclable to a Street Fighter game controlled by physically performing the movements.
- Charlie discusses the differences between Machine Learning and Artificial Intelligence. There are four types of machine learning methods: supervised, unsupervised, semi-supervised, and reinforcement learning.
Pre-Trained Models
Section Duration: 50 minutes
- Charlie explains that pre-trained models are already trained, tested, and ready for use in applications. These models include image recognition, text classification, and sentiment analysis. Factors that influence model selection include the quality/quantity of the training data and if you have access to the list of classes and labels within the model.
- Charlie uses the coco-ssd pre-trained module in TensorFlow to build a simple application that predicts the content of a selected image. The prediction results can vary widely based on the chosen image and the data set used to train the model.
- Charlie expands the object detection project to use the webcam as the input source. Buttons are used to start the camera, capture an image, and run the prediction code.
- Charlie uses a model to detect faces from the captured webcam image. When faces are recognized, the model provides an array of data for each face. The data includes the coordinate locations for the face and other key0 points like the eyes, nose, and mouth.
- Charlies uses the data from the face detection model to draw a rectangle around the detected face. A utility function is used to draw the rectangle on a canvas element.
- Charlie shares an alternative face detection model that provides more face data. She also demonstrates her face detection projects to highlight other ways TensorFlow.js can be used.
Transfer Learning
Section Duration: 1 hour, 2 minutes
- Charlie introduces transfer learning, a technique that provides a pre-trained model with data for a new task. For example, an image classification model could be given a new set of image data to recognize a different set of objects. The Teachable Machine website is demonstrated, which allows users to create classes of data and train a model. Once the model is trained, it can be tested in the browser.
- Charlie uses Teachable Machine to train an image recognition model to determine if a user tilts their head to the left or right. The model is downloaded and added to the project-2 files.
- Charlie imports the custom prediction model from Teachable Machine into the application. The paths for the model and metadata files are provided, and the webcam is initialized.
- Charlie implements the loop function, continuously evaluating the live webcam image and making predictions using the model data. After an initial test, a larger training data set is added to improve the accuracy of the results.
- Charlie spends a few minutes sharing other ideas for training models. Machine Learning can be combined with hardware or other platforms to create more creative use cases.
- Charlie demonstrates how to train an audio model in Teachable Machine. First, a background noise sample is captured. Then, samples for each sound are created. The model creates an image representation of each audio sample and uses that to recognize input audio.
- Charlie shares a few applications she created that use Machine Learning and audio models.
Train a Model in the Browser
Section Duration: 1 hour
- Charlie begins a project to train a model in the browser. The TensorFlow libraries are loaded into the application, and some initial logic to show the webcam and controls is added.
- Charlie codes the UI logic for capturing sample data from the webcam. When the record button is pressed, an image is captured from the webcam, processed by TensorFlow, and concatenated with the previously captured examples. Label data is also stored alongside the image data to provide a complete training data set.
- Charlie creates layers, the primary building blocks for constructing a Model. Each layer typically performs some computation to transform its input into its output. Layers automatically create and initialize the various internal variables/weights they need to function.
- Charlie finishes the code that prepares the model for predictions. An optimizer is coded to prepare the model for training and evaluation, and the fit method will be used to train the model on a specific number of iterations through the dataset.
- Charlie tests the application by capturing training data from the webcam in the browser. The model is trained and then the webcam is used to make predictions from live images in the webcam.
Creating an Image Classification Model
Section Duration: 50 minutes
- Charlie introduces the third project, which is an application that detects what shape is drawn by the mouse on a canvas element in the browser. The project files and initial code for drawing on a canvas element are explained.
- Charlie implements the clear button. When the button is clicked, anything drawn on the canvas is removed and the paragraph element where the prediction text will be displayed is cleared.
- Charlie builds a dataset of images for two different shapes. The dataset is split into images used for training and testing. She also begins coding a Node.js module to load the images and prepare them for processing by TensorFlow.
- Charlie finishes the Node.js module, which will build and test the model to detect the drawn image. Methods are created for getting the training and testing data and returning Tensor objects with the images and labels.
- Charlie codes the layers for the image model. Rather than chaining the layers inside the TensorFlow sequential method, the "add" method adds layers one by one to the model. Some advice for selecting and experimenting with alternative layers is also included in this lesson.
Training & Testing the Classification Model
Section Duration: 36 minutes
- Charlie creates the module to execute the loading of the images and the training and testing of the model. Loss and accuracy metrics are logged to the console to highlight how well the model performs based on the inputs.
- Charlie writes the model data to a directory in the project. The program is executed a few times to see if the accuracy improves. Additional model directories can be created to save several versions with varying accuracies.
- Charlie demonstrates how the `model.summary()` method can provide additional details and visualize the training process. Layer data is displayed, and the input and output shapes can be compared.
- Charlie implements the frontend code to run the prediction code when the predict button is clicked. The label of the predicted shape is displayed on the screen. Some additional tips for improving accuracy are also shared.
Wrapping Up
Section Duration: 7 minutes
- Charlie wraps up the course, describing how the techniques used in the three projects from the course are related to the experiments she has posted on her website. She also shares a few real-world uses of TensorFlow and some additional learning resources.
Learn Straight from the Experts Who Shape the Modern Web
- In-depth Courses
- Industry Leading Experts
- Learning Paths
- Live Interactive Workshops