Machine Learning in JavaScript with TensorFlow.js

Face Detection Demos and Q&A

Charlie Gerard

Socket

Check out a free preview of the full Machine Learning in JavaScript with TensorFlow.js course

The "Face Detection Demos and Q&A" Lesson is part of the full, Machine Learning in JavaScript with TensorFlow.js course featured in this preview video. Here's what you'd learn in this lesson:

Charlie shares an alternative face detection model that provides more face data. She also demonstrates her face detection projects to highlight other ways TensorFlow.js can be used.

Get Unlimited Access Now

Preview

Transcript from the "Face Detection Demos and Q&A" Lesson

[00:00:00]
>> But even better, I think that the face landmarks detection. Okay, so here, you can see again it's like a media pipe model and it gives you a face smash. So you can also use that one and it gives you a lot more points around your face. But every point to between the lines here is actually a key point that you can get data from, x and y, in terms of coordinates on the screen.

[00:00:27]
And I believe that with this model, you can do some calculation where you see really the movement of the eye. Also, I'm showing my projects because I know how I built them, so if you have any questions, I can answer them. But there's more people building things like this.

[00:00:42]
But with that kind of model you can build prototypes. I tried to build a hands- free coding where that uses TensorFlow js, and I'm just gonna play the stream be any sound. But it's TensorFlow.js that's looking at where my eyes are going. There's a bit of logic there.

[00:00:59]
It's basically able to isolate the white part of your eye and where your iris actually is. And then you can do a delta between the current position and the next position, it's right or left. And then you can start thinking about, well, what can you do with that?

[00:01:13]
And in the context of hands-free coding, I was thinking, well, in general, there's only a certain subset of things that we can do with code. You can declare variables, or declare functions, or have a written statement, or imports. And you can then think about, well, if I was creating snippets in VS code, then I could just select them with, I think on the right is, well, it ended up on, okay.

[00:01:38]
So on the right here, it's a very small prototype, but the first one is var to create a variable, then I can select the right column and isolate only the ones that are on the right. So FUN would be like a function or then CLG is like console log, or return, or things like that.

[00:01:54]
And then navigate, I think I maybe wanted to go up and down my code. But you can start thinking about applications like this where maybe the first time that you hear of a model to detect key points in your face, you might not be thinking about then, I'll do hands-free coding.

[00:02:08]
But the more you experiment with this, the more you realize and think about different applications where you could use it. So again, it's like every time I move my eye, I'm actually selecting between the right and left column of the little selection here. And I think the first one is probably, I want to create a component, yeah.

[00:02:25]
So I selected RCE for React component something, and then it just spits out a basic structure of the component. So if you write code every day, you know it's a bit more complex than that, but it was a prototype to see, well, if you could either add an eye detection functionality to VS code.

[00:02:43]
Or even as a compliment, you could write and also use your eyes, then you end up doing things like this. So obviously it's not like a product that's ready to use, but I think I just wrote the yeah, hi, or something like that. And that's the type of applications that you can work with.

[00:03:05]
Sorry, question?
>> I'm just curious how you're deciding what to show for the options like letters or variables?
>> Sure, so the nice stuff with that is what's on the right is standard JavaScript. You have a layout of two columns. And then here when I'm for example, if I'm looking right, then my logic is to isolate just the four things that are on the right.

[00:03:29]
Because I'm looking at the screen and I'm thinking, okay, I want to write a function, and I can see fun on the right part. So I'm just looking right and it's just isolate, and that is actually not Tensorflow.js, it's normal JavaScript. But the eye detection is Tensorflow.js. So when I'm getting as a prediction the position of my eye, then I can see which part on the screen it's on.

[00:03:53]
And if I'm looking right, then I can isolate that. For example, if I had been looking left, then I would have had the options var, import, react, and text, if I want to write letters. So if I knew I want to write just the word hello, then I would have looked to the left text, and that would have displayed a fake keyboard of letters.

[00:04:16]
And then I can also, depending on where the letters are, I think there's a better example that I can show of something similar. Okay, this one, the gaze control keyboard. So this is the same concept where I have a keyboard of letters and they're separated into columns. And the reason why, so that's actually a reproduction of a project that was built by a part of Google and they were building an app in Android for people who actually have mobility issues.

[00:04:44]
Where to allow them to text faster, they actually do look at the screen and they select letters. And it actually made me also think about the way that the keyboard layout is. It's like, well, I could have a keyboard and select letter one by one by going right, right, right, right, until I reach the letter.

[00:05:00]
But that would take a long time. And if you instead separate your keyboard into two sections and you end up doing almost like a binary search when you're just like the letter that I want is in the right portion, so I look right. And then it isolates and it splits that into two.

[00:05:15]
And it's what's happening here. The letters are just in the order of the alphabet, but if I want to write hello, I'm like, okay, the h is in the left column, and then the e is also in the left column, and then l, but then o is on the right.

[00:05:25]
And once you're done with your word, then it redisplays. And that is really just a reproduction of the Google project but in the browser because I was interested to see as they've had built it using just Android, Colleen probably. I was just thinking, well can we have the same but as a web interface so that people don't have to download an app, you just open a browser.

[00:05:46]
And then I had a little bit of fun doing it with Chrome dyno game as well where I think I forgot. Yeah, I just look up for a jump. And you could see that it's pretty good. It's pretty quite reactive. I actually was quite impressed. Well, it made me look like I played pretty well.

[00:06:04]
So [LAUGH] I was actually really impressed with how fast the model was. I was thinking, maybe JavaScript in the browser are using the camera, maybe there'll be some delay. But actually, in terms of doing that, it was pretty fast.

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now