Speech Synthesis API

Maximiliano Firtman

Independent Consultant

Check out a free preview of the full A Tour of Web Capabilities course

The "Speech Synthesis API" Lesson is part of the full, A Tour of Web Capabilities course featured in this preview video. Here's what you'd learn in this lesson:

Max demonstrates the Speech Synthesis API, which speaks a string of text through the computer speakers using the available voices in the operating system.

Get Unlimited Access Now

Preview

Transcript from the "Speech Synthesis API" Lesson

[00:00:00]
>> We also have, so actually it's the same API, it's called Web Speech that has two parts, recognition and synthesis. With synthesis, we can make the web app speak, okay? It's not actually an AI voice or something like that. It's the synthesis that you have on your device.

[00:00:24]
It can be pretty good or pretty bad. It depends on the phone that you have and the voices that you have installed there, but we will try it. But in this case, the web app will speak. We give the web app the text, for example, in this case, that's Spanish.

[00:00:38]
And I can even specify I want Spanish from Argentina. es is Espanyol, Spanish, AR is the Argentinian country. Or you can say US with a Canadian accent or with a British accent. So you can specify that, the right volume, and you say, hey, go and speak. So the API is also pretty simple.

[00:00:58]
The part that is a little bit more complicated is the selection of voices. By default here, it's going to select the closest voice for what you're asking. For example, maybe some devices that don't have an Argentinian voice, they will take a Mexican one or Spanish from Spain, okay?

[00:01:16]
But it's Spanish at least, it will take the closest one. But also, we have an API to query about the available voices, I will show you that in a second. And the voice depends on the device, depends on the operating system, the voices available. So you can select a female voice or a male voice, but not by properties, by just picking one from a list, okay?

[00:01:40]
Well, I have a demo here that we can see directly in action. In fact, it was here, the Web Speech Synthesis demo, that one, where I can-
>> Call me Ishmael. Some years ago, never mind how long precisely, having-
>> That's-
>> Call me Ishmael, call me-
>> Come on, Samantha, stop.

[00:02:02]
That's Samantha, okay? But here, we have all the other possible, Alice, no, that's Italian, Boing.
>> Call me, Ishmael. Some years ago, never mind how long, precisely.
>> Come on, that's Boing. Anyway, you have for example a lot of Spanish, Spanish Argentina, Spanish Mexico.
>> [INAUDIBLE] This is just an API working with that.

[00:02:29]
And if I try this on Safari, you will see different voices. The list of voices is different, because it's a different browser. Okay, let me see, yeah.
>> So then on iOS, is it gonna be the same across all of those browser platforms?
>> Yeah, on iOS, yeah, because it's the same browser.

[00:02:50]
>> Okay.
>> Remember that on iOS, every browser is just Safari with a different scheme. So yeah, that's correct. We have Flo
>> Call me Ishmael. So many years ago, never mind how long precisely, having little or no money in-
>> That sounds a little weird anyway. I know, I think that the pitch here is by default.

[00:03:08]
We can change this, okay?
>> Call me Ishmael. Some years ago, never mind how long precisely,
>> So you can play with that, because it can sound better. In our Cooking Master, we already have that. I'm not sure if you saw that. It's that little icon that we have there.

[00:03:26]
>> Thinly slice the red onion and set aside.
>> All right, and also I think I have it with, hey cooking, next. Was is it connected? Let me check, hey cooking, next. This one is also replying my phone because it says hey, but I didn't say hey, Google.

[00:03:52]
I know why you are trying to speak. Let me see. Hey cooking repeat.
>> Add the sliced onions to the pickling liquid and let sit for at least 30 minutes.
>> So now you're cooking. What was the step again? Hey, cooking and repeat. That's kind of the idea.

[00:04:10]
Anyway, so this is also pretty simple. You don't need any permission. It will just use the speaker, that's so easy. The speaker is mute, you won't hear a thing, okay? You can play a little bit with the audio. Of course, compared with the tools that we have today where you can clone your own voice, yeah, it's not so good.

[00:04:32]
For that, you have to call an API, pay for it, and get in result on audio file that you play, okay? But this is free, and it's available there, okay? That's speech synthesis, any question? Cut the fish fillets into 2-inch wide strips.
>> And by the way, if you try this on Safari, let's try just for seeing, or an iPhone or Android, that you will see a different voice, better or worse, I don't know.

[00:05:04]
>> Thinly slice the red onion and set aside.
>> Looks better, right? The Safari one looks, let's try that one.
>> In a small saucepan, combine the water, vinegar, sugar, and salt. Bring to a simmer, then remove from heat
>> Sounds better. So the voice on Safari sounds better.

[00:05:21]
So if you try on the iPad, it sounds like this one, like this voice.

Learn Straight from the Experts Who Shape the Modern Web

In-depth Courses
Industry Leading Experts
Learning Paths
Live Interactive Workshops

Get Unlimited Access Now