All sorts of inputs have little microphone buttons within them that you can press to talk instead of type. Honestly, I worry my daughter will never learn to type because of them. But I get it from a UX perspective, it’s convenient. We can put those in our web apps, too. Pamela Fox has an article about all this.
There are two approaches we can use to add speech capabilites to our apps:
- Use the built-in browser APIs: the SpeechRecognition API and SpeechSynthesis API.
- Use a cloud-based service, like the Azure Speech API.
Which one to use? The great thing about the browser APIs is that they’re free and available in most modern browsers and operating systems. The drawback of the APIs is that they’re often not as powerful and flexible as cloud-based services, and the speech output often sounds much more robotic.
I like that she whipped it up into a Web Component.

