Speech Recognition using Google Speech API


Google has a great Speech Recognition API. This API converts spoken text (microphone) into written text (Python strings), briefly Speech to Text. You can simply speak in a microphone and Google API will translate this into written text. The API has excellent results for English language.

Google has also created the JavaScript Web Speech API, so you can recognize speech also in JavaScript if you want, here’s the link: https://www.google.com/intl/en/chrome/demos/speech.html. To use it on the web you will need Google Chrome version 25 or later.

Related courses:

Installation

Google Speech API v2 is limited to 50 queries per day. Make sure you have a good microphone.
Are you are looking for text to speech instead?

This is the installation guide for Ubuntu Linux. But this will probably work on other platforms is well. You will need to install a few packages: PyAudio, PortAudio and SpeechRecognition. PyAudio 0.2.9 is required and you may need to compile that manually.

git clone http://people.csail.mit.edu/hubert/git/pyaudio.git
cd pyaudio
sudo python setup.py install
sudo apt-get installl libportaudio-dev
sudo apt-get install python-dev
sudo apt-get install libportaudio0 libportaudio2 libportaudiocpp0 portaudio19-dev
sudo pip3 install SpeechRecognition

Program

This program will record audio from your microphone, send it to the speech API and return a Python string.

The audio is recorded using the speech recognition module, the module will include on top of the program. Secondly we send the record speech to the Google speech recognition API which will then return the output.
r.recognize_google(audio) returns a string.

#!/usr/bin/env python3
# Requires PyAudio and PySpeech.
 
import speech_recognition as sr
 
# Record Audio
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)
 
# Speech recognition using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    print("You said: " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

You may like: Personal Assistant Jarvis (Speech Recognition and Text to Speech) or Speech Engines

31 thoughts on “Speech Recognition using Google Speech API

  1. Dipo Anugrah Salam - August 12, 2017

    Excuse me if this question sounds silly, but what do you mean by “Google Speech API v2 is limited to 50 queries per day.” Does that mean this program can only recognize a limited amount of words per day?

    1. Frank - August 15, 2017

      At the time it was capped, meaning you can only do a limited amount of words per day. I’m not sure if that’s still the case.

  2. Frank - July 28, 2017

    This records the microphone locally (attached to the computer). If the client would run a Python program recording the microphone, you could forward the text to a server.

  3. Sincole Brans - July 26, 2017

    Any documentation to publish this as webservice? Or can be consumed by hangout, skype or something?
    Any leads?

  4. Akkas Singh - July 25, 2017

    Not able to install any of the above packages on Windows 10
    Got Python Version 2.7.13 and pip version 9.0.1

    Everytime a get an error says : Could not find a version that satisfies the requirement libportaudio-dev(from version:)
    No matching distribution found for libportaudio-dev

    Help me out

    1. Frank - July 26, 2017

      On windows you need to compile PortAudio.
      Also try: pip install pyaudio

  5. Hector Aaron Castillo Elizalde - July 19, 2017

    Hello, is there any solution for reducing the delay time? I have test the code and this does not work online, it takes a few seconds to give back the string. Thanks

    1. Frank - July 22, 2017

      There is no real time solution that I know of. Even on Android it takes a moment to listen

  6. Kavya Shree - July 13, 2017

    I have to do convert speech to text in offline on SAMSUNG ARTIK board. Please tell which package do i need to install and the steps to follow.

    1. Frank - July 14, 2017

      Many speech APIs only work online. The module SpeechRecognition only works offline with the engine CMU Sphinx. All the other speech engines supported dby the module SpeechRecognition need internet connectivity.

  7. Deepak Chawla - May 25, 2017

    Hello Sir, I am using google speech API with default API key since 15 days but currently it does’t recognize my voice with it where my microphone works well which I test at google voice where it works without any error. I can’t understand what problem behind it. Please help me.
    Hope for positive response.

    1. Frank - May 26, 2017

      I’m not sure, does the site https://www.google.com/intl/en/chrome/demos/speech.html work for you?

      1. Chintan Mungra - July 10, 2017

        sir , i have the same problem but this site https://www.google.com/intl/en/chrome/demos/speech.html works for me by changing the default to usb mic in chrome setting.
        so sir can you plz tell is there any way to change default to usb mic. i am using Raspberry PI3

        1. Frank - July 11, 2017

          The usb mic is needed on the raspberry PI. I don’t have a raspberry pi, but it looks like
          you can change it with:

          Microphone(device_index=MICROPHONE_INDEX)

          that’s in the line

          with sr.Microphone(device_index=MICROPHONE_INDEX) as source:

          To list the microphones use this program:

          import speech_recognition as sr
          for index, name in enumerate(sr.Microphone.list_microphone_names()):
              print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index, name))