Speech Recognition using Google Speech API


Google has a great Speech Recognition API. This API converts spoken text (microphone) into written text (Python strings), briefly Speech to Text. You can simply speak in a microphone and Google API will translate this into written text. The API has excellent results for English language.

Google has also created the JavaScript Web Speech API, so you can recognize speech also in JavaScript if you want, here’s the link: https://www.google.com/intl/en/chrome/demos/speech.html. To use it on the web you will need Google Chrome version 25 or later.

Related courses:

Installation

Google Speech API v2 is limited to 50 queries per day. Make sure you have a good microphone.
Are you are looking for text to speech instead?

This is the installation guide for Ubuntu Linux. But this will probably work on other platforms is well. You will need to install a few packages: PyAudio, PortAudio and SpeechRecognition. PyAudio 0.2.9 is required and you may need to compile that manually.

git clone http://people.csail.mit.edu/hubert/git/pyaudio.git
cd pyaudio
sudo python setup.py install
sudo apt-get installl libportaudio-dev
sudo apt-get install python-dev
sudo apt-get install libportaudio0 libportaudio2 libportaudiocpp0 portaudio19-dev
sudo pip3 install SpeechRecognition

Program

This program will record audio from your microphone, send it to the speech API and return a Python string.

The audio is recorded using the speech recognition module, the module will include on top of the program. Secondly we send the record speech to the Google speech recognition API which will then return the output.
r.recognize_google(audio) returns a string.

#!/usr/bin/env python3
# Requires PyAudio and PySpeech.
 
import speech_recognition as sr
 
# Record Audio
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)
 
# Speech recognition using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    print("You said: " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

You may like: Personal Assistant Jarvis (Speech Recognition and Text to Speech) or Speech Engines

62 thoughts on “Speech Recognition using Google Speech API

  1. Frank
    Michael Meanswell - March 17, 2018

    Here’s an edit of your script you can use to test Google or Sphinx recogntion in dialogue format. It will speak it’s response back to you. I also changed the listen interval to 5 seconds as my builtin microphone has so much noise it will hang indefinitely. Needs ‘python-gtts’ and ‘pygame’ installed via apt or otherwise.

    import speech_recognition as sr
    from gtts import gTTS
    #quiet the endless 'insecurerequest' warning
    import urllib3
    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
     
    from pygame import mixer
    mixer.init()
     
    while (True == True):
    # obtain audio from the microphone
      r = sr.Recognizer()
      with sr.Microphone() as source:
        #print("Please wait. Calibrating microphone...")
        # listen for 1 second and create the ambient noise energy level
        r.adjust_for_ambient_noise(source, duration=1)
        print("Say something!")
        audio = r.listen(source,phrase_time_limit=5)
     
    # recognize speech using Sphinx/Google
      try:
        #response = r.recognize_sphinx(audio)
        response = r.recognize_google(audio)
        print("I think you said '" + response + "'")
        tts = gTTS(text="I think you said "+str(response), lang='en')
        tts.save("response.mp3")
        mixer.music.load('response.mp3')
        mixer.music.play()
     
     
      except sr.UnknownValueError:
        print("Sphinx could not understand audio")
      except sr.RequestError as e:
        print("Sphinx error; {0}".format(e))
    1. Frank
      Michael Meanswell - March 17, 2018

      pygame’s play was incredibly buggy for me, causing this to hang a lot. I ended up using mpg123 via os.system(‘mpg123 -q filename.mp3’) instead.

  2. Frank

    Hello Frank. I’ve tried the tutorial, but it seems to stuck at “Say something!”. I already changed the audio source with
    with sr.Microphone(device_index=2) as source:
    Tried the demo site https://www.google.com/intl/en/chrome/demos/speech.html and my mic works. I can record voices from the mic and save audio data. But the tutorial code just stuck at “Say something”.
    Any idea on how to fix this? I’m using Raspberry Pi on latest Raspbian. Thanks.

    1. Frank
      Frank - March 16, 2018

      Check with nethogs or ‘netstat -antup’ to see if a network connection is made when running the program. You could also try the cloud voice recognition service.

      1. Frank

        Thank you. The network is working. It turns out the default mic sensitiviy was cranked up too high; so high it failed to distinguish my voice from the ambience, so it listened continuously forever. I lowered the volume and its working perfectly! Thank you

  3. Frank

    Hello Frank. Thanks for your concise tutorial.
    Is there a way to change the language that Google Api is working on, for example, to ‘Bahasa Malaysia’? I assume the tutorial is working by default in English?
    Thanks.

    1. Frank
      Frank - March 14, 2018

      You can specify a language parameter in the recognize_google call. Set it to ‘ms-MY’

      1. Frank

        Thanks for the answer. Sorry for asking that redundant question; I just saw your previous reply to another poster asking the same thing. I hope you’ll continue to give these useful tutorials. Thank you.

Leave a Reply