Personal Assistant (Jarvis) in Python

I thought it would be cool to create a personal assistant in Python. If you are into movies you may have heard of Jarvis, an A.I. based character in the Iron Man films. In this tutorial we will create a robot.

The features I want to have are:

For this tutorial you will need (Ubuntu) Linux, Python and a working microphone.

Related courses:


This is what you’ll create:

Recognize spoken voice

Speech recognition can by done using the Python SpeechRecognition module. We make use of the Google Speech API because of it’s great quality.

Answer in spoken voice (Text To Speech)

Various APIs and programs are available for text to speech applications. Espeak and pyttsx work out of the box but sound very robotic. We decided to go with the Google Text To Speech API, gTTS.

sudo pip install gTTS

Using it is as simple as:

from gtts import gTTS
import os
tts = gTTS(text='Hello World', lang='en')"hello.mp3")
os.system("mpg321 hello.mp3")


Complete program

The program below will answer spoken questions.

#!/usr/bin/env python3
# Requires PyAudio and PySpeech.
import speech_recognition as sr
from time import ctime
import time
import os
from gtts import gTTS
def speak(audioString):
    tts = gTTS(text=audioString, lang='en')"audio.mp3")
    os.system("mpg321 audio.mp3")
def recordAudio():
    # Record Audio
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say something!")
        audio = r.listen(source)
    # Speech recognition using Google Speech Recognition
    data = ""
        # Uses the default API key
        # To use another API key: `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
        data = r.recognize_google(audio)
        print("You said: " + data)
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from Google Speech Recognition service; {0}".format(e))
    return data
def jarvis(data):
    if "how are you" in data:
        speak("I am fine")
    if "what time is it" in data:
    if "where is" in data:
        data = data.split(" ")
        location = data[2]
        speak("Hold on Frank, I will show you where " + location + " is.")
        os.system("chromium-browser" + location + "/&")
# initialization
speak("Hi Frank, what can I do for you?")
while 1:
    data = recordAudio()

Related posts:


38 thoughts on “Personal Assistant (Jarvis) in Python

  1. SIlva Dexter - January 11, 2018

    Hi Frank, for the google , do we have to install any google cloud package ??

    1. Frank - January 20, 2018

      No, that’s not necessary for this example

  2. Prudhvi Chaitanya - December 16, 2017

    i have problem with Text to Speech.
    I have installed both pyttsx and gTTS .
    but i am not able to get the jarvis voice on my PC

    1. Frank - December 25, 2017

      It may be a playback problem on your computer, do you have the audio file?

  3. Ryan Holland - December 8, 2017

    I have speechrecognition installed but when I try to run the program it says importerror no module named speech_recognition. Help!

    1. Frank - December 8, 2017

      The module should work with Python 2.7 up to 3.6. Did you run: pip install SpeechRecognition ?

      1. Ryan Holland - December 8, 2017

        Yes I used pip install SpeechRecognition and it says installed but when I run the code it says no module found.

      2. Ryan Holland - December 9, 2017

        I got speechrecognition to work but now when i run the program it won’t speak or show where something is like in your video.

  4. Emre aşkan - September 5, 2017

    First of all, thank you Frank! In recordAudio function, audio = r.listen(source) line there is an indentation mistake. It wasn’t working for me at first and I thought that there was problem about jackd and alsa. But the problem is just a little indentation mistake. Just include the audio line into the with function by putting a space.

    1. Frank - September 6, 2017

      You’re right, thanks Emre!

  5. Jason Heaton - August 29, 2017

    I had an issue with this section

    def recordAudio():
        # Record Audio
        r = sr.Recognizer()
        with sr.Microphone() as source:
            print("Say something!")
        audio = r.listen(source)
    I modified it to show this
    def recordAudio():
        # Record Audio
        r = sr.Recognizer()
        with sr.Microphone() as source:
            print("Say something!")
            audio = r.listen(source)

    Had to indent the audio section.. .

  6. Edward Principe - July 27, 2017

    Frank, I love the quality and execution of this program. I intend to build an interface to run some scientific equipment. I am not a programmer …. I generally hack my way through what I need to get the job done. I have written several basic programs to control the microscope.

    This is a Windows 8.1 system. Is that an issue??
    Installed the gTTS and SpeechRecognition. Having trouble getting PyAudio and PySpeech installed …. using python 3.3 and seems to need Visual C++ 10.0. Trying to work around that now. ….

    When I try to run your example code (short version), I get a string of errors, the end of which oddly seems tied to a URL related to ‘’…. if I interpret the error correctly.

      File "C:\Python33\lib\site-packages\requests\", line 504, in send
        raise ConnectionError(e, request=request)
    requests.exceptions.ConnectionError: HTTPSConnectionPool(host='', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:547)'),))

    I know it is a mess …… any insights are appreciated!

    BTW: It is that generates the error.

    1. Frank - July 28, 2017

      Thanks Edward! Windows 8.1 should not be an issue, at the time I had tested it on Ubuntu.
      The gTTS module underneath uses the website, see inside the gtts source code. This website returns an audio file, which is played with any sound player (mpg321 as example).

      In this case I see a connection error, do you have a firewall? It may also be throttling (too many connections). If you have an offline environment, try ms sapi or espeak. The speech recognition part also needs internet connection though.

  7. kumar rx - May 4, 2017

    Hi mate, I have downloaded gTTS, now what i want to do and where to save the both py files, whether it should get saved in separate file or in same file… And another doubt is you are saving that hello.mp3 what is that ?

    1. Frank - May 5, 2017

      Save as different py files. The file hello.mp3 is the output file saved automatically. You’ll also need to install the program mpg321.

  8. Shubham Bhuyan - May 3, 2017

    In that try-except block, if i don’t say something for a short period of time it says “Google Speech Recognition could not understand audio” and exits my program.(I am using the code to make a voice controlled bot. So after each command I need time to make bot move. Giving delay makes a fixed time for each order,so i don’t want to use it.) Is there any way to control the time before the except block starts working??

    1. Frank - May 5, 2017

      That looks like another type of exception.
      It may be another type of exception the try-catch block is getting.
      Try adding these two exception handlers:

      except sr.UnknownValueError:
          speak("I don't understand!")
      except sr.RequestError as e:
           print("Could not request results")
           print("from Google Speech Recognition service; {0}".format(e))

      Let me know how that works out.