python prediction

Python hosting: Host, run, and code Python in the cloud!

NLTK Dive into Natural Language Processing (NLP) with Python and discover its vast applications in predictions.
For instance, by analyzing a product review, NLP enables computers to predict its sentiment, deciphering if it’s positive or negative. This guide will walk you through the creation of a prediction program hinging on natural language processing techniques.

Course Recommendation: Natural Language Processing with Python

NLP-Based Name Gender Prediction

Ever wondered if a computer could guess the gender associated with a name? Through this tutorial, you’ll empower a classifier to predict whether a name sounds more ‘male’ or ‘female’.

The entire prediction process can be outlined as:

Data Preparation
Feature Extraction
Training the Model
Executing Predictions

Data Preparation

The foundational step is gathering and organizing data. For our purpose, we’ll utilize the names dataset available within the nltk library.

from nltk.corpus import names

# Aggregate male and female names 
names_data = ([(name, 'male') for name in names.words('male.txt')] + 
             [(name, 'female') for name in names.words('female.txt')])

To offer a glimpse, this dataset essentially comprises a series of tuples, resembling:

1 2	[('Aaron', 'male'), ('Abbey', 'male'), ('Abbie', 'male')] [('Zorana', 'female'), ('Zorina', 'female'), ('Zorine', 'female')]

You’re at liberty to concoct your personalized tuple set. It’s fundamentally a list studded with myriad tuples.

Feature Extraction

Proceeding, we need to extract meaningful features from our dataset. A salient feature here is the last letter of a name. Let’s define a feature set:

1	featuresets = [(gender_features(n), g) for (n,g) in names_data]

And to extrapolate the last letter of names, we employ:

1 2	def gender_features(word): return {'last_letter': word[-1]}

Training and Executing Predictions

With our features ready, we dive into the training phase and subsequently harness the trained model for predictions.

classifier = nltk.NaiveBayesClassifier.train(train_set) 

# Sample Prediction
print(classifier.classify(gender_features('Frank')))

To crystallize, here’s a complete example:

import nltk.classify.util
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import names

def gender_features(word): 
    return {'last_letter': word[-1]}

names_data = ([(name, 'male') for name in names.words('male.txt')] + 
             [(name, 'female') for name in names.words('female.txt')])

featuresets = [(gender_features(n), g) for (n,g) in names_data] 
train_set = featuresets
classifier = nltk.NaiveBayesClassifier.train(train_set)

print(classifier.classify(gender_features('Frank')))

For a more interactive experience, you can facilitate name input during runtime:

1
2
3

# Real-time Prediction
name = input("Enter a Name: ")
print(f'Predicted Gender for {name}:', classifier.classify(gender_features(name)))

Note for Python 2 enthusiasts: Make sure to use raw_input instead of input.

Navigate to more insights with Back or delve further with Next.

Posted in nltk

2016-08-25

Leave a Reply: