As companies are becoming increasingly data-driven, a Machine Learning technique called ‘Sentiment Analysis’ is gaining immense popularity day by day. It analyses the digital data/text through Natural Language Processing (NLP) to find the polarity (positive, negative, neutral), feelings, and emotions (angry, happy, sad, etc.) expressed in the text.
Since Twitter is one of the most comprehensive sources of live, public conversation worldwide, business firms, political groups, etc. are interested in performing ‘Sentiment Analysis’ of tweets to understand the emotions/opinions of the target market or for studying competitors’ market. Although they are ready to use programs for the purpose but to achieve predictions with a high level of accuracy, specific to particular criteria and domains, the best way is to create a customized Twitter Sentiment Analysis Python model or program.
Step-by-step Tutorial: Create Twitter Sentiment Analysis Program Using Python
This tutorial aims to create a Twitter Sentiment Analysis Program using Python. The resultant program should be capable of parsing the tweets fetched from twitter and understanding the text’s sentiments, like its polarity and subjectivity.
1. Foremost is the basic coding/programming knowledge of Python.
2. Tools to be installed on your computer:
- Libraries: Tweepy, text blob, word cloud, pandas, NumPy, matplotlib
(Tweepy is the official python library for twitter API that enables Python to communicate with Twitter platform)
3. A Twitter Account
4. A Twitter App needs to be created and authenticated by Twitter: This is necessary to get the ‘Consumer key and Access tokens’ that you will need in your programming.
If you already don’t have a Twitter App created for the purpose, then here is how to create it.
How to Create a Twitter App?
- Go to the Twitter developer site: dev.twitter.com.
- Sign in with your Twitter account
- Go to ‘My applications’
- Click on ‘Create a new application.’
- Next, you need to fill a form, as shown below.
- Next, click on ‘Create my Access Token.’
- In the next page, choose the ‘Read and Write’ option under the column ‘Application Type.’
You will be provided with your Twitter App OAuth Settings, which includes all necessary details related to your consumer key, consumer secret, Access token, Access token secret, etc. You need to note these details as these API credentials will enable you to fetch tweets from twitter. Better to save it in a CSV file in your computer, latter you can directly upload the CSV file into your program to read API credentials
Get Started with Creating Twitter Sentiment Analysis Python Program
1. Import the Libraries: Tweepy, text blob, word cloud, pandas, NumPy, matplotlib
2. Authenticate the Twitter App: Next, you need to authenticate your twitter app using the Twitter App OAuth Settings credentials, also referred to as Twitter API credentials. For this, you need to create an Authentication object, using the codes as shown in the image below.
To fill up the Twitter API credentials, you can either upload the CSV file or manually copy paste the credential details.
3. Fetch the Tweets from the Twitter User: Now, for fetching the tweets, you first need to choose a Twitter user whose tweets you want to parse to understand the sentiment expressed in it. Let’s say; you want to see whether the tweets of ‘UserXYZ’ are positive or negative or neutral by performing sentiment analysis of the 100 tweets by the UserXYZ.
Code for fetching the tweets
posts = api.user_timeline(screen_name = ”UserXYZ”, count= 100, Lang =”en”, tweet_mode=“extended”)
Running the above command will show up the tweets.
4. Create Data Frame: Now, you need to create a data frame for the tweets you have fetched. Let’s say you name the first column of your df as ‘Tweets’, and it will contain all the tweets spread across 100 rows since you are analyzing 100 tweets.
Df = pd.dataframe( [tweet.full_text for tweet in posts] , columns=[ ‘Tweet’])
5. Clean the Text: Cleaning the text of the tweets is important for the success of your twitter sentiment analysis python program, as there will be many unwanted symbols like @, #, re-tweets, hyperlinks in the URLs, etc. Here your python’’ library gets into use.
Get the Subjectivity and Polarity: Once you have cleaned the text, you need to create two functions using the TextBlob python library to get the tweets’ subjectivity and polarity. The subjectivity shows how opinionated the text is, and polarity describes the positivity or negativity of the text. It would be best to write the python script to create two more columns in your data frame to host Subjectivity and Polarity. So, now your data frame will have three columns (first for the tweets, 2nd for the subjectivity, 3rd for the polarity)
The codes for creating Subjectivity and Polarity functions are as follows:
After you run the code, you will see the scores of subjectivity and polarity of each tweet shown in the respective columns. TextBlob describes the polarity within a scale of 1 to -1. So, if a tweet has -0.4 polarity means it’s slightly negative, and if it has 0.6 subjectivity, then it is fairly subjective.
6. Next, you can choose to include a word cloud in your Twitter Sentiment Analysis Python program, as word clouds are also popular as a data visualization technique used for sentiment analysis, wherein the size of the words indicates its importance.
Example of a WordCloud:
The matplotlib, Pandas, and WordCloud libraries will come into action that you have already imported. To plot a word cloud-first, you need to create a variable; let’s name it ‘allwords’ to represent all the tweets in the ‘Tweets’ column of the data frame.
Code for creating WordCloud
allwords = ‘ ‘.join( [twts for twts in df [ ‘Tweets’ ]] )
WordCloud = WordCloud (width =xxx, height =xxx, randon_state =xxx, max_font_size =xxx. generate (allwords)
7. As you have the polarity scores for each tweet, you can start to compute positive, negative, and neutral analysis of the tweets. For this, you need to create a function, let’s call it ‘Analysis’, wherein you can assign the score 0 to neutral, <0 to negative, and >0 to positive.
If score < 0
elif score == 0
Next, to host the results of the sentiment analysis of the tweets, create a new column in your data frame, let’s name it ‘TwtAnalysis’ and then write the following code:
df [ ‘TwtAnalysis’ ] = df [ ‘Polarity’ ]. apply(Analysis)
8. The new data frame will have the added column named ‘TwtAnalysis’, and it will refer to each tweet either as positive, negative, or neutral based on its polarity score. An example is shown below in the image:
9. Once you have the classification of the tweets as positive, negative, and neutral, you can continue building your Twitter Sentiment Analysis Python program to represent the data in different formats such as:
- Get the percentage of positive, negative, or neutral tweets.
- Print all of the positive comments or negative or neutral tweets separately
- Create a visual sentiment analysis chart of the positive, negative, and neutral tweets, and much more.
Also Read: Top 9 Python Libraries for Machine Learning
The Twitter Sentiment Analysis Python program, explained in this article, is just one way to create such a program. The developer can customize the program in many ways to match the specifications for achieving utmost accuracy in the data reading, that is the beauty of programming it through python, which is a great language, supported by an active community of developers and too many libraries.
Python holds immense scope in the space of Machine Learning and Data Science. Those who are into programming for a while know it well that Machine Learning will continue to be one of the breakthroughs in the future of programming.
If you want to get a comprehensive and structured learning experience, also if you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.