Table of Contents
Prediction and analysis of the stock market are some of the most complicated tasks to do. There are several reasons for this, such as the market volatility and so many other dependent and independent factors for deciding the value of a particular stock in the market. These factors make it very difficult for any stock market analyst to predict the rise and fall with high accuracy degrees.
However, with the advent of Machine Learning and its robust algorithms, the latest market analysis and Stock Market Prediction developments have started incorporating such techniques in understanding the stock market data.
In short, Machine Learning Algorithms are being used widely by many organisations in analysing and predicting stock values. This article shall go through a simple Implementation of analysing and predicting a Popular Worldwide Online Retail Store’s stock values using several Machine Learning Algorithms in Python.
Before we get into the program’s implementation to predict the stock market values, let us visualise the data on which we will be working. Here, we will be analysing the stock value of Microsoft Corporation (MSFT) from the National Association of Securities Dealers Automated Quotations (NASDAQ). The stock value data will be presented in the form of a Comma Separated File (.csv), which can be opened and viewed using Excel or a Spreadsheet.
MSFT has its stocks registered in NASDAQ and has its values updated during every working day of the stock market. Note that the market doesn’t allow trading to happen on Saturdays and Sundays; hence there is a gap between the two dates. For each date, the Opening Value of the stock, Highest and Lowest values of that stock on the same days are noted, along with the Closing Value at the end of the day.
The Adjusted Close Value shows the stock’s value after dividends are posted (Too technical!). Additionally, the total volume of the stocks in the market are also given, With these data, it is up to the work of a Machine Learning/Data Scientist to study the data and implement several algorithms that can extract patterns from the Microsoft Corporation stock’s historical data.
Long Short-Term Memory
To develop a Machine Learning model to predict the stock prices of Microsoft Corporation, we will be using the technique of Long Short-Term Memory (LSTM). They are used to make small modifications to the information by multiplications and additions. By definition, long-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in deep learning.
Unlike standard feed-forward neural networks, LSTM has feedback connections. It can process single data points (such as images) and entire data sequences (such as speech or video).To understand the concept behind LSTM, let us take a simple example of an online customer review of a Mobile Phone.
Suppose we want to buy the Mobile Phone, we usually refer to the net reviews by certified users. Depending on their thinking and inputs, we decide whether the mobile is good or bad and then buy it. As we go on reading the reviews, we look for keywords such as “amazing”, “good camera”, “best battery backup”, and many other terms related to a mobile phone.
We tend to ignore the common words in English such as “it”, “gave”, “this”, etc. Thus, when we decide whether to buy the mobile phone or not, we only remember these keywords defined above. Most probably, we forget the other words.
This is the same way in which the Long short-term Memory Algorithm works. It only remembers the relevant information and uses it to make predictions ignoring the non-relevant data. In this way, we have to build an LSTM model that essentially recognises only the essential data about that stock and leaves out its outliers.
Though the above-given structure of an LSTM architecture may seem intriguing at first, it is sufficient to remember that LSTM is an advanced version of Recurrent Neural Networks that retains Memory to process sequences of data. It can remove or add information to the cell state, carefully regulated by structures called gates.
The LSTM unit comprises a cell, an input gate, an output gate, and a forget gate. The cell remembers values over arbitrary time intervals, and the three gates regulate the flow of information into and out of the cell.
We shall move on to the part where we put the LSTM into use in predicting the stock value using Machine Learning in Python.
Step 1 – Importing the Libraries
As we all know, the first step is to import libraries that are necessary to preprocess the stock data of Microsoft Corporation and the other required libraries for building and visualising the outputs of the LSTM model. For this, we will use the Keras library under the TensorFlow framework. The required modules are imported from the Keras library individually.
#Importing the Libraries
import pandas as PD
import NumPy as np
import matplotlib. pyplot as plt
from sklearn. Preprocessing import MinMaxScaler
from Keras. layers import LSTM, Dense, Dropout
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib. dates as mandates
from sklearn. Preprocessing import MinMaxScaler
from sklearn import linear_model
from Keras. Models import Sequential
from Keras. Layers import Dense
import Keras. Backend as K
from Keras. Callbacks import EarlyStopping
from Keras. Optimisers import Adam
from Keras. Models import load_model
from Keras. Layers import LSTM
from Keras. utils.vis_utils import plot_model
Step 2 – Getting Visualising the Data
Using the Pandas Data reader library, we shall upload the local system’s stock data as a Comma Separated Value (.csv) file and store it to a pandas DataFrame. Finally, we shall also view the data.
#Get the Dataset
df = pd.read_csv(“MicrosoftStockData.csv”,na_values=[‘null’],index_col=’Date’,parse_dates=True,infer_datetime_format=True)
Step 3 – Print the DataFrame Shape and Check for Null Values.
In this yet another crucial step, we first print the shape of the dataset. To make sure that there are no null values in the data frame, we check for them. The presence of null values in the dataset tend to cause problems during training as they act as outliers causing a wide variance in the training process.
#Print Dataframe shape and Check for Null Values
print(“Dataframe Shape: “, df. shape)
print(“Null Value Present: “, df.IsNull().values.any())
>> Dataframe Shape: (7334, 6)
>>Null Value Present: False
Step 4 – Plotting the True Adjusted Close Value
The final output value that is to be predicted using the Machine Learning model is the Adjusted Close Value. This value represents the closing value of the stock on that particular day of stock market trading.
#Plot the True Adj Close Value
Step 5 – Setting the Target Variable and Selecting the Features
In the next step, we assign the output column to the target variable. In this case, it is the adjusted relative value of the Microsoft Stock. Additionally, we also select the features that act as the independent variable to the target variable (dependent variable). To account for training purpose, we choose four characteristics, which are:
#Set Target Variable
output_var = PD.DataFrame(df[‘Adj Close’])
#Selecting the Features
features = [‘Open’, ‘High’, ‘Low’, ‘Volume’]
Step 6 – Scaling
To reduce the data’s computational cost in the table, we shall scale down the stock values to values between 0 and 1. In this way, all the data in big numbers get reduced, thus reducing memory usage. Also, we can get more accuracy by scaling down as the data is not spread out in tremendous values. This is performed by the MinMaxScaler class of the sci-kit-learn library.
scaler = MinMaxScaler()
feature_transform = scaler.fit_transform(df[features])
feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)
As mentioned above, we see that the feature variables’ values are scaled down to smaller values compared to the real values given above.
Step 7 – Splitting to a Training Set and Test Set.
Before feeding the data into the training model, we need to split the entire dataset into training and test set. The Machine Learning LSTM model will be trained on the data present in the training set and tested upon on the test set for accuracy and backpropagation.
For this, we will be using the TimeSeriesSplit class of the sci-kit-learn library. We set the number of splits as 10, which denotes that 10% of the data will be used as the test set, and 90% of the data will be used for training the LSTM model. The advantage of using this Time Series split is that the split time series data samples are observed at fixed time intervals.
#Splitting to Training set and Test set
for train_index, test_index in timesplit.split(feature_transform):
X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]
y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()
Step 8 – Processing the Data For LSTM
Once the training and test sets are ready, we can feed the data into the LSTM model once it is built. Before that, we need to convert the training and test set data into a data type that the LSTM model will accept. We first convert the training data and test data to NumPy arrays and then reshape them to the format (Number of Samples, 1, Number of Features) as the LSTM requires that the data be fed in 3D form. As we know, the number of samples in the training set is 90% of 7334, which is 6667, and the number of features is 4, the training set is reshaped to (6667, 1, 4). Similarly, the test set is also reshaped.
#Process the data for LSTM
X_train = trainX.reshape(X_train.shape, 1, X_train.shape)
X_test = testX.reshape(X_test.shape, 1, X_test.shape)
Step 9 – Building the LSTM Model
Finally, we come to the stage where we build the LSTM Model. Here, we create a Sequential Keras model with one LSTM layer. The LSTM layer has 32 unit, and it is followed by one Dense Layer of 1 neuron.
We use Adam Optimizer and the Mean Squared Error as the loss function for compiling the model. These two are the most preferred combination for an LSTM model. Additionally, the model is also plotted and is displayed below.
#Building the LSTM Model
lstm = Sequential()
lstm.add(LSTM(32, input_shape=(1, trainX.shape), activation=’relu’, return_sequences=False))
plot_model(lstm, show_shapes=True, show_layer_names=True)
Step 10 – Training the Model
Finally, we train the LSTM model designed above on the training data for 100 epochs with a batch size of 8 using the fit function.
history = lstm.fit(X_train, y_train, epochs=100, batch_size=8, verbose=1, shuffle=False)
834/834 [==============================] – 3s 2ms/step – loss: 67.1211
834/834 [==============================] – 1s 2ms/step – loss: 70.4911
834/834 [==============================] – 1s 2ms/step – loss: 48.8155
834/834 [==============================] – 1s 2ms/step – loss: 21.5447
834/834 [==============================] – 1s 2ms/step – loss: 6.1709
834/834 [==============================] – 1s 2ms/step – loss: 1.8726
834/834 [==============================] – 1s 2ms/step – loss: 0.9380
834/834 [==============================] – 2s 2ms/step – loss: 0.6566
834/834 [==============================] – 1s 2ms/step – loss: 0.5369
834/834 [==============================] – 2s 2ms/step – loss: 0.4761
834/834 [==============================] – 1s 2ms/step – loss: 0.4542
834/834 [==============================] – 2s 2ms/step – loss: 0.4553
834/834 [==============================] – 1s 2ms/step – loss: 0.4565
834/834 [==============================] – 1s 2ms/step – loss: 0.4576
834/834 [==============================] – 1s 2ms/step – loss: 0.4588
834/834 [==============================] – 1s 2ms/step – loss: 0.4599
Finally, we see that the loss value has decreased exponentially over time during the training process of 100 epochs and has reached a value of 0.4599
Step 11 – LSTM Prediction
With our model ready, it is time to use the model trained using the LSTM network on the test set and predict the Adjacent Close Value of the Microsoft stock. This is performed by using the simple function of predict on the lstm model built.
Step 12 – True vs Predicted Adj Close Value – LSTM
Finally, as we have predicted the test set’s values, we can plot the graph to compare both Adj Close’s true values and Adj Close’s predicted value by the LSTM Machine Learning model.
#True vs Predicted Adj Close Value – LSTM
plt.plot(y_test, label=’True Value’)
plt.plot(y_pred, label=’LSTM Value’)
plt.title(“Prediction by LSTM”)
The above graph shows that some pattern is detected by the very basic single LSTM network model built above. By fine-tuning several parameters and adding more LSTM layers to the model, we can achieve a more accurate representation of any given company’s stock value.
If you’re interested to learn more about artificial intelligence examples, machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.
Can you predict the stock market using machine learning?
Today, we have a number of indicators to help predict market trends. However, we have to look no further than a high-powered computer to find the most accurate indicators for the stock market. The stock market is an open system, and it can be viewed as a complex network. The network is made up of the relationships between the stocks, companies, investors and trade volumes. By using a data-mining algorithm like the support vector machine, you can apply a mathematical formula to extract the relationships among these variables. The stock market is now beyond human prediction.
Which algorithm is best for stock market prediction?
For best results, you should use Linear Regression. Linear Regression is a statistical approach that is used to determine the relationship between two different variables. In this example, the variables are price and time. In stock market prediction, the price is the independent variable, and the time is the dependent variable. If a linear relationship between these two variables can be determined, then it is possible to accurately predict the value of the stock at any point in the future.
Is stock market prediction a classification or regression problem?
Before we answer, we need to understand what stock market predictions mean. Is it a binary classification problem or a regression problem? Suppose we want to predict the future of a stock, where future means the next day, week, month, or year. If the past performance of the stock at some time point is the input and future is the output, then it is a regression problem. If the past performance of a stock and the future of a stock are independent, then it is a classification problem.