Credit Card Fraud Detection Using Machine Learning
Updated on Aug 07, 2025 | 7 min read | 11.65K+ views
Share:
For working professionals
For fresh graduates
More
Updated on Aug 07, 2025 | 7 min read | 11.65K+ views
Share:
In the financial industry, identifying fraudulent credit card transactions is a crucial task. In this project, we use machine learning-powered credit card fraud detection techniques to address the issue.
Our goal is to identify fraudulent and legitimate activity with a high degree of accuracy and few false alarms by examining patterns in transaction data. This project demonstrates how data science can help reduce risk and promote financial safety.
For more project ideas like this one, check out our blog post - Top 25+ Essential Data Science Projects GitHub to Explore in 2025.
Popular AI Programs
Before you begin, you should be comfortable with:
Note: the project is of intermediate level, so you need 4-5 hours to complete the project.
Now, in the next section, we will see how to build a credit card fraud detection model using the above.
Let’s start building the project from scratch. So, without wasting any more time, let’s begin!
We use KaggleHub to directly fetch the dataset from Kaggle's repository. Here is the code to do so:
import kagglehub
# Download dataset
path = kagglehub.dataset_download("mlg-ulb/creditcardfraud")
print("Path to dataset files:", path)
Output:
Path to dataset files: /kaggle/input/creditcardfraud
In this step, we will load essential libraries needed for analysis and model training. To do so, use the below-mentioned code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import gridspec
Now, in this step, we will load the .csv file. Once loaded, we will inspect the first few entries.
Use the below-mentioned code to do so:
data = pd.read_csv("/kaggle/input/creditcardfraud/creditcard.csv")
print(data.head())
print(data.describe())
Output:
Time | V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V21 | V22 | V23 | V24 | V25 | V26 | V27 | V28 | Amount | Class |
0 | -1.359807 | -0.072781 | 2.536347 | 1.378155 | -0.338321 | 0.462388 | 0.239599 | 0.098698 | 0.363787 | -0.018307 | 0.277838 | -0.110474 | 0.066928 | 0.128539 | -0.189115 | 0.133558 | -0.021053 | 149.62 | 0 |
0 | 1.191857 | 0.266151 | 0.16648 | 0.448154 | 0.060018 | -0.082361 | -0.078803 | 0.085102 | -0.255425 | -0.225775 | -0.638672 | 0.101288 | -0.339846 | 0.16717 | 0.125895 | -0.008983 | 0.014724 | 2.69 | 0 |
1 | -1.358354 | -1.340163 | 1.773209 | 0.37978 | -0.503198 | 1.800499 | 0.791461 | 0.247676 | -1.514654 | 0.247998 | 0.771679 | 0.909412 | -0.689281 | -0.327642 | -0.139097 | -0.055353 | -0.059752 | 378.66 | 0 |
1 | -0.966272 | -0.185226 | 1.792993 | -0.863291 | -0.010309 | 1.247203 | 0.237609 | 0.377436 | -1.387024 | -0.1083 | 0.005274 | -0.190321 | -1.175575 | 0.647376 | -0.221929 | 0.062723 | 0.061458 | 123.5 | 0 |
2 | -1.158233 | 0.877737 | 1.548718 | 0.403034 | -0.407193 | 0.095921 | 0.592941 | -0.270533 | 0.817739 | -0.009431 | 0.798278 | -0.137458 | 0.141267 | -0.20601 | 0.502292 | 0.219422 | 0.215153 | 69.99 | 0 |
[8 rows x 31 columns]
The dataset has 284,807 transactions and 31 columns, including time, amount, anonymized features (V1–V28), and a Class column. The Class column indicates fraud (1) or not (0).
In this step, we will understand how many transactions are fraudulent. Use the below-mentioned code:
fraud = data[data['Class'] == 1]
valid = data[data['Class'] == 0]
outlierFraction = len(fraud) / float(len(valid))
print("Outlier Fraction:", outlierFraction)
print(f"Fraud Cases: {len(fraud)}")
print(f"Valid Transactions: {len(valid)}")
Output:
Outlier Fraction: 0.0017304750013189597
Fraud Cases: 492
Valid Transactions: 284315
We are dealing with an imbalanced classification problem. How can we say this? It is because fraudulent transactions make up only a tiny fraction of the dataset, as we can see from the output.
In this step, we will explore the transaction amounts. By doing so, we will be able to identify if frauds tend to involve higher or lower amounts.
Use the below-mentioned code to accomplish the same:
print("Fraudulent Transactions Amount Stats:")
print(fraud['Amount'].describe())
print("Valid Transactions Amount Stats:")
print(valid['Amount'].describe())
Output:
Fraudulent Transactions Amount Stats:
Statistic | Value |
Count | 492 |
Mean | 122.211321 |
Standard Deviation | 256.683288 |
Min | 0 |
25% | 1 |
50% | 9.25 |
75% | 105.89 |
Max | 2125.87 |
Name: Amount, dtype: float64
Valid Transactions Amount Stats:
Statistic | Value |
Count | 284315 |
Mean | 88.291022 |
Standard Deviation | 250.105092 |
Min | 0 |
25% | 5.65 |
50% | 22 |
75% | 77.05 |
Max | 25691.16 |
Name: Amount, dtype: float64
In this step, we will identify patterns and redundancy in features using a correlation matrix. Use the below-mentioned code to do so:
corrmat = data.corr()
plt.figure(figsize=(12, 9))
sns.heatmap(corrmat, vmax=0.8, square=True)
plt.title("Correlation Matrix")
plt.show()
Output:
Through the output, we get to know that:
In this step, we will prepare the data. We will do it by splitting the data into features (X) and target (Y). Once done, the train-test split.
Use the below-mentioned code:
from sklearn.model_selection import train_test_split
X = data.drop(['Class'], axis=1)
Y = data['Class']
xData = X.values
yData = Y.values
xTrain, xTest, yTrain, yTest = train_test_split(
xData, yData, test_size=0.2, random_state=42
)
In this step, we will train the Random Forest Classifier model. It is a robust model for tabular data.
Use the below-mentioned code to do so:
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier()
rfc.fit(xTrain, yTrain)
yPred = rfc.predict(xTest)
In this step, we will evaluate model performance. We will assess accuracy, precision, recall, F1-score, and MCC for a holistic view.
Use the below-mentioned code to do the same:
from sklearn.metrics import (
accuracy_score, precision_score, recall_score,
f1_score, matthews_corrcoef, confusion_matrix
)
accuracy = accuracy_score(yTest, yPred)
precision = precision_score(yTest, yPred)
recall = recall_score(yTest, yPred)
f1 = f1_score(yTest, yPred)
mcc = matthews_corrcoef(yTest, yPred)
print("Model Evaluation Metrics:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print(f"MCC: {mcc:.4f}")
Output:
Model Evaluation Metrics:
Metric | Value |
Accuracy | 0.9996 |
Precision | 0.9506 |
Recall | 0.7857 |
F1-Score | 0.8603 |
MCC | 0.864 |
Use the code mentioned below to visualize the confusion matrix.
conf_matrix = confusion_matrix(yTest, yPred)
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues",
xticklabels=['Normal', 'Fraud'], yticklabels=['Normal', 'Fraud'])
plt.title("Confusion Matrix")
plt.xlabel("Predicted Class")
plt.ylabel("True Class")
plt.show()
Output:
For this project, we created a machine learning model using the random forest classifier to detect fraudulent transactions from a highly imbalanced dataset. The model was found to be highly accurate - 99.96%. The real success lies in precision (95.06%) and recall (78.57%), which indicate it performs well in detecting actual frauds while keeping false alarms low
The confusion matrix shows:
These results suggest that the model is not only accurate but also maintains a strong balance between detecting true fraud and minimizing disruption to genuine users.
Expand your expertise with the best resources available. Browse the programs below to find your ideal fit in Best Machine Learning and AI Courses Online.
Discover in-demand Machine Learning skills to expand your expertise. Explore the programs below to find the perfect fit for your goals.
Subscribe to upGrad's Newsletter
Join thousands of learners who receive useful tips
Discover popular AI and ML blogs and free courses to deepen your expertise. Explore the programs below to find your perfect fit.
Colab Link:
https://colab.research.google.com/drive/1q_lwxweIUaY-y4z-zyjxTo4Fgm9WJere?usp=sharing
Credit card fraud detection is difficult because the dataset is extremely imbalanced. Fraud cases only make up a tiny portion of total transactions. This makes it challenging for models to learn patterns of fraud without being biased toward predicting non-fraud cases.
Precision tells us how many of the transactions flagged as fraud were actually fraudulent. In contrast, recall shows how many actual fraud cases the model was able to detect. In fraud detection, both metrics are paramount to minimize false alarms and catch as many real frauds as possible.
The confusion matrix aids in visualizing model predictions by showing how many true frauds were correctly identified and how many were missed/incorrectly flagged. It breaks down performance into:
High accuracy can be misleading in imbalanced datasets because predicting all transactions as non-fraud would still yield a very high accuracy. That’s why precision, recall, and F1-score provide a more realistic picture of the model’s effectiveness.
To improve performance, you can use techniques like:
Besides all the above-mentioned, fine-tuning the model and choosing the right evaluation metric is also helpful.
You should use metrics like accuracy, precision, recall, F1-score, and AUC-ROC curve. These metrics help evaluate how well your model detects fraudulent transactions.
You can either fill missing values with the mean, median, or mode, or remove rows with missing values depending on the amount and importance of the data missing.
Improve model accuracy by using techniques like ensemble learning, feature engineering, and hyperparameter tuning. Also, consider using imbalanced class methods like SMOTE.
Yes, deep learning models like Neural Networks can be used to capture complex patterns in fraud detection, but they require more data and computational power.
After training your model, use cross-validation and evaluate it on a separate test set to ensure it performs well on unseen data. Additionally, use metrics like precision and recall to evaluate its ability to detect fraud.
Labeled data is highly recommended for supervised learning methods, as it helps train the model to differentiate between fraud and non-fraud. For unsupervised learning, labeled data is not needed, but it may affect model accuracy.
6 articles published
Jaideep is in the Academics & Research team at UpGrad, creating content for the Data Science & Machine Learning programs. He is also interested in the conversation surrounding public policy re...
Speak with AI & ML expert
By submitting, I accept the T&C and
Privacy Policy
Top Resources