Industries of the 21st century are highly dependent on data to develop insights and make data-driven decisions. Data science has thus no longer remained a tangential need, but is imperative to businesses. In this article, we will examine the use of data science and analytics as a key disruptor in the financial services sector by diving into data collection, analysis and integration through diverse banking touchpoints such as branches, mobile apps, online platforms, call centres and ATMs, and correlating it to a bank’s strategy.
It is not surprising that data is deeply correlated with the business efficiency of financial institutions, especially banks. Banking is one of the key business domains, which makes the highest investment in big data and BA (business analytics) technologies. The application of data science in banking has got a lot of acceptance from half the planet. The banking sector has been facing intense competition from non-traditional players like mobile-only banks and UPI (Unified Payments Interface) applications, and needs to look for effective instruments to expand revenues and contain costs.
Internationally, market research companies have predicted that financial services companies will witness data growth at a CAGR of over 25% in 2018-2025. This means that data collection and analysis are not just imperative but also integral to the success of banking strategies worldwide. We dive into the workflow of a data scientist and the impact of data analysis on banking strategies in the future.
Table of Contents
Data Collection and Management
Over the last decade, data collection in the banking sector has become extremely sophisticated and has transitioned from bank ledgers to online transaction history, including collection from other touchpoints such as apps, call centres and ATMs. In fact, it is believed that half the world’s adult population now uses digital banking, and this statistic is bound to grow rapidly.
This means that every credit card transaction, message and web page opened on a banking website adds to the billions of bytes of data being collected by the global population. Along with this, the intense economic digitisation has provided banks with a fresh pool to tap into in the form of external sources such as mobile operators, geospatial data, credit bureaus and social networks. For a data analyst, the first step is to define and procure information from this pool, to structure and use it effectively as intelligence for making smart business decisions and staying competitive.
Data management is the most important step in effectively harnessing data that is collected painstakingly from various channels. This includes keeping an eye on data ownership, governance and storage, and structuring, data review and cleaning.
All these processes can be accomplished with the help of cloud-based storage, on-site servers, data warehouses and data lakes; by integrating CBS, CRM, LOS and other systems; with analytical and data visualisation tools; and through the creation of an organisational structure to optimise data usage. Additionally, centralised ownership of data ensures there is complete accountability within an organisation.
Given that data is the greatest asset for banks, they can only reap benefits by integrating data with analytics. Raw data alone cannot be of any use unless it is supported by robust analysis. The job of a data analyst includes creating models to predict customer behaviour, understand their preferences and proactively manage expectations.
Forward-looking analytics provide proactive triggers, such as default, attrition, propensity to buy a specific product, to predict future events in the journey of a consumer and, in turn, help the bank take effective actions. Over the years, rapid processing needs, innovative mobile technology, data availability and proliferation of open-source software have produced several use cases of data science in banking.
Some of these include customer targeting and segmentation, customer activation and drop-out prevention, cross-selling models, and credit and risk management. Integrating analytics with data is critical for the following reasons.
1. Risk Modelling
Risk modelling refers to the use of formal econometric techniques to determine the aggregate risk in a financial portfolio. Financial risk modelling is the most important blue-print for a bank when analysing businesses as well as individuals. The techniques used for risk modelling involve evaluating market risk, value at risk (VaR), historical simulation and using the extreme value theory to forecast likely losses for a variety of risks.
The different risks are also clubbed under the credit risk, market risk, model risk, liquidity risk and operational risk categories for businesses. Risk modelling should be done keeping in mind regional and international banking standards. With large amounts of data collected, powerful computing software can be used to perform quantitative risk analysis.
2. Fraud Detection
Data analysis enables auditors and fraud examiners to analyse an organisation’s business data to determine how well it has implemented internal controls to identify fraudulent transactions or activity. Some of the techniques employed for this purpose include calculation of statistical parameters (averages, standard deviations, unnatural high or low values), classification of data, arranging data to point to anomalies, digital analysis for unexpected occurrences, matching values, duplicate testing, gap testing, summing and validating entries.
3. Real-Time Predictive Analysis
As the banking industry becomes more dynamic, banks are under constant pressure to stay profitable and, at the same time, understand the needs, wants and preferences of their customers. There is a rapid time crunch between understanding the data and evolving a strategy. To address this time crunch, there is a need for banks to adopt real-time predictive analysis in order to respond in the fastest possible manner.
The benefit of predictive analysis is that it supports ongoing learning from current data rather than relying on historical data. It can detect unfamiliar trends, rather than operating in a defined structure, and requires building models rather than using traditional ones.
It adapts visual discovery tools that are easier to use and considers external data as well. In effect, predictive analysis is more dynamic and bespoke. A few areas where predictive analysis comes in handy include customer relationship enhancement, collateral and liquidity management, cash flow management, helping trade and supply chain financing, risk management and support operations.
4. Understanding Consumer Sentiment
In a fiercely competitive banking sector, understanding and managing consumer sentiment & analysis are integral to the operation of any bank. Banks are rapidly seeking AI (Artificial Intelligence) solutions in the form of sentiment analysis. This is needed to capture a customer’s reaction to a product, situation or event, and respond to it through texts, posts, reviews and other digital content.
Natural Language Processing (NLP) techniques are implemented for nuanced analysis of data. The data is usually collected from chatbots, virtual assistants, online translation services and more such channels. Banks can then use this as subjective information to personalise communication, prioritise customer issues, improve products and services, and enhance overall customer satisfaction.
5. Customer Segmentation, Profiling and Marketing Segmentation
As we accelerate into the digital world, there is a need for banks to adopt a data-based approach to consumer segmentation, profiling and market segmentation in order to customise their strategies for different audiences.
Banks need to upgrade their customer strategies frequently to fulfil the demands of the new and existing bases, and segmentation is critical to achieving this. Banks need to look at the entire cycle of creating customer value segments and understanding the customer relationship life cycle to create or upgrade their products accordingly.
6. Customer Lifetime Value
Customer Lifetime Value, or CLTV, refers to the complete financial value that a customer brings to the bank over his or her entire relationship with the bank. CLTV is calculated considering loyalty to the bank, the profitability of the customer, the average balances of loans and savings on a per customer basis, the average interest rate margin, average income or revenue per customer generated from non-interest income sources, and the cost of providing customer services and access. Automation in the calculation of the CLTV based on these factors produces faster results, and enables banks to change or upgrade their strategies faster.
7. Digital Assistants and Chatbots
Digital assistants and chatbots have revolutionised service and business communication for banks. Chatbots are creating a stir in social media. From assisting people with performing simple tasks to giving them a personalised experience, virtual assistants and chatbots have made banking easier.
8. Voice Recognition and Predictive Analysis
Banks have been able to test security features using AI-based voice recognition, which can automatically confirm the identity of a customer when they call into customer service. Given the success of this technology, customers may be able to speak and perform simple tasks such as transferring money, reporting stolen or lost ATM and credit cards, and receiving an accurate response with a virtual assistant.
Most of the urban bank users rely heavily on mobile banking, which has given way to AI-powered banking mobile apps. Most of these apps offer personal, contextual and predictive services. These are intelligent apps that can track the user’s behaviour, and provide them with personalised tips and insights on savings and expenses.
Aligning with Banking Strategies
The third step after data collection and modelling is to use such intelligence to optimise marketing campaigns, create relevant ads, automate outreach and optimise customer onboarding. Analytics is a key strategic pillar, apart from operations, that can greatly improve functional areas such as risk, compliance, fraud and non-performing asset (NPA) monitoring, and calculation of VaR. Banks can reduce excessive servicing costs and increase profitability if these actions are performed with the least amount of roadblocks.
Future of Data Science in Banking
To push the boundaries of data utilisation, the future of banking is highly dependent on data encryption, AI and cloud computing. Since sensitive data is key to business, fully encrypted data can ensure the security of ledgers that are impossible to tamper with and also guarantee financial security. It can also help in eliminating third-party verifications, thereby speeding up processes and saving on transaction fees.
Other revolutionary fits to the future of data science in banking are seen in AI functions such as automated customer service response, real-time monitoring of regulatory requirements and algorithmic or micro-trading. The future of banking also includes cloud computing, which offers banks unlimited hardware and software resources that can help them scale up or down as per their requirements.
Cloud technology can also help banks lower their infrastructure costs, improve flexibility, increase efficiency and serve clients faster; all of this will contribute to a more efficient infrastructure and ensure a better relationship with the consumers. However, cloud computing means significant regulatory implications for banks. Some of the key issues that banks are trying to address currently include ensuring service continuity for customers in the face of any potential mishap on the cloud and the need to transition back to own databases in such scenarios.
Other important issues being evaluated include the way personal information is stored and used, customer data protection, dependence on third-party providers and security of cloud infrastructure, and potential mixing of financial data with other data on shared servers.
The future of data in banking also holds the promise of teams getting rid of a ‘siloed’ approach and using insights for the entire organisation. With data science, enterprises are likely to be more connected, instead of using separate analytics practices.
For example, marketing and digital (web, social and mobile) analytics, credit risk analytics, operations analytics, fraud analytics and compliance analytics will be able to leverage the same data structures. For instance, a team looking at increasing activation and product penetration by cross-selling across channels could potentially uncover fraudulent behaviour and anomalies.
The future of banking will involve leveraging data collected from various sources to get accurate customer insights and use advanced analytics with the help of digital technologies for increased profits and efficiency.
If you are curious to learn about data science applications in banking & various sectors, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.
What is the hiring process of data scientists at Citibank?
A telephonic interview is the first step in the interview process. It is a straightforward Data Science Q&A session. An onsite interview follows the phone interview. Interviews with team heads, team members, and SVPs are part of the onsite interview. Before the onsite, there may or may not be an online SQL assessment. SQL evaluations are pretty tough. Citigroup's data science team works using Hadoop and Spark. Their questions cover a wide range of topics, including coding, SQL, Systems Design, Hadoop, and Spark. They are based on both fundamental and advanced Data Science concepts. If you work hard on your fundamentals, you will almost certainly get hired by one of the world's largest banks!
Is Python used in investment banking?
Python is an excellent programming language for financial applications. Banks are adopting Python to address quantitative issues for pricing, trade management, and risk management platforms throughout the investment banking and hedge fund industries. Python is being used by banks to tackle quantitative issues in pricing, trading, and risk management, as well as predictive analysis. Pandas in Python is a library that simplifies the process of data visualization and allows for complex statistical analyses. Python is a popular programming language among financial data analysts, traders, crypto currency enthusiasts, and developers. Many bank positions need it as a necessity, making it one of the most in-demand languages for people looking for work in one of these sectors.
Should I learn Python or R for finance?
R is used extensively for credit risk analysis and portfolio management. Python is a popular architectural language among investment banks and asset managers. R still maintains a modest advantage over Python in pure data science, though the margin has narrowed substantially. Python, on the other hand, has a larger range of applications, making it a superior all-around pick. If you're just starting out in your profession, knowing Python will provide you additional alternatives down the road.