This article was originally published in Analytics India Magazine.
What forms the core of businesses today? Huge volumes of data that flow in and out every day — and though it does matter, what really comes into play is the ability to use data and models to make better business decisions. UpGrad has collaborated with Uber for the Data Analytics Program content and a very central case-study.
Sai Alluri, Analytics Lead at Uber India talks about supply positioning models, segmentation and visualization tools that are applied at Uber, and how Uber stays on top of the game by understanding the biggest mismatch between supply and demand.
Table of Contents
Why this Program?
- Get a peek into how Uber analyses historical data, uses it as a benchmark and predicts future action
- Get pointers from the best in the industry: Sai Alluri part of Uber’s PRO team
- Learn how to leverage analytics to stave off competition
Supply Optimization at Uber
The supply positioning model at Uber refers to anticipating demand patterns, and placing driver partners across those hubs with the aim to plug in the demand, lower ETAs and increase overall efficiency. One of the key focus areas is moving from a passive supply-positioning model to act through specific recommendations across the network.
How is Supply Positioning Done At Uber?
In the words of Alluri — supply optimization is one of the biggest focuses at Uber and the challenge is to efficiently optimize the supply wherever there are high areas of demand (or can be). One of the methodologies is through search surge, in real time, meaning that supply comes in from the area of highest demand. Say for example, when you see a search surge multiple in 2x or 3x, it portrays how much demand is in that particular area and what supply would you need to meet this demand.
Building Models Based on Historical Data
Uber analyzes historical data for about three or four weeks and identifies pockets within the city that witnesses extremely high demand. Let’s take Gurgaon as a case in point. “Say, there is a high search-surge multiple in Connaught Place and our driver partner is in Gurgaon which is X kms from CP. It is very difficult for a driver to move from Gurgaon to CP given the traffic conditions and it might take him longer to reach. How do we know in advance where this demand is going to be based on historical data?” asks Alluri.
Key Steps to How the Model was Built
- Look at historical data for the last three or four weeks
- Look at the time, day and specific areas within the city where the highest demand comes in from
- Key metric is the number of requests coming in and how many are getting completed in different pockets of the city
- If a specific pocket has a really low completed trips request, it implies a high demand in that hub but not enough supply
- Next step is to focus on how to proactively tell drivers to move within these areas, not in real time but a 2-hour or 3-hour lag so that they can position themselves there when the demand arises
Supply Positioning in a Nutshell
Uber does supply positioning by specifically a) breaking down the city into multiple pockets, b) then identifying these pockets based on the demand parameters that show up, c) once you identify these pockets, you can figure out how you want to position the supply chain in these specific areas.
“For example, a specific pocket has a low complete-request ratio or has a fewer number of rides completed as compared to other areas, what should be done is ensuring how to get drivers in the demand hub in time,” says Alluri.
Key parameters addressed for the analysis are: broken up by the hour of the day, by day of the week and by the specific pocket.
Meeting the Demand-Supply Gap with Predictive Analytics
So now that you have the information, how do you use it to inform future decisions? In case of Uber, the real challenge is in filling the demand-supply gap. “The idea is to figure out if the highest area of demand is in one specific pocket but the supply is going to come in from a different pocket. Which means we need to send this message to driver-partners early so that they can get to this specific area and are ready to go when the demand hits,” points out Alluri.
Analysis is Automated to Drive Results
- Uber sends out weekly communications to drivers in real time
- Weekly communications inform about high demand areas, with specific recommendations
- Enabling driver-partners to make best decisions, increase earnings and lower ETAs
Objective of Historical Analysis – Build Forecasting Model
Alluri informs that the idea behind analyzing three-four weeks of data for a specific city, further broken down into specific hub/pocket within the city – and by the hour of day and day of the week – is to get consistent behaviour across that time period, for that particular pocket. The motive is to set a benchmark and rule out weekly anomalies. And it is further used to build a potential forecasting model, where one can predict the highest demand or lowest supply and keep modifying it on a weekly or bi-weekly basis as the data changes.
A/B Testing & Clustering/Segmentation Analysis
At Uber, the goal is to drive efficiency across all areas of business. A/B testing was to find the most optimized and effective communications that had to be dispatched to driver-partners to address their issues, convert drivers to become loyal Uber partners by incentivizing.
“We want to make the process for a driver-partner signing up on our platform easy and scalable so that they can reach out to us for specific issues, such as using the app. For example, as soon as the driver becomes active on our system, we want to make sure if he has any questions pertaining to how do you go online or how do you essentially go pick up your customer (they are answered). So we monitor every aspect of this journey map at different cycles,” says Alluri.
The communication dispatch was targeted at converting drivers into loyal Uber partners. An A/B test was set up for two specific cohorts of drivers who had joined in the same week. Let’s keep 100 drivers in cohort A and another 100 in cohort B.
- Idea is to find out how many don’t take the trip in the first 3-4 days
- Reach out with specific communications to drivers who still haven’t gotten activated
- Did the communication improve efficiency and drive conversions vis a vis cohort B that did not receive any messaging
The goal of A/B test was to use resources, in this case, communications and incentives, effectively:
- Lift conversions, urge drivers to become activated and turn from part time to full time
- Find out what communication is most (text or more personalized calls) effective
- Find out what should the content be and how to build the iterative process
Clustering Analysis basically means breaking up huge data sets into further subsets to help get better insights into critical decision areas. “What happens with clustering/segmentation analysis is that, it is an iterative process, you keep building into the model and keep finding data sets so that you gain smarter and stronger insights,” notes Alluri.
In this case, segmentation was based on hours and trips. Alluri shares how the model was further optimized to include trips and how it led to increased revenue for drivers. “When we started this model initially it was meant as a question analysis and we used hours, that driver partners were putting in on a weekly basis or a daily basis as a variable. But as the model became smarter we wanted to include trips also to ensure that drivers that are driving at night or just part-time at night are not coming online for just 4-5 hours but are able to get trips, the end result is, they are engaged on our platform,” he says.
The end result was:
- Helping part-time drivers find trips at night (we don’t want a driver coming online at a wrong time)
- Achieve their running target, thereby meeting revenue generation
- Boosting loyalty, converting from part time to full time (achieving day-time trips as well)
SQL Still Triumphs in Data Analytics
“Data warehousing is set in a way that we can do analysis on it, so it is easy for city teams and analysts to go into this data, get what you need to figure out what the biggest problems/issues are in those specific areas and how to go about fixing it,” explains Alluri.
Alluri tells why SQL is preferred in Analytics
- There are no manual mistakes
- Write the query you want, find out what information you need and run the logic in that query
- When you get the file you are ready to share, you can also keep adding analysis on it
- Automate it using either R or Python and gather information sets that are more useful
Visualization at Uber
Visual analytics is used at Uber to make data look more actionable and understandable. In India, one of the tools used by Uber’s city teams is heat maps which are used to find out where exactly is the biggest mismatch between supply and demand. The team uses visualization layers for most business insight applications and uses it to find out the sequence of data flowing in.
Learn data science courses from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.
About Sai Alluri:
Sai Alluri holds a degree in Mechanical Engineering from University of Illinois at Urbana-Champaign. He worked in consulting before joining Uber in San Francisco, California. Sai shifted to India last year to set up a team and focus on operational and analytical challenges in India. He is part of the industry professionals team working closely with UpGrad to create a world-class learning experience.
Haven’t had enough? Want to know more about this case-study or various other real-life examples from many other industry leaders who have partnered with UpGrad? Check out the UpGrad-IIIT Bangalore PG Diploma in Data Analytics Program now!
If you are curious to learn about big data, data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.
What is Uber?
Uber is a car-hailing business that originated in San Francisco under the name UberCab. Uber collects information on its drivers. Uber studies their speed and acceleration, as well as examining to see whether they are working for a competitor, in addition to gathering non-identifiable information on their car and location. Uber utilizes personal data to track which elements of the service are most popular, evaluate user trends, and determine where they should offer their services.
What is Data Visualization?
Data Visualization is known as the process of converting massive data sets and measurements into charts, graphs, and other graphics. The visual representation of data makes it simpler to locate and address real-time patterns, outliers, and fresh insights regarding the data's content. It provides views on one or more pages or screens to assist you in keeping track of events or activities at a glance. A dashboard presents real-time data by extorting complicated data points from large data sets.
What is Data Segmentation?
The act of segmenting data according to your company's demands in order to refine your analyses based on a defined context, utilizing a cross-calculating analysis tool, is referred to as segmentation. The goal of segmentation is to gain a deeper understanding of your visitors as well as actionable data to improve your website or mobile app. A segment, in real words, allows you to select your analysis based on specific elements (single or combined). Segmentation can be performed on components linked to a single visit as well as elements connected to several visits throughout the course of the study period. This segmentation is referred to as 'visitor segmentation' in the latter situation.