Anything that gains momentum quickly tends to become what everyone is talking about. And, the more people talk about something, the more misconceptions and myths pile up. Data Science and Analytics is one such domain that is continuously on the rise, and with it, is an increasing number of associated myths.
Today, we’re going to debunk some of these myths and misconceptions revolving around the lives and work of data scientists. But before we move on to that, let’s first understand at a typical day in the life of a data scientist.
An organisation has heaps of data which they’ve collected over time from various sources and in various formats. Now, they’ve decided to do something about it. They want to make their data count. Who do they turn to?
Yes, data scientists whom majority confuses to be some supernatural beings. These people are at the heart and soul of any organisation’s data analytics team. They hold a vital position and though it might come as a surprise to you, their regular day is quite like the typical day of any other white-collar employee.
Meetings, meetings, and some more meetings!
The data scientists have to attend meetings, mostly on a daily basis, to gather requirements, discuss the work accomplished, and plan the day’s work. There are also internal meetings that are important to the organisational goals and overcome business problems. All in all, the purpose of these meetings is to get a clearer idea of the problems at hand and make sure everyone in the organisation is on terms of the way forward.
Scrounge for data and make it pristine!
Part of their day goes in identifying real-world issues their organisation is facing and finding out ways to make their data help in solving those problems. Then comes a more challenging part – determining the type and source of data required. An experienced data scientist always picks the data from the most relevant sources – the ones that are likely to deliver value. However, this is something that comes with experience and expertise. Hence, data scientists need to spend quite a lot of time to it.
However, gathering the data only does half the job. The data scientist also needs to make sure that the data is validated and cleaned. If they work with imperfect data, the chances of being successful decreases exponentially.
Get to doing magic. We mean analytics.
When the data is entirely cleaned, the data scientist spends his remaining time in identifying trends and patterns from the data. This is another problematic aspect of a data scientist’s job, especially since there is no set method to analyse this data efficiently. More often than not, it requires a data scientist to design their tools and algorithms or tweak with the existing ones. This demands an open mind and a willingness to experiment.
Weave a story.
After analysing the datasets next comes the most important part – that of data visualisation. The data scientists need to present their findings in front of an audience that is majorly non-tech, the likes of stakeholders and marketers of the company. This isn’t always a daily task, but it needs to be frequently done to keep things in motion. The data scientist’s significant workload here involves coming up with a visualisation technique that not only captures the essence of their data but also presents everything in an aesthetically pleasing manner.
The role of a data scientist is extremely dynamic; no two days are the same for them. Their job involves them to be on their toes and always have their thinking hats on. The data they’re working with, the problems they’re aiming to solve, and the insights they’re looking to discover are all constantly changing. That is what makes the role of a data scientist so unique and exciting.
Now, take a step ahead and debunk more of such, sometimes preposterous, myths:
Myth #1: You need to be an expert statistician with a PhD in statistics. Or, at the very least, you must have a degree in statistics.
Yes, holding a formal degree in statistics will ensure that you’re on terms with the better practices in statistics from day 1. However, hold your horses there – if you look at the world of data science, you’ll find more people from a managerial/non-mathematics background than the math-addicted “rocket scientists”.
Myth #2: You need to be a hardcore programmer to excel at data science. The more hardcore, the better.
Again, like the myth we discussed just a couple of lines ago, this too is based on a false assumption about the data scientist’s job. People assume being a data scientist involves writing lines of codes and algorithms and whatnots! But, if you paid attention to the routine we discussed earlier, you’ll realise there’s no significant “coding” involved there. Most of the algorithms or methods are available ready-made with just a little tweaking needed. However, you need to have a logical bent of mind to do that.
Myth #3: Data scientists aren’t scientists in any meaningful sense of the word.
Every scientist is by default a data scientist. Pure science has always co-existed with observational data. Without the ability to sift, sort, structure, classify, theorize, and present their data, no scientist can bring coherence to their study. Similarly, a data scientist who hasn’t drilled deep into the heart of their data can not present their findings effectively. Statistical controls have always been a bedrock of pure science, and now, they’re the fundamental responsibilities of a data scientist. So, if a data scientist is observing the trends and patterns in the behaviour of an organisation’s customers, and confirming their findings using statistics and real-world experiments, they’re a scientist, plain and simple.
Myth #4: Data scientists work on costly and complicated statistical tools to get their work done.
Essentially, the job of a data scientist demands them to look for hidden trends and patterns in a broad set of data. For that, they can use user-friendly visualisation tools, self-service search-driven business intelligence tools, interactive data exploration tools, or even simple tools that don’t require much statistical mastery. Just to add, many business analysts of the world can find profound insights even from modelling the features in a primary spreadsheet application.
Myth #5: Data science is all about feeding data into Hadoop clusters and using MapReduce. Simple!
If people tried to explore before spreading myths, we wouldn’t be here. If you talk to a data scientist, you’ll realise that there’s far more to data science and analytics than Hadoop and MapReduce. These two are just two of the many tools. More often than not, a successful data science project uses an array of tools at various stages. Hence, a data scientist is expected to be on top of any major technological advancements taking place in this domain to make the appropriate switch to any tool or technology whenever needed. When it comes to Data Science, one shoe does not fit all, and there is no magic Ouija board to make the data science spirits talk to us mortals.
We hope you enjoyed getting your vision broadened! Stick with us; we’ll be back with more such Mythbusters.