Thulasiram received his PG Diploma in Data Analytics from UpGrad and IIIT-Bangalore and now works at UpGrad as a Program Associate for Data Analysis at UpGrad.
Years ago, I read a story in the book The Monk Who Sold His Ferrari, a story I remember to this day.
There was once a teacher renowned for his archery skills. It was said that the teacher’s arrow would never miss the bull’s eye. His skills attracted many students. One day, the teacher asked one of his students to blindfold him. Then, he asked the student to set up a target and give him his bow and arrow. All the students held their breath and watched the teacher. In their hearts, they were sure that the teacher’s arrow would hit the target. As everyone was watching with bated breath, the teacher’s arrow left the bow and missed the target! The students were left flabbergasted. One of the students removed the master’s blindfold and asked him, “How could you possibly miss the target?” The master replied –
“One cannot hit a target which one cannot see!”
A long time ago, I attended a teacher training event. In one of the lectures, we were told that we should inform students in advance about what they will learn in a particular session since that will help them prepare better for what is going to come. Learning occurs best when the students know why they are learning what they are learning and how it can be utilised. Context helps facilitate learning. In the process of trying to solve a difficult problem in creative ways, learning occurs naturally, and what we learn stays with us.
How can we apply these general principles to data science? The first step in mastering data science is to have a measurable goal. You should have a passion and love for data. In today’s Digi-verse, it is easy to get lost or distracted. Balancing work, family, and learning makes it all the more complicated. So, it is crucial to set a learning goal and follow it meticulously.
Secondly, practice is a must to master any skill. In the book Outliers, Malcolm Gladwell claims it takes 10,000 hours to master a skill. There is a bit of controversy around that number, but one thing is clear: without practice, mastery in any skill can’t be achieved. Data science is no exception. The best approach is to work on a problem that you care about, one that inspires you. Learning for the sake of learning will not last long, and theory will be forgotten along the way.
Recently, I read an article about a student who helped solve the problem of detecting eye cancer by analysing publicly available images. She learned everything required to solve the problem on the way and achieved something great. Data science, by definition, is an interdisciplinary subject. It involves linear algebra, programming, statistics, computer infrastructure and many more areas. This long list is intimidating to all of us who would like to pass through the gates of data science.
Hence, it will do you good to initially focus on two or three algorithms and try to apply these to the problems you choose. Instead of learning a little about all the algorithms under the sun, explore a few in depth. Presently, resources in our world are very less, and the demand for them is far too high. Using data science in optimal ways can help us find ways to best utilise these scarce resources.
Moreover, try to experiment when you are working on a problem. There are several ways you could proceed, and you will only know the right one once you try it. Experiments will make you ready to perform data science tasks such as improving the accuracy of algorithms, reducing the time taken for execution, parallelising the execution, making the model production-ready, etc. In many of the job interviews I have given, a common question was about the data sets I had handled or the real-world problems I had solved. Working on a problem will assuredly be a big asset in your job hunt as well.
Communicating the results is as important as the analysis itself. Data science is all about storytelling. In another interview, I was asked –
“How will you explain the median to your grandmother?”
I had to improvise to explain the complex jargon in simplified words. In an organization, the management will often not be aware of jargon, and simple and effective communication will be key. Try explaining data science concepts to children (or grandparents!), and see if they understand.
Finally, everything boils down to visibility and presence. As an aspiring data scientist, it is crucial to improving your digital visibility. Create a GitHub account, and upload all your projects for showcasing. Contribute to blogs on data science. Join data science groups and contribute to the ongoing discussions. Try answering questions on Quora, Reddit, and other popular forums.
These are some of my meditations on walking the long path to data science mastery. You will definitely have other ideas or suggestions. Please share your thoughts, articles, videos, and podcasts, or anything else that you feel will be helpful to your peers while they walk this arduous but exciting, and extremely fulfilling, path with you!