One of the most frequently asked questions is to describe “a day in the life of a data scientist”. Here we have tried to give a light description of how it looks to make an informed decision whether this career choice is the right one for you.
At the outset, let us be very clear. It is nigh on impossible to characterize just a day in the life of a data scientist. Because the job is so varied and the profession so complex, a typical day will depend on multiple factors. One of the primary factors is the type of data project you are working on, which can change monthly or quarterly. The second consideration is more systemic and depends on the kind of organization you work in.
If the structure there is hierarchical, the experience will be different, if it is team-based, it will be different. The third parameter that influences a typical day is your role within the team. Whether you are a senior or a junior or the single data scientist of the team, or other such role considerations influence your typical workday.
But once you take a mean of them all, an ordinary day for a data scientist might look something like what follows. There are three main functions a data scientist accomplishes in a day. Unsurprisingly, the majority of the time goes into coding. The balance time goes in meetings and thinking, both roughly divided equally.
Here, thinking refers to personal reflection, and we can include group-think in meeting time. It is crucial to keep in mind that there is no project anywhere you can finish in a single day. So, on most days, your job will involve either of the three concerning continuing discussions, thoughts or work on the existing project from where you had stopped the previous day. Let us discuss some of them in slightly more detail.
Table of Contents
As a data scientist, you can expect it to take about 70% of your time. It can even exceed that. That is not a surprise considering the primary job of a data scientist is to code. Just as any other scientist, a data scientist also has various tools and languages at their disposal.
Some of the more familiar ones are Python, SQL, and Bash. For this reason, coding is the most important out of all the skills that you can learn if you want to become a data scientist. Statistics and Business Thinking round off the other key skills, but they diminish in importance to coding. Learn more about data scientist tools available.
However, coding is a vast word, and we must make attempts to learn about some of the typical tasks that go into coding. Some of them are briefly given in the following sentences. Data cleaning and formatting is perhaps the most laborious and time-consuming job within coding.
It can sound counter-intuitive once we explain it to you, but it still holds. This process involves bringing the data into a recognizable format that you can code on further in the project’s next stages. While this can be explained in one line, achieving it is one of the most arduous processes.
Once we complete data cleaning and formatting, the next task typically involves prototyping. You do prototyping to check the data against various analytics methods and machine learning methods.
This helps you choose which method fits best. This stage is often considered challenging by many data scientists, but they will be the first to point out that it is also one of the most exciting parts of the entire sequence. That is because raw data becomes valuable with this step, much like extracting precious metal from an ore.
We mentioned some of the tools before, and there is compatible prototyping software for each of them. You can mix and match here and see what works in a particular environment and what feels most comfortable to you. Remember that this stage is not for a final inference of the data. Instead, this is the point where you want to check what works and what does not.
The following steps can vary depending on the final aim of the project. For example, it could be for a meeting with your team or seniors. In such cases, you would need to turn your data into visual representation and report the findings. These things will then need to go into your presentation.
On the other hand, if it is a report that your colleagues might find use for in the future, then your primary job after prototyping should be how to automate it and make it accessible for everyone in the company. Finally, and perhaps most excitingly, if you are in-charge of machine learning or analysis that will be turned into a service or a product, then your job will be to figure out the implementation. At this point, developers will also assist you.
Therefore, to summarize what we have learnt so far in coding, the first couple of steps involve data cleaning & formatting, followed by prototyping. The subsequent steps may include creating data visualizations, automatizing the project, implementing your models to use as a product or a service, to name a few.
Other miscellaneous activities could have been included in this section, but they crop up from time to time and are not part of the normal process. They involve bug fixing, tutorials on new packages & libraries, and maintenance of previously written scripts. There is always something to do when you are a data scientist.
Meeting, Presentations, Talking and Brainstorming with the Group
Since coding takes up about 70% of the time, there is a balance of 30% left. In the balance, 15% of the total time is spent meeting with people. These can take different forms such as formal meetings, one-on-one sessions, presentations, discussions over the water cooler or even group chat.
Getting in touch with your team members is vitally important because there is often just one data scientist in the entire team, and they are not exactly aware of what you do. You must take them along with you. But let us not make it seem too fastidious because doing this allows you to seek greater cooperation with them. You can get more assistance from them in your big data projects and therefore have a bigger impact.
Hence, it is important that you develop rapport with your colleagues, even if you may be naturally introverted as a data scientist. But a word of caution is necessary here. Especially at bigger companies, there is a habit of having meetings throughout the day. This involves sitting and talking and not having the time to do actual coding. At the end of the day, you will find your work piling up with nobody there to support you. Therefore, remain in contact with your fellow workers but do not overdo it to a point where it becomes counterproductive.
The way you manage this issue can be crucial to your chances of progression in the organization. First of all, remember that you are not supposed to spend more than 15% of your working hours in a meeting, to take an approximation. Keeping this benchmark in mind, initially develop a bond with your teammates and your manager. After that, sit down with them and explain to them what your work entails so that you need to be present in only the meetings that are essential to your work.
This might seem absurd to some, but it is absolutely critical to spend at least 15% of the day thinking. Data science is not child’s play and involves a lot of tough work. Therefore, if you do not think and plan your day, it is nearly impossible to proceed. You need to figure out the best statistical models, you need to correctly interpret the data, you need the words to report the findings, and for all of this, you need time to think alone.
During thinking, if you find yourself unable to organize your thoughts, move to doodling or sketching. Keep a whiteboard near you. Or use plain, old paper. But as a data scientist, you can always use a high-technology tool such as Miro, which is an online mind-mapping tool.
Coding is the major part of your work, but it can do wonders when you can combine it with sketching and thinking. Stepping back to think lets you see the bigger picture, which often gets lost in the tiny minutiae of coding. While it looks like down-time, it is often the most critical time to boost productivity.
Miscellaneous Activities and Conclusion
Before leaving for the day, one must make time for answering all the emails. It is only polite to respond on the same day and you should do so. During the day, you are expected to be busy, so make time at the end of the day. Review the day that you have just finished and plan for the next day to keep up continuity and efficiency.
To summarize, 70% of the working time for a data scientist goes into coding. Balance 15% each goes into meetings and thinking, with the end of the day kept for various activities. It is a rewarding career to which many aspire.
If you are curious about learning data science to be in the front of fast-paced technological advancements, check out upGrad & IIIT-B’s Executive PG Programme in Data Science and upskill yourself for the future.