In the midst of this pandemic, what is allowing us unprecedented flexibility in making faster technological advancements is the availability of various competent cloud computing systems. From delivering on-demand computing services for applications, processing and storage, now is the time to make the best use of public cloud providers. What’s more, with easy scalability there are no geographical restrictions either.
Machine Learning systems can be indefinitely supported by them as they are open-sourced and within reach now more than ever with increased affordability for businesses. In fact, public cloud providers are increasingly helpful in building Machine Learning models. So, the question that arises for us is – what are the possibilities for using them for deployment as well?
What do we mean by deployment?
Model building is very much like the process of designing any product. From ideation and data preparation to prototyping and testing. Deployment basically is the actionable point of the whole process, which means that we use the already trained model and make its predictions available to users or other systems in an automated, reproducible and auditable manner.
While a lot of cloud providers have created customised and dedicated ML stacks, there are on-premise server services and Heroku, which provides a ready and secure environment that allows you to deploy faster. There are however, challenges cloud providers face collectively.
What are the challenges?
Deployment is hard!
Contrary to general belief, you’re not only deploying code, you’re also, in essence deploying data that moves between various departments, in various formats, that change as the model changes, and there are a ton of moving variables in the system that can be vulnerable to that.
There is no homogeneity
End-to-end ML applications are often full of components written in different programming languages. The choice of a programming language is dependent on the use case and Python, R, Scala, or any other language can be used to build different models.
ML deployments aren’t monolithic
Machine learning model deployments are not necessarily self-contained solutions. They are commonly embedded or integrated into various business applications.
Testing and validation pain points
Data changes result in an evolution process for models for which methods improve or software dependencies change. Every time such a change occurs, model performance needs to be re-validated.
Complexity of release strategies
Depending on the use case, ML models need to be updated more frequently than regular software applications.
Data security issues
With data being a vulnerable resource, the open-sourced nature of cloud providers does raise some eyebrows. A lot of banking sector companies have been apprehensive about using the cloud because of data security issues.
The top three contenders
AWS, Google Cloud and Microsoft Azure which are the top three contenders in the cloud market, can be compared on a few important parameters to make the best choice.
According to Gartner’s Magic Quadrant report, AWS is ranked highest in terms of both vision and ability to execute. What makes it so is AWS’s approach that it truly democratises AI by delivering tools and services that enable all developers even those who have no prior experience of ML. It’s even attractive for small businesses as the pricing is based on usage not a blanket fee. Additionally there’s a lot of room for flexibility, customisation and support for third-party integrations.
Google is committed to making AI accessible to all. Google has been open-sourcing its AI/ML tools and engineers have been actively putting out their research for everyone to access. Cybersecurity is a critical area where Google is employing AI/ML to solve business problems.
Chronicle, a subsidiary of Alphabet (Google’s parent company), is all set to leverage Google’s AI/ML expertise and provide near limitless computing power to develop a world-class security analytics solution. It can be easily integrated with other Google services. A really huge cost-saving discount that Google Cloud offers are SUDs or Sustained Use Discounts. These are automatic discounts that Google Cloud Platform provides for the period of time one uses the platform.
As a public cloud, Microsoft Azure services make sure that no user has to buy any hardware or software to use it. Azure Machine Learning can be used for any kind of machine learning, from classical ML to deep learning, supervised, and unsupervised. Most languages are supported including Python or R code or zero-code/low-code options. It’s biggest plus point is its speed with a guaranteed downtime of less than 4.38 hours a year.
Comparative study: AWS, Google Cloud and Microsoft Azure
Let’s see how well they perform on the following four parameters.
Convenience of use and learning curve
The difference in progress between these companies can be measured by the level of investment and their failure/success in gaining knowledge. A steep learning curve makes for a slowdown in industry adoption and is directly proportional with the convenience it imparts to the user experience.
As you can see below, AWS has pretty much taken over the market share when it comes to measuring their adoption by various small and large businesses. It helps that it was one of the first ones to enter this market. The usage statistics are an indication of how easily they can be used as well as how quickly they allow users to reach the deployment stage and a proof of their consistency.
The more customers consider which of the clouds to use, the more possibility of them searching for it on Google to understand their offerings. According to Google Analytics, it’s evidently shown that popularity in terms of Google search for Amazon Web services has been consistently high. The more it is searched for, the more likely it is to be widely used.
An enterprise does have a choice to use multiple cloud providers to make the product deployment as smooth as possible. Also to avoid ‘vendor lock-in’, organisations are using different cloud providers to solve their business problems with as much flexibility as possible. The recent RightScale 2019 State of the Cloud shows that 84% of their sample size have adopted multi-cloud strategy.
Major public cloud providers offer services based on multi-tenant servers that are shared. The capacity required to compute and handle unpredictable changes is humongous and there is a need to optimise user demand across different servers. Although the popularity of serverless models is rising, there is still high density of work that needs to be processed.
According to Stack Overflow, a popular community of developers here we can gauge the share of usage of the three cloud systems through their analysis of patterns based on the percentage of questions they receive in a month.
Lower cost enables start-ups also to adopt cloud services. All processes for a start-up have to be built from scratch. What public cloud computing can do for them is phenomenal in the sense that the capital required for investing in the pricing can be managed until they find a long term investor. The quality of the project can remain uncompromised. For each of the scenarios below, you can observe the hourly on-demand price and then the hourly price per GB of RAM for each.
How can clouds help?
Major cloud computing systems like AWS, GCP, Heroku, Azure and IBM cloud are providing a safe haven for all data aspirants and companies with limited funding who’d like to explore machine learning models and efficiently deploy them. These systems are cheap to operate.
By paying a few dollars an hour on average you can drive your very own machine learning application almost instantly! Public clouds also provide cheap data storage. You can leverage true databases or storage systems as the input of the data into the machine learning-enabled applications.
They all provide software developer kits (SDKs) and application program interfaces (APIs) that allow one to embed machine-learning functionality directly into applications and they support most programming languages. The real value of machine-learning technology is the use from within applications, because the types of predictions that are made are more operations and transaction focused.
However, it would be a good strategy for companies to consider both on-premise and cloud, as clouds may cost a bit in the experimentation phase. The clouds also have their own tools created on top of the open-source tools like Kubernetes, Dockers, Tf etc. Kubernetes, being a popular Google product is an open-source system for automating deployment, scaling, and management of applications, but it would run better on GCP than on other provider platforms. Above all, it will be critical to know which tools one is equipped to use in order to choose the best cloud service for oneself.