MLOps, cheap as chips

In the shadow of modern AI hype, tabular data is still relevant

When I started my career as a data science consultant many years back, the application of deep neural networks was going through a renaissance. Cloud computing was becoming increasingly affordable, reducing the need to maintain costly, high-performance computer hardware.

The current sky-rocketing pace of LLM development has in turn made possible by the vast amount of data available in the internet as well as rapidly increasing computational capacity. However, this still comes with a fairly hefty price tag; just the training cost of GPT-4 was more than $100 million, whereas the annual operating costs of Open AI are likely to be several $100 millions.

While social media is keeping us updated about new advancements of generative AI, the everyday work of most data analytics professionals is still about good old tabular data. This is what business keeps producing and can in turn be used to the benefit of the same business. It is possible that one day AI will surpass human analysts, but still it is hard to imagine that tabular data would be abandoned, as it is a logical format also for algorithms to mess around with.

It is hard to tell what technological advancements lay in the future. What is certain, however, is that that our lives and the world around us are being increasingly reduced to data. This opens more and more opportunities for the application of machine learning. To prevent things going wild, regulation is being imposed to make sure not anything is accepted and at least critical systems should adhere to some quality standards.

How much for MLOps in production?

The gold standard for generating business value from data via machine learning has, for some time already, been MLOps. In short, MLOps (short for ML operations) is about taking software development best practices to ML development and making it run (semi) automatically. The idea is to have models adapting to changes in the operational environment, without the need of constant user intervention, in order to improve quality and potentially meet regulatory requirements.

Skill wise, MLOps sits in the intersection between Data Engineering, Machine Learning, and DevOps. Typically, the MLOps process includes pipelines for data transformation, model development, and model serving. As a best practice, there should be both model and data versioning (for diagnostics and roll back), model and data validation as well as monitoring (for robustness and trustworthiness).

But how much does it cost to run MLOps in production? Clearly much less than LLM development, as this is not something to be done automatically. The answer obviously depends on many parameters, including the solution architecture, underlying compute resources, computation time used per model, the amount of data processed, and the number of models maintained. Roughly, taking these variables into account, the costs can range between tens of thousands to hundreds of thousands, including also costs of labor.

Leaving the labor costs aside, we are not talking about that much (unless LLMs need to be served on the cloud). In this case monthly costs can be over 10k, depending on the model(s) and the amount of consumption. Whether this is this a lot or not depends on the size of the organization and the extent of ML applications.

Snowflake can help keeping the expenses at bay

Using Snowflake for running automated ML pipelines it is possible to keep costs really low. As an example, at a client using ML to gain customer insight, personalize customer care, and target marketing activities, year-to-date average cost has been less than seven Snowflake credits per month. This cost includes compute from both production and development—the later being much higher, to develop, troubleshoot, and maintain over ten different models.

The amount of seven credits translates to less than 50€ each month, which is approximately one fifth of the full monthly lunch benefit of a single employee. Peanuts! Clearly the added costs of ML development/production use (in addition to data platform costs, which are paid anyway) are no reason not to start utilizing advanced analytics.

This is of course a very cheap scenario, with batch scoring run in production only once a month. There is a broad range of cases between this one and e.g. serving a large language model for many users or using computer vision in production line quality control. Both require substantial compute capacity, while the later can also require specific imaging hardware.

Computation isn’t the bottleneck

The estimate for the case described above is based on an XS Snowflake warehouse. Snowflake does not give out the specifications, but let’s say there are 8 vCPUs and 16GB of memory. Databricks allows the user a plethora of different compute options, so pricing comparison with Snowflake is not that straightforward.

If we take an All-purpose Compute with the same specs as the Snowflake XS warehouse (m4.2xlarge), the hourly cost on an Enterprice plan on AWS is 0.4125 $/hour. For about 50€, this compute can be on 5 hours each working day a month. Having the cluster on 24/7 costs a bit over 300€/month, which isn’t that much either.

That is, unless the data volumes (which is the case, e.g., with IoT, game industry, eCommerce, etc) or computational requirements are large (LLMs), ML development and serving in production are likely to be relatively low.

The real costs lay in the payroll

While operational costs might be negligible, the bottle neck can, however, be a lack of understanding about potential use cases with sufficient ROI. The benefits need to be measurable to cover the costs of hiring/renting the necessary Data Engineer(s)/Data Scientist(s)/ML Engineer(s). Special expertise is still needed in the implementation of ML solutions, which makes it rather labor intensive.

Hiring a data expert costs easily 100+ k€ per year, whereas getting a consultant for the job will most likely more than double the price. Both are large figures, but one should not be fooled by the apparently much more expensive consultant. A seasoned expert, who is difficult to find in the first place, is likely to deliver better quality in less time, evening out the costs considerably, at least in the short term.

To conclude, delivering real value and exciting use cases with modern MLOps is probably not a huge technological investment. Thus, there should be no such worries discouraging anyone from trying. If the potential/validated business cases justify hiring/renting expertise for implementing ML/AI solutions, there should be no substantial additional financial hurdles for producing benefits to the business.