Demystifying Serverless Machine Learning

In this episode, we sat down with Carl Osipov with CounterFactual.AI and the author of Serverless Machine Learning In Action. Carl shared some real-world use cases for serverless machine learning and identified strategies to get the most from a machine learning investment. 

“One of the things that happens at the beginning of a machine learning project — and this is a well-known problem for data scientists and machine learning practitioners — is spending way too much time cleaning up their data sets and focusing on things like data quality instead of actually building out machine learning solutions. I think, as practitioners, machine learning developers and engineers have created a set of techniques over the past few years to help formalize and accelerate this process. But it’s still a concern, especially if you think about scenarios that are common to manufacturing where different data silos have to come together for a machine learning system. This also happens in the scenarios where manufacturers acquire companies and then integrate data and use that data for machine learning systems. What happens is that if companies don’t actually have a rigorous approach for transitioning their machine learning systems code into operations, they find themselves in a situation where data scientists and machine learning engineers actually end up doing a lot of operations involved in putting machine learning systems into production. So what I’m describing here is what I call an ML ops trap. This machine learning operations trap, where these highly compensated practitioners are essentially spending their time working on something that’s not their core competency.” 

Connect with Carl on LinkedIn


Announcer: Hi and welcome to, “Data in Depth,” podcast where we delve into advanced analytics, business intelligence and machine learning and how they’re revolutionizing the manufacturing sector. Each episode, we share new ideas and best practices to help you put your business data to work from the shop floor, to the back office from optimizing supply chains to customer experience the factory of the future runs on data.

Andrew Rieser: Welcome and thanks for joining us for season two of Data in Depth, the podcast exploring data and its role in the manufacturing industry. I’m your host, Andrew Rieser. Today we are joined by Carl Osipov, CTO of CounterFactual.AI and the author of “Serverless Machine Learning.” CounterFactual.AI is a consulting company, focused on helping customers deploy cutting edge machine learning solutions by focusing on just two things, turning experimental machine learning code into production machine learning systems and reducing the costs of machine learning systems and production by minimizing or even eliminating operational overhead of the systems using serverless. Carl’s book, “Serverless Machine Learning,” from Manning Publishers is currently available in ebook format and is scheduled to go into print at the beginning of next year. Welcome Carl.

Carl Osipov: Thank you, Andrew. And thank you for hosting me on this podcast. I’m looking forward to our conversation.

Andrew: Absolutely. So before we dive in, I think it’d be great for the audience to learn a little bit more about you and your background. So if you don’t mind just kinda share what led you to CounterFactual.AI and your journey along the way.

Carl: Absolutely. So I like to think of myself as an applied computer scientist who escaped academia. I did both of my undergraduate and graduate degrees in computer science. I focused on artificial intelligence and then later on machine learning specifically, and now it’s about 20 years ago, I actually wrote my first neural network using the C programming languages. And back then writing neural nets was not nearly as cool as it is today. So that taught me quite a bit about artificial intelligence, about using techniques that became precursors to today’s machine learning. And my first job out of college was actually in the manufacturing industry. I joined IBM as a software engineer back in 2001. At that time, it was a great opportunity to help IBM build out a state of the art semiconductor manufacturing plant. So in the information technology industry, we call them fabs for short. So these are the fabs that manufacturer computer chips. And at the time when I was there, I helped develop software that integrated the equipment at the fab and the equipment was used to manufacture chips for top gaming consoles, at the time, Nintendo’s, Sony’s, Xboxes for Microsoft. And since then I’ve worked at different parts of IBM various roles, including software architecture, technical leadership that led me eventually to Google. And more recently I decided to shift gears and start working at a much smaller company. So currently a CTO of CounterFactual.AI, I have an opportunity to work with an exclusive set of customers and help focus on ensuring customer success instead of worrying about a breakneck pace of a scaling company.

Andrew: Yeah, that’s a great summary and a great evolution. So I think a lot of that experience, and then what led you to ultimately your role now, as you described is kinda the combination of a lot of different events and experience. And so obviously as technology continues to evolve and we see it maturing at this rapid pace, there’s all kinds of things out there with cloud and large infrastructure and platforms like AWS and Google Cloud, as an example that are really lowering the barrier to entry, for leveraging things like machine learning. So one thing that I’d like for you to share is just what’s your perspective on that and how is that allowing this technology to evolve at a much faster pace?

Carl: What I’m seeing right now is that many companies are experimenting with machine learning. So when I talk to my customers, they’re are at a process of building out data science teams and actually experimenting with those data science teams with those machine learning teams to put together experimental machine learning solutions. And I gave that example from 20 years ago where implementation of the neural network involved writing low-level C code playing with pointers in a computer architecture writing code at a very low-level. So today these machine learning engineers, data scientists have frameworks that allow them to implement machine learning solutions much faster compared to even just five, 10 years ago. So that’s definitely contributing to the pace of development for machine learning, the skills in data science and machine learning are more widely available today than they were five years ago. So it’s easier to bring those skills on board. However, what I’m finding is that those skills are not necessarily used specifically on data science and ML. So one of the things that happens is at the beginning of a machine learning project, and this is the well known problem of data scientists and machine learning practitioners spending way too much time cleaning up their data sets, focusing on things like data quality instead of actually building out machine learning solutions. So that’s still a concern. I think, as practitioners, the machine learning developers, machine learning engineers have created a set of techniques over the past few years to help formalize this process to accelerate this process. But it’s still a concern, especially if you think about scenarios that are common to manufacturing, where different data silos have to come together for a machine learning system. This also happens in the scenarios where manufacturers have to acquire companies and then integrate data and use that data for machine learning systems. Now there’s another concern that’s happening on the other end of a machine learning project. And that’s what’s happening after some of these machine learning experiments. These initial machine learning projects and pilots have actually proved interesting or successful. And here what happens is that if companies don’t actually have a rigorous approach for transitioning their machine learning systems code into operations, they find themselves in a situation where data scientists, where machine learning engineers actually end up doing a lot of operations work involved in putting machine learning systems into production. So what I’m describing here is what I call a ML ops trap. This machine learning operations trap, where these highly compensated practitioners are essentially spending their time working on something. that’s not their core competency. They’re working on activities like setting up a web services for their models. They’re working on activities such as ensuring high availability for their applications. So things that have traditionally been part of operations, but since there’s not a mature operations practice around machine learning, I am seeing customers actually asking their data scientists and machine learning practitioners to focus on this operations tasks.

Andrew: Yeah, I think that’s a great point. And I was actually speaking with a data scientist the other day, and she said that was one of the biggest pain points of her job currently is all the operational aspects that she didn’t realize that she was gonna have to do. And that took up about 90 to 95% of her time. Whereas she just wanted to focus on the models and the algorithms and the things that could have the impact on the problems that she was trying to solve.

Carl:  And I think this is happening in the industry, and this has been proven out in some peer reviewed research publications. So there’s a very well-known paper from Google. It’s a research paper from Scale AI and others from Google that reviewed a collection of machine learning systems. And they found that in production machine learning systems only about 5% of the code is pure machine learning/data science code. The other 95% are what I would describe as machine learning platform. And that includes the code that has to do was everything from data collection, data analysis, resource management, and web serving infrastructure for the system. So this is certainly a problem for the industry.

Andrew: And that also helps kinda put into context, the building blocks, if you will, of what makes up all of this. So one thing that I’ve found is sometimes confusing or sometimes gets used either in or out of context in the right way is just the term serverless. So maybe you can help share how you view that term and how that is often misconstrued or how that should be thought of when people were talking about serverless.

Carl: Absolutely. Andrew, so the term serverless has been adopted as this moniker in the information technology industry to describe the code development experience. So when we talk about serverless, we’re describing how a machine learning practitioner or a data scientist or a developer of machine learning code is actually developing the project. How do they contribute to creation of a machine learning system? And I did not come up with the term, the term has gained adoption in the IT industry because it emphasizes a very efficient model for doing software development. So what is describing is not necessarily systems that work without servers. So unfortunately, that term is absolutely confusing. What it is describing is that the developers don’t have to worry about servers that actually run the code, whether it’s machine learning code or web application code, or a code from other types of applications. And when I talk about serverless in context of the “Serverless Machine Learning” the book, I talk about this model for machine learning practitioners, where they can create their software, where they can create their machine learning code without actually worrying about these details of setting up the operating system or managing things like Linux device drivers for a graphical processing units or worrying about compatibility between the device drivers and the latest machine learning framework that they using. So serverless is definitely not about eliminating physical servers from the picture it’s about eliminating servers from the day to day experience of the machine learning practitioners

Andrew: Yup. That makes perfect sense. So sticking in theme with your book, one of the things that you’ve shared is the following of this real world use case, as you’re applying these principles and practices of machine learning, could you elaborate a little bit more on that and kind of talk through how you came about using that as the use case for the book and then how that’s relatable to others, listening of how they can kinda stitch these dots together, if you will.

Carl: Absolutely. So the use case that’s in the book and that’s used as a project throughout the book is focused on estimating costs of deliveries and the cost of transportation from point A to point B in the Washington, DC city boundaries and areas around. So originally when we planned this example, this particular use case, it was a motivated by one of the customers from the logistics industry. And obviously I started working on the book late last year, way before the COVID-19 situation, but completely unexpectedly, the use case became more relevant today than it used to be just a few months ago. So the use case is really about predicting the estimated time of arrival, the ETA for transportation, between two points in city boundaries, and also estimating the cost of transportation between these two points. So to be more precise, if you think in terms of the latitude and longitude coordinates in between two points in Washington, DC it’s possible to actually estimate these things like estimating how long it would take a vehicle to drive between these two points, depending on time of the day. For example, whether it’s the weekday and there’s traffic, or if it’s a weekend and there is no traffic there, or maybe it’s a holiday and everybody is out of the city. So that’s fundamentally the use case. And this use case of predicting the ETA for transportation is something that has become more relevant as company are seeking to improve their delivery processes. So, within the book I described the process of using serverless machine learning approach to help machine learning practitioners, help data scientists actually put together an implementation of this predictor for both estimated time of an arrival. And also for estimating the cost of travel between these two points. And then using that approach actually put together a production machine learning system, the machine learning system that actually can scale to customers worldwide and can support scalability both up as the demand for the machine learning system increases, but also scalability down. So if the company decides to migrate to an alternative machine learning system, it’s also very important to be able to bring down the costs and the number of requests to a machine learning system as well. So that’s what the use case and the project in the book is about. And by reading the book, the practitioners are not going to become world class researchers in machine learning. That’s not the focus of the book. Instead, the focus of the book is in helping individual practitioners become more valuable contributors to their teams, to their organizations by avoiding this machine learning operations trap that I described earlier. So it’s about helping build code in a way that relies on a machine learning platform. And also ensuring that that machine learning platform is reusable across many different machine learning systems, many different machine learning projects, and it minimizes the costs of an operation of the machine learning applications.

Andrew: Thank you for that. I think that put a lot of this into context and I’m sure the listeners are automatically thinking of the common apps that they’ve been using with now delivery and wondering when their DoorDash meal is gonna arrive at their house. So it’s good to think about this in terms of some of those real world applications. One question I have along those lines is the data sets. So you kinda, we talked about the quality of data and the different data sets that are available as well as publicly accessible data, APIs that some of these platforms offer. As you were going through this use case, could you maybe describe a little bit about how data came into play with that and maybe any surprises or assumptions or hypotheses that the originally had that may have changed over this process?

Carl: Certainly, and with respect to data, I think many companies are in a situation where they’re facing the reality of data silos, whether those data silos come in through acquisitions, or maybe those data silos have developed within an organization simply over time. And nobody actually bothered to integrate those data silos, but within the book. I actually take a very practical point of view on how data is adopted for machine learning. So the use case that I described the use case of predicting the ETA for travel between two data points in the boundaries of Washington, DC can actually be solved using a traditional set of techniques. So there’s a very well known set of terms. People would describe Software 1.0 and Software 2.0, and I did not come up with this terminology. These two terms have been popularized by a Tesla’s AI division. And if you think about Software 1.0, it’s the traditional approach for software development using business rules, using integration with APIs. The approach works well. There are many individuals in the world with the skills to use this traditional Software 1.0 approach. But I think what makes the Software 2.0 approach effective and Software 2.0 here means using techniques such as machine learning. So what makes Software 2.0 approach effective is that it allows companies to use the data, to use the experiences that a company has collected to substantially reduce costs. And let me give you an example, based on the scenario that I described with the ETA estimation. A very simple Software 1.0 approach to do estimation estimated time of arrival is to use publicly available APIs. So their APIs from services like Google Maps, Bing Maps, and others that can take in, pick up location, drop off locations and chart, driving directions, taking into account traffic, and also give predictions for the time of arrival. However, using these APIs carries a cost. Obviously there’s the development cost, but there’s also the cost of using or invoking these API for every request. For a company, even in the medium to large range, this can quickly add up to thousands of dollars per month just for the API requests. And then two that you have to add the cost of actually working with a service from Google Cloud or another provider where if you want to get support for these APIs, you actually have to carry additional support contracts or maybe sign up for a technical account manager. So the traditional approach can be very effective. The Software 1.0 approach of using business rules and integrating with APIs can be easy to implement. But the approach of using machine learning and this Software 2.0 approach allows companies to actually take the data, historical data that they have about doing something like a logistics of the delivery, keeping track of starting location, drop off locations for their deliveries, for their packages. And then instead of outsourcing the cost of predicting the time of arrival or outsourcing the cost of figuring out the distance of the route, companies can simply use their historical data and build machine learning models that can later be used in house to do better predictions. So in a nutshell, the approach of using machine learning with existing data allows companies to substantially save on costs. And I’m describing this approach because it helps outline how the company should think about different data sources within in their organizations. So even if companies start with the collection of data silos, there’s an opportunity to use the machine learning project as a motivation to cut costs, but also as a motivation to start breaking down some of these data silos. Potentially bringing in the data on the ETA from one of the silos and then bringing the data from a different geographic location for the customers, from another silos and even potentially take advantage of the open source data sets that are available on the web. So, as I mentioned, this data set in the book relies on the data about Washington, DC boundaries. So this is one of the publicly accessible data sets that is available from the Washington, DC municipal website. And by bringing these different disparate data sets together, it’s really possible to initiate projects that deliver real cost savings and deliver real value to customers such as improve the ETAs.

Andrew: That’s perfect. I really appreciate you going through that and especially the way that you make these complex systems and technologies bringing it up to a level where I think you did a great job of explaining Software 1.0 and the differences between that and 2.0, so thank you for that. So, Carl, the last question that we typically like to ask is, is what’s on the horizon for machine learning? You’ve obviously been embedded in this space for a few decades now. And where do you see this going? Obviously it’s going to evolve, but if you had a crystal ball, what do you think the focus over the next five to 10 years would be as it relates to the adoption and use of machine learning?

Carl: Oh, let me start with a focus on manufacturing, in particular. I see many opportunities for machine learning in manufacturing at two scales, really. So one scale that it would describe as an embedded scale has more to do with the individual pieces of the equipment on the manufacturing floor, with the individual sensor devices, with even individual workers on the manufacturing floor. So these techniques for machine learning are really about helping create smarter equipment, smarter workers. So this is a valuable approach for machine learning, but I think of it as machine learning at a scale of the individual piece of equipment or individual asset or individual worker. I also see opportunities for manufacturing at a large scale. And if you think about having smarter workers in the manufacturing plant. Having a smarter worker doesn’t mean that these workers actually doing the right job. So I think there’s an opportunity for manufacturers to adopt machine learning and apply machine learning, to figure out the right jobs, to do for workers, for the equipment and for the devices that they have in their plant. And ultimately I think manufacturing organizations, the businesses are going to be able to achieve sustainable and a competitive advantage using this application of machine learning at scale and companies should be able to do this in a way that minimizes their operational costs. So definitely many opportunities for machine learning, for manufacturing. And if we take a little bit of a step back and think about machine learning for the economy, I’m seeing the latest results that are coming out of research labs that show that machine learning can be what’s known as sample efficient. So that their techniques today they’re called semi-supervised techniques, self-supervised techniques for machine learning that requires significantly less data to train machine learning models. Also there technologies that are becoming available that help manufacturers and other businesses do data augmentation, meaning that if a company does not necessarily have the data to train machine learning models to day, but is interested in using machine learning models for competitive advantage, there are technologies that help companies generate synthetic data sets that can be used to create machine learning models. And then they used to deploy effective machine learning models in production. So if I were to look inside my crystal ball, I think these trends are the ones that are going to be reshaping, manufacturing, and reshaping the economy with machine learning over the next few years.

Andrew: That’s fantastic. I couldn’t agree more. I think that this is why I’ve always been so passionate about the manufacturing space because of all the opportunity and use cases that I think these companies can can benefit from. So Carl, it’s definitely been a pleasure chatting and learning from you today. Thank you for participating with us.

Carl: Thank you, Andrew.

Andrew: For those of you listening, if you’d like to learn more about Carl, CounterFactual.AI or the book “Serverless Machine Learning” from Manning Publications, we’ll be sure to include the relevant links and details in our show notes. If you enjoyed this episode, please take a moment to rate the episode and subscribe to Data in Depth available on iTunes, Google, Spotify, Stitcher, and pretty much anywhere else, you might listen to your podcasts. Thanks again for joining us today.

Announcer: Data in Depth is produced by Mountain Point a digital transformation consulting firm. Focusing on the manufacturing sector. You can find show notes, additional episodes, and more by visiting Thanks for listening and be sure to subscribe wherever you get your podcasts.