The jump between writing a quick script and building production software often surprises people. Sometimes a prototype can be extended into production software. But more typically, a professional software engineering team will throw away the first prototype and start from scratch.
This is because there’s far more to professional software than cleaner code. Tests, documentation, logical architecture and modules, and packaging are all inherent to production-grade software. And they need to be integrated or “baked in” from the start – they aren’t components that can be added at the end.
Machine learning teams face similar problems – there is a huge gap between a proof-of-concept script running in a Jupyter Notebook and a solution running in a production environment.
Here are the differences between prototyping and production – first in software engineering and then in machine learning.
Production software engineering vs prototyping or scripting
Here are the stand-out differences between scripts and prototypes on one hand, and production software on the other. Firstly scripts,
- Created by a single developer;
- Work in a very limited setting;
- Work for a limited time;
- Work on limited data;
- Work on a single machine running a specific environment;
- Updated only by the developer that created it.
On the other hand, production software,
- Built and understood by a team of developers, testers, and product managers;
- Adaptable and configurable for different use cases;
- Future proof;
- Deals with edge cases, large data sets, and unexpected inputs;
- Runs on and integrates with various kinds of hardware and software;
- Version controlled and allows for easy collaboration.
So when a developer says they are “90% done” once the prototype is working, this is a bad indication of how much work is left before project completion.
It’s similar for machine learning. Building a proof-of-concept is similar to writing a script.
Production machine learning engineering vs proof-of-concept model building
Machine learning also has two distinct phases, similar to the large gap between a prototype and a production codebase. Data scientists often build an initial model on test data. Everything is small and well understood, so progress is fast. They get good accuracy and declare the problem solved.
In reality, using the initial data scientist’s notebook can be impossible. Only the initial data scientist understands all of it! It doesn’t account for any edge cases, uses a small and neat dataset, and has no tests or documentation, so it doesn’t scale because it’s hard to build on.
In order to provide value, the project has to be recreated by a machine learning engineering team. The machine learning algorithm is usually just a tiny part of the final product for production solutions.
The production solution mustn’t rely on any single individual, besides dealing with more difficult problems than the initial proof-of-concept. Too frequently, software and machine learning products fall out of use simply because the individual who understood how they worked has moved on.
The extra time and money invested into a production solution not only allows it to scale to larger data and more difficult problems, but also ensures its longevity; it keeps the knowledge at a team or process level instead of tied to a specific developer.
But getting a machine learning proof-of-concept into production is not directly equivalent to traditional software engineering.
The additional challenges of machine learning engineering
All the best practices of software engineering teams focus around code. Similarly, machine learning engineers handle large codebases, but in addition, they need to deal with data and models. Specifically, they have to think about:
- Storing and tracking models – keeping track of how a model was trained, what results it achieved, and being able to serve it efficiently are all aspects in machine learning engineering that you don’t usually see in standard software development.
- Large and fast-changing datasets – while most software integrates with data through databases and other sources, machine learning solutions often need to handle substantially more cleaning and preprocessing, often on “live” datasets that are updated frequently. Machine learning engineers need a system that monitors and executes a series of interconnected processing steps, often as a directed acyclic graph (DAG), to manage these smaller tasks efficiently.
Building a framework for reliable, repeatable machine learning development
We’ve built Open MLOps, an architecture and set of open source tools to help teams build production-grade machine learning solutions.
We’ve tried most of the existing tools and software aimed at machine learning teams, and Open MLOps is a collection of our favourites, integrated in an opinionated way that is optimal for most teams.
If you’d like to hear more about Open MLOps or want help with your own machine learning architecture problems, book a free consultation with us today.