Machine learning can be confusing. Everyone uses the term to mean something slightly different, and there’s just so much to keep up with.
Before you choose a vendor or a platform to meet your AI needs, it’s important to understand which “layer” of machine learning is best for you. You can build your own AI from scratch, use an off-the-shelf product, or something in between.
In this post, we’ll walk you through the four options and help you identify which is best for you by comparing them to means of transportation.
The different “layers” of machine learning
- Novel research such as developing new speech-to-text algorithms is like building a car.
- Open source frameworks such as Tensorflow are similar to owning a car.
- Managed platforms such as AWS SageMaker are like taking a taxi.
- APIs such as Google Translate are similar to taking a flight.
You can build a machine learning solution at different layers. As you go to more specialised layers, more features work straight out of the box, but you get less flexibility.
Each layer is built from the previous one: APIs are often built on managed platforms, which are built on open-source frameworks, which are built on research.
Different problems are best solved by different layers
Which layer is best for you will depend mainly on the problem you’re trying to solve.
Not all the layers are the same size though: for example, it’s actually surprisingly rare for a problem to be cleanly solved by a managed platform.
Because the companies behind managed machine learning platforms use aggressive marketing tactics, people often overestimate the value of these platforms. In reality, a managed platform often leaves you with the worst aspects of using off-the-shelf APIs while not fully resolving the pain points that you have to accept when you build a bespoke solution with open source frameworks.
Let’s take a look at each layer in depth.
Doing AI research is like building your own car
If you want to use AI, you might be tempted to hire a team of PhDs. But this would be like hiring a team of mechanical engineers and asking them to build a car for you and then take care of it 24/7.
If you’re a racecar driver competing on an international level, this might be reasonable – but otherwise it’s probably not. Similarly, companies like Google have teams of researchers who push AI’s foundations forward by developing new algorithms and architectures.
This work is expensive, and it often takes years or decades to create real value. Most companies today don’t need to even consider taking this approach. Unless you’re pushing the leading 1% edge of performance in your field, it’s unlikely that it makes sense to have a dedicated research team.
You might want to do your own AI research if:
- You’re creating a competitor to Siri and you want it to be better than Siri at speech-to-text interpretation.
- You’re in a head-to-head race with other investment banks and want to give your trading algorithms an edge.
Sample team structure:
- A team of research engineers: PhDs, postdocs, and professors.
- A team of data, machine learning and devops engineers who can translate research results into machine learning solutions.
Using open source frameworks is like owning your own car
As academic research matures, it evolves into open source frameworks like Tensorflow, PyTorch, and Keras. Thousands of developers around the world then contribute to improving these frameworks for free.
Anybody can just download these open source frameworks for free and use them. These are the exact same powerful “engines” that expensive managed platforms like those from Amazon, Google and Microsoft use. But you have the flexibility to build your system the way you want to.
But like owning your own car, this comes with some downsides too. If you own a car, you need to learn how to drive or hire a driver. To use a tool like Tensorflow, you need to learn to code or hire a developer.
If you own a car, you also need to worry about things like refueling, replacing tires, and finding a safe garage. If you build your own custom machine learning solutions, you’ll also need to set up a fitting infrastructure and maintain it.
You might want to develop your machine learning solutions with open-source frameworks if:
- You want to improve a workflow that’s very specific to your business.
- You have the data and domain knowledge on the problem in-house.
- You’re not afraid of taking on complicated software projects.
Sample team structure:
- Internal: Teams of DevOps engineers, machine learning engineers, data scientists, project managers, and subject matter experts.
OR
- External: Agencies like us, who build your custom ML solution using open source frameworks.
Using managed platforms is like taking a taxi
Sometimes you need something quickly, and you’re willing to pay more for it. If you need to make a quick one-off trip, you might take a taxi. But if you take a long-distance taxi every day, it’s likely you’d get better value by buying a car.
Similarly, managed platforms like AWS Sagemaker, Dataiku, Knime, and Alteryx allow you to get started more quickly and provide some pre-built components to help you get started faster.
But while these platforms can save you time, they’re often far more limited than their marketing teams would like to admit. Plus if you get “locked in” to one of them, it can be expensive. As with a taxi, if you need it all the time, then it’s probably the wrong solution, and you’re likely to hit its limits sooner than you expect.
You might want to use a managed platform if:
- Machine learning isn’t going to be an important part of your business or a source of competitive advantage.
- You want to apply machine learning to standard problems that are already well understood and have been solved elsewhere (e.g., recommendation systems, churn analysis, demand forecasting).
Sample team structure:
- At least one machine learning engineer, one developer, and one DevOps specialist.
Using APIs is like taking a flight
If you just need to get from London to New York regularly, then public transport will probably fit your needs fine. This is the least flexible form of transport, but it does one thing and does it well.
Similarly, APIs and web applications like Google Translate are the easiest to plug into your existing infrastructure. They also scale up as much as you need them to and provide a level of quality that’s hard to beat.
But they’re also the least flexible. If you have any custom needs, then APIs probably won’t be flexible enough for you. Plus you’ll pay a premium: with a translation API, for example, you’ll probably pay per character translated.
You might want to use an API if:
- You have an e-commerce platform and you want users to see automatic translations of reviews left in other languages.
- You run a logistics company and you want automatic route optimization for deliveries.
Sample team structure:
- A single developer to integrate the API into your system.
Next steps
Once you’ve chosen which “layer” makes the most sense for you, then you can start getting into the details of what orchestration platform, tools, or managed platform to use.
If you’re still unsure, book a free call with us to discuss what might be right for you!