Machine Learning Server | Machine Learning Guide Podcast

MLA 014 Machine Learning Server

Jan 17, 2021

Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev

Try a walking desk to stay healthy while you study or work!

Show Notes

After you train an ML model and need to deploy it to production, you have a number of options. If your model runs rarely (1-50x / day), you can set it up as a batch job through various services. In this case it will run to completion, then take itself offline. If your model needs to always be available, via a customer-facing product with constant usage, then you'll deploy it as an endpoint through various services.

Batch models

AWS Batch. Lets you run a model deployed as a Docker container (eg via ECR) to completion, using price-saving features like spot instances. Much cheaper than Sagemaker, but at cost of spin-up time.

Endpoint models

AWS SageMaker lets you deploy trained models to a REST endpoint. Also lets you train models & view analytics and various training insights.
GCP Cloud ML. GCP's equivalent to SageMaker.
Cortex is similar to SageMaker, with many added benefits. It's free and open source, using your AWS stack to deploy services (like SageMaker) but allowing cost-savings via spot instances, better than SageMaker's 40% EC2 added cost. Soon they'll support scale-to-0 instances, for when your ML server doesn't have traffic; a huge cost saving.
Other competitors include PaperSpace Gradient, FloydHub, and more.

Comments temporarily disabled because Disqus started showing ads (and rough ones). I'll have to migrate the commenting system.