SageMaker streamlines machine learning workflows by enabling integrated model training, tuning, deployment, monitoring, and pipeline automation within the AWS ecosystem, offering scalable compute options and flexible development environments. Cloud-native AWS machine learning services such as Comprehend and Poly provide off-the-shelf solutions for NLP, time series, recommendations, and more, reducing the need for custom model implementation and deployment.
Machine learning applied SageMaker two. So we're still exploring SageMaker features. We left off in the train and tune phase of the SageMaker tooling as all part of a data and machine learning pipeline. Let's spend a little bit more time discussing training, because training is sort of gonna be the bread and butter of a machine learning engineer's day-to-day role training your machine learning model.
In the past, we might write our model in Cous TensorFlow PyTorch, train that model on local host, maybe in a Docker container using our system's GPU. Well, a benefit of training in SageMaker is that your model will be part of the pipeline, part of the stack. It will be receiving data downstream from what you've already built out in your pipeline using data Wrangler and feature store and all these things.
If you build out your model into SageMaker so that when you're training your model on SageMaker, it's able to be deployed through SageMaker, then you don't have to take that extra step when you're ready to deploy your model of transferring the concept of a local host, trained model to the cloud, it'll all be ready for you to just run a deploy script or click a button, whatever the case.
Of course, in the training phase, SageMaker offers all that tooling around training or model things like model debugger, which allows you to peek into your neural network by way of a tensor board, or keep an eye on the objective metrics or model drift or bias. All these things through a graphical user interface in SageMaker studio, or email or text message alerts in CloudWatch and so on.
Another benefit of training your model in SageMaker. As opposed to do it on local host is that this whole process, you're gonna be spinning up a SageMaker studio project. Remember that SageMaker Studio is their IDE on their web console where you will be writing your code in an I Python notebook in SageMaker Studio you write your code in an I Python notebook and you can share that code with your team, with other people with different AWS accounts, and you can collaboratively.
Edit and manage your model and the training of your model and all these things. Finally, another huge benefit of using SageMaker to train your models as opposed to local host is that you can do distributed parallelized training of your model across multiple EC2 instances so that you can train your model faster than on local host, and you can use these specialized.
Chips on SageMaker. AWS now has all these EC2 instances where they have handcrafted their own chips, their own types of CPUs and GPUs that are specially crafted for machine learning model training, as well as chips for machine learning model inference. When you actually get to that step of the inference, the model deployment.
In fact, I just opened up the SageMaker website right now so I can look at the features again as I do this episode. And right at the top there's a banner. They say they just released the EC2 DL one instance, which delivers up to 40% better price performance for training deep learning models. So remember way back when when I talked about gcps special sauce being their TPUs tensor processing units.
Well, AWS certainly did not sit on their hands while. Google was creating these chips. They did not twiddle their thumbs. No. They created a whole bunch of chips that are dedicated to fast cheap machine learning model training, and separately fast cheap machine learning model inference chips. So a whole slew of benefits developing your model in an I Python notebook on SageMaker Studio, and then kicking off a training job in that I Python notebook, which may be making use of specialized chips handling distributed training across multiple EC2 instances and consuming from the data pipelining you've already set up.
One final benefit to mention is that you don't have to set up a local environment for your machine learning development. You can keep your SageMaker studio project on SageMaker, and your code is then ready for you day to day. If your computer dies, you can switch to a new computer and your environment is still up in the cloud, so you don't have to spend the time setting up your local environment or B, transferring to a different local environment.
If you're on a different computer, in a different workspace, whatever. Now a little bit more about SageMaker Studio and this Jupiter Notebook, you're gonna be spinning up your SageMaker Studio Project is a project environment that gets created for your account and then you reuse that environment for development of your model day-to-day at your workplace at home.
And that studio project is not actually gonna be using the resources that you'll use when you kick off your SageMaker jobs. It's just your development environment. So you can spin this thing up on a T one micro instance, either a free or cheap EC2 instance to host your project and you're, you're writing code in your I Python notebook.
And when you get to the point where you're actually going to submit a training job using a script in BO O three using the SageMaker tooling, it won't run it in your studio project environment. It'll actually. Been up separate EC2 instances to run your machine learning model against, and then maybe report those results back to studio, stuff them into some charts and graphs and some CloudWatch logs and alerts and all these things.
So the environment of your project itself is either free or cheap running in SageMaker Studio. It's not actually using a lot of resources on AWS. It's only when you kick off SageMaker jobs like Train and Infer that will actually be running SageMaker on EC2 instances. Now there are various ways to write and train your model in SageMaker, these different approaches like script mode and bring your own model.
Basically, you can either use Sage make's default environment set up for you. That comes with Amazon Linux two and a handful of. Tools for data science like TensorFlow, KR os, PyTorch, and all these things. These may be pre-installed tool chains, including the Cuda and coup DNN installation at the operating system level.
Or you can bring your own Docker file that you're gonna be running your training model inside of that environment. Or you can use one of their prefab environments and specify some small amount of extra requirements that you may have by way of a PIP requirements text file, so that it will use a prefab SageMaker environment meant for TensorFlow, for example, but will still install a small number of packages that you need in addition to the prefab environment.
So there's a lot of flexibility around hitting the ground, running with the default environments that they provide for you, or taking it to the next level of customization by bringing your own Docker container and everything in between. An example of which is just providing requirements. Do text file for installing miscellaneous PIP packages.
In that gray area, that in-between area is also something called SageMaker Jumpstart. Jumpstart gives you any number of models outta the box that you can choose from. So for example, hugging face transformers, some of their summarization models, their question answering models. Some convolutional neural networks, resnet, and various other computer vision models.
You can go to SageMaker Jumpstart, click one of these pre-setup environments, a model that's pre-trained on maybe ImageNet or cocoa image datasets, whatever the case may be, and it will generate a whole bunch of code for you that's optimized to run within the SageMaker environment based on their hardware and the packages available at the environment level.
Some sample script for lopping off the head of your model, and then fine tuning it on your own data, and then you can run your training job and then deploy it when you're done. And it's highly recommended that if you're not gonna be using an off the shelf SageMaker solution, like. Autopilot or one of the cloud native machine learning endpoints that I'll talk about in a bit like recognition and comprehend that rather than writing your own model from scratch, you start with a jumpstart project on SageMaker that gets you a headstart on your project, whether it's in computer vision or natural language processing, because.
The sage maker tooling the environment setup, the operating system, the packages that are installed, cuda, ku, DNN, the versions of things and such. By using one of these jumpstart projects, you get all of that dialed in so you don't have to go through the trial and error of finding the right packages and what other versions they're compatible or incompatible with at the operating system level and so on.
So if you're gonna write a training job, you're starting from scratch, but it's a somewhat. Common machine learning situation like computer vision, natural language processing, use one of their jumpstart projects to get you started. Okay. So when it comes to training your model and then deploying your model, you can write it from scratch.
You can bring your own Docker container. You can get a sample project set up for you by way of jumpstart, or you can bypass this whole process and use autopilot, which will train and deploy your model for you based on your data. SageMaker experiments. SageMaker experiments is for hyper parameter optimization.
You can train your model against different hyper parameters. In the past, I've mentioned using tools like tuna or hyperop. Well, SageMaker. Provides a bunch of recommended prefab hyper parameters to try against various models, and you can also specify your own hyper parameters to try against, and it will kick off a Bayesian optimization hyper parameter search, job running, multiple model training instances.
You designate how many instances to run. In parallel and how many hyper parameter training jobs to try total. You might want models to run in parallel for parallelization, so you might want to try five or 10 models running at once, but you don't want to do your hyperop all in parallel all at once because the way Bayesian optimization works is it looks at.
Prior runs, sees what worked and what didn't, and then uses that to inform the next trials. It tries, so if you run five or 10 in parallel, it now has 10 different random searches in its back pocket. When it's done with those, it looks at all 10 says. Okay, based on this, let's try some other angles here. Up until, let's say, a hundred or 200 trials that you want to try, SageMaker will handle all the tooling for kicking off these experiments, giving you monitoring and charts and graphs about what seems to work, what doesn't, what features are important, what hyper parameters outperform, what.
Other hyper parameters and then locks these sort of into a repository that you can then reuse in the future. Let's say you add new data. Now, you don't want to change the data structure. You don't want to add new columns because that will require a new hyper parameter optimization setup. But if your data structure is the same, but maybe your model is slightly tweaked and the data is.
Augmented you have more data or less data or whatever, you can then pick up where you left off through SageMaker experiments to then continue your hyper parameter optimization. Finally, let's get into SageMaker deployments. SageMaker deployments, so eventually. You have a train model and you want to deploy it to the cloud, you can deploy it to a rest endpoint and it's all just so easy.
You can kick it off by running a script, either Bodo three or doing some clicking around in the AWS console and it will create for you a rest endpoint. That hosts your model, and then all you have to do is send up a Jason object to that rest endpoint with whatever it is you wanna run inference on. And then it will send back a response with the predictions, the inferences based on the data you sent it.
What's important about the way SageMaker handles deployments is that it's scalable. You specify the type of EC2 instances that you wanna run these models on, and it will scale up and scale down as necessary for handling traffic as traffic increases or decreases. Further, you can use one of these optimized chips that I mentioned previously to reduce costs and increase throughput or increase inference speed.
And in the case of inference, there is an INF chip or EC2 instance. The chip is called inia and the instances are are tagged as these INF instance types. You can also, rather than specify the type of instance, maybe one that has the INF instance type, you could specify a smaller instance. Let's say if you're not using a whole lot of CPU or ram, you can use Elastic Inference to attach A GPU or an INF chip to your instance, so you can have more fine-grained control over the type of environment that you set up.
You can cut costs. By being really stingy with the way you set up your environment, specifying the CPU and RAM, and the type of inference CHIP or GPU that gets attached to that instance. Now, two episodes ago, I said that one bummer about deploying your SageMaker model to a rest endpoint is that it's always on and that there's no way to scale to zero.
Let's say if you don't get any traffic during one day, or if you only run a machine learning inference job once every hour or once every two hours, well, you don't have to deploy your model. You don't have to use a SageMaker deployment. You can use something called. Batch Transform. Batch Transform. Now, the word transform, it sounds like you're transforming your data.
This is a common use case of batch transform is to load up a whole bunch of data from your pipeline and then run a bunch of transformations on it, and then kick off a sequence of steps in a pipeline. But you don't have to use it that way. You can kick off a single SageMaker batch, transform job for inference using an inia chip that then exits once it's done running that inference, and you'll either be returning the result of that.
Inference, call from batch transform to your calling script, or you might stuff it away in a database somewhere, or you might call a Lambda function from there. And then maybe CloudWatch can send it off somewhere else using SQS or SNS, however you want to handle it. But when I was talking about, no, I, since I have so few users that machine learning jobs are called maybe once an hour, once every half hour, I don't want to deploy a SageMaker model to the cloud on A GPU.
I want to just kick off a quick. Inference job as the case may be ad hoc as needed. PRN, and I can do this by way of SageMaker batch transform jobs. It's very similar to using AWS batch. AWS batch is a dedicated service for running one-off jobs using a docker container on whatever EC2 instance you want.
It's very similar to that, but it's all tied into the SageMaker tooling. So you get all of the other features that I've mentioned before. Now, what's so cool about all this is if you're writing a Python script, let's say you have a server on AWS Lambda, you have your application server, you have a web client, it's React, it's sending HTTP requests to your server.
Your server is running on AWS Lambda, all behind API Gateway. So it's a rest server, and that server's written in Python. You want to kick off a machine learning job? Well, you can say Bodo three client SageMaker, however you construct the client in Python, and then you kick off a batch transform job or a training job or whatever in Python.
In Python and it does all of this ML Ops stuff for you in the background. So the way you might write a line of Bodo three code to kick off a machine learning inference task is by saying batch transform parentheses. And then in the arguments list of that function call, you'll specify the Python script that you wanna run for this inference job.
It will send that Python script to the inference engine or the ECR Docker image that you're gonna be running this inference job inside and the EC2 instance type and all these things. You run that line of code and it can return back a result. Into your Python script. So it's as if you're running machine learning code within your Python script, but hidden from you is that it's actually running this machine learning code in the cloud on AWS SageMaker before it returns the result.
Back to your script almost as if it was a background job like you had called OS P Open, or kicking off some background script on your local host and getting the result back. Well, with SageMaker you can kick off the script to the cloud and get your result. Back to your calling script, or if it's a long running script.
If this inference job may take a while, then maybe you want the result of the inference to get stuffed away into an SQS or SNS or database somewhere, and you can handle that in the inference Python file itself. Okay. The next feature in the deployment section listed on Sage Maker's website is called SageMaker Pipelines.
A lot of that we've already covered so far in pipelining the various steps of your tasks thus far. We have the data stuff in the data wrangler, the feature store, managing the data labeling with ground truth, keeping an eye on that data with clarify. Kicking it off to the machine learning training stuff, autopilot Jumpstart, bring your own Docker file script mode, all in a SageMaker Studio I Python notebook, and then everything gets deployed.
And the dedicated pipeline feature is sort of for managing the steps of this pipeline. It also has some other functionality. For example, CICD, continuous integration slash. Continuous delivery or continuous deployment. This is a common concept in DevOps, in in web app development. When you're writing your server code or your client code, as a web developer, you want to get, commit your code, push that commit.
Up as a pull request on GitHub or to your Code Commit repository on AWS. That commit will then run through a series of unit tests. It will be running these unit tests on a backend environment. If you're doing this on AWS, you might use something like Code Deploy to run this stuff. Maybe using Travis CI or CircleCI or whatever.
On an AWS EC2 instance, if those unit tests pass, it may kick off a deployment into a staging environment or production environment. And if the tests fail, you'll want it to email the administrator or the owner of the repository and say those tests failed and not kick off a deployment. Well built into the SageMaker pipelines, tooling is handling of ci.
CD for ML ops, not just for the code deployment, but also for the training of your models and then the deploying of those models if they pass whatever tests you want them to pass. And in machine learning unit tests is a different kind of concept than is the case in web development. Unit tests. Yeah, you can actually unit test your machine learning model code in Python.
You do wanna run unit tests against your Python code, but maybe that's a little bit less interesting than ensuring that your model is not drifting or that bias is not being introduced, or that your objective metrics meet a certain threshold. You want to have greater than 80. Percent accuracy and whatnot.
And so the CICD component of the pipelines feature tool chain handles keeping track of these aspects of machine learning model post train in order to determine if a model is a candidate for deployment to the cloud. And so. It won't just be using Git Ops Git operations. It won't just be using a notification that a GI commit has been pushed.
It will also potentially be using notifications that new data has been added to your dataset, to your data lake. A new CSV has been added to S3. A whole bunch of new rows have been added to your RDS database. That will kick off a CloudWatch notification. That CloudWatch notification will inform the.
Pipelines feature of SageMaker, the CICD suite, which will then run a training job. Check those metrics, everything's good. Maybe we will automate a deployment. Everything's bad. Email an administrator. The next feature is called SageMaker Model Monitor, and we'll skip past this feature. 'cause most of the tooling here is stuff that I've discussed already in the past.
Monitoring model bias, drift data changes over time, and all these things. The next feature listed is SageMaker Kubernetes. Okay. I'm gonna talk about Kubernetes in a later episode. Let me do a small bit of distinguishing here. We're talking about using the whole SageMaker stack in this episode, in the last episode, using SageMaker for everything.
Using AWS for everything. Well, there's a universal. DevOps framework out there called Kubernetes. It was developed by Google. Kubernetes orchestrates your docker containers into a whole bunch of microservices. That includes your database, that includes your job queue, a bunch of servers, even the client.
It's an orchestration service using docker files for deploying your tech stack to the cloud, but it doesn't assume that you're using AWS, in fact, since it was developed by Google, kind of its first class citizen as GCP Google Cloud Platform, but all of the cloud providers, Microsoft Azure, Amazon AWS. And GCP, they all support Kubernetes.
So Kubernetes allows you to orchestrate your tech stack using Docker files in a universal fashion that's compatible across all cloud providers, and in fact can run on your local host in that way. It is kind of mutually exclusive to the way I am suggesting you use SageMaker, which is to sort of trust fall on the entire AWS tooling.
So if you're gonna be using AWS for everything, you'd use SQS for their message queue. You'd use SNS for their notifications, CloudWatch for their logging, Lambda, for your server S3, and CloudFront for your client. So this is using all of the AWS tooling, their services. But alternatively, you could use Kubernetes and instead of using these backend services, you would use docker files, orchestrate the deployment of these docker files to EC2 instances.
On AWS, and rather than using these AWS native services, it will be using services in these docker containers that you're running on AWS EC2 instances, all orchestrated by Kubernetes, AWS has a Kubernetes orchestration service called EKS. There are a lot of ways you can do ML ops. Of course, we're talking about SageMaker here, but there are competing solutions on GCP in Azure.
Well, there are universal machine learning, pipelining training and deployment solutions out there. One of the most popular of which is called cobe flow. Q or cube flow, KUBE. Flow as in flow of data on Kubernetes and SageMaker is compatible with this by way of the SageMaker Kubernetes integration. There are some other popular universal ML ops orchestration services out there.
It was one called ML Flow, and I'll talk about this all in future episodes. Then finally, the last service listed on SageMaker is called SageMaker, Neo, NEO. Now you can run your machine learning models in the cloud using the SageMaker Deploy feature, rest endpoint, or BO O three calls. Or you can package up that model on SageMaker and have it optimized and exported to a chip, set or hardware environment of your choice and have it deployed on that hardware.
So let's say you wanna run your face recognition model on your phone, on the front facing camera of some mobile app that you're deploying to the Google Play or the Apple App store. You make an app. You have a face recognition model that's part of it, and you want that thing to be blazing fast and in order for it to be really fast, so it's consistently detecting a face from the front facing camera.
Well, you don't want this machine learning model to be running in the cloud where you're making a rest request from the mobile device to the cloud. Once every 100, 200 milliseconds, there's too much latency there and there's too much strain on your servers. So SageMaker Neo. Has some tooling that packages your model to be deployed to the mobile app.
You specify the hardware you wanna run this on, you might specify an iOS device and an Android device, and it will package up your model. It will optimize it. Remember when I mentioned Onyx, O-N-N-X-A model optimization framework? Well, SageMaker has built their own model optimization tooling. And it is called SageMaker Neo, and it will export your model.
That so slimmed down that it can be ran efficiently on a mobile device. They also support hardware like cameras. It's very common to run your image recognition or object detection models on a camera at the edge. Edge means it's on the device. It's very common that cameras do object detection, bounding boxes, intrusion alerts.
They're looking for intruders. Okay? You have a camera that's sitting outside your house, and if it sees somebody walking up to your door that it doesn't recognize and it doesn't look like a UPS person, then it may alert you from a mobile notification and you want that computer vision model to be running on the camera.
SageMaker Neo can optimize your model. Export it so that it can be run on camera hardware. You specify the chip that is gonna be running on that camera, and then whether it's your mobile app or your camera, you tie your physical device or your app up to Neo and it will sync the model to the device as new model deployments become available.
So you train your model using SageMaker autopilot, a bring your own model, a jumpstart model, whatever. The last phase of your pipeline is neo. After the model has been trained, it runs through Neo to optimize it to some device. You specify that device and then you specify where it's actually being deployed.
Maybe on the mobile app. Inside the code, you have it tied up using the A-W-S-S-D-K to the a RN of this NEO deployment, the mobile app. And the Neo service will communicate with each other. Neo will download a packaged up optimized model to that device, making sure that it runs well on that device hardware and anytime new models become available because you retrained your model, maybe you got some CloudWatch notification with model monitor, and then it kicked off a new training job because there's been new data made available, whatever a new model's trained.
Neo notices that repackages an optimized model, syncs that model down to that mobile device. Very cool tooling. And remember in the past I said, what if you wanted to skip all this SageMaker stuff and you just wanted to run your machine learning model on an AWS Lambda function? Lambda is really cheap now.
The hardware is limited. I think it's a 10 gigabyte ram. Cap and I don't know what the CPU cap is, but these things are intended not to run your usual machine learning models. Especially not the heavy stuff like hugging face transformers models. It's meant to be run quick Snippets of Python script or no JS script.
Great for deploying your server code at scale or running one-off function calls in your AWS tech stack. Less great for machine learning. Just due to the nature of the heaviness of inference jobs, but that doesn't have to be the case. I mentioned previously you could use Onyx to export an optimized machine learning model and then put that on Lambda, which then may be within the hardware constraints of the Lambda function.
But instead of using Onyx, you can use Neo. So at the end of your trained model. Step the pipeline. You can export your model using Neo to be optimized on Lambda hardware, and then now you can run that model on AWS Lambda and it will probably run within the hardware constraints of that Lambda function. I.
So that is the overview of SageMaker. It is a pipelining tool set on AWS that lets you take in your data, transform your data, train your model, monitor your model, deploy your model, and then a whole bunch of bells and whistles in between. Now SageMaker iss Done. We're done with SageMaker. We're gonna talk about AWS's Cloud native machine learning offerings, and I'm not gonna talk about all of these.
I'm gonna leave it to you to go to Sage Make's website and look at the services available in the machine learning category. If I'm in my AWS console and I click the services dropdown under the machine learning category. There is a big old list of services and before I list some of these services, let me tell you what a cloud native service is.
SageMaker is intended for you training and deploying your own models. Now, some of those models may be off the shelf, either by way of SageMaker autopilot. Some of those models might not be off the shelf, but they may have handheld you through the process. It gives you a headstart like jumpstart, for example, but these are intended for you to write your own model or use somebody else's model and then deploy it to the cloud.
This is what's called managing a service. You're managing a service or AWS is managing a service on your behalf. If you are just calling some AWS service, you're just calling a service. You're not actually hosting any instances, whether those are ephemeral, like a batch, transform job, or permanent, like a rest endpoint.
If you're not actually running any instances in the cloud and you're simply making a rest, call a quick fire and forget, call against some AWS service. We call this cloud native, cloud native. So, for example, AWS has a cloud native service called Amazon Poly. Polly, P-O-L-L-Y. It's a text to speech service.
You send it text as a rest, call against the rest endpoint. You put in the headers of your rest, call your Amazon key, and then you, you know, the body of the text of the rest call to a post request or something returns back to you, an MP three file of the text that you had submitted to that endpoint. So rather than writing your own code to do text to speech, you should instead use Poly, Amazon Poly.
You should send a rest request or make a BO three call submitting up the text. You want turned into speech. Like let's say you wanted to convert an entire book into an audio book, you just submit the TXT file up to the service. And then out comes an MP three, maybe it's gonna store it on S3, or maybe it will actually return a streaming file that you then handle in JavaScript or Python.
That's one example. Poly for text tope. But there are a whole bunch of services here. So I'm gonna just go top to bottom and list these services, and then I'll do a little bit of coverage on some of the ones I'm familiar with. Augmented ai, code guru DevOps guru comprehend forecast fraud detector. Kendra Lex, personalize poly recognition.
Text tracked, transcribe translate. Deep composer, deep lens, deep racer Panorama mono. Health Lake, look out for vision, look out for equipment, look out for metrics. That's a lot of services. These are all services that are pre-trained model to the cloud that you make rest calls to so that you don't have to deploy your own model.
Thus, saving time, saving money, because you're not gonna be deploying a model to rest endpoint. And these services improve with time. Amazon is constantly retraining these models. They're swapping out some old model with the latest and greatest. Some white paper comes out, says we've improved on the Transformers architecture, the experiment with that new Transformers architecture.
Yes, it looks a lot better than our old model. Swap out the old. In with the new, and so you don't have to maintain your model in the cloud. You don't have to keep tabs on the latest and greatest technology in very common use cases of machine learning. They'll do all this for you. Let's talk about some of the services I'm familiar with, Amazon Comprehend.
Comprehend is their NLP tooling. So there's a whole bunch of NLP tasks built into the comprehend service that you can perform on a paragraph on a document. So listed here are some examples. We have syntax tree construction. You can do topic modeling of your documents, okay? You can label documents and, and cluster them into various topics.
Sentiment analysis, determine if a, if a sentiment of a document or a phrase is positive or negative, pull out named entities named entity recognition. Document classification, all these things. So before you decide to either bring a pre-trained hugging face transformers model to SageMaker and then deploy that to the cloud, or train your own hugging face transformers model, before you do that.
Go to AWS comprehend, look at all the features that are available on that service and see if instead of bringing your own model to the cloud, you can just use one of these services prefab. Save yourself some time and money and keep up to date with the latest and greatest in machine learning by using AWS cloud native functions instead of bringing your own model.
The next one is forecast. Forecast is for time series analysis. Historic data. Maybe you want to do budget forecast or a cost forecast, or you're doing stock market stuff or weather prediction before doing your own recurrent neural network. See, if instead you could use AWS forecast as a cloud native service.
Fraud detector. So AWS offers fraud detection outta the box. Lex. Lex is a chat bot. You can set up this whole system. It has a graphical user interface, building a conversation flow with a bot. So if you wanted to have. If you wanted a mobile app that chats with you or a customer service triaging pipeline chat feature of your website before they get kicked off to customer service and it just wants to go through some quick dialogue flow to make sure they've tried all the 1 0 1 stuff first.
Lex. Lex is a chat bot. Personalize Amazon personalizes for. Personalized recommendations. So if you're trying to recommend products or you're trying to recommend articles or movies or music based on what this user has listened to or purchased in the past and what other users have listened to or purchased in the past, it will use machine learning to generate personalized recommendations.
Text extract, T-E-X-T-R-A-C-T-O-C-R of PDFs. Let's say you have a, a tax document or a receipt or a contract or some PDF, where it's a bunch of fields and those fields are all over the place, so you wouldn't be able to just use simple OCR. To transcribe this document into just a blob of text. Instead, you want to actually pull out the fields and their values in text format and text extract will take this PDF or this PNG file and it will determine what are the fields and what are the values for those fields using optical character recognition, OCR.
It's actually pretty accurate. I've used it in the past. It's pretty handy. Translate. Translate languages, English to Spanish, Spanish to Italian, all that stuff. So you can use Google translates API, or you can use AWS translate. Panorama. Panorama is a whole suite of tooling around cameras on your premises.
So if I look at the use cases here, it says optimize in-store experiences, gather critical supply chain inputs, improve restaurant operations. So it's a suite of tooling for computer revision at the edge on cameras for your shop. And I'm not gonna go into all these services, so let's just stop there. Now that was a lot of stuff to cover.
We covered SageMaker and then we covered some of these cloud native services. So in order to bring this all together, let me discuss how I might use this for Noie. Now I am in currently in the process of moving everything from Noie over to SageMaker. Currently Noie trains on a single instance running on AWS batch.
It trains and it runs inference jobs. Now there are a handful of hugging face transformers models for NLP, we've got summarization question answering document clustering for the themes feature. Text similarity that's used for the book recommendations feature. And I'll also be deploying a groups feature here in the near future so that you can join mental health groups, people of like mind.
It will take your journal entries and determine what groups you are similar to based on things you've said in the past, and then suggest that you join those groups using cosign similarity of the document embeddings. And then I'm using XG Boost currently for the Fields feature. So you can track certain fields in your day-to-day life, alcohol consumption, sleep quality, work quality, all these things.
And it will tell you sort of what things are affecting, what other things and what fields maybe sleep, for example, have the highest impact in your life as a whole. And the recommender system is actually currently a handcrafted neural network for book recommendations. In this process of moving everything from handcrafted code to SageMaker, here's how I'm gonna do it.
The first thing you do is. Ask yourself, is there already a cloud native AWS service I can use for this feature so that I don't have to deploy my own machine learning model? That means cheaper, easier, and as the technology and models improve, so to, does the backend improve and therefore your app? Well, indeed, for most of the NLP stuff, there's already cloud native NLP offerings.
My bread and butter here is probably gonna be. AWS Comprehend. Comprehend has document topic modeling. Now, one thing I do is I take all of your journal entries and I cluster them into themes, categories, common recurring patterns. Well, I can use comprehend to do that for me. I just send up a bunch of documents and it will auto cluster them for me using topic modeling.
So I will gut that custom section of Python code from my. Project and I will defer instead to kicking this off as a comprehend job. Another thing offered on comprehend is question answering. You ask a question, it takes in all of those journal entries as documents, and it answers the question for you.
Currently, this is a pre-trained hugging face transformers model that I'm running myself in Python on AWS batch. I'm gonna gut that code and I'm instead gonna kick it off as a comprehend service call question and answering summarization. Okay, well I was poking around AWS comprehend. I didn't see summarization there, so I might have to keep that custom.
So now we go to SageMaker. How am I gonna handle this? Well, currently. I'm storing everything in an RDS database, so I might make that database, my data store or data lake that then gets ingested to data wrangler and feature store. Now these are coming in as text, so there's not a lot of transformation I need to do, so we're just gonna pipeline it onto the next step, the next phase generally, and we're going to deploy this as a SageMaker model to the cloud.
Now, I also don't need to train this model. This is a. Pre-trained model that's coming straight from hugging face transformers. So I can skip the training phase and again, just deploy this thing and I can either deploy this as a rest endpoint. So it's always available. Therefore, it has low latency, high throughput, or since these summarization jobs are actually somewhat rare, I'm actually instead going to be.
Kicking off a SageMaker batch transform job, which will run inference using an INIA chip with hugging face transformers deployed as a docker container to SageMaker. Run that script. Boom. We have our summarization stuff that away into the database. We're done. What about the document embeddings? This one's a little bit more interesting.
When you write a journal entry on no fee and then you click save. One thing it does is it embeds the entire document into a vector of dimensions. One by 7 68. Those are all floating point values that is essentially a.in a sphere, and now that you have that point, that vector, as you embed other journal entries or other users', journal entries or book blurbs, you can find books that are similar to stuff that you talk about so that I can make.
Book recommendations. Any book that is near your dot for that journal entry can get recommended to you. And that similarity is by way of cosign similarity or as I'm creating the groups feature where you may want to find people of like mind their journal entries. We turn those into dots. We average their vectors together so that there's sort of a single vector that if.
Essentially represents that user average, all those users within a group together so that there's an average vector representing users of a group. And then we cosign similarity your averaged vectors to that group to determine what are the groups whose users are the most similar to you based on the types of things you talk about.
I'm using a library called UKP Lab sentence transformers. How will I go about this? Well, comprehend does not offer tooling for embedding your documents into vector space. However, something I do want to explore is an AWS service called Elastic Cache, which is basically using elastic search and either I can store these documents in Elastic Cache.
Elasticsearch has a whole bunch of tooling for similarity. Matching of documents to other documents including, and I don't quote me on this. This is something I've heard that I'm gonna do a little bit of research on. As I understand it, there is some tooling for embedding documents into vector space in elastic Cache, and that will allow me to scale the storage and the similarity search functionality of the document embeddings.
Which is something of an achilles heel of no, the at present. These vectors, one vector per journal entry are very large to store, so deferring to elastic cash feature may offer me a lot in way of scalability, but I haven't quite done the research yet. I may be wrong about this service, so if that doesn't work out, if I can't do this all by way of elastic cash.
What I will do is this, take all my embeddings, create a SageMaker pipeline. The entry point is going to be the journal entries, the raw text that will go into data wrangler. One step of the feature engineering process for feature store is going to be to transform those journal entries into an embedding.
Using UKP lab sentence transformers. So this will be one step of a SageMaker pipeline. Take the raw text and to transform that into a feature. And that feature is going to be the embeddings, and that job is gonna be running as a SageMaker batch transform job. Now, I don't actually have to. Kick off that job from my app server script per se.
Instead, I can have some CloudWatch monitoring either the journal entries getting put into the RDS database, or maybe an alternative would be that. When a journal entry gets put into the RDS database, it also gets saved as a text file in S3 and S3 kicks off a notification to ingest this into the data pipeline, and then that step of the pipeline will do the embedding of the document.
Save that embedding as a file in S3. Whether that's Parquet or a CSV. Then the next step of the pipeline, which gets kicked off by some other notification during this process, we'll take that embedding and compare those embeddings using cosign similarity to other embeddings. So one step of the pipeline is to embed the journal entry.
Then the next step of the pipeline is to take that embedding and compare it to a bunch of other embeddings to see what books we wanna recommend to this user, or what groups we wanna recommend to this user. I. Now each of these steps will effectively be SageMaker batch transform jobs. But rather than running everything in a single Python script, each step is operating independently and only listening to changes in the data pipeline that it is concerned about.
So the embedding SageMaker job is listening for journal entries that get put into the database or that get put on S3. When it sees a new one, it kicks off the job that's gonna run UKP lab sentence Transformers embed the document and then put it somewhere. Now, let's say 10 users come online and they all submit journal entries at once.
10 SageMaker instances will then come online and embed separately. So this is scalable microservices. Then separately is gonna be a SageMaker batch transform job that is listening to embeddings that become available. Either those get saved to a database or those get saved to S3 As CSV files or parquet files, and then scalably.
It does its job. Now you'll note nothing here is training yet. So far, everything I'm talking about in my tool chain are pre-trained models. How about a training job? Well, like I said, the field's feature of Noie will look at your fields and see how they impact each other. Consider this tabled data.
Eventually, I'm actually gonna be moving this feature into the causal modeling machine learning models that are available out there, a library available called Do y. Which I will do a separate episode on, but for now I'm just gonna be working with it as if it's table data. I'm using XG Boost. And how do I determine the feature importances of one field upon another?
How is it that I determine that alcohol has the highest impact on your sleep? Well, I run the whole thing through an XG Boost model. I determine where sleep quality is, the label and all the other fields all are the features. I train an XG Boost model per field, whether that's sleep quality, productivity, quality, and so on.
And I pull two things out of that model. The first is the feature importances. The feature importances. What determines how much impact other fields have on the field under consideration? So for sleep quality, I trained an XG Boost model where sleep quality is the label and everything else are the features.
Rank all the other fields in order of feature importance using the feature importance's capability of xg Boost. And then I also then use that XG Boost model to predict tomorrow's values for those fields, which is actually kind of a cool feature. You should check that out. So the XG Boost model is tracking my sleep quality.
It tells me what things impact my sleep quality, whether that's caffeine intake or alcohol intake, et cetera. And that it also predicts today what my sleep quality is gonna be as well as tomorrow. I don't know what we think of that. I kind of like it. So what am I gonna do with this? This all exists.
Currently in a Python file, running XG Boost in a Docker container. Well, I'm gonna send this all to autopilot. I'm gonna take your fields coming from the RDS database, and I'm gonna pipe that right on through to autopilot. Autopilot is going to give me those feature importances and the predictive model out of the box, and it will do some of the necessary feature transformations that I'm currently manually doing in code in pandas.
One important of which is. Imputation, there's a lot of days for which I do not record my sleep quality, and it's important to impute that smartly, and so I'm gonna defer to autopilot to handle that on my behalf. Then finally, the book recommender system. It uses co-sign similarity to match your journal entries to relevant books.
So if I talk a lot about dealing with stress, it might recommend me books on stress management, cognitive behavioral therapy, and the like. Well, there's an up vote and down vote feature on No, the, you can thumb up a book or thumb down a book. Yes, that was close to what I'm talking about, but I'm personally not interested in that book.
I'm using a neural network. That pre trains on the co-sign similarity. So the first thing it learns in the training phase is simply the cosign function. It takes all your journal entries and all of the co-sign similar book recommendations. And then those similarities are now the pre-training data for that neural network.
It simply learns the co-sign function. And then I fine tune that model. That neural network on your own personal preferences, on your thumbs up, thumbs down. And this concept here that I'm discussing is called metric learning. There are better ways to handling metric learning. I'm not really using metric learning the way it's supposed to be used, but it's working great for now and I'll deal with swapping out that model later.
But as you can see, there's a lot of custom code that I can't really get away with calling A AWS. Cloud native service, there's nothing out there for that nor really autopilot or jumpstart here. So instead, I'm actually gonna be using the real raw power of the SageMaker pipeline. So this is why I saved this one for last.
What am I gonna do? Well in comes a journal entry, a user journal entry as a text blob. It's in the database or it's a text file on S3. Data Wrangler pulls that data into the data pipeline. One of the steps of the feature transformation phase in feature store is to embed that document into a vector using sentence transformers.
That embedding is saved in the feature store. So therein shows some value in data Wrangler and feature store, because now I want that embedding. Embedding is used elsewhere. Also, it was used for the cosign similarity matching of users to groups and users to books. But now I also want to use that embedding, so I am another consumer of the feature store at this phase in SageMaker to train a neural network that is pre-trained on the cosign similarity function.
It's simply learning the cosign similarity function, and then fine tuning. On a user's preferences, whether there's a thumbs up or thumbs down on book recommendations. And so this part, this phase of the pipeline, this SageMaker batch transform job can get kicked off either by a new embedding becoming available in the data pipeline or a new thumbs up, thumbs down.
Action. It will be listening to these events and will kick off a training job on SageMaker. I'll kick off a training job, scalable. I can have multiple of these training jobs running at once for different users, and that training job is gonna plug into all the tooling of SageMaker, and so it is going to monitor.
The model performance, it's gonna keep an eye on data drift bias, it's gonna keep an eye on the model metrics, is the accuracy of learning that co-sign similarity function within the healthy range, is it where I want it to be? And then once we go to the fine tuning phase and we train it some more on that user's individual preferences, it will continue to monitor objective metrics.
Now this will be very handy for me. Because if the model is pre-trained on some co-sign similarity of journal entries to book blurbs, and it comes up with some metric score, and that seems to be fine and dandy according to model monitor and model debugger, yes, it seems to have accurately learned the cosign function and then the user's preferences, thumbs up, thumbs down, actually take us.
Way far away from the recommendations that we were previously giving from that model. In other words, there was a huge drift. The previously trained model is so different than the fine tuned model because the user did not like any of the books that Noie was recommending them. That means my model sucks.
That means I'm not giving that user recommendations that that user would expect, and I should reconsider how I'm pre-training this model. Maybe I should be switching to a different model and, and using metric learning the right way or maybe. That cosign similarity isn't what I thought it was, and so I'll get a notification on CloudWatch.
It may email me, it may text message me and tell me there's a lot of drift in your model. You may wanna look into this. I. Then I will have insight and I won't have to sort of keep an eye on this model myself. SageMaker will keep an eye on it for me. Now, those are a whole bunch of the features of Noie and how they can lend well to SageMaker and AWS Cloud native services.
One feature I want to add to Noie is dream interpretation, automatic dream interpretation. If you have a journal and that journal's tag is dream or dreams or dreaming, it will. Auto interpret your dream by matching the elements of your dream and their definitions In a dream dictionary, this will be a new feature.
I don't have to worry about the current tech stack of my machine learning deployment because I can write a one-off microservice SageMaker model that is independent of all the other machine learning in my tech stack currently. No runs. All of its machine learning in one Docker container that runs on batch.
And if I want to change one machine learning model, improve one model, then that might affect all the other models. So by using SageMaker, I can write a microservice, a single machine learning model. Then I can then add into no fee's feature set without worrying about how it may affect the rest of the stack.
There you go. SageMaker and AWS Cloud Native Services. There are a lot. Before you write your own code, see if you can do it on a cloud native service. If not, see if you could do it on autopilot. If not, see if you can get started with SageMaker Jumpstart. And if you can't do that, then you can fall back on the dedicated SageMaker.
Tooling and it is powerful. It is magical, and I recommend moving away from local host development into developing and training your models on SageMaker Studio in an I Python notebook, so that you can become acquainted with the tooling in SageMaker and so that when you're ready to deploy your model, it's already all baked into the pipeline.
So you don't have to do that mental translation of taking your local model on Docker to the cloud. It's all ready to go for your customer at scale. In coming episodes, I'm gonna talk a little bit more about AWS and sort of developing against AWS by way of something called Local Stack so you can become steeped in the AWS tech stack.
And I'm also going to talk about some alternatives to SageMaker, like cube flow and ML flow and all these things. I'll see you later.