-
Book Overview & Buying
-
Table Of Contents
Accelerate Deep Learning Workloads with Amazon SageMaker
By :
In this chapter, we discussed how to operationalize and optimize your inference workloads. We covered various inference options offered by Amazon SageMaker and model hosting options, such as multi-model, multi-container, and Serverless Inference. Then, we reviewed how to promote and test model candidates using the Production Variant capability.
After that, we provided a high-level overview of advanced model deployment strategies using SageMaker Deployment Guardrails, as well as workload monitoring using the Amazon CloudWatch service and SageMaker’s Model Monitor capability. Finally, we summarized the key selection criteria and algorithms you should use when defining your inference workload configuration.