Tag: endpoint

Benchmark and optimize endpoint deployment in Amazon SageMaker JumpStart

When deploying a large language model (LLM), machine learning (ML) practitioners typically care about two measurements for model serving performance: