详细内容或原文请订阅后点击阅览
在Amazon Sagemaker Hyperpod
With a one-click installation of the Amazon Elastic Kubernetes Service (Amazon EKS) add-on for SageMaker HyperPod observability, you can consolidate health and performance data from NVIDIA DCGM, instance-level Kubernetes node exporters, Elastic Fabric Adapter (EFA), integrated file systems, Kubernetes APIs, Kueue, and SageMaker HyperPod任务操作员。在这篇文章中,我们将带您完成安装和使用Sagemaker Hyperpod中开箱即用的可观察性功能的统一仪表板。我们介绍了来自Amazon Sagemaker AI控制台的一键安装,将其合并的仪表板和指标浏览,以及高级主题,例如设置自定义警报。
来源:亚马逊云科技 _机器学习