使用Amazon Nova XiaoMi-AI 科研信息收集

详细内容或原文请订阅后点击阅览

使用Amazon Nova

2025年6月13日 16:01 33 Comments

在这篇文章中，我们演示了如何使用Amazon Nova，Amazon Rekognition和Amazon Polly之类的服务来自动创建视频内容的可访问音频描述。这种方法可以大大减少使视觉障碍受众访问视频所需的时间和成本。

来源:亚马逊云科技 _机器学习

根据世界卫生组织的说法，全球有22亿多人的视力障碍。为了遵守残疾立法，例如美国的《美国残疾人法》（ADA），需要视觉表演或电影等视觉格式的媒体才能为视力受损的人提供可访问性。这通常以音频描述的形式出现，轨迹叙述了电影或节目的视觉元素。根据国际纪录片协会的说法，使用第三方时，创建音频描述的价格为每分钟25美元（或更多）。对于内部构建音频描述，媒体行业的企业的努力可能很重要，需要内容创建者，音频描述作者，描述叙述者，音频工程师，交付供应商等等。 This leads to the natural question, can you automate this process with the help of generative AI offerings in Amazon Web Services (AWS)?Newly announced in December at re:Invent 2024, the Amazon Nova Foundation Models family is available through Amazon Bedrock and includes three multimodal foundational models (FMs):Amazon Nova Lite (GA) – A low-cost multimodal model that’s lightning-fast for processing image, video, and text inputsAmazon Nova Pro (GA) – A highly capable multimodal model with a balanced combination of accuracy, speed, and cost for a wide range of tasksAmazon Nova Premier (GA) – Our most capable model for complex tasks and a teacher for model distillationIn this post, we demonstrate how you can use services like Amazon Nova, Amazon Rekognition, and Amazon Polly to automate the creation of accessible audio descriptions for video content.这种方法可以大大减少使视觉障碍受众访问视频所需的时间和成本。但是，这篇文章没有提供完整的部署就绪解决方案。我们在序列中共享伪代码片段和指导

电影 Amazon 视力障碍工程师组织的提供视觉障碍媒体视觉所需的音频残疾人 cost 需要描述完整的 automate 第三方 multimodal model 根据多人的美国说法 Nova 供应商 GA 代码

使用Amazon Nova

其他外部链接

Tags

XiaoMi-AI