Conveners
AMD 技術專題:從資料中心到客戶端:以 AMD AI 軟體、ROCm 與 NPU 部署大型語言模型(LLM)|Invited Technical Talk: From Datacenter to Client: Deploying LLMs with AMD AI Software, ROCm, and NPUs (Presented by AMD)
- Simon CHANG (AMD)
- Micky CHENG (AMD)
Description
This session explores how AMD’s AI software stack enables efficient LLM deployment from Datacenter to client across CPUs, GPUs, and NPUs. We will cover model development and optimization using ROCm in the data center, followed by seamless deployment to client platforms with AMD Ryzen™ AI NPUs. Using Lemonade, an open-source, OpenAI-compatible local LLM server, attendees will learn how to deploy LLMs with minimal code changes, leveraging heterogeneous execution across GPU and NPU to achieve low-latency, power-efficient, and private on-device generative AI experiences on PCs and edge devices.