Description
WHAT YOU DO AT AMD CHANGES EVERYTHING
We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.
AMD together we advance_
Responsibilities:
1. Lead a team focused on AI model inference and training solutions for AMD GPUs, including but not limited to optimizations at the framework, model, and operator levels. Analyze and fine-tune issues related to accuracy and performance during model training and deployment, identify and resolve bottlenecks, fully utilize hardware resources to optimize model inference and training performance, reduce latency during large model inference, and improve throughput.
2. Collaborate with AMD GPU software and hardware teams to optimize LLM frameworks and computing libraries, enhancing end-to-end AI model training and inference performance.
3. Support AMD GPU customers by helping them achieve optimal performance through effective use of AMD software and hardware.
Requirements:
1. Experience in leading teams to accelerate AIGC applications and optimize large model inference/training.
2. Proficiency in Python/C/C++ programming, CUDA kernel development, and Triton, with experience in low-level algorithm performance debugging and acceleration.
3. Proficient in at least one deep learning framework such as TensorFlow or PyTorch, and familiar with common distributed machine learning frameworks like Megatron, DeepSpeed, or Hugging Face Transformers.
4. Familiar with mainstream LLM inference engines such as FasterTransformer, vLLM, and TRT-LLM, and common inference optimization methods like FlashAttention, PageAttention, Continuous Batching, and Speculative Decoding.
5. Strong leadership and project management skills, with excellent communication and cross-team collaboration abilities.
GPU高性能优化研发经理
岗位职责:
1.领导团队聚焦面向AMD GPU的AI模型推理和训练解决方案,包括但不限于框架、模型、算子层面的优化,对模型训练和部署时存在的精度与性能问题进行分析调优,识别并解决瓶颈问题,充分利用硬件资源优化模型推理和训练性能,降低大模型推理时延,提升吞吐;
2.与AMD GPU软硬件团队合作,优化LLM框架和计算库,提高端到端AI模型训练和推理性能
3.支持AMD GPU客户,帮助客户充分利用AMD软硬件获得最优性能
岗位要求:
1.有领导团队加速AIGC应用和优化大模型推理/训练经验;
2.熟悉Python/C/C++编程,cuda kernel开发,Triton,有底层算法性能调试及加速经验;
3.熟练掌握TensorFlow、PyTorch等至少一种深度学习框架,熟悉常见的分布式机器学习框架,如Megatron、DeepSpeed、HuggingFace Transformers等;
4.熟悉LLM主流推理引擎,如FasterTransformer/vLLM/TRT-LLM。熟悉常见的推理优化方法,如FlashAtention、PageAttention、Continuous Batching、Speculative Decoding等;
5.具备良好的领导能力和项目管理能力,具备良好的沟通和跨团队合作能力。
工作地点:北京
#LI-FL1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
Apply on company website