Back to Search Results

Get alerts for jobs like this Get jobs like this tweeted to you

Company: AMD

Location: Beijing, Beijing, China

Career Level: Entry Level

Industries: Technology, Software, IT, Electronics

Apply on company website View all jobs at this company

Description

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_

THE ROLE:

We are seeking a technically proficient PMTS Engineer to develop and optimize AI model inference and training solutions for AMD products. You will deeply engage in technical work, optimizing at the framework, model, and operator levels to enhance model training and deployment accuracy and performance, identify and resolve bottlenecks, and improve model inference and training performance.

KEY RESPONSIBILITIES:

Develop and optimize AI model inference and training solutions for AMD products, covering framework, model, and operator-level optimizations. Analyze and fine-tune accuracy and performance issues during model training and deployment, identify and resolve bottlenecks, fully leverage hardware resources, and significantly enhance model inference and training performance, reduce latency in large model inference, and boost throughput.
Research and develop advanced model optimization techniques, including but not limited to model quantization, model compression, efficient attention mechanisms, and efficient model architectures. Collaborate with AMD software and hardware teams to deliver optimal end-to-end model training and inference solutions.
Provide technical support to AMD customers, helping them achieve optimal performance through effective use of AMD software and hardware.

TECHNICAL REQUIREMENTS:

Experience in model compression and inference accelerating techniques, such as quantization, sparsity, efficient attention, LV cache compression, etc. Those with publications in top conferences will be given priority.
Proficiency in Python/C/C++ programming, CUDA kernel development, and Triton, with experience in low-level algorithm performance debugging and acceleration.
Proficiency in at least one deep learning framework such as PyTorch , and familiarity with common distributed machine learning frameworks like Megatron, DeepSpeed, or Hugging Face Transformers.
Familiarity with mainstream LLM inference engines such as FasterTransformer, vLLM, and TRT-LLM, and common inference optimization methods like FlashAttention, PageAttention, Continuous Batching, and Speculative Decoding.

PERSONAL ABILITIES:

Strong problem-solving skills to tackle complex technical challenges and find effective solutions.
Excellent learning ability to quickly acquire new knowledge and skills, keeping pace with technological advancements.
Good communication skills to interact clearly and effectively with team members and customers from diverse backgrounds.

#LI-FL1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

Apply on company website

资深AI模型推理架构师 Job Listing at AMD in Beijing, Beijing (Job ID 63081-en-us)

Description

Job Seekers

资深AI模型推理架构师 Job Listing at AMD in Beijing, Beijing (Job ID 63081-en-us)

Description

Find Connections via Linkedin

General Tips

Asking for Help

Getting Introduced

Job Seekers