About me

Gaosong is a software engineer with 5 years of experience specializing in optimizing Large Language Model (LLM) performance and developing robust software solutions (Python, C++). Proven ability in designing and implementing LLM inference API services, leveraging tools like vLLM and low-bit quantization to enhance throughput and reduce memory usage. Proficient in GPU architecture, CUDA programming, PyTorch, and experienced with distributed systems and big data platforms.

For Recruiters and Prospective Colleagues

To learn more about my:

  • Problem-Solving Approach
  • Communication Style
  • Preferred Work Environment
  • Career Goals and Skills

Please read this post