AI/ML learner focusing on Large Language Models (LLMs).
Passionate about the full LLM lifecycle from pre-training & fine‑tuning to alignment (RLHF) and inference optimization.
- LLM Alignment: RLHF, PPO, DPO & Retrieval‑Augmented Generation (RAG) systems
- Learning: Agentic Workflows, DeepSpeed & model quantization (AWQ/GPTQ)
- Education: Qilu University of Technology (QLUT)
- Research Interests: Reward modeling, context window extension, chain‑of‑thought (CoT)
- Currently seeking 2026 Summer Internship opportunities (LLM / Multimodal / Agent / RLHF)
- Email: 867762462f@gmail.com
- Zhihu: 二次函数


