[colin@world ~]$
whoami
[colin@world ~]$
cat interests.md
- My current research focuses on building code LLMs, including:
- - Continual Learning on Code can help efficiently handle complicated and evolving codebases,
complementary to scaling model and data size up.
- - "World modeling" for Code is expected to improve code semantics understanding, but current
results show limited benefits of transferability (e.g., to agentic settings), which motivates future
work on "why" and "how".
[colin@world ~]$
cat ./pubs/*_selected*
-
[arXiv] SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning
Jinjun Peng*, Magnus Saebo*, Tianjun Zhong, Yi-Jie Cheng, Junfeng Yang, Baishakhi Ray, Simin Chen, Yangruibo Ding
[arXiv:2601.21649]
[side notes]
-
[LLM4Code '25] CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code
Generation
Jinjun Peng, Leyi Cui, Kele Huang, Junfeng Yang, Baishakhi
Ray
[arXiv:2501.08200]
[code]
-
[NeurIPS '24] SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
Yangruibo Ding, Jinjun Peng, Marcus J Min, Gail Kaiser,
Junfeng Yang, Baishakhi Ray
[arXiv:2406.01006]
[code]
-
[ESEC/FSE '23 Distinguished Paper] NeuRI: Diversifying DNN Generation via Inductive Rule
Inference
Jiawei Liu, Jinjun Peng, Yuyao Wang, and Lingming
Zhang
[arXiv:2302.02261]
[code 0]
[code 1]
-
[OOPSLA '24] Quarl: A Learning-Based Quantum Circuit Optimizer
Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded
Padon, Zhihao Jia
[arXiv:2307.10120]
[code]
[colin@world ~]$
cat ./exps/*
-
Amazon, Applied Scientist Intern, 2025.05 - 2025.08
- Agentic retrieval for coding tasks
-
Columbia University, Ph.D. student in Computer Science, 2023.09 - ?
- Advised by Professors Baishakhi Ray and Junfeng Yang.
-
Tsinghua University, B.E. in Computer Science, 2019 - 2023
-
Beijing National Day School, 2016 - 2019
- Where dreams begin.
[colin@world ~]$
cat services.md
-
Program Committee/Reviewer
- ICLR ['26]
- DL4C @ NeurIPS ['25], ASE NIER '25
- TSE ['25], TOSEM ['25], TKDD ['25], Journal of Big Data
[colin@world ~]$
cat ack.md
- - [Always] My friends, mentors, and collaborators.
- Funding:
- - [2025] Capital One Fellow
- - [2024] Amazon Trusted AI Challenge ($250k)
- Compute:
- - [2024] NSF ACCESS Program
- - [2024] OpenAI Researcher Access Program