Weiliang Zhao

I'm a first-year PhD student at Columbia University, advised by Professor Junfeng Yang and Professor Zhou Yu. My research interests mainly include safety and alignment of large language models and AI agents.

I completed my Master’s in Computer Science in the Department of Computer Science at Columbia University, advised by Prof. Junfeng Yang and Prof. Chengzhi Mao.

I hold a BSc in Mathematics from the University of Edinburgh, where I was advised by Prof. Buark Buke.

📮Email  /  🔗LinkedIn  /  🎓Google Scholar  /  📃CV

profile photo
Publications
Diversity Helps Jailbreak Large Language Models
Weiliang Zhao, Daniel Ben-Levi, Junfeng Yang, Chengzhi Mao,
NAACL, 2025, Oral
arXiv

A Generalised jailbreaking technique by encouraging higher levels of diversification and adjacent obfuscated prompting to evaluate the vulnerabilities of LLMs.

Learning to Rewrite: Generalized LLM-Generated Text Detection
Wei Hao, Ran Li , Weiliang Zhao, Junfeng Yang, Chengzhi Mao,
ACL, 2025
arXiv

We propose a method designed to enhance the detection of LLM-generated text by learning to rewrite more on LLM-generated inputs and less on human generated inputs.

Visitor Map

Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.