 |
Hao Liang is a postdoctoral researcher at University of Maryland, College Park, working with Kaiqing Zhang. He received his PhD from The Chinese University of Hong Kong (CUHK), Shenzhen, under the supervision of Zhi-Quan (Tom) Luo. His research lies at the intersection of statistical and computational efficiency in decision-making algorithms, with a particular focus on risk awareness, safety, and multi-agent systems.
Email: haoliang1 at link.cuhk.edu.cn
[CV]
[Google Scholar]
[LinkedIn]
|
News
May 2026: Recognized as an ICML 2026 Gold Reviewer
April 2026: One paper accepted to ICML 2026: "How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?" [link]
January 2026: Two papers accepted to ICLR 2026: "Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation?" [link] and "BRIDGE: Bi-level Reinforcement Learning for Dynamic Group Structure in Coalition Formation Games"
November 2025: Served as Local Organizing Chair of the 7th International Conference on Distributed Artificial Intelligence (DAI 2025) held on November 21–24, 2025 in London, UK.
October 2025: Our paper, "Why GRPO Needs Normalization: A Local-Curvature Perspective on Adaptive Gradients" will be presented at NeurIPS 2025 Workshop on
Efficient Reasoning
👉 TL;DR: We reveal that GRPO’s normalization acts as an adaptive gradient mechanism aligned with local curvature, accelerating and stabilizing LLM reasoning training.
September 2025: Our paper, "Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems" is accepted by NeurIPS 2025 (Spotlight) [link]
👉 TL;DR: We develop GSAC, a causality-aware framework enabling provable scalability and fast cross-domain adaptation in large networked systems.
July 2024: My paper, "Bridging Distributional and Risk-sensitive Reinforcement Learning with
Provable Regret Bounds" is accepted by JMLR
👉 TL;DR: We bridge distributional and risk-sensitive RL under entropic risk measures, achieving near-optimal regret with computationally efficient DRL algorithms.
July 2024: Present "Bridging Distributional and Risk-Sensitive Reinforcement Learning: Balancing Statistical,
Computational, and Risk Considerations" at ICML 2024 FoRLaC Workshop
March 2024: Deliver a talk "Efficient Risk-aware Decision-making: A Distributional Perspective" at Vector Institute
March 2024: Deliver a talk "A Distribution Optimization Framework for Confidence Bounds of Risk
Measures" at the Informs Optimization Society (IOS) Conference
Janurary 2024: My paper, "Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz
Dynamic Risk Measures" is accepted at AISTATS 2024
July 2023: Present "A Distribution Optimization Framework for Confidence Bounds of Risk
Measures" at ICML 2023
Selected Papers
| Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation?
[link] |
ICLR 2026 |
| Hao Liang, Jiayu Cheng, Sean R. Sinclair, Yali Du |
| Why GRPO Needs Normalization: A Local-Curvature Perspective on Adaptive Gradients
[link] |
NeurIPS 2025 Workshop on Efficient Reasoning |
| Cheng Ge*, Heqi Yin*, Hao Liang†, Jiawei Zhang† |
| * Equal contribution. † Co-last authors. |
| Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems
[link] |
NeurIPS 2025 (Spotlight) |
| Hao Liang*, Shuqing Shi*, Yudi Zhang, Biwei Huang, Yali Du |
| Bridging Distributional and Risk-sensitive Reinforcement Learning with
Provable Regret Bounds [link] |
JMLR |
| Hao Liang, Zhi-Quan Luo |
| Regret Bounds for Risk-sensitive Reinforcement Learning with Lipschitz
Dynamic Risk Measures [link] |
AISTATS 2024 |
| Hao Liang, Zhi-Quan Luo |
| A Distribution Optimization Framework for Confidence Bounds of Risk
Measures [link] |
ICML 2023 |
| Hao Liang, Zhi-Quan Luo |