RL/MARL Applications

Reinforcement Learning, Applications

Reinforcement learning (RL) has the potential be applied to many real-world applications. In our research, we also investigate the applications of RL and Multi-agent RL. Currently, we have been investigating two applications: one is traffic signal control; another is EDA. Traffic signals coordinating traffic movements are the key for transportation efficiency. However, conventional traffic signal control that heavily relies on pre-defined rules and assumptions on traffic conditions is far from intelligence. RL that learns from directly interacting with the environment has great potential to be applied to traffic signal control for building smart cities. EDA has many combinatorial optimization problems. Many of them are currently solved by heuristics, which usually obtain the performance far from the optimal. How to achieve better performance has been a long-standing prbolem. RL that aims to optimize the long-term return naturally fits many problems in EDA. Therefore, we also pay attention to solving the problems of EDA. In the following, we introduce some of our studies. For detail, please refer to the paper.

HiLight

The objective of traffic signal control is to optimize average travel time, which is a delayed reward in a long time horizon in the context of RL. However, existing work simplifies the optimization by using queue length, waiting time, delay, etc., as immediate reward and presumes these short-term targets are always aligned with the objective. Nevertheless, these targets may deviate from the objective in different road networks with various traffic patterns. Moreover, it remains unsolved how to cooperatively control traffic signals to directly optimize average travel time. To address these challenges, we propose a hierarchical and cooperative reinforcement learning method-HiLight. HiLight enables each agent to learn a high-level policy that optimizes the objective locally by selecting among the sub-policies that respectively optimize short-term targets. Moreover, the high-level policy additionally considers the objective in the neighborhood with adaptive weighting to encourage agents to cooperate on the objective in the road network. Empirically, we demonstrate that HiLight outperforms state-of-the-art RL methods for traffic signal control in real road networks with real traffic.

Net Oder Exploration in Detailed Routing

The net orders in detailed routing are crucial to routing closure, especially in most modern routers following the sequential routing manner with the rip-up and reroute scheme. In advanced technology nodes, detailed routing has to deal with complicated design rules and large problem sizes, making its performance more sensitive to the order of nets to be routed. In literature, the net orders are mostly determined by simple heuristic rules tuned for specific benchmarks. We propose an asynchronous reinforcement learning (RL) framework to search for optimal ordering strategies automatically. By asynchronous querying the router and training the RL agents, we can generate high-performance routing sequences to achieve better solution quality.

Publications

[DATE'21] Asynchronous Reinforcement Learning Framework for Net Order Exploration in Detailed Routing

Tong Qu, Yibo Lin, Zongqing Lu, Yajun Su, and Yayi Wei

Design, Automation and Test in Europe Conference (DATE), February 1-5, 2021.

[AAAI'21] Hierarchically and Cooperatively Learning Traffic Signal Control

Bingyu Xu, Yaowei Wang, Zhaozhi Wang, Huizhu Jia, and Zongqing Lu

Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), February 2-9, 2021.

(Acceptance Rate: 21%=¹⁶⁹²⁄₇₉₁₁)

MTLight: Efficient Multi-Task Reinforcement Learning for Traffic Signal Control

Liwen Zhu, Peixi Peng, Zongqing Lu, and Yonghong Tian

ICLR 2022 Workshop on Gamification and Multiagent Solutions.

[TCAD] Asynchronous Reinforcement Learning Framework and Knowledge Transfer for Net Order Exploration in Detailed Routing

Yibo Lin, Tong Qu, Zongqing Lu, Yajun Su, and Yayi Wei

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol 41, no. 9, pp 3132 - 3142, 2022

[TKDE] MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control

Liwen Zhu, Peixi Peng, Zongqing Lu, Yonghong Tian

IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 35, no. 11, pp 11570-11584, 2023.

[CVPR'23] Multi-Agent Automated Machine Learning

Zhaozhi Wang, Kefan Su, Jian Zhang, Huizhu Jia, Qixiang Ye, Xiaodong Xie, and Zongqing Lu

IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), Jun 18-22, 2023.

(Acceptance Rate: 25.78%=²³⁶⁰⁄₉₁₅₅)

[ICCV'23] ReLeaPS: Reinforcement Learning-based Illumination Planning for Generalized Photometric Stereo

Junhoong Chan, Bohan Yu, Heng Guo, Jieji Ren, Zongqing Lu, and Boxin Shi

International Conference on Computer Vision (ICCV), October 2-6, 2023.