The net orders in detailed routing are crucial to routing closure, especially in most modern routers following the sequential routing manner with the rip-up and reroute scheme. In advanced technology nodes, detailed routing has to deal with complicated design rules and large problem sizes, making its performance more sensitive to the order of nets to be routed. In literature, the net orders are mostly determined by simple heuristic rules tuned for specific benchmarks. In this work, we propose an asynchronous reinforcement learning (RL) framework to automatically search for optimal ordering strategies and a transfer learning (TL) algorithm to improve performance. By asynchronous querying, the router, pre-training the RL agents, and finetuning with the TL algorithm, we can generate high-performance routing sequences to achieve a 26% reduction in the DRC violations and a 1.2% reduction in the total costs compared with the state-of-the-art detailed router.