Journal of International Commercial Law and Technology
2025, Volume:6, Issue:1 : 836-840 doi: dx.doi.org/10.61336/Jiclt/25-01-81
Research Article
Adaptive Reinforcement Learning for Smart City Traffic Optimization
 ,
1
Professor, Jagan Institute of Management Studies, Rohini, New Delhi
Received
Sept. 30, 2025
Revised
Oct. 16, 2025
Accepted
Oct. 27, 2025
Published
Nov. 12, 2025
Abstract

Urban traffic congestion is a critical issue impacting travel time, fuel consumption, and air quality. Traditional traffic management systems rely on static rules and limited sensor feedback, which fail to adapt to dynamic and unpredictable conditions. This paper proposes an Adaptive Reinforcement Learning (ARL) approach for optimizing traffic signals within smart cities. The ARL model leverages continuous environmental feedback to adjust signal timing based on real-time vehicular flow. Simulations using a synthetic traffic network demonstrate that the proposed model reduces average waiting time by 28%, improves throughput by 21%, and decreases CO₂ emissions by 16% compared to traditional fixed-time control. These results indicate that ARL is a promising direction for sustainable urban mobility

Keywords
INTRODUCTION

Traffic congestion remains a major challenge in modern cities. Static and semi-adaptive systems, though efficient under predictable patterns, cannot cope with stochastic variations in vehicle density. Reinforcement Learning (RL) provides a self-learning framework where an agent interacts with its environment, receives feedback, and learns an optimal policy.
This research introduces an Adaptive Reinforcement Learning (ARL) framework capable of dynamically tuning parameters according to real-time changes, ensuring stable and efficient control even under uncertain traffic conditions

LITERATURE REVIEW

Recent studies have applied RL to traffic management with varying degrees of success. Van der Pol and Oliehoek (2016)demonstrated that Deep Q-Networks (DQN) outperform traditional Q-learning in non-linear traffic environments. Wei et al. (2018) introduced CoLight, a multi-agent RL approach for signal coordination. However, these methods often struggle with scalability and adaptability. Adaptive frameworks, as discussed by Genders and Razavi (2019), attempt to balance learning speed and stability.

This paper builds upon these foundations by incorporating adaptive reward functions and policy update rates that self-adjust according to congestion intensity.

METHODOLOGY

Problem Formulation

Each traffic intersection is modeled as an RL agent. The state (S) includes queue lengths, waiting times, and neighboring intersection statuses. The action (A) represents the green-light duration for each lane direction. The reward (R) penalizes vehicle delays and rewards higher throughput.

 

Adaptive Reinforcement Learning Model

The ARL model modifies traditional Q-learning using an adaptive learning rate (α) and reward scaling:

 

Simulation Setup

  • Tool Used: SUMO (Simulation of Urban MObility)
  • Network: 4×4 intersection grid
  • Vehicle Input: 500–1500 vehicles/hour/lane
  • Comparison: Fixed-time, Conventional Q-Learning, Proposed ARL

 

Results and Analysis

Model

Avg. Waiting Time (s)

Throughput (veh/hr)

CO₂ Emission (g/km)

Fixed-Time

72.4

820

140.6

Q-Learning

56.8

960

126.3

ARL (Proposed)

52.1

1160

118.0

Table 1: Performance comparison of traffic control methods.

 

Analysis:
As shown in Table 1, the proposed ARL method achieves a 28% reduction in waiting time compared to fixed-time control and 8% improvement over conventional Q-Learning. Figure 1 (below) shows the cumulative reward convergence, demonstrating faster stabilization with ARL due to dynamic adaptation.

 

Figure 1: Average Waiting Time by Model

 

Figure 2: Throughput Comparison

 

Figure 3: CO₂ Emission by Model

 

Figure 4: Cumulative Reward Convergence Curve

 

DISCUSSION

The results confirm that adaptability in learning rate and reward scaling enhances convergence speed and performance stability. Unlike static RL, ARL maintains efficiency during unexpected traffic surges. The scalability to larger networks is promising, though further optimization is needed to reduce computational cost in multi-agent scenarios.

CONCLUSION

This study demonstrates that Adaptive Reinforcement Learning significantly improves traffic flow and reduces congestion. Future work will focus on:

  • Multi-intersection cooperative learning.
  • Integration with real-time IoT sensor data.

Deployment on edge-AI platforms for real-time inference.

REFERENCES
  1. Belletti, F., Haziza, D., Gomes, G., & Bayen, A. M. (2018). Expert level control of ramp metering based on multi-task deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 19(4), 1198–1207.
  2. Chu, T., Wang, J., & Codecà, L. (2019). Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 21(11), 4516–4525.
  3. Kharb, L. (2019). Implementing IoT and data analytics to overcome vehicles danger. Int J Innov Technol Exploring Eng (IJITEE). ISSN2278, 3075.
  4. Genders, W., & Razavi, S. (2019). Evaluating reinforcement learning state representations for adaptive traffic signal control. Procedia Computer Science, 151, 708–713.
  5. Jain, P., & Kharb, L. (2019). Future of Transport: Connected Vehicles. Future6(11).
  6. Khamis, M. A., & Gomaa, W. (2014). Adaptive multi-agent reinforcement learning for traffic signal control. Proceedings of the 2014 International Conference on Intelligent Systems Design and Applications.
  7. Kharb, L. (2019). Implementing IoT and data analytics to overcome vehicles danger. Int J Innov Technol Exploring Eng (IJITEE). ISSN2278, 3075.
  8. Kharb, D. L. & Chahal, D. D. (2025). Squeak: Nurturing Creativity and Innovation with an Open Source Smalltalk Language. Journal of Marketing & Social Research, 2(2), 41-43.
  9. Li, L., Lv, Y., & Wang, F. Y. (2016). Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica, 3(3), 247–254.
  10. Liu, Y., Zhu, Y., & Gao, J. (2022). Dynamic adaptive learning for intelligent traffic management. Transportation Research Part C: Emerging Technologies, 137, 103607.
  11. Mannion, P., Duggan, J., & Howley, E. (2016). An experimental review of reinforcement learning algorithms for adaptive traffic signal control. Autonomous Agents and Multi-Agent Systems, 31(2), 285–306.
  12. Cavaliere, L. P. L., Rawat, S., Sidana, N., Acharjee, P. B., Kharb, L., & Podile, V. (2024). Securing Automated Systems with BT: Opportunities and Challenges. Robotics and Automation in Industry 4.0, 337-348.
  13. Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
  14. Oliehoek, F. A., & van der Pol, E. (2016). Deep reinforcement learning for traffic light control. arXiv preprint arXiv:1609.08684.
  15. Prashanth, L. A., & Bhatnagar, S. (2011). Reinforcement learning with average cost for traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 12(2), 412–421.
  16. Kharb, "A Perspective View on Commercialization of Cognitive Computing," 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2018, pp. 829-832, doi: 10.1109/CONFLUENCE.2018.8442728.
  17. Shi, L., Gao, Y., & Chen, M. (2021). Adaptive deep Q-learning for urban traffic flow optimization. Expert Systems with Applications, 186, 115719.
  18. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.
  19. Kharb, L. & Singh, P. (2021). Role of Machine Learning in Modern Education and Teaching. In S. Verma & P. Tomar (Eds.), Impact of AI Technologies on Teaching, Learning, and Research in Higher Education(pp. 99-123). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-7998-4763-2.ch006
  20. Tang, Z., et al. (2020). Multi-agent cooperative learning for smart city traffic control. IEEE Access, 8, 157502–157512.
  21. Van der Pol, E., & Oliehoek, F. A. (2016). Coordinated deep reinforcement learners for traffic light control. NeurIPS Workshop on Learning Systems.
  22. Singh, R., Singh, P., Kharb, L. (2020). Proposing Real-Time Smart Healthcare Model Using IoT. In: Raj, P., Chatterjee, J., Kumar, A., Balamurugan, B. (eds) Internet of Things Use Cases for the Healthcare Industry. Springer, Cham. https://doi.org/10.1007/978-3-030-37526-3_2
  23. Wei, H., Zheng, G., Yao, H., & Li, Z. (2018). CoLight: Learning network-level cooperation for traffic signal control. Proceedings of the 27th ACM CIKM, 1913–1922.
  24. Xiong, Y., Zhang, Z., & Zhao, J. (2023). Adaptive learning rate optimization in reinforcement learning for dynamic environments. IEEE Access, 11, 4128–4139.
  25. Yang, K., et al. (2020). Federated reinforcement learning for adaptive traffic control. Transportation Research Record, 2674(5), 1–14.
  26. Yin, H., & Li, J. (2019). A survey on reinforcement learning for intelligent transportation systems. IEEE Transactions on Intelligent Transportation Systems, 20(12), 4649–4672.
  27. Zhang, Q., Li, X., & Wang, F. (2022). Hybrid model for adaptive traffic signal control using deep reinforcement learning. Applied Intelligence, 52(8), 8741–8757.
  28. Zheng, G., et al. (2019). Learning phase competition for traffic signal control. Advances in Neural Information Processing Systems, 32, 1–11.
Recommended Articles
Original Article
Cross-Border Jurisdiction In Cyberspace The Role Of The Hague Conference In Resolving Online Disputes
Research Article
IoT-Powered Parking Management Systems: Architectures, Enabling Technologies, and Future Pathways
...
Published: 10/11/2025
Research Article
Sowing Innovation, Harvesting Change: A Study of Agricultural Technology Adoption Among Smallholders in Vidarbha, India
Published: 10/11/2025
Research Article
Financial Inclusion and Ownership of Bank Account and Savings Among Individual and Household in India
...
Published: 08/11/2025
Loading Image...
Volume:6, Issue:1
Citations
7 Views
6 Downloads
Share this article
© Copyright Journal of International Commercial Law and Technology