Impact of memory on beamforming optimization for DDPG-assisted RIS-multi-user system


  • Naima Sofi
  • Faouzi Didi
  • Moustafa Sahnoune Chaouche
  • Oussama Abdelilah Sofi
  • Fethi Tarik Bendimerad



RIS, DRL, LSTM, optimization, 6G, phase-shift, beamforming


Recently, reconfigurable intelligent surfaces (RIS) have garnered considerable attention as indispensable components driving the evolution of future 6G wireless communication systems. This heightened interest is primarily attributed to notable advancements in programmable meta-material fabrication, which enable the creation of highly versatile and adaptable surfaces. These surfaces, often referred to as intelligent reflecting arrays, represent a significant departure from the conventional capabilities of massive multiple-input multiple-output (MIMO) systems, thereby catalyzing the emergence of intelligent radio environments characterized by enhanced flexibility and efficiency. Our study is dedicated to the comprehensive exploration of coordinated design strategies that encompass both the transmission beamforming matrix at the base station and the phase shift matrix at the RIS. Leveraging recent breakthroughs in deep reinforcement learning (DRL), our approach harnesses the power of a Long Short-Term Memory (LSTM) based DRL algorithm. This algorithm orchestrates the joint design process through iterative interactions with the environment, strategically guided by predefined rewards operating within a continuous framework of states and actions. While recent research has underscored the efficacy of LSTM-based architectures in bolstering the learning capacity of reinforcement learning (RL) algorithms and simplifying the search process, our investigation reveals a nuanced insight. Contrary to prevailing trends, we find that the integration of memory into the Deep Deterministic Policy Gradient (DDPG) with LSTM (DDPG-LSTM) algorithm yields unexpected consequences, negatively impacting the system's overall performance. Through preliminary simulation results, we empirically demonstrate the adverse effect of memory on DDPG performance when applied to RIS systems. This finding not only sheds light on the complexities of optimizing 6G wireless communication systems but also underscores the importance of careful algorithmic design and parameter tuning in achieving desired outcomes in emerging technologies.


Huang C, Zappone A, Alexandropoulos GC, et al (2018) Reconfigurable Intelligent Surfaces for Energy Efficiency in Wireless Communication.

Huang C, Mo R, Yuen C (2020) Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning.

Zhou G, Pan C, Ren H, et al (2020) Secure Wireless Communication in RIS-Aided MISO Systems with Hardware Impairments.

Wu Q, Zhang R (2018) Intelligent Reflecting Surface Enhanced Wireless Network: Joint Active and Passive Beamforming Design.

Jiang F, Lin W, Zhang H, et al (2022) Design of a Reconfigurable Intelligent Surface Algorithm Based on Multiple-Input Multiple-Output. Traitement du Signal 39:1943–1950.

Yu X, Xu D, Sun Y, et al (2019) Robust and Secure Wireless Communications via Intelligent Reflecting Surfaces.

Kumar V, Flanagan M, Zhang R, Tran L-N (2021) Achievable Rate Maximization for Underlay Spectrum Sharing MIMO System with Intelligent Reflecting Surface.

Kumar V, Flanagan MF, Ng DWK, Tran L-N (2021) On the Secrecy Rate under Statistical QoS Provisioning for RIS-Assisted MISO Wiretap Channel.

Deshpande AA, Vaca-Rubio CJ, Mohebi S, et al (2022) Energy-Efficient Design for RIS-assisted UAV communications in beyond-5G Networks. In: 2022 20th Mediterranean Communication and Computer Networking Conference, MedComNet 2022. Institute of Electrical and Electronics Engineers Inc., pp 158–165.

Saglam B, Gurgunoglu D, Kozat SS (2022) Deep Reinforcement Learning Based Joint Downlink Beamforming and RIS Configuration in RIS-aided MU-MISO Systems Under Hardware Impairments and Imperfect CSI.

Zhang Q, Saad W, Bennis M (2020) Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning.

Yang Z, Liu Y, Chen Y, Al-Dhahir N (2021) Machine Learning for User Partitioning and Phase Shifters Design in RIS-Aided NOMA Networks.

Tham ML, Wong YJ, Iqbal A, et al (2023) Deep Reinforcement Learning for Secrecy Energy- Efficient UAV Communication with Reconfigurable Intelligent Surface. In: IEEE Wireless Communications and Networking Conference, WCNC. Institute of Electrical and Electronics Engineers Inc.

Guo X, Chen Y, Wang Y (2021) Learning-Based Robust and Secure Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave UAV Communications. IEEE Wireless Communications Letters 10:1795–1799.

Yang H, Xiong Z, Zhao J, et al (2020) Deep Reinforcement Learning Based Intelligent Reflecting Surface for Secure Wireless Communications.

Chen P, Li X, Matthaiou M, Jin S (2023) DRL-Based RIS Phase Shift Design for OFDM Communication Systems. IEEE Wireless Communications Letters 12:733–737.

Gong H, Wang P, Ni C, Cheng N (2022) Efficient Path Planning for Mobile Robot Based on Deep Deterministic Policy Gradient. Sensors 22:.

Zhang H, Xi S, Jiang H, et al (2023) Resource Allocation and Offloading Strategy for UAV-Assisted LEO Satellite Edge Computing. Drones 7:.

Richard SS, Andrew GB (2018) Reinforcement Learning: An Introduction. A Bradford Book55 Hayward Street Cambridge , MA, United States.

Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9:17351780.

Zhao J, Huang F, Lv J, et al (2020) Do RNN and LSTM have Long Memory.

Henderson P, Islam R, Bachman P, et al (2017) Deep Reinforcement Learning that Matters.



How to Cite

Sofi, N., Didi, F., Chaouche, M. S., Sofi, O. A., & Bendimerad, F. T. (2024). Impact of memory on beamforming optimization for DDPG-assisted RIS-multi-user system. STUDIES IN ENGINEERING AND EXACT SCIENCES, 5(1), 1589–1609.