Journal of Hydroelectric Engineering ›› 2023, Vol. 42 ›› Issue (11): 21-32.doi: 10.11660/slfdxb.20231103
Previous Articles Next Articles
Online:
Published:
Abstract: Compared with a single reservoir, cascade reservoirs operation features a state space increasing exponentially. This paper describes a Deep Q-network (DQN) algorithm for deep reinforcement learning to solve the dimension disaster problem that is faced by the table-based reinforcement learning method in optimizing the long-term operation of cascade reservoirs. First, we derive a joint distribution function of stochastic inflow runoffs of the reservoirs based on the Copula function. Then, following the idea of time series difference, we construct a target neural network and a main neural network for approximating the values of the current action state and the next action state, respectively, and use ε-greedy algorithm to obtain optimal operation policy. Finally, the main parameters of reservoir operation are optimized by step to ensure operation efficiency. Compared with the Q-learning algorithm or its modification, the DQN algorithm improves the objective value of optimal scheduling, accelerates convergence, and avoids dimension disaster effectively in the optimization of cascade reservoirs operation.
Key words: stochastic optimal operation of cascade reservoirs, deep reinforcement learning, Deep Q-network algorithm, temporal difference idea, exploration and exploitation strategy
LI Wenwu, ZHOU Jiani, PEI Benlin, ZHANG Yifan. Study on long-term stochastic optimal operation of cascade reservoirs by deep reinforcement learning[J].Journal of Hydroelectric Engineering, 2023, 42(11): 21-32.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://www.slfdxb.cn/EN/10.11660/slfdxb.20231103
http://www.slfdxb.cn/EN/Y2023/V42/I11/21
Cited