摘要:AbstractIn the paper, we formulate the problem of charging electric vehicles with a time-dependent energy source as a Markov Decision Process (MDP), with states defining the presence of cars, their individual levels of charge as well as the level of available renewable energy and storage devices. We exploit MDPs-based online algorithms such as Monte-Carlo Tree Search (MCTS) to overcome the scalability issues associated with charging of a large number of EVs, which corresponds to real distributed networks with flexible options. Using MCTS, we were able to generate optimal policies that balanced the energy toll on the electric grid with the final charge levels of each vehicle. We compare the performance of offline MDP solvers (Discrete Value Iteration algorithm) and online MDP solvers (MCTS) as well as reinforcement learning-based solvers (Q-learning) to find the optimal policy for EVs flexible charging optimization.