摘要:AbstractWireless Sensor and Actuator Networks like ISA SP100.11a and WirelessHART have a special device known as network manager, which has tasks such as admission control of devices, definition of routes and allocation of communication resources. The routing algorithms used in those protocols need to build routes that keep path redundancy, network reliability and balance energy consumption, resource use and latency. Some of the routing algorithms used for those protocols have weights that allow adjusting some route preferences. The dynamicity of wireless networks can be challenging for adjusting the routing algorithms, and Reinforcement Learning models can be useful to select and adapt weights and optimize routes according to application requirements and current operating conditions. In this work, a global routing agent with Q-Learning is proposed for weight adjustments of a state-of-the-art routing algorithm, aiming the balance of overall latency and lifetime of the network. States are modeled as a set of weights and actions change the current state. The rewards are positive when a state-action pair increases the expected network lifetime and decreases the average network latency. Experiments were conducted using a WirelessHART simulator, and the results showed that the approach can balance the latency and network lifetime when compared with other state-of-the-art routing algorithms with fixed weights.