Argumentation-based dialogue systems, which can handle and exchange arguments through dialogue, have been widely researched. It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems do not often have enough information in realistic situations. One way to fill in the gap is acquiring such missing information from dialogue partners (information-seeking dialogue). Existing informationseeking dialogue systems were based on handcrafted dialogue strategies that exhaustively examine missing information. However, these strategies were not specialized in collecting information for constructing rational arguments. Moreover, the number of system’s inquiry candidates grows in accordance with the size of the argument set that the system deal with. In this paper, we formalize the process of information-seeking dialogue as Markov decision processes (MDPs) and apply deep reinforcement learning (DRL) for automatic optimization of a dialogue strategy. By utilizing DRL, our dialogue strategy can successfully minimize objective functions: the number of turns it takes for our system to collect necessary information in a dialogue. We also proposed another dialogue strategy optimization based on the knowledge existence. We modeled the knowledge of the dialogue partner by using Bernoulli mixture distribution. We conducted dialogue experiments using two datasets from different domains of argumentative dialogue. Experimental results show that the proposed dialogue strategy optimization outperformed existing heuristic dialogue strategies.