Towards self-learning control of HVAC systems with the consideration of dynamic occupancy patterns: Application of model-free deep reinforcement learning

占用率暖通空调强化学习计算机科学钢筋控制（管理）工程类人工智能控制工程建筑工程空调结构工程机械工程

作者

Mohammad Esrafilian-Najafabadi,Fariborz Haghighat

出处

期刊：Building and Environment [Elsevier BV]
日期：2022-11-01 卷期号：226: 109747-109747 被引量：1

标识

DOI：10.1016/j.buildenv.2022.109747

摘要

This study proposes a self-learning control system that aims to learn occupancy profiles, building energy consumption patterns, and lag-time of the heating, ventilation, and air-conditioning (HVAC) systems. The control system learns by interacting with the environment with no need to develop building models and occupancy prediction models. The controller is developed based on a double deep Q-networks (DDQN) algorithm, as a model-free reinforcement learning method. The system's performance is evaluated and compared with that of a model predictive control (MPC) system under two scenarios of perfect and actual occupancy predictions based on occupancy data collected from 20 residential units. The MPC is assisted by a genetic algorithm and supervised learning models for predicting future occupancy patterns, indoor operative temperature, and building energy consumption. The results show that in the case of using perfect occupancy prediction, the self-learning controller operates almost as well as the MPC while not requiring any models. When occupancy prediction uncertainty is added to the problem, the proposed method outperforms the MPC in terms of thermal comfort by increasing the average temperature deviation and deviation period by 0.24 °C and 7.87%, respectively. However, the DDQN agent causes significant thermal comfort violations during the initial training period. The system causes up to a 2.8% longer deviation period and a 0.32 °C higher average temperature deviation, compared with the performance of the fully-trained system. • A self-learning occupancy-based predictive control system is developed. • Double deep Q-network is utilized as a model-free reinforcement learning technique. • The performance is compared with that of a model predictive control. • Thermal comfort is improved by 7.87% with no need for occupancy and building models. • Trial-and-error-based learning process causes almost 2.8% thermal discomfort.

求助该文献

最长约 10秒，即可获得该文献文件

Towards self-learning control of HVAC systems with the consideration of dynamic occupancy patterns: Application of model-free deep reinforcement learning

今日热心研友