Host Intrusion Detection System (HIDS) is an important research topic in the field of cyberspace security. With the explosion in the number of malicious attacks in recent years, machine learning-based detection method is now the most common and efficient approach. While traditional centralized machine learning needs to transmit data to the central server for training, which not only requires the central server to have large computing resources, but also causes problems such as sensitive data leakage and communication overhead. As a distributed machine learning paradigm, Federated Learning (FL) can achieve multi-party collaborative training and aggregate a unified global model without data sharing, which can well alleviate these problems. It is worth noting that existing studies on the use of FL in HIDS are all conducted in the scenario where the data is independent and identically distributed (IID). However, due to the different context of hosts, the data generated by hosts is usually non-independent and identically distributed (Non-IID) in reality. Therefore, We investigate the impact of Non-IID data with different skew levels on FL in HIDS. On this basis, we propose a data augmentation FL algorithm based on Synthetic Minority Over-Sampling Technique (SMOTE) to reduce the impact of Non-IID data. We also develop a data collection module using extended Berkeley Packet Filter (eBPF) technology to collect a dataset for experiments. Experimental results show that our proposed FL algorithm can effectively improve the performance of HIDS under Non-IID data.