作者
Francesco Musumeci,Virajit Garbhapu Venkata,Yusuke Hirota,Yoshinari Awaji,Sugang Xu,Masaki Shiraiwa,Biswanath Mukherjee,Massimo Tornatore
摘要
Optical network failure management (ONFM) is a promising application of machine learning (ML) to optical networking. Typical ML-based ONFM approaches exploit historical monitored data, retrieved in a specific domain (e.g., a link or a network), to train supervised ML models and learn failure characteristics (a signature) that will be helpful upon future failure occurrence in that domain. Unfortunately, in operational networks, data availability often constitutes a practical limitation to the deployment of ML-based ONFM solutions, due to scarce availability of labeled data comprehensively modeling all possible failure types. One could purposely inject failures to collect training data, but this is time consuming and not desirable by operators. A possible solution is transfer learning (TL), i.e., training ML models on a source domain (SD), e.g., a laboratory testbed, and then deploying trained models on a target domain (TD), e.g., an operator network, possibly fine-tuning the learned models by re-training with few TD data. Moreover, in those cases when TL re-training is not successful (e.g., due to the intrinsic difference in SD and TD), another solution is domain adaptation, which consists of combining unlabeled SD and TD data before model training. We investigate domain adaptation and TL for failure detection and failure-cause identification across different lightpaths leveraging real optical SNR data. We find that for the considered scenarios, up to 20% points of accuracy increase can be obtained with domain adaptation for failure detection, while for failure-cause identification, only combining domain adaptation with model re-training provides significant benefit, reaching 4%–5% points of accuracy increase in the considered cases.