计算机科学
稳健性(进化)
离散化
数学优化
梯度下降
正规化(语言学)
算法
人工智能
数学
数学分析
生物化学
化学
人工神经网络
基因
作者
Jiuling Zhang,Zhiming Ding
标识
DOI:10.1109/icdm51629.2021.00099
摘要
Differentiable architecture search (DARTS) is a promising end to end NAS method which directly optimizes the architecture parameters through general gradient descent. However, DARTS is brittle to the catastrophic failure incurred by the skip connection in the search space. Recent studies also cast doubt on the basic underlying hypotheses of DARTS which are argued to be inherently prone to the performance discrepancy between the continuous-relaxed supernet in the training phase and the discretized finalnet in the evaluation phase. We Figure out that the robustness problem and the skepticism can both be explained by the information bypass leakage during the training of the supernet. This naturally highlights the vital role of the sparsity of architecture parameters in the training phase which has not been well developed in the past. We thus propose a novel sparse-regularized approximation and an efficient mixed-sparsity training scheme to robustify DARTS by eliminating the information bypass leakage. We subsequently conduct extensive experiments on multiple search spaces to demonstrate the effectiveness of our method.
科研通智能强力驱动
Strongly Powered by AbleSci AI