信息瓶颈法
瓶颈
一般化
计算机科学
代表(政治)
人工神经网络
极限(数学)
图层(电子)
信息论
深度学习
相互信息
理论计算机科学
算法
人工智能
数学
数学分析
法学
化学
有机化学
嵌入式系统
统计
政治
政治学
作者
Naftali Tishby,Noga Zaslavsky
标识
DOI:10.1109/itw.2015.7133169
摘要
Deep Neural Networks (DNNs) are analyzed via the theoretical framework of the information bottleneck (IB) principle. We first show that any DNN can be quantified by the mutual information between the layers and the input and output variables. Using this representation we can calculate the optimal information theoretic limits of the DNN and obtain finite sample generalization bounds. The advantage of getting closer to the theoretical limit is quantifiable both by the generalization bound and by the network's simplicity. We argue that both the optimal architecture, number of layers and features/connections at each layer, are related to the bifurcation points of the information bottleneck tradeoff, namely, relevant compression of the input layer with respect to the output layer. The hierarchical representations at the layered network naturally correspond to the structural phase transitions along the information curve. We believe that this new insight can lead to new optimality bounds and deep learning algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI