Architectural Complexity Measures of Recurrent Neural Networks
In this paper, we systematically analyze the connecting architectures of
recurrent neural networks (RNNs). Our main contribution is twofold: first, we
present a rigorous graph-theoretic framework describing the connecting
architectures of RNNs in general. Second, we propose three architecture
complexity measures of RNNs: (a) the recurrent depth, which captures the RNN's
over-time nonlinear complexity, (b) the feedforward depth, which captures the
local input-output nonlinearity (similar to the "depth" in feedforward neural
networks (FNNs)), and (c) the recurrent skip coefficient which captures how
rapidly the information propagates over time. We rigorously prove each
measure's existence and computability. Our experimental results show that RNNs
might benefit from larger recurrent depth and feedforward depth. We further
demonstrate that increasing recurrent skip coefficient offers performance
boosts on long term dependency problems.
Authors
Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio