Poster
On the Provable Generalization of Recurrent Neural Networks
Lifu Wang · Bo Shen · Bo Hu · Xing Cao
Keywords: [ Deep Learning ] [ Theory ]
Abstract:
Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that over-parameterized networks can learn functions in some notable concept classes with a provable generalization error bound. In this paper, we analyze the training and generalization for RNNs with random initialization, and provide the following improvements over recent works:(1) For a RNN with input sequence , previous works study to learn functions that are summation of and require normalized conditions that with some very small depending on the complexity of . In this paper, using detailed analysis about the neural tangent kernel matrix, we prove a generalization error bound to learn such functions without normalized conditions and show that some notable concept classes are learnable with the numbers of iterations and samples scaling almost-polynomially in the input length .(2) Moreover, we prove a novel result to learn N-variables functions of input sequence with the form , which do not belong to the additive'' concept class, i,e., the summation of function . And we show that when either or is small, will be learnable with the number iterations and samples scaling almost-polynomially in the input length .
Chat is not available.