NeurIPS Poster Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View

Poster

Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View

Anlan Yu · Shusen Jing · Ning Lyu · Wujie Wen · Zhiyuan Yan

West Ballroom A-D #6901

[ Abstract ]

[ Paper] [ Slides] [ Poster] [ OpenReview]

Thu 12 Dec 11 a.m. PST — 2 p.m. PST

Abstract: Error correcting output code (ECOC) is a classic method that encodes binary classifiers to tackle the multi-class classification problem in decision trees and neural networks.Among ECOCs, the one-hot code has become the default choice in modern deep neural networks (DNNs) due to its simplicity in decision making. However, it suffers from a significant limitation in its ability to achieve high robust accuracy, particularly in the presence of weight errors. While recent studies have experimentally demonstrated that the non-one-hot ECOCs with multi-bits error correction ability, could be a better solution, there is a notable absence of theoretical foundations that can elucidate the relationship between codeword design, weight-error magnitude, and network characteristics, so as to provide robustness guarantees. This work is positioned to bridge this gap through the lens of neural tangent kernel (NTK). We have two important theoretical findings: 1) In clean models (without weight errors), utilizing one-hot code and non-one-hot ECOC is akin to altering decoding metrics from

l_{2}

$l_2$ distance to Mahalanobis distance. 2) In non-clean models (with weight errors), if the normalized distance exceeds a threshold, then non-clean DNNs can reach the clean model's accuracy as long as the code length approaches infinity. This threshold is determined by DNN architecture (e.g. layer number, activation), weight error magnitude, and the distance between the output and the nearest codeword. Based on these findings, we further demonstrate how to practically use them to identify optimal ECOCs for simple tasks (short-code ECOCs) and complex tasks (long-code ECOCs), by balancing the code orthogonality (as per finding 1) and code distance (as per finding 2). Extensive experimental results across four datasets and four DNN models validate the superior performance of constructed codes, guided by our findings, compared to existing ECOCs. To our best knowledge, this is the first work that provides theoretical explanations for the effectiveness of ECOCS and offers associated design guidance for optimal ECOCs specifically tailored to DNNs.

Chat is not available.