Workshop
Tue Dec 14 07:00 AM -- 03:35 PM (PST)
2nd Workshop on Self-Supervised Learning: Theory and Practice
Pengtao Xie · Ishan Misra · Pulkit Agrawal · Abdelrahman Mohamed · Shentong Mo · Youwei Liang · Jeannette Bohg · Kristina N Toutanova
Self-supervised learning (SSL) is an unsupervised approach for representation learning without relying on human-provided labels. It creates auxiliary tasks on unlabeled input data and learns representations by solving these tasks. SSL has demonstrated great success on images (e.g., MoCo [19], PIRL [9], SimCLR [20]) and texts (e.g., BERT [21]) and has shown promising results in other data modalities, including graphs, time-series, audio, etc. On a wide variety of tasks, SSL without using human-provided labels achieves performance that is close to fully supervised approaches. The existing SSL research mostly focuses on improving the empirical performance without a theoretical foundation. While the proposed SSL approaches are empirically effective, theoretically why they perform well is not clear. For example, why certain auxiliary tasks in SSL perform better than others? How many unlabeled data examples are needed by SSL to learn a good representation? How is the performance of SSL affected by neural architectures? In this workshop, we aim to bridge this gap between theory and practice. We bring together SSL-interested researchers from various domains to discuss the theoretical foundations of empirically well-performing SSL approaches and how the theoretical insights can further improve SSL’s empirical performance. Different from previous SSL-related workshops which focus on empirical effectiveness of SSL approaches without considering their theoretical foundations, our workshop focuses on establishing the theoretical foundation of SSL and providing theoretical insights for developing new SSL approaches.