Skip to yearly menu bar Skip to main content


Long Presentation
in
Affinity Workshop: LXAI Research @ NeurIPS 2020

Model Misspecification in Multiple Weak Supervision

Salva Rühling Cachay


Abstract:

"Data programming has proven to be an attractive alternative to costly hand-labeling of data. In this paradigm, users encode domain knowledge into \emph{labeling functions}, heuristics that label a subset of the data noisily and may have complex dependencies. The effects on test set performance of a downstream classifier caused by label model misspecification are understudied--presenting a serious knowledge gap to practitioners, in particular since LF dependencies are frequently ignored. In this paper, we focus on modeling errors due to structure over-specification. Based on novel theoretical bounds on the modeling error, we empirically show that this error can be substantial, even when modeling a seemingly sensible structure."

Chat is not available.