Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations

Is EMA Robust? Examining the Robustness of Data Auditing and a Novel Non-calibration Extension

Ayush Alag · Yangsibo Huang · Kai Li


Abstract: Auditing data usage in machine learning models is crucial for regulatory compliance, especially with sensitive data like medical records. In this study, we scrutinize potential vulnerabilities within an acknowledged baseline method, Ensembled Membership Auditing (EMA), which employs membership inference attacks to determine if a specific model was trained using a particular dataset. We discover a novel False Negative Error Pattern in EMA when applied to large datasets, under adversarial methods like dropout, model pruning, and MemGuard. Our analysis across three datasets shows that larger convolutional models pose a greater challenge for EMA, but a novel metric-set analysis improves performance by up to $5\%$. To extend the applicability of our improvements, we introduce EMA-Zero, a GAN-based dataset auditing method that does not require an external calibration dataset. Notably, EMA-Zero performs comparably to EMA with synthetic calibration data trained on as few as 100 samples.

Chat is not available.