The need for deceptive news detection on social media platforms has grown significantly in the last several years. In 2018 about half of consumers expected the news they received on social media to be largely inaccurate and surveys from 2021 have revealed that 57% of adults would like to see steps taken to restrict the spread of false information online . Although several approaches have been proposed, most rely on standard performance metrics and test data sets which do not effectively capture underlying biases or model dependencies; they answer the question of how the model is performing but not why or within what circumstances the model will perform in this way. Recent work has also shown models display biases towards certain dialects – e.g., “California English” . Standard test data used to evaluate model performance is not likely to be representative across variations, such as regional or dialectic differences in language, that the model will encounter when deployed in a real-world setting.