Timezone: »

Workshop
NeurIPS 2022 Workshop Virtual Proposal: HCAI@NeurIPS 2022, Human Centered AI
Michael Muller · Plamen P Angelov · Hal Daumé III · Shion Guha · Q.Vera Liao · Nuria Oliver · David Piorkowski

Fri Dec 09 05:00 AM -- 12:00 PM (PST) @ Virtual

 Fri 5:00 a.m. - 5:15 a.m. Welcome (Introduction) 🔗 Fri 5:15 a.m. - 5:30 a.m. Keynote 1: Living with AI. Joonhwan Lee, Seoul National University, South Korea. (Keynote) 🔗 Fri 5:30 a.m. - 5:35 a.m. Discussion 🔗 Fri 5:34 a.m. - 5:35 a.m. Panel 1: Explainable AI (XAI) (Panel) 🔗 Fri 5:35 a.m. - 5:40 a.m. Closing the Creator-Consumer Gap in XAI: A Call for Participatory XAI Design with End-users. Sunnie S. Y. Kim, Elizabeth Anne Watkins, Olga Russakovsky, Ruth Fong, Andres Monroy-Hernandez (Panel Speaker) 🔗 Fri 5:40 a.m. - 5:45 a.m. Rethinking Explainability as a Dialogue: A Practitioner's Perspective. Dylan Z Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh (Panel Speaker) 🔗 Fri 5:45 a.m. - 5:50 a.m. A Thematic Comparison of Human and AI Explanations of Sexism Assessment. Sharon Ferguson, Paula Akemi Aoyagui, Rohan Alexander, Anastasia Kuzminykh (Panel Speaker) 🔗 Fri 5:50 a.m. - 5:55 a.m. Discussion 🔗 Fri 5:55 a.m. - 6:10 a.m. Keynote 2: Human-Centered Co-Creative AI: From Inspirational to Responsible AI. Mary Lou Maher, University of North Carolina Charlotte, US. (Keynote) 🔗 Fri 6:10 a.m. - 6:15 a.m. Discussion 🔗 Fri 6:14 a.m. - 6:15 a.m. Panel 2: Large Models (Panel) 🔗 Fri 6:15 a.m. - 6:20 a.m. User and Technical Perspectives of Controllable Code Generation. Stephanie Houde, Vignesh Radhakrishna, Praneeth Reddy, Juie Darwade, Haoran Hu, Kalpesh Krishna, Mayank Agarwal, Kartik Talamadupula, Justin D. Weisz (Panel Speaker) 🔗 Fri 6:20 a.m. - 6:25 a.m. Towards an Understanding of Human-AI Interaction in Prompt-Based Co-Creative Systems. Atefeh Mahdavi Goloujeh, Anne Sullivan, Brian Magerko (Panel Speaker) 🔗 Fri 6:25 a.m. - 6:30 a.m. Towards End-User Prompt Engineering: Lessons From an LLM-based Chatbot Design Tool. J.D. Zamfirescu-Pereira, Richmond Wong, Bjorn Hartmann, Qian Yang (Panel Speaker) 🔗 Fri 6:30 a.m. - 6:35 a.m. Discussion 🔗 Fri 6:35 a.m. - 6:45 a.m. short break (Break) 🔗 Fri 6:44 a.m. - 6:45 a.m. Panel 3: Creativity + Collaboration (Panel) 🔗 Fri 6:45 a.m. - 6:50 a.m. The Need for Explainability in AI-Based Creativity Support Tools. Antonios Liapis, Jichen Zhu (Panel Speaker) 🔗 Fri 6:50 a.m. - 6:55 a.m. Quantitatively Assessing Explainability in Collaborative Computational Co-Creativity. Michael Paul Clemens, Rogelio Enrique Cardona-Rivera, Courtney Rogers (Panel Speaker) 🔗 Fri 6:55 a.m. - 7:00 a.m. Embodied Socio-cognitive Reframing of Computational Co-Creativity. Manoj Deshpande, Brian Magerko (Panel Speaker) 🔗 Fri 7:00 a.m. - 7:05 a.m. Discussion 🔗 Fri 7:05 a.m. - 7:20 a.m. Keynote 3: Independent Community Rooted AI Research. Timnit Gebru, DAIR, US. (Keynote) 🔗 Fri 7:20 a.m. - 7:25 a.m. Discussion 🔗 Fri 7:24 a.m. - 7:25 a.m. Panel 4: Values + Participation (Panel) 🔗 Fri 7:25 a.m. - 7:30 a.m. Beyond Safety: Toward a Value-Sensitive Approach to the Design of AI Systems. Alexander J. Fiannaca, Cynthia L. Bennett, Shaun Kane, Meredith Ringel Morris (Panel Speaker) 🔗 Fri 7:30 a.m. - 7:35 a.m. Participation Interfaces for Human-Centered AI. Sean McGregor (Panel Speaker) 🔗 Fri 7:35 a.m. - 7:40 a.m. Expansive Participatory AI: Supporting Dreaming within Inequitable Institutions. Shiran Dudy (Panel Speaker) 🔗 Fri 7:40 a.m. - 7:45 a.m. Discussion 🔗 Fri 7:45 a.m. - 8:15 a.m. meal break (Break) 🔗 Fri 8:15 a.m. - 8:30 a.m. Keynote 4: Designing AI Systems for Digital Well-Being. Asia Biega, Max Planck Institute for Security and Privacy (MPI-SP), Germany. (Keynote) 🔗 Fri 8:30 a.m. - 8:35 a.m. Discussion 🔗 Fri 8:34 a.m. - 8:35 a.m. Panel 5: Social Good + Human Wellbeing (Panel) 🔗 Fri 8:35 a.m. - 8:40 a.m. Statelessness in Asylum Data ‚Äì A Human-Centered Perspective on Outliers. Kristin Kaltenhauser, Naja Møller (Panel Speaker) 🔗 Fri 8:40 a.m. - 8:45 a.m. Another Horizon for Human-Centered AI: An Inspiration to Live Well. Julian Posada (Panel Speaker) 🔗 Fri 8:45 a.m. - 8:50 a.m. A Future for AI Governance Systems beyond Predictions. Devansh Saxena, Erina Moon, Shion Guha (Panel Speaker) 🔗 Fri 8:50 a.m. - 8:55 a.m. Discussion 🔗 Fri 8:55 a.m. - 9:10 a.m. Keynote 5: Building human-centric AI systems: thoughts on user agency, transparency and trust. Fernanda Viegas, Google and Harvard University, US. (Keynote) 🔗 Fri 9:10 a.m. - 9:15 a.m. Discussion 🔗 Fri 9:14 a.m. - 9:15 a.m. Panel 6: Users (Panel) 🔗 Fri 9:15 a.m. - 9:20 a.m. (Re)Defining Expertise in Machine Learning Development. Mark Diaz, Angela Smith (Panel Speaker) 🔗 Fri 9:20 a.m. - 9:25 a.m. A Human-Capabilities Orientation for Human-AI Interaction Design. Sean Koon (Panel Speaker) 🔗 Fri 9:25 a.m. - 9:30 a.m. (De)Noise: Moderating the Inconsistency of Human Decisions. Junaid Ali, Nina Grgic-Hlaca, Krishna P. Gummadi, Jennifer Wortman Vaughan (Panel Speaker) 🔗 Fri 9:30 a.m. - 9:35 a.m. Discussion 🔗 Fri 9:35 a.m. - 10:35 a.m. Posters (full list appears below) (Poster) 🔗 Fri 10:35 a.m. - 10:45 a.m. short break (Break) 🔗 Fri 10:44 a.m. - 10:45 a.m. Panel 7: Critical (Panel) 🔗 Fri 10:45 a.m. - 10:50 a.m. Supporting Qualitative Coding with Machine-in-the-loop. Matthew K Hong, Francine Chen, Yan-Ying Chen, Matt Klenk (Panel Speaker) 🔗 Fri 10:50 a.m. - 10:55 a.m. Today we talk to the machine'' - Unveiling data for providing micro-credit loans using conversational systems. Heloisa Candello, Emilio Vital Brazil, Rogerio De Paula, Cassia Sanctos, Marcelo Grave, Gabriel Soella, Marina Ito, Adinan Brito Filho (Panel Speaker) 🔗 Fri 10:55 a.m. - 11:00 a.m. Towards Multi-faceted Human-centered AI. Sajjadur Rahman, Hannah Kim, Dan Zhang, Estevam Hruschka, Eser Kandogan (Panel Speaker) 🔗 Fri 11:00 a.m. - 11:05 a.m. Discussion 🔗 Fri 11:05 a.m. - 11:20 a.m. Keynote 6: Why HCAI Needs the Humanities. Lauren Klein, Emory University, US. (Keynote) 🔗 Fri 11:20 a.m. - 11:25 a.m. Discussion 🔗 Fri 11:24 a.m. - 11:25 a.m. Panel 8: Data Work (Panel) 🔗 Fri 11:25 a.m. - 11:30 a.m. Labeling instructions matter in biomedical image analysis. An annotator-centric perspective. Tim R√§dsch, Annika Reinke, Vivienn Weru, Minu D. Tizabi, Nicholas Schreck, A. Emre Kavur, B√ºnyamin Pekdemir, Tobias Ro√ü, Annette Kopp-Schneider, Lena Maier-Hei (Panel Speaker) 🔗 Fri 11:30 a.m. - 11:35 a.m. Ground(less) Truth: The Problem with Proxy Outcomes in Human-AI Decision-Making. Luke Guerdan, Amanda Lee Coston, Steven Wu, Ken Holstein (Panel Speaker) 🔗 Fri 11:35 a.m. - 11:40 a.m. Human-centered Proposition for Structuring Data Construction. Cheul Young Park, Inha Cha, Juhyun Oh (Panel Speaker) 🔗 Fri 11:40 a.m. - 11:45 a.m. Discussion 🔗 Fri 11:45 a.m. - 12:00 p.m. Conclusion (Closing) 🔗 - Towards a Human-Centered Approach for Automating Data Science (Poster) []   link » Technology for Automating Data Science (AutoDS) consistently undervalues the role of human labor, resulting in tools that, at best, are ignored and, at worst, can actively mislead or even cause harm. Even if full and frictionless automation were possible, human oversight is still desired and required to review the outputs of AutoDS tooling and integrate them into decision-making processes. We propose a human-centered lens to AutoDS that emphasizes the collaborative relationships between humans and these automated processes and elevates the effects these interactions have on downstream decision-making. Our approach leverages a provenance framework that integrates user-, data-, and model-centric approaches to make AutoDS platforms observable and interrogable by humans. Link » Ana Crisan · Lars Kotthoff · Marc Streit · Kai Xu 🔗 - Is It Really Useful?: An Observation Study of How Designers Use CLIP-based Image Generation For Moodboards (Poster) []   link » Contrastive neural network models (i.e. CLIP) based image generation services (e.g. DALL-E2, MidJourney, Stable Diffusion) have shown that they can produce a huge range of flawless images, consistent with a user-provided image concept in text. While a lot of people have shared successful cases on the Internet, we still have very limited knowledge about whether such tools are helpful for daily design work. We conducted a preliminary observational study to investigate how designers create moodboards using DALL-E2. The results indicate that novice users would find it hard to find best prompts for creating and modifying generate images. The goal of this position paper is to propose potential research areas and ideas such as how to set guidelines for designing interactive image generation services for a specific purpose. Link » Seungho Baek · Hyerin Im · Uran Oh · Youn-kyung Lim · Tak Yeon Lee 🔗 - Explainable Representations of Human Interaction: Engagement Recognition model with Video Augmentation (Poster) []   link » In this paper, we explore how different video augmentation techniques transition the representation learning of a dyad’s joint engagement. We evaluate state-of-the-art action recognition models (TimeSformer, X3D, I3D, and SlowFast) on parent-child interaction video dataset with joint engagement recognition task and demonstrate how the performance varies by applying different video augmentation techniques (General Aug, DeepFake, and CutOut). We also introduce a novel metric to objectively measure the quality of learned representations (Grad-CAM) and relate this with social cues (smiling, head angle, and body closeness) by conducting correlation analysis. Furthermore, we hope our method serves as a strong baseline for future human interaction analysis research. Link » Yubin Kim · Hae Park · Sharifa Alghowinem 🔗 - Trust Explanations to Do What They Say (Poster) []  []   link » How much are we to trust a decision made by an AI algorithm? Trusting an algorithm without cause may lead to abuse, and mistrusting it may similarly lead to disuse. Trust in an AI is only desirable if it is warranted; thus, calibrating trust is critical to ensuring appropriate use. In the name of calibrating trust appropriately, AI developers should provide contracts specifying use cases in which an algorithm can and cannot be trusted. Automated explanation of AI outputs is often touted as a method by which trust can be built in the algorithm. However, automated explanations arise from algorithms themselves, so trust in these explanations is similarly only desirable if it is warranted. Developers of algorithms explaining AI outputs (xAI algorithms) should provide similar contracts, which should specify use cases in which an explanation can and cannot be trusted. Link » Neil Natarajan · Reuben Binns · Jun Zhao · Nigel Shadbolt 🔗 - Towards Companion Recommendation Systems (Poster) []   link » Recommendation systems can be seen as one of the first successful paradigms of true human-AI collaboration. That is, the AI identifies what the user might want and provide this to them at the right time; and the user, implicitly or explicitly, gives feedback of whether they value said recommendations. However, to make the recommender a \emph{true companion} of users, amplifying and augmenting the capabilities of users to be more knowledgeable, healthy, and happy, requires a shift into the way this collaboration happens. In this position paper, we argue for an increasing focus into reflecting the user values into the design, evaluation, training objectives, and interaction paradigm of state-of-the-art recommendation. Link » Konstantina Christakopoulou · Yuyan Wang · Ed Chi · MINMIN CHEN 🔗 - Generation Probabilities are Not Enough: Improving Error Highlighting for AI Code Suggestions (Poster) []   link » Large-scale generative models are increasingly being used in tooling applications. As one prominent example, code generation models recommend code completions within an IDE to help programmers author software. However, since these models are imperfect, their erroneous recommendations can introduce bugs or even security vulnerabilities into a code base if not overridden by a human user. In order to override such errors, users must first detect them. One method of assisting this detection has been highlighting tokens with low generation probabilities. We also propose another method, predicting the tokens people are likely to edit in a generation. Through a mixed-methods, pre-registered study with N = 30 participants, we find that the edit model highlighting strategy results in significantly faster task completion time, significantly more localized edits, and was strongly preferred by participants. Link » Helena Vasconcelos · Gagan Bansal · Adam Fourney · Q.Vera Liao · Jennifer Wortman Vaughan 🔗 - Values Shape Optimizers Shape Values (Poster) []   link » We often construct AI systems that optimize over specified objectives serving as proxies for human values. Consider recommender systems on social media and entertainment streaming platforms, which maximize the time-on-application or other user-engagement metrics, as a proxy for providing entertaining content to users. The research community has also begun to study how optimizing systems influence human values (e.g., shifts political leanings or predictable induction into specific online communities). We are left with an obvious, yet overlooked framework: Consideration of values and optimizers as a highly intertwined and interactive system; one that constantly feeds into and transforms the other. This perspective is crucial for engineering safe and beneficial AI systems—ones which preserve diverse values across individuals and communities. Link » Joe Kwon 🔗 - Exploring Human-AI Collaboration for Fair Algorithmic Hiring (Poster) []   link » The current machine learning applications in the hiring process are prone to bias, especially due to poor quality and small quantity of data. The bias in hiring imposes potential societal and legal risks. Thus, it is important to evaluate ML applications' bias in the hiring context. To investigate the algorithmic bias, we use real-world employment data to train models for predicting job candidates' performance and retention. The result shows that ML algorithms make biased decisions toward a certain group of job candidates. This analysis motivates us to resort to an alternative method---AI-assisted hiring decision making. We plan to conduct an experiment with human subjects to evaluate the effectiveness of human-AI collaboration for algorithmic bias mitigation. In our designed study, we will systematically explore the role of human-AI teaming in enhancing the fairness of hiring in practice. Link » Hyun Joo Shin · Anqi Liu 🔗 - Revisiting Value Alignment Through the Lens of Human-Aware AI (Poster) []   link » Value alignment has been widely argued to be one of the central safety problems in AI. While the problem itself arises from the way humans interact with the AI systems, most current solutions to value alignment tend to sideline the human or make unrealistic assumptions about possible human interactions. In this position paper, we propose a human-centered formalization of the value alignment problem that generalizes human-AI interaction frameworks that were originally developed for explainable AI. We see how such a human-aware formulation of the problem provides us with novel ways of addressing and understanding the problem. Link » Sarath Sreedharan · Subbarao Kambhampati 🔗 - Human-AI Co-Creation of Personas and Characters with Text-Generative Models (Poster) []  []   link » Natural language generation has been one of the prime focuses of human-AI collaboration in recent years. We are specifically interested in exploring the idea of creativity in human-AI co-creation, most especially in the context of persona generation for the iterativehuman-centered design process. Collaborating with AIs to generate engaging personas may present opportunities to overcome the shortcomings of personas and how they’re currently used in the design process. We aim to study how collaborating with AIs might help designers and researchers to create engaging personas and narrative scenarios for their products, and by extension the implications of human-AI collaborative creative writing on fields like literature with character generation. The implications of such a study could be generalized beyond user-experience design and persona generation. The ability to create engaging personas is not dissimilar from the ability to generate characters as a whole, and the subsequent potential for natural language generation to assist in creative writing and thinking is implicit. In this paper, we will discuss the process and potential merits of iterating with AIs for creative content creation, as well as expand upon experiments we have conducted and the questions we hope to answer in our future research. Link » Toshali Goel · Orit Shaer 🔗 - Human-Centered Algorithmic Decision-Making in Higher Education (Poster) []   link » Algorithms used for decision-making in higher education promise cost-savings to institutions and personalized service for students, but at the same time, raise ethical challenges around surveillance, fairness, and interpretation of data. To address the lack of systematic understanding of how these algorithms are currently designed, we reviewed algorithms proposed by the research community for higher education. We explored the current trends in the use of computational methods, data types, and target outcomes, and analyzed the role of human-centered algorithm design approaches in their development. Our preliminary research suggests that the models are trending towards deep learning, increased use of student personal data and protected attributes, with the target scope expanding towards automated decisions. Despite the associated decrease in interpretability and explainability, current development predominantly fails to incorporate human-centered lenses. Link » Kelly McConvey · Anastasia Kuzminykh · Shion Guha 🔗 - Social Construction of XAI: Do We Need One Definition to Rule Them All? (Poster) []   link » There is a growing frustration amongst researchers and developers in Explainable AI (XAI) around the lack of consensus around what is meant by ‘explainability’. Do we need one definition of explainability to rule them all? In this paper, we argue why a singular definition of XAI is neither feasible nor desirable at this stage of XAI’s development. We view XAI through the lenses of Social Construction of Technology (SCOT) to explicate how diverse stakeholders (relevant social groups) have different interpretations (interpretative flexibility) that shape the meaning of XAI. Forcing a standardization (closure) on the pluralistic interpretations too early can stifle innovation and lead to premature conclusions. We share how we can leverage the pluralism to make progress in XAI without having to wait for a definitional consensus. Link » Upol Ehsan · Mark Riedl 🔗 - Science Communications for Explainable AI (XAI) (Poster) []   link » Artificial intelligence has a communications challenge. To create human-centric AI, it is important that XAI is able to adapt to different users. The SciCom field provides a mixed-methods approach that can provide better understanding of users’ framings so as to improve public engagement and expectations of AI systems, as well as help AI systems better adapt to their particular user. Link » Simon Hudson · Matija Franklin 🔗 - Beyond Decision Recommendations: Stop Putting Machine Learning First and Design Human-Centered AI for Decision Support (Poster) []   link » Zana Bucinca · Alexandra Chouldechova · Jennifer Wortman Vaughan · Krzysztof Z Gajos 🔗 - Indexing AI Risks with Incidents, Issues, and Variants (Poster) []   link » Two years after publicly launching the AI Incident Database (AIID) as a collection of harms or near harms produced by AI in the world, a backlog of issues'' that do not meet its incident ingestion criteria have accumulated in its review queue. Despite not passing the database's current criteria for incidents, these issues advance human understanding of where AI presents the potential for harm. Similar to databases in aviation and computer security, the AIID proposes to adopt a two-tiered system for indexing AI incidents (i.e., a harm or near harm event) and issues (i.e., a risk of a harm event). Further, as some machine learning-based systems will sometimes produce a large number of incidents, the notion of an incidentvariant'' is introduced. These proposed changes mark the transition of the AIID to a new version in response to lessons learned from editing 1,800+ incident reports and additional reports that fall under the new category of issue.'' Link » Sean McGregor · Kevin Paeth · Khoa Lam 🔗 - Towards Better User Requirements: How to Involve Human Participants in XAI Research (Poster) []   link » Human-Center eXplainable AI (HCXAI) literature identifies the need to address user needs. This paper examines how existing XAI research involves human users in designing and developing XAI systems and identifies limitations in current practices, especially regarding how researchers identify user requirements. Finally, we propose several suggestions on how to derive better user requirements by deeper engagement with user groups. Link » Thu Nguyen · Jichen Zhu 🔗 - Feature-Level Synthesis of Human and ML Insights (Poster) []   link » We argue that synthesizing insights from humans and ML models at the level of features is an important direction to explore to improve human-ML collaboration on decision-making problems. We show through an illustrative example that feature-level synthesis can produce correct predictions in a case where existing methods fail, then lay out directions for future exploration. Link » Isaac Lage · Sonali Parbhoo · Finale Doshi-Velez 🔗 - High-stakes team based public sector decision making and AI oversight (Poster) []   link » Oversight mechanisms, whereby the functioning and behaviour of AI systems are controlled to ensure that they are tuned to public benefit, are a core aspect of human-centered AI. They are especially important in public sector AI applications, where decisions on core public services such as education, benefits, and child welfare have significant impacts. Much current thinking on oversight mechanisms revolves around the idea of human decision makers being present ‘in the loop’ of decision making, such that they can insert expert judgment at critical moments and thus rein in the functioning of the machine. While welcome, we believe that the theory of human in the loop oversight has yet to fully engage with the idea that decision making, especially in high-stakes contexts, is often currently made by hierarchical teams rather than one individual. This raises the question of how such hierarchical structures can effectively engage with an AI system that is either supporting or making decisions. In this position paper, we outline some of the key contemporary elements of hierarchical decision making in contemporary public services and show how they relate to current thinking about AI oversight, thus sketching out future research directions for the field. Link » Deborah Morgan · Vincent Straub · Youmna Hashem · Jonathan Bright 🔗 - The Role of Labor Force Characteristics and Organizational Factors in Human-AI Interaction (Poster) []   link » Algorithmic risk assessment tools are now commonplace in public sector domains such as criminal justice and human services. In this paper we argue that understanding how the deployment of such tools affect decision-making requires a considering of organizational factors and worker characteristics that may influence the take-up of algorithmic recommendations. We discuss some existing evaluations of real-world algorithms and show that labor force characteristics play a significant role in influencing these human-in-the-loop systems. We then discuss our findings from a real-world child abuse hotline screening use case, in which we investigate the role that worker experience plays in algorithm-assisted decision-making. We argue that system designers should consider ways of preserving institutional knowledge when introducing algorithms into settings with high employee turnover. Link » Lingwei Cheng · Alexandra Chouldechova 🔗 - The Challenges and Opportunities in Overcoming Algorithm Aversion in Human-AI Collaboration (Poster) []   link » Algorithm aversion occurs when humans are reluctant to use algorithms despite their superior performance. Prior studies have shown that giving users outcome control'', the ability to appeal or modify model's predictions, can mitigate this aversion. This can be contrasted withprocess control'', which entails control over the development of the algorithmic tool. The effectiveness of process control is currently under-explored. To compare how various controls over algorithmic systems affect users' willingness to use the systems, we replicate a prior study on outcome control and conduct a novel experiment investigating process control. We find that involving users in the process does not always result in a higher reliance on the model. We find that process control in the form of choosing the training algorithm mitigates algorithm aversion, but changing inputs does not. Giving users both outcome and process control does not result in further mitigation than either outcome or process control alone. Having conducted the studies on both Amazon Mechanical Turk (MTurk) and Prolific, we also reflect on the challenges of replication for crowdsourcing studies of human-AI interaction. Link » Lingwei Cheng · Alexandra Chouldechova 🔗 - Pragmatic AI Explanations (Poster) []   link » We use the Rational Speech Act framework to examine AI explanations as a pragmatic inference process. This reveals fatal flaws in how we currently train and deploy AI explainers. To evolve from level-0 explanations to level-1, we present two proposals for data collection and training: learning from L1 feedback, and learning from S1 supervision. Link » Shi Feng · Chenhao Tan 🔗 - Accuracy Is Not All You Need (Poster) []   link » Improving the performance of human-AI (artificial intelligence) collaborations tends to be narrowly scoped, with better prediction performance often considered the only metric of improvement. As a result, work on improving the collaboration usually focuses on improving the AI's accuracy. Here, we argue that such a focus is myopic, and instead, practitioners should take a more holistic view of measuring the performance of AI models, and human-AI collaboration more specifically. In particular, we argue that although some use cases merit optimizing for classification accuracy, for others, accuracy is less important and improvement on human-centered metrics should be valued instead. Link » David Piorkowski · Rachel Ostrand · Yara Rizk · Vatche Ishagian · Vinod Muthusamy · Justin D Weisz 🔗 - Honesty as the Primary Design Guidelineof Machine Learning User Interfaces (Poster) []   link » The outputs of most Machine Learning (ML) systems are often riddled with uncertainties, biased from the training data, sometimes incorrect, and almost always inexplicable. However, in most cases, their user interfaces are oblivious to those shortcomings, creating many undesirable consequences, both practical and ethical. I propose that ML user interfaces should be designed to make clear those issues to the user by exposing uncertainty and bias, instilling distrust, and avoiding imposture. This is captured by the overall concept of Honesty, which I argue should be the most important guide for the design of ML interfaces. Link » Claudio Pinhanez 🔗 - Data Issues Challenging AI Performance in Diagnostic Medicine and Patient Management (Poster) []   link » In this short article, we focus on four dimensions of data that create layers of complexity using the data in the context of AI systems developed for medical applications collectively referred to as diagnostic medicine. These complexities, or ‘data dilemmas,’ share a core human element, making it clear why a human-centered approach is needed in understanding the relationship between medical data and AI systems. Link » Mohammad Hossein Jarrahi · Mohammad Haeri · Chris Lenhardt 🔗 - Understanding the Criticality of Human Adaptation when Designing Human-Centered AI Teammates (Poster) []   link » Research on human-centered AI teammates has often worked to create AI teammates that adapt around humans, but humans have a remarkable and natural ability to adapt around their environment and teammates. This paper capitalizes on human adaptability by showcasing how humans actively adapt around their AI teammates even when those teammates change. In doing so, results of a mixed-methods experiment (N = 60) demonstrates that human adaptation is a critical and natural component of human-centered AI teammate design. Link » Christopher Flathmann · Nathan McNeese 🔗 - The Design Space of Pre-Trained Models (Poster) []  []   link » Card et al.’s classic paper "The Design Space of Input Devices" established the value of design spaces as a tool for HCI analysis and invention. We posit that developing design spaces for emerging pre-trained, general AI models is necessary for supporting their integration into human-centered systems and practices. We explore what it means to develop an AI model design space by proposing two design spaces relating to pre-trained AI models: the first considers how HCI can impact pre-trained models (i.e., interfaces for models) and the second considers how pre-trained models can impact HCI (i.e., models as an HCI prototyping material). Link » Meredith Morris · Carrie Cai · Jess Holbrook · Chinmay Kulkarni · Michael Terry 🔗 - Human-in-the-loop Bias Mitigation in Data Science (Poster) []   link » With the successful adoption of machine learning (ML) in decision making, there have been growing concerns around the transparency and fairness of ML models leading to significant advances in the field of eXplainable Artificial Intelligence (XAI). Generating explanations using existing techniques in XAI and merely reporting model bias, however, are insufficient to locate and mitigate sources of bias. In line with the data-centric AI movement, we posit that to mitigate bias, we must solve the myriad data errors and biases inherent in the data, and propose a human-machine framework that strengthens human engagement with data to remedy data errors and data biases toward building fair and trustworthy AI systems. Link » Romila Pradhan · Tianyi Li 🔗 - A View From Somewhere: Human-Centric Face Representations (Poster) []  []   link » Biases in human-centric computer vision models are often attributed to a lack of sufficient data diversity, with many demographics insufficiently represented. However, auditing datasets for diversity can be difficult, due to an absence of ground-truth labels of relevant features. Few datasets contain self-identified demographic information, inferring demographic information risks introducing additional biases, and collecting and storing data on sensitive attributes can carry legal risks. Moreover, categorical demographic labels do not necessarily capture all the relevant dimensions of human diversity that are important for developing fair and robust models. We propose to implicitly learn a set of continuous face-varying dimensions, without ever asking an annotator to explicitly categorize a person. We uncover the dimensions by learning on a novel dataset of 638,180 human judgments of face similarity (FAX). We demonstrate the utility of our learned embedding space for predicting face similarity judgments, collecting continuous face attribute values, and comparative dataset diversity auditing. Moreover, using a novel conditional framework, we show that an annotator's demographics influences the importance they place on different attributes when judging similarity, underscoring the need for diverse annotator groups to avoid biases. Link » Jerone Andrews · Przemyslaw Joniak · Alice Xiang 🔗 - Metric Elicitation; Moving from Theory to Practice (Poster) []   link » Metric Elicitation (ME) is a framework for eliciting classification metrics that better align with implicit user preferences based on the task and context. The existing ME strategy so far is based on the assumption that users can most easily provide preference feedback over classifier statistics such as confusion matrices. This work examines ME, by providing a first ever implementation of the ME strategy. Specifically, we create a web-based ME interface and conduct a user study that elicits users' preferred metrics in a binary classification setting. We discuss the study findings and present guidelines for future research in this direction. Link » Safinah Ali · Sohini Upadhyay · Gaurush Hiranandani · Elena Glassman · Sanmi Koyejo 🔗 - Combating Toxicity in Online Games with HCAI (Poster) []   link » Multiplayer gaming yields social benefits, but can cause harm through toxicity—particularly as directed toward women, players of color, and 2SLGBTQ+ players. Detecting toxicity is challenging, but is necessary for intervention. We present three challenges to automated toxicity detection, and share potential solutions so that researchers can develop HCAI models that detect toxic game communication. Link » Regan Mandryk · Julian Frommel 🔗 - The Aleph & Other Metaphors for Image Generation (Poster) []  []   link » In this position paper, we reflect on fictional stories dealing with the infinite and how they connect with the current, fast-evolving field of image generation models. We draw attention to how some of these literary constructs can serve as powerful metaphors for guiding human-centered design and technical thinking in the space of these emerging technologies and the experiences we build around them. We hope our provocations seed conversations about current and yet to be developed interactions with these emerging models in ways that may amplify human agency. Link » Gonzalo Ramos · Rick Barraza · VICTOR DIBIA · Sharon Lo 🔗 - Tensions Between the Proxies of Human Values in AI (Poster) []  []   link » Motivated by mitigating potentially harmful impacts of technologies, the AI community has formulated and accepted mathematical definitions for certain pillars of accountability: e.g. privacy, fairness, and model transparency. Yet, we argue this is fundamentally misguided because these definitions are imperfect, siloed constructions of the human values they hope to proxy, while giving the guise that those values are sufficiently embedded in our technologies. Under popularized techniques, tensions arise when practitioners attempt to achieve each pillar of fairness, privacy, and transparency in isolation or simultaneously. In this position paper, we argue that the AI community needs to consider alternative formulations of these pillars based on the context in which technology is situated. By leaning on sociotechnical systems research, we can formulate more compatible, domain-specific definitions of our human values for building more ethical systems. Link » Daniel Nissani · Teresa Datta · John Dickerson · Max Cembalest · Akash Khanna · Haley Massa 🔗

#### Author Information

##### Michael Muller (IBM Research)

Michael Muller works in the AI Interactions group of IBM Research AI, where his work focuses on the human aspects of data science; ethics and values in applications of AI to human issues; metrics and analytics for enterprise social software applications, with particular application to employee engagement emergent social phenomena in social software. Recognitions include: ACM Distinguished Scientist; SIGCHI Academy; IBM Master Inventor. Steering Committees: EUSSET (European Society for the study of Socially Embedded Technologies); ACM GROUP conference series. Papers co-chair for ECSCW 2019 (European Computer Supported Cooperative Work conference).

##### Plamen P Angelov (Lancaster University)

Prof. Angelov (MEng 1989, PhD 1993, DSc 2015) is a Fellow of the IEEE, of the IET and of the HEA. His PhD supervisor, Dr. Dimitar P. Filev is now Member of the National Academy of Engineering, USA. Prof. Angelov is Vice President of the International Neural Networks Society (INNS) for Conferences. He has 30 years of professional experience in high level research and holds a Personal Chair in Intelligent Systems at Lancaster University, UK. He founded in 2010 the Intelligent Systems Research group which he led till 2014 when he founded the Data Science group at the School of Computing and Communications before going on sabbatical in 2017 and established LIRA (Lancaster Intelligent, Robotic and Autonomous systems) Research Centre (www.lancaster.ac.uk/lira ) which includes over 40 academics across different Faculties and Departments of the University. He is a founding member of the Data Science Institute and of the CyberSecurity Academic Centre of Excellence at Lancaster. He has authored or co-authored 300 peer-reviewed publications in leading journals, peer-reviewed conference proceedings, 3 granted patents, 3 research monographs (by Wiley, 2012 and Springer, 2002 and 2018) cited over 8800 times with an h-index of 48 and i10-index of 156. His single most cited paper has 940+ citations. He has an active research portfolio in the area of explainable AI, computational intelligence and machine learning and internationally recognised results into online and evolving learning and algorithms for knowledge extraction in the form of human-intelligible rule-based systems. Prof. Angelov leads numerous projects (including several multimillion ones) funded by UK research councils, EU, industry, UK MoD. His research was recognised by ‘The Engineer Innovation and Technology 2008 Special Award’ and ‘For outstanding Services’ (2013) by IEEE and INNS. He is also the founding co-Editor-in-Chief of Springer’s journal on Evolving Systems and Associate Editor of several leading international scientific journals, including IEEE Transactions on Cybernetics, IEEE Transactions on Fuzzy Systems, Fuzzy Sets and Systems, Soft Computing, etc. He gave over two dozen key note/plenary talks at high profile conferences. Prof. Angelov was General co-Chair of a number of high profile IEEE conferences and is the founding Chair of the Technical Committee on Evolving Intelligent Systems, SMC Society of the IEEE and was previously chairing the Standards Committee of the Computational Intelligent Society of the IEEE (2010-2012). He was also a member of International Program Committee of over 100 international conferences (primarily IEEE).

##### David Piorkowski (IBM Research)

I am currently a Staff Research Scientist employed at IBM's Thomas J. Watson Research Center as part of the Human-AI Collaboration team. My current interests apply a human-factors perspective to Artificial Intelligence (AI) trust and AI transparency. Broadly speaking this involves understanding how AI developers and their peers work together, identify where things go wrong, and develop solutions to address those problems. My current work in this space includes evaluating and documenting risks associated with AI models, accelerating the development of intelligent agents for business automation, and developing best practices and tools for AI documentation. My prior work included understanding information and communication needs between stakeholders throughout the AI development life cycle and on developing novel ways to measure and evaluate conversational systems.