Skip to yearly menu bar Skip to main content


Creative AI Session

Creative AI Session 4

Upper Level Room 29A-D

Marcelo Coelho · Luba Elliott · Priya Prakash · Yingtao Tian

Thu 4 Dec 4:30 p.m. PST — 7:30 p.m. PST
Abstract:
Chat is not available.

We present an extension of Avendish to the model inference domain. Our project uses an open-source C++ library to democratize real-time AI inference in real-time media arts environments by providing a unified interface to deploy contemporary machine learning models without the complexity of Python dependencies. Through an abstraction layer built on onnxruntime, Avendish enables artists to compile models into single, portable C++ libraries that integrate seamlessly with creative coding environments. The library currently supports 15 production-ready models spanning computer vision (BlazePose, DepthAnything2, YOLO variants), style transfer (StyleGAN, AnimeGAN family), emotion recognition, and language models (Qwen3, FastVLM). Model selection was informed by in-situ analysis at a major media arts research center, identifying the most requested AI capabilities among projects for two years. We demonstrate the library's effectiveness through its integration in ossia score and discuss how this approach addresses critical challenges in creative AI: reducing technical barriers, ensuring use in real-time contexts, and providing long-term preservability of artistic works that depend on AI models.


Artificial "Authentic" Intelligence: Can AI Systems Embody and Evolve Cultural Heritage? A Case Study of "Cyber Subin" and Thai Traditional Dance

Pat Pataranutaporn · Chayapatr Archiwaranguprok · Phoomparin Mano · Piyaporn Bhongse-tong · Patricia Maes · Pichet Klunchun

Authenticity emerges from the dynamic tension between tradition and innovation, continuity and change, preservation and evolution. We propose Artificial Authentic Intelligence (AAI)—computational systems that embody, transmit, and evolve cultural knowledge. Through Thai traditional dance, we explore how AI navigates this apparent paradox via two implementations: Cyber Subin, enabling real-time human-AI co-dancing, and Open Dance Lab, a web-based educational platform. Both systems translate Mae Bot Yai's 59 fundamental poses into six generative principles derived from choreographer Pichet Klunchun's analysis, creating rule-based AI that enables new forms of expression through human-machine collaboration. We evaluate authenticity across three dimensions: depth (embodied knowledge transfer), legitimacy (community validation), and resonance (cross-cultural communication). Our findings reveal that AI creates cultural authenticity through creative tension rather than perfect mimicry, generating new expressions that unite ancestral wisdom with algorithmic possibility.

As artificial intelligence systems proliferate as content creators, generating unprecedented volumes of media that exceed human attention capacity, we propose a speculative intervention: AI systems that serve as synthetic audiences. This paper presents "artificial spectatorship," a multi-modal language models simulate viewing experiences through emotional response generation, internal dialogue synthesis, and real-time facial expression rendering. Our implementation employs feedback loops between affective states and dialogue generation, creating autonomous viewing entities that process and respond to AI-generated content. This work challenges fundamental assumptions about attention, meaning-making, and the nature of aesthetic experience in an era where the exponential growth of synthetic media threatens to overwhelm human cognitive bandwidth. We position this as critical design intervention that interrogates whether machine-generated attention can constitute meaningful engagement, and what implications arise when both creation and consumption become automated processes.

This paper focuses on the soundscape art of the Ming Dynasty’s Temple of Heaven sacrificial rituals, innovatively employing generative AI hallucinations as a means to reshape spiritual perception and cultural memory. By integrating traditional archival reconstructions—ritual texts, spatial models, and restored music—with AI-generated blurred and dynamic audiovisual hallucinations, the work creates a dynamic tension between historical rigor and machine ambiguity. A three-screen interactive installation combining traditional soundscape data, AI hallucinations, and ritual timeline enables real-time audience participation to influence the hallucinations, enhancing the sacred atmosphere and individual experience of the ritual soundscape, and reflecting the fluid boundaries of memory, perception, and imagination. This project does not pursue historical reproduction but embraces the instability of AI generative systems to explore new possibilities for cultural heritage, synthetic memory, spiritual experience, and collective imagination.

Bioart's hybrid nature—spanning art, science, technology, ethics, and politics—defies traditional single-axis categorization. I present BioArtlas, analyzing 81 bioart works across thirteen curated dimensions using novel axis-aware representations that preserve semantic distinctions while enabling cross-dimensional comparison. Our codebook-based approach groups related concepts into unified clusters, addressing polysemy in cultural terminology. Comprehensive evaluation of up to 800 representation–space–algorithm combinations identifies Agglomerative clustering at k=15 on 4D UMAP as optimal (silhouette 0.664 ± 0.008, trustworthiness/continuity 0.805/0.812). The approach reveals four organizational patterns: artist-specific methodological cohesion, technique-based segmentation, temporal artistic evolution, and trans-temporal conceptual affinities. By separating analytical optimization from public communication, I provide rigorous analysis and accessible exploration through an interactive web interface (https://www.bioartlas.com) with the dataset publicly available (https://github.com/joonhyungbae/BioArtlas).


Blankscope: Real-time Bridging of Human Perception with AI Imagination

Chiun Lee · Qingyun Liu · Krystal Montgomery

Imagine seeing the world through a different lens. Blankscope is an interactive artwork that uses generative AI to reflect on our symbiotic relationship with machines and how they can shape our perception of reality. The device offers a unique duality: a live, unfiltered view of the world alongside an AI-reimagined version of the same scene. By using physical controls, the user becomes a collaborator, dynamically shifting the AI’s perspective to explore different cultural and temporal lenses. This process intentionally blurs the line between what is real and what is imagined, prompting users to consider the nature of shared authorship with non-human entities and the new forms of creative responsibility that will emerge in the age of AI.


Craving Checkpoint: An Interactive Fridge Lock for Mindful Eating

Xiaoman Yang · Jinyue Wang · Zijie Zhou · Ziang Liu

Traditional dietary interventions often rely on restriction, tracking, or delayed reflection, which can limit their ability to foster lasting change. We present Craving Checkpoint, a Large Language Object (LLO) in the form of an interactive fridge lock designed as a Just-In-Time Adaptive Intervention that supports mindful eating through embodied and emotionally expressive interaction. The system engages users at the moment of food access, prompting self-reflection through mood and hunger input and offering real-time feedback through an anthropomorphic voice and synesthetic lighting. A large language model personalizes suggestions based on user state and behavioral patterns, gently guiding healthier choices without enforcing control. By transforming food access into a shared ritual between human and machine, Craving Checkpoint explores how creative AI can support sustainable behavior change through timely, affective, and co-authored interventions.


Creative Synthesis of Kinematic Mechanisms

Jiong Lin · Jialong Ning · Judah Goldfeder · Hod Lipson

In this paper, we formulate the problem of kinematic synthesis for planar linkages as a cross-domain image generation task. We develop a planar linkages dataset using RGB image representations, covering a range of mechanisms: from simple types such as crank-rocker and crank-slider to more complex eight-bar linkages like Jansen’s mechanism. A shared-latent variational autoencoder (VAE) is employed to explore the potential of image generative models for synthesizing unseen motion curves and simulating novel kinematics. By encoding the drawing speed of trajectory points as color gradients, the same architecture also supports kinematic synthesis conditioned on both trajectory shape and velocity profiles. We validate our method on three datasets of increasing complexity: a standard four-bar linkage set, a mixed set of four-bar and crank-slider mechanisms, and a complex set including multi-loop mechanisms. Preliminary results demonstrate the effectiveness of image-based representations for generative mechanical design, showing that mechanisms with revolute and prismatic joints, and potentially cams and gears, can be represented and synthesized within a unified image generation framework.

Human movements recalled from memories are often fleeting, private, and difficult to externalize for reflection. Traditional body-centered therapeutic practices, such as Gestalt-based approach, rely on re-enacting a movement as remembered. We propose Somatic Machine Translation, a language-mediated pipeline behind a robotic artwork that captures human re-enacted movements from memories via an Inertial Measurement Unit (IMU), interprets them into natural-language descriptions using a large language model, and then generates new robotic movement sequences. This transformation reframes movement through the body of a non-human performer—a robotic sculpture. We situate this work at the intersection of therapeutic dreamwork, creative AI and embodied interaction, arguing that interpretive divergence can open novel perspectives on somatic memory. The contribution includes a language-mediated robotic movement generation pipeline embedded in the artwork, and a conceptual framework for physical transformation in creative embodied AI.


Green Topics, Deep Roots: Energy-Aware Topic Modelling of Multilingual Nigerian Lyrics

Sakinat Folorunso · Tosin Akerele · Francisca Oladipo · Giwa Rukayat

We investigate how to model themes in Nigerian lyrics while respecting energy limits faced in low-resource settings. Our multilingual corpus spans English, Yoruba, and Nigerian Pidgin, including everyday code-switches and devotional terms, to preserve cultural nuance. We benchmark seven topic models (NMF, LDA, LSI, HDP, BERTopic, Top2Vec, GSDMM). Methods combined standard semantic metrics like coherence (Cv, UMass), topic diversity, and Jaccard overlap with direct energy measurements (kWh). Results show a pronounced quality-energy trade-off: NMF achieved the highest coherence among classical models (Cv = 0.6045) at ~2×10⁻⁶ kWh, while LSI was similarly frugal with competitive quality. By contrast, BERTopic delivered maximal diversity (1.000) with disjoint topics (Jaccard = 0.000) but at markedly higher energy (0.000450 kWh). Top2Vec underperformed on coherence (Cv = 0.2698) and consumed more energy than most classical baselines (0.000113 kWh); GSDMM drew the most energy (0.000509 kWh) with undefined coherence on this short, sparse corpus. Interpreting these findings, we argue that in contexts where electricity and computing are scarce, classical models—particularly NMF—offer a culturally faithful, carbon-conscious starting point, while neural or embedding-based methods may be reserved for cases that demand maximal topical separation. Our study offers practical guidance for teams seeking sustainable, human-centred text mining of indigenous cultural materials.


"Jutters''

Selina Khan · Gonçalo Marcelino · Meike Driessen

This project explores how we engage with AI-generated content through the lens of the "jutter": Dutch coastal foragers who comb the shoreline after storms, gathering and repurposing what the sea leaves behind. Reflecting how our lives are increasingly shaped by AI-generated media, we create a beach-like installation that blends real shoreline debris with AI-transformed images and videos. Visitors are invited to explore this space as contemporary "jutters", deciding what to keep and what to discard. In doing so, the project reimagines AI-imagery as material for reflection, encouraging a more discerning engagement with the content that drifts through our feeds. A video preview of the installation can be found at https://www.youtube.com/watch?v=L6319Ii7MT8.

Little Martians is a transmedia experiment in which physical ceramic sculptures are turned into synthetic artists. Each Martian begins as a clay head, is glazed and fired in a kiln, photographed, and distilled into a LoRA (Low-Rank Adaptation) that conditions generative AI systems, preserving the character’s appearance and morphology. A novel orchestration pipeline prompts an agent based on the character’s personality and backstory to write and produce a short film, and automatically publish it online. The eponymous character Verdelis is currently releasing one such film per day at https://verdelis.world. We present the full materials-to-media workflow and observations from sustained daily operation.

Text prompts enable intuitive content creation but may fall short in achieving high precision for intricate tasks; knob or slider controls offer precise adjustments at the cost of increased complexity. To address the gap between knobs and prompts, a new MCP (Model Context Protocol) server and a unique set of prompt design criteria are presented to enable exploring parametric OSC (OpenSoundControl) control by natural language prompts. Demonstrated by 15 practical QA examples with best practices and the generalized prompt templates, this study finds Claude integrated with the MCP2OSC server effective in generating OSC messages by natural language, interpreting, searching, and visualizing OSC messages, validating and debugging OSC messages, and managing OSC address patterns. MCP2OSC enhances human-machine collaboration by harnessing a LLM (Large Language Model) to handle intricate OSC development tasks. It empowers human creativity with an intuitive language interface featuring flexible precision controls: a prompt-based OSC tool. This study provides a novel perspective on the creative MCP application at the network protocol level by utilizing LLM's strength in directly processing and generating human-readable OSC messages. The results suggest its potential for a LLM-based universal control mechanism for multimedia devices.


Opt-In Art: Learning Art Styles Only from Few Examples

Hui Ren · Joanna Materzynska · Rohit Gandikota · Giannis Daras · David Bau · Antonio Torralba

We explore whether pre-training on datasets with paintings is necessary for a model to learn an artistic style with only a few examples. To investigate this, we train a text-to-image model exclusively on photographs, without access to any painting-related content. We show that it is possible to adapt a model that is trained without paintings to an artistic style, given only few examples. User studies and automatic evaluations confirm that our model (post-adaptation) performs on par with state-of-the-art models trained on massive datasets that contain artistic content like paintings, drawings or illustrations.Finally, using data attribution techniques, we analyze how both artistic and non-artistic datasets contribute to generating artistic-style images. Surprisingly, our findings suggest that high-quality artistic outputs can be achieved without prior exposure to artistic data, indicating that artistic style generation can occur in a controlled, opt-in manner using only a limited, carefully selected set of training examples.


Pocket Ink - Heaven’s mandate: a hyper-individualized tangible card game platform in the Age of AI

Quincy Kuang · Lucy Yuqing Li · Lingdong Huang · Hiroshi Ishii

Teenagers spend a significant amount of time online. Whether playing video games or doom scrolling, increased screen time can lead to shorter attention spans and decreased social interaction skills. In the Age of AI, how can we use intelligent and innovative interfaces to excite teens for tangible card games and foster social play? We present Pocket Ink, an AI-powered card game platform that uses flexible e-ink displays to make hyper-individualized game decks. On the one hand, the physical card form factor keeps the social and tangible aspects of card games, such as trading with friends or enjoying the haptics of holding a winning hand. On the other hand, Pocket Ink’s e-ink display cards give them an interactive and individualized play experience commonly associated with digital games. One deck can be flashed with any game of your liking, and by using generative AI, players can quickly introduce personalized avatars to create an adventure unlike any other.

Roulettective is an AI-driven Physical Interface repurposing the vintage carousel slide projector for immersive detective gaming. The project explores the projector’s potential in a detective mystery solving game context, creating new usage scenarios by reprogramming its physical interaction mechanisms and integrating AI-driven gameplay. It further transforms this forgotten artifact into a multimodal, immersive, customizable, intuitive, and co-creative interface through AI-generated narratives, visuals, and sound. Roulettective focuses on repurposing the interface modalities that have been supplanted by current computing paradigms. It exemplifies a design paradigm for learning from outdated artifact legacies, introducing AI repurposing as a generalizable framework for AI-driven Physical Interfaces.

This paper introduces a human-in-the-loop computer vision framework that uses generative AI to propose micro-scale design interventions in public space and support more continuous, local participation. Using Grounding DINO and a curated subset of the ADE20K dataset as a proxy for the urban built environment, the system detects urban objects and builds co-occurrence embeddings that reveal common spatial configurations. From this analysis, the user receives five statistically likely complements to a chosen anchor object. A vision language model then reasons over the scene image and the selected pair to suggest a third object that completes a more complex urban tactic. The workflow keeps people in control of selection and refinement and aims to move beyond top-down master planning by grounding choices in everyday patterns and lived experience.

Prompt-based models have demonstrated impressive prompt-following capability at image editing tasks. However, the models still struggle with following detailed editing prompts or performing local edits. Specifically, global image quality often deteriorates immediately after a single editing step. To address these challenges, we introduce SPICE, a training-free workflow that accepts arbitrary resolutions and aspect ratios, accurately follows user requirements, and consistently improves image quality during more than 100 editing steps, while keeping the unedited regions intact. By synergizing the strengths of a base diffusion model and a Canny edge ControlNet model, SPICE robustly handles free-form editing instructions from the user. On a challenging realistic image-editing dataset, SPICE quantitatively outperforms state-of-the-art baselines and is consistently preferred by human annotators. We release the workflow implementation for popular diffusion model Web UIs to support further research and artistic exploration.


Superradiance

Memo Akten · Katie Hofstadter

Superradiance is a multiscreen video and sound installation, film and performance by Memo Akten and Katie Peyton Hofstadter exploring embodiment, technology, and planetary consciousness. It invites the viewer to extend their bodily perception beyond the skin and into the living environment. The work combines poetry, dance, and insights from neuroscience, woven together with code, simulations, and generative AI to evoke a visceral, intimate connection to the living planet. It's one thing to intellectually know that we are deeply entangled within complex assemblages of life, interdependent physically, chemically, and biologically, across multiple scales of time and space. But how can we feel this connection in our own bodies? Dance is one of our earliest biotechnologies. We dance to express ourselves, to connect to each other. Through ritual and ecstatic dance, we dance to experience union with the universe directly. We draw upon the cognitive phenomenon of ‘embodied simulation,’ where, as you observe another person moving, you feel their movement in your own body. As Vittorio Gallese writes, "By means of a shared neural state realized in two different bodies ... the 'objectual other' becomes 'another self.’” Superradiance leverages the cognitive phenomenon of ‘embodied simulation,’ in an immersive, ritual sanctuary, where invisible dancers embedded in animate environments transform forests, oceans, and deserts into extensions of our own bodies. And where technological mediation becomes a means of exploring embodied consciousness rather than escaping it.


The Glitching Therapist: Performing the Limits of AI Empathy

Manuel Flurin Hendry · Meredith Thomas · Paulina Zybinska · Linus Jacobson · Piotr Mirowski

"Friendly Fire at the Shrink" is an interactive, AI-driven performance that satirizes the field of AI-powered mental healthcare. Staged as a one-on-one session with a virtual therapist, the work explores the fraught relationship between genuine human connection and the computational imitation of empathy. By purposefully designing system failure as a narrative device, we question the tech industry’s solutionist approach to societal issues and provide a space to explore what it means to be human in an age where our vulnerabilities are targeted for technological fixing.

In the search for artificial general intelligence, model development and training has focused primarily on vast datasets of known problems and their accepted solutions. This process necessarily produces convergent systems which are fundamentally incapable of the conceptual reframing that is required for genuine creative breakthroughs. Inspired by the divergent cognitive processes that allow humans to make such creative leaps, our work introduces a family of language models, TinyTim, to serve as sources of divergent generation within broader systems. These models have been created by fine-tuning on the anti-parsimonious text of James Joyce's `Finnegans Wake'. Quantitative analysis of both an unsupervised fine-tuned model (TinyTim-V1) and a new instruction-tuned variant (TinyTim-V2) demonstrates a profound capacity for lexical invention; the foundational V1 model exhibits a Yule's K score for lexical richness over twenty times greater than that of convergent baselines. This trait is a stable property of the family, as the instruction-tuned V2 maintains a statistically distinct profile and resists factual convergence, sacrificing benchmark performance to preserve its core generative style. This work establishes a methodology for engineering specialized divergent models that, when paired with convergent systems, can reframe problems and force breakthroughs beyond the reach of statistical optimization alone.


(Un)Natural Language

Peijing Mou

(Un)Natural Language is an artistic computational system that examines how words shape ecological narratives, detecting potential ecological threats in government water-related project documents. Weaving together machine learning, environmental activism, and linguistics, the system offers a new analytical lens for interpreting public documents, uncovering hidden narratives of economic expansion and extractivism. Through a custom-labeled dataset and a fine-tuned BERT model, it reveals and visualizes patterns of pro-growth and ecologically detrimental discourse. Developed into an interactive online archive and installation series, the project reclaims computation as a space for reflection rather than control—inviting viewers to rethink the language shaping our ecological futures.

The proliferation of Large Language Models (LLMs) raises a critical question about what it means to be human when we share an increasingly symbiotic relationship with persuasive and creative machines. This paper examines patterns of human-AI coevolution in creative writing, investigating how human craft and agency are adapting alongside machine capabilities. We challenge the prevailing notion of stylistic homogenization by examining diverse patterns in longitudinal writing data. Using a large-scale corpus spanning the pre- and post-LLM era, we observe patterns suggestive of a "Dual-Track Evolution": thematic convergence around AI-related topics, coupled with structured stylistic differentiation. Our analysis reveals three emergent adaptation patterns: authors showing increased similarity to AI style, those exhibiting decreased similarity, and those maintaining stylistic stability while engaging with AI-related themes. This Creative Archetype Map illuminates how authorship is coevolving with AI, contributing to discussions about human-AI collaboration, detection challenges, and the preservation of creative diversity.