Skip to yearly menu bar Skip to main content


Exhibitor Spot Talks

Exhibitor Spot Talks - Session 2

Exhibit Hall A,B
Wed 3 Dec 9:30 a.m. PST — 4:45 p.m. PST
Abstract:
Chat is not available.

Wed 3 Dec. 9:45 - 9:57 PST

The Importance and Fragility of CoT Monitorability

Bowen Baker

AI systems that verbalize their “thinking” in human language offer a new, yet possibly fragile, opportunity for AI safety. We’ve found that monitoring chains of thought (CoT) can be highly effective for catching misbehavior during frontier reasoning model training and deployment. However, chain-of-thought monitorability may prove fragile in the face of increased scaling and algorithmic advancements. In this talk, we’ll discuss existing work OpenAI has done in this area and where we are looking to go moving forward.

Financial markets provide a rigorous, dynamic testbed for developing and evaluating artificial intelligence. They couple prediction and action under real risk, exposing how easily models overfit and how rarely they generalize when conditions change. Unlike language models trained on static data, markets reward systems that adapt as the “language” of price and behavior evolves. In this talk, we argue that quantitative trading offers a principled framework for studying and building truly adaptive intelligence.

As state-of-the-art AI models continue to scale, access to efficient, cost-effective compute has become a bottleneck for both industry and academic research. In this talk, we present a systems level analysis of decentralized GPU networks as a viable alternative to conventional centralized cloud infrastructure. We evaluate the performance characteristics of large scale training and inference workloads running on heterogeneous, globally distributed GPU clusters, focusing on throughput, latency, fault tolerance, and cost efficiency. We further discuss scheduling strategies, orchestration challenges, and reliability considerations unique to decentralized environments, along with empirical benchmarks comparing decentralized and centralized clusters across common ML workloads. Our goal is to highlight how decentralized compute can broaden accessibility for researchers, reduce costs for large scale experimentation, and offer new design spaces for distributed training in the era of rapidly growing model complexity.

Wed 3 Dec. 10:30 - 10:42 PST

Dolphin: A Large-Scale ASR Model for Eastern Languages

Xiaofeng Xin

Wed 3 Dec. 10:45 - 10:57 PST

AI for Creativity

Atlassian’s mission is to unleash the potential of every team. This mission underscores our commitment to empowering teams and organizations to achieve their best work through collaboration and productivity tools. Artificial Intelligence has become the most transformative force in achieving this goal.x000D At Atlassian, we have made significant investments in AI to fundamentally enhance team productivity. Our approach spans two dimensions: integrating AI deeply into our existing flagship products—such as Jira, Confluence, Loom, Jira Service Management, and Bitbucket, and developing AI-first products from the ground up.x000D One of the most ambitious outcomes of this effort is Rovo, Atlassian’s end-to-end enterprise AI solution. Rovo enables organizations to discover knowledge through enterprise search, gain deep insights through AI-powered chat, and take intelligent actions through Rovo agents.x000D Building Rovo and the AI platform that powers it has been a multi-year journey filled with innovation, experimentation, and learning. In this session, I’ll share key lessons from our experience—what worked well, where we failed fast, and how we continue to evolve our enterprise AI strategy to transform productivity for millions of users around the world.

Academic breakthroughs often falter at enterprise scale. This talk shares hard-won lessons from applying advanced AI at CommBank, from generative interfaces and multimodal fine-tuning to prompt optimisation and audio-to-text. We show how synthetic-data fine-tuning beats proprietary models in specialised domains like complaint handling, and why generation-time controls are vital for compliance.

Wed 3 Dec. 11:30 - 11:42 PST

Exhibitor Talk - Yokogawa Digital Corporation

Wed 3 Dec. 11:45 - 11:57 PST

Scaling Data Quality in the Era of LLMs: A Quality Framework

Olga Megorskaya

As Large Language Models and AI agents advance in capability and autonomy, an often-overlooked challenge emerges: their success hinges on access to high-quality training data that traditional quality assurance processes struggle to deliver at scale. This dependency creates a critical bottleneck—conventional QA methodologies cannot match the complexity and volume demands of modern AI systems, constraining innovation across the field. This session presents a hybrid quality assurance framework that combines intelligent project scoping, automated expert matching, and multi-layered validation through both AI agents and human oversight. We demonstrate how this integrated approach maintains data quality while scaling across diverse annotation projects, addressing key failure modes in traditional workflows. Through a case study using Toloka's self-service platform, we share design principles for building efficient and scalable data pipelines essential for training both LLMs and AI agents.

Wed 3 Dec. 12:00 - 12:12 PST

Exploring Diffusion Transformer Designs via Grafting

Juan Carlos Niebles

In this session, Juan Carlos Niebles presents grafting, a method for exploring new diffusion transformer (DiT) architectures by directly editing pretrained models. This approach enables systematic investigation of operator and structural variations - such as replacing attention with convolution or reconfiguring block depth - without full pretraining. Experiments show that grafted models retain strong generative quality (e.g., FID 2.38–2.64 vs. 2.27 for DiT-XL/2) using under 2% of pretraining compute. The results demonstrate that pretrained DiTs can serve as a foundation for efficient architectural design and analysis.

Wed 3 Dec. 12:15 - 12:27 PST

Exhibitor Talk - Uber AI Solutions

Accelerating small molecule drug discovery with AI + Physics fundamentally depends on accurately predicting how potential drug candidates bind to target proteins in 3D space. Current structure prediction methods are limited by the severely restricted and biased experimental data available in the Protein Data Bank (PDB) and their propensity to generate physically invalid poses. This talk introduces PEARL (Placing Every Atom in the Right Location), our generative foundation model that overcomes this low-data scientific regime by leveraging large-scale synthetic training data and an SO(3)-equivariant diffusion module to enforce core geometric principles, improving generalization and sample efficiency. PEARL establishes the new state-of-the-art in protein-ligand structure prediction, demonstrating up to a 14.5% relative improvement on public benchmarks for generating accurate and physically valid poses (RMSD<2Å and PB-valid). The model’s novel approach to inference-time conditioning makes it substantially more useful for drug discovery programs, allowing users to leverage auxiliary structural information in a controlled manner. On an internal benchmark of protein-ligand systems relevant for small-molecule drug discovery programs, Pearl delivers nearly a 4-fold relative improvement over comparable baselines at the stricter, medicinal chemistry-relevant RMSD<1Å threshold. This talk is intended for ML researchers interested in how generative models are being applied to structural biology. Attendees will leave the talk with a better understanding of applications for diffusion models in drug discovery, and examples of ongoing problems at the forefront of the field where there is ample room for future research.

In less than a year, AI agents have evolved from a research curiosity into the foundation of some of the largest software platform updates in decades. These systems promise to automate substantial portions of knowledge work, and their progress has been rapid, with early 2025 reports by METR suggesting that the complexity of solvable tasks doubles roughly every seven months. In this talk, we take a closer empirical look at this claim by examining what it truly takes to benchmark agentic performance on long-running, open-ended knowledge work tasks. We review recent contributions from ServiceNow AI Research and others across domains such as browser use, multimodal understanding, data analytics, and deep research. We also discuss benchmarks that evaluate agentic safety and security, arguing that these dimensions cannot be meaningfully separated from primary task performance. Our analysis leads to a more nuanced picture of the field, highlighting both genuine advances and persistent challenges that frontier agents have yet to overcome.

Goals: 1) motivate the need to benchmark AI agents in realistic enterprise settings 2) give an overview of recent research in this direction at ServiceNow

Audience: academic and industry researchers interested in measuring the capabilities and reliability of AI agents.

Wed 3 Dec. 13:15 - 13:27 PST

The Quiet Enabler: AI-Amplified Discovery

Joseph Tracy

Multi-agent systems are quietly transforming scientific inquiry by orchestrating literature analysis, experiment design and data interpretation in parallel. This session outlines the system-level principles needed to successfully integrate AI in real research settings.

What You’ll Learn: - How multi-agent architectures coordinate scientific reasoning - Real-world examples of AI completing end-to-end discovery loops - How to structure your own pipelines for fast, reproducible iteration - The infrastructure traits that support high-tempo scientific workflows

Wed 3 Dec. 13:30 - 13:42 PST

Towards A Blueprint for Open Science of Foundation Models

Zhengzhong Liu · Jiannan Xiang

At the Institute for Foundation Models (IFM) we build state‑of‑the‑art foundation models and pursue open science, enabling the whole community to push the frontiers together. In this talk we'll showcase our next frontier: moving beyond text to models that understand the world. By building rich, interactive virtual‑world simulations we can embed intelligence in realistic environments, unlocking capabilities far beyond language generation. We will (1) introduce IFM's open‑source ethos, (2) give a quick overview of our SOTA models, and (3) explain how world models will reshape AI research. Join us to discover how we plan to bridge the gap from language models to truly world‑aware intelligence.

Wed 3 Dec. 13:45 - 13:57 PST

T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

Shixiong Zhang · Genta Winata

Large Language Models (LLMs) are impressive intelligent agents, but they frequently struggle with effective multi-step planning, particularly in multi-turn conversations involving dependencies between tool calls. To address this challenge, we introduce T1, a specialized tool-augmented conversational dataset designed to capture and manage these inter-tool dependencies across diverse domains. T1 enables rigorous evaluation of agents' ability to coordinate tool use, integrate short- and long-term memory, and supports dynamic replanning decisions. We will demonstrate results powered by T1-Agent, showcasing its ability to plan and reason in these complex, tool-dependent scenarios, ultimately setting a new standard for building reliable and sophisticated agentic workflows.

Wed 3 Dec. 14:00 - 14:12 PST

Gen AI Firsts - Qualcomm

Fatih Porikli

AI is rapidly reshaping the landscape of intelligent systems, and Qualcomm AI Research has been at the forefront of pioneering breakthroughs and proof-of-concepts. Our work spans LLMs, multimodal learning, intelligent agents, embodied AI, model efficiency, model adaptation, and visual content generation, all aimed at pushing the boundaries of what AI can achieve. By leveraging a full-stack research approach, we address system-level and feasibility challenges to enable efficient, high-performing solutions. This talk will highlight how these advances translate into practical demonstrations, including the benefits of on-device AI, and showcase Qualcomm’s leadership in bringing generative AI innovations closer to real-world deployment.

The $200B+ dental industry is undergoing a digital revolution, with advanced technologies such as AI / ML, 3D / CAD, and Robotics rapidly transforming how dental products and services are provided. In this work, we introduce a novel approach to highly personalized (i.e., biomimetic) dental restoration design by developing a 3D generative AI model, DANTE (Dandy Automatic Natural prosThetics Engine) to overcome the challenges of bio-organic constraints, micro-level precision, and biomimicry.

Wed 3 Dec. 15:45 - 15:57 PST

Building close-loop simulation with realtime generative modeling

Patrick Cho

As end-to-end policies get closer to human-level performance, it becomes an increasingly bigger challenge to identify edge / failure cases in the wild and reproduce / solve them reliably. Simulation that can consistently evaluate the policy's performance against certain edge cases therefore becomes critical for model development. In addition, such a system should be sufficiently versatile to apply to any scenario, support control-in-the-loop for policy fidelity, and have low compute requirement for use in reinforcement learning. Given these requirements, we developed a real-time, close-loop simulation system based on generative modeling, and we will demonstrate how such a system allows us to reproduce real-world interventions in a generated world and solve them in the real world.

Wed 3 Dec. 16:00 - 16:12 PST

Charting the Course for Human+AI Collaboration at Upwork

Zhao Chen

Come join the Upwork AI team to learn about how our AI research empowers Uma, Upwork’s Mindful AI, to be a powerful and positive force for Human+AI collaboration on our platform. We will give an overview of Uma’s capabilities and pull the curtain back on how we train Uma to help both freelancers and clients do their work faster and more efficiently. We will also provide a look at our various AI research initiatives, including projects in AI robustness, knowledge retrieval, and agentic workflows, with details on how these initiatives fit into our long-term vision for Human+AI collaboration.

Wed 3 Dec. 16:30 - 16:42 PST

Open Source Models in Trading: Free Like a Puppy, Not Like a Beer

Wachi Bandara

A deep dive into how open-source models are transforming quantitative research and trading.

Thu 4 Dec. 16:45 - 17:00 PST

Accelerating AI for the Real World: Strategies for safe deployment and rapid scaling

Karan Kapoor · Alfiya Timorshina

How can breakthrough AI models be deployed safely, swiftly, and at scale? This session reveals how Firstsource powers rapid AI deployment through large-scale data operations, advanced red teaming, and robust quality frameworks. Join us to discover real world case-studies that highlight various use-cases - from orchestrating global data collection and annotation to evaluating frontier LLMs and simulating platform threats, all while ensuring AI safety at scale.