Skip to yearly menu bar
Skip to main content
Main Navigation
NeurIPS
Help/FAQ
Contact NeurIPS
Code of Ethics
Code of Conduct
Create Profile
Journal To Conference Track
Diversity & Inclusion
Proceedings
Future Meetings
Press
Exhibitor Information
Privacy Policy
Downloads
My Stuff
Login
San Diego
Mexico City
Select Year: (2025)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
Earlier Conferences
Start Here
Schedule
Main Conference
Awards
Invited Talks
Orals
Papers
Competitions
Datasets & Benchmarks
Journal Track
Creative AI Track
Outstanding Paper Awards
Creative AI
Spotlights
Tutorials
Workshops
Community
Affinity Events
Socials
Careers
Exhibitors
Help
Help via Chat
FAQ
Organizers
Expo
Layout:
mini
compact
topic
detail
×
No topics available
No sessions available
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
More effort is needed to protect pedestrian privacy in the era of AI
Test-Time Adaptive Object Detection with Foundation Model
PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold
SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly
NegoCollab: A Common Representation Negotiation Approach for Heterogeneous Collaborative Perception
Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
Flow-Based Policy for Online Reinforcement Learning
Reconstructing Heterogeneous Biomolecules via Hierarchical Gaussian Mixtures and Part Discovery
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
Connectome-Based Modelling Reveals Orientation Maps in the Drosophila Optic Lobe
Hawaii: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models
Online Multi-Class Selection with Group Fairness Guarantee
Majority of the Bests: Improving Best-of-N via Bootstrapping
Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs
Orthogonal Contrastive Learning for Multi-Representation fMRI Analysis
Abstract Counterfactuals for Language Model Agents
Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models
Localized Data Shapley: Accelerating Valuation for Nearest Neighbor Algorithms
UniDomain: Pretraining a Unified PDDL Domain from Real-World Demonstrations for Generalizable Robot Task Planning
COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation
No Object Is an Island: Enhancing 3D Semantic Segmentation Generalization with Diffusion Models
When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product
3DID: Direct 3D Inverse Design for Aerodynamics with Physics-Aware Optimization
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
Don’t Give Up on Democratizing AI for the Wrong Reasons
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts
Linguini: A benchmark for language-agnostic linguistic reasoning
SSRB: Direct Natural Language Querying to Massive Heterogeneous Semi-Structured Data
OpenGU: A Comprehensive Benchmark for Graph Unlearning
ChemX: A Collection of Chemistry Datasets for Benchmarking Automated Information Extraction
MUVR: A Multi-Modal Untrimmed Video Retrieval Benchmark with Multi-Level Visual Correspondence
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
Care-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson’s Disease Gait Assessment
A Learning-Augmented Approach to Online Allocation Problems
DERD-Net: Learning Depth from Event-based Ray Densities
Reduction-based Pseudo-label Generation for Instance-dependent Partial Label Learning
More Than Just Functional: LLM-as-a-Critique for Efficient Code Generation
Memory-Augmented Potential Field Theory: A Framework for Adaptive Control in Non-Convex Domains
Limitations of Normalization in Attention
Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs
Learning to Plan Like the Human Brain via Visuospatial Perception and Semantic-Episodic Synergistic Decision-Making
Teaching Language Models to Evolve with Users: Dynamic Profile Modeling for Personalized Alignment
Dr. RAW: Towards General High-Level Vision from RAW with Efficient Task Conditioning
Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders
Predictable Scale (Part II) --- Farseer: A Refined Scaling Law in LLMs
Cognitive Predictive Processing: A Human-inspired Framework for Adaptive Exploration in Open-World Reinforcement Learning
A unified framework for establishing the universal approximation of transformer-type architectures
A machine learning approach that beats Rubik's cubes
PlanarGS: High-Fidelity Indoor 3D Gaussian Splatting Guided by Vision-Language Planar Priors
Information Theoretic Learning for Diffusion Models with Warm Start
Trust Region Reward Optimization and Proximal Inverse Reward Optimization Algorithm
A Dynamic Learning Strategy for Dempster-Shafer Theory with Applications in Classification and Enhancement
RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation
PointMAC: Meta-Learned Adaptation for Robust Test-Time Point Cloud Completion
TreeSplat: Mergeable Tree for Deformable Gaussian Splatting
Adaptive Sigmoid Clipping for Balancing the Direction–Magnitude Mismatch Trade-off in Differentially Private Learning
Resource-Constrained Federated Continual Learning: What Does Matter?
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
MoniTor: Exploiting Large Language Models with Instruction for Online Video Anomaly Detection
THD-BAR: Topology Hierarchical Derived Brain Autoregressive Modeling for EEG Generic Representations
F-Adapter: Frequency-Adaptive Parameter-Efficient Fine-Tuning in Scientific Machine Learning
SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training
Unsupervised Federated Graph Learning
FrameShield: Adversarially Robust Video Anomaly Detection
A Closer Look at Graph Transformers: Cross-Aggregation and Beyond
Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization
HypoBootstrap: A Bootstrapping Framework for Inductive Reasoning
Can Class-Priors Help Single-Positive Multi-Label Learning?
GraphKeeper: Graph Domain-Incremental Learning via Knowledge Disentanglement and Preservation
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Storyboard-guided Alignment for Fine-grained Video Action Recognition
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs
Strassen Attention, Split VC Dimension and Compositionality in Transformers
Pessimistic Data Integration for Policy Evaluation
Fair Matroid Selection
LLM at Network Edge: A Layer-wise Efficient Federated Fine-tuning Approach
Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks
Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent
Mesh Interpolation Graph Network for Dynamic and Spatially Irregular Global Weather Forecasting
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
FedIGL: Federated Invariant Graph Learning for Non-IID Graphs
DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving
Mysteries of the Deep: Role of Intermediate Representations in Out of Distribution Detection
Doubly Robust Alignment for Large Language Models
Aligning What Matters: Masked Latent Adaptation for Text-to-Audio-Video Generation
More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
BNMusic: Blending Environmental Noises into Personalized Music
MSTAR: Box-free Multi-query Scene Text Retrieval with Attention Recycling
Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression
Integral Imprecise Probability Metrics
CHPO: Constrained Hybrid-action Policy Optimization for Reinforcement Learning
Nearly-Linear Time Private Hypothesis Selection with the Optimal Approximation Factor
Shapley-Based Data Valuation for Weighted $k$-Nearest Neighbors
VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
AugGen: Synthetic Augmentation using Diffusion Models Can Improve Recognition
Prompt Tuning Transformers for Data Memorization
One-Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models
AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models
Diffusion-Guided Graph Data Augmentation
Fast Rate Bounds for Multi-Task and Meta-Learning with Different Sample Sizes
An Evidence-Based Post-Hoc Adjustment Framework for Anomaly Detection Under Data Contamination
Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples
Towards foundational LiDAR world models with efficient latent flow matching
RankMatch: A Novel Approach to Semi-Supervised Label Distribution Learning Leveraging Rank Correlation between Labels
NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding
Toward a Unified Geometry Understanding : Riemannian Diffusion Framework for Graph Generation and Prediction
Constant Bit-size Transformers Are Turing Complete
Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification
AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining
Transcending Cost-Quality Tradeoff in Agent Serving via Session-Awareness
Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers
IMPACT: Irregular Multi-Patch Adversarial Composition Based on Two‑Phase Optimization
Compressed and Smooth Latent Space for Text Diffusion Modeling
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
Visual Structures Help Visual Reasoning: Addressing the Binding Problem in LVLMs
Capturing Individual Human Preferences with Reward Features
How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning
Manipulating Feature Visualizations with Gradient Slingshots
Rope to Nope and Back Again: A New Hybrid Attention Strategy
Enhancing Privacy in Multimodal Federated Learning with Information Theory
Tight Bounds on the Distortion of Randomized and Deterministic Distributed Voting
Tighter CMI-Based Generalization Bounds via Stochastic Projection and Quantization
TRUST: Test-Time Refinement using Uncertainty-Guided SSM Traverses
$\texttt{STRCMP}$: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization
InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection
The Quotient Bayesian Learning Rule
S-Crescendo: A Nested Transformer Weaving Framework for Scalable Nonlinear System in S-Domain Representation
On the VC dimension of deep group convolutional neural networks
The Promise of RL for Autoregressive Image Editing
Light-Weight Diffusion Multiplier and Uncertainty Quantification for Fourier Neural Operators
SEMPO: Lightweight Foundation Models for Time Series Forecasting
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
Efficient Knowledge Transfer in Federated Recommendation for Joint Venture Ecosystem
AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant Adversarial Patches
YEAST: Yet Another Sequential Test
PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning
Understanding Prompt Tuning and In-Context Learning via Meta-Learning
Breakthrough Sensor-Limited Single View: Towards Implicit Temporal Dynamics for Time Series Domain Adaptation
Turning the Tables: Enabling Backward Transfer via Causal-Aware LoRA in Continual Learning
Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness
Improving the Straight-Through Estimator with Zeroth-Order Information
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Enhancing GUI Agent with Uncertainty-Aware Self-Trained Evaluator
Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA
Global Prompt Refinement with Non-Interfering Attention Masking for One-Shot Federated Learning
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
GLNCD: Graph-Level Novel Category Discovery
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners
Soft-consensual Federated Learning for Data Heterogeneity via Multiple Paths
Neural Attention Search
One Sample is Enough to Make Conformal Prediction Robust
CymbaDiff: Structured Spatial Diffusion for Sketch-based 3D Semantic Urban Scene Generation
Semi-supervised Graph Anomaly Detection via Robust Homophily Learning
Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems
BADiff: Bandwidth Adaptive Diffusion Model
Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction
Towards Unified Multimodal Interleaved Generation via Group Relative Policy Optimization
Addressing Mark Imbalance in Integration-free Marked Temporal Point Processes
The Quest for Universal Master Key Filters in DS-CNNs
Pin the Tail on the Model: Blindfolded Repair of User-Flagged Failures in Text-to-Image Services
AutoPartGen: Autoregressive 3D Part Generation and Discovery
Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning
Uncertainty-Aware Multi-Objective Reinforcement Learning-Guided Diffusion Models for 3D De Novo Molecular Design
Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational Inversions
Learning Simple Interpolants for Linear Integer Arithmetic
EAReranker: Efficient Embedding Adequacy Assessment for Retrieval Augmented Generation
Identifying Macro Causal Effects in C-DMGs over DMGs
Bridging Time and Linguistics: LLMs as Time Series Analyzer through Symbolization and Segmentation
Statistical inference for Linear Stochastic Approximation with Markovian Noise
Revealing Multimodal Causality with Large Language Models
Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
Leveraging robust optimization for llm alignment under distribution shifts
Relaxing partition admissibility in Cluster-DAGs: a causal calculus with arbitrary variable clustering
SPICED: A Synaptic Homeostasis-Inspired Framework for Unsupervised Continual EEG Decoding
$\epsilon$-Seg: Sparsely Supervised Semantic Segmentation of Microscopy Data
Second-order Optimization under Heavy-Tailed Noise: Hessian Clipping and Sample Complexity Limits
UFO-RL: Uncertainty-Focused Optimization for Efficient Reinforcement Learning Data Selection
Statistical Parity with Exponential Weights
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning
SPARTAN: A Sparse Transformer World Model Attending to What Matters
3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction
Robust Graph Condensation via Classification Complexity Mitigation
Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool
Computational Hardness of Reinforcement Learning with Partial $q^{\pi}$-Realizability
Mixtures of Subspaces for Bandwidth Efficient Context Parallel Training
Rectifying Soft-Label Entangled Bias in Long-Tailed Dataset Distillation
Sample-Efficient Tabular Self-Play for Offline Robust Reinforcement Learning
Spurious-Aware Prototype Refinement for Reliable Out-of-Distribution Detection
Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning
World Models as Reference Trajectories for Rapid Motor Adaptation
Reasoning Is Not a Race: When Stopping Early Beats Going Deeper
Beyond Pairwise Connections: Extracting High-Order Functional Brain Network Structures under Global Constraints
UniGTE: Unified Graph–Text Encoding for Zero-Shot Generalization across Graph Tasks and Domains
scSplit: Bringing Severity Cognizance to Image Decomposition in Fluorescence Microscopy
LoRA-EnVar: Parameter-Efficient Hybrid Ensemble Variational Assimilation for Weather Forecasting
User-Instructed Disparity-aware Defocus Control
SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score
SHAP values via sparse Fourier representation
Token Perturbation Guidance for Diffusion Models
Reward-Aware Proto-Representations in Reinforcement Learning
Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs
Tight Lower Bounds and Improved Convergence in Performative Prediction
Implicit Modeling for Transferability Estimation of Vision Foundation Models
Detecting Data Deviations in Electronic Health Records
SSIMBaD: Sigma Scaling with SSIM-Guided Balanced Diffusion for AnimeFace Colorization
Non-convex entropic mean-field optimization via Best Response flow
ECO: Evolving Core Knowledge for Efficient Transfer
Explainably Safe Reinforcement Learning
NUTS: Eddy-Robust Reconstruction of Surface Ocean Nutrients via Two-Scale Modeling
Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
Residual Stream Analysis of Overfitting And Structural Disruptions
Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues
CausalVerse: Benchmarking Causal Representation Learning with Configurable High-Fidelity Simulations
TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence
Towards Self-Refinement of Vision-Language Models with Triangular Consistency
Dual-Res Tandem Mamba-3D: Bilateral Breast Lesion Detection and Classification on Non-contrast Chest CT
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection
Heterogeneous Graph Transformers for Simultaneous Mobile Multi-Robot Task Allocation and Scheduling under Temporal Constraints
Repurposing AlphaFold3-like Protein Folding Models for Antibody Sequence and Structure Co-design
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
Incentivizing LLMs to Self-Verify Their Answers
Unlearned but Not Forgotten: Data Extraction after Exact Unlearning in LLM
Discretization-free Multicalibration through Loss Minimization over Tree Ensembles
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Stop the Nonconsensual Use of Nude Images in Research
Truthful Aggregation of LLMs with an Application to Online Advertising
BAM-ICL: Causal Hijacking In-Context Learning with Budgeted Adversarial Manipulation
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
FairDICE: Fairness-Driven Offline Multi-Objective Reinforcement Learning
Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
A Minimalistic Unified Framework for Incremental Learning across Image Restoration Tasks
CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering
Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function
Consistency of the $k_n$-nearest neighbor rule under adaptive sampling
On Hierarchies of Fairness Notions in Cake Cutting: From Proportionality to Super Envy-Freeness
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
On Learning Verifiers and Implications to Chain-of-Thought Reasoning
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts
Under the Shadow: Exploiting Opacity Variation for Fine-grained Shadow Detection
Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations
Lost in Transmission: When and Why LLMs Fail to Reason Globally
PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching
Uncertain Knowledge Graph Completion via Semi-Supervised Confidence Distribution Learning
4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming
Towards Generalizable Detector for Generated Image
Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics
Optimal Single-Policy Sample Complexity and Transient Coverage for Average-Reward Offline RL
Learning to Zoom with Anatomical Relations for Medical Structure Detection
Training a Scientific Reasoning Model for Chemistry
Private Hyperparameter Tuning with Ex-Post Guarantee
Training the Untrainable: Introducing Inductive Bias via Representational Alignment
Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives
Error Forcing in Recurrent Neural Networks
A Markov Decision Process for Variable Selection in Branch & Bound
A Simple Linear Patch Revives Layer-Pruned Large Language Models
Neighbor-aware Contrastive Disambiguation for Cross-Modal Hashing with Redundant Annotations
Encouraging metric-aware diversity in contrastive representation space
Proper Hölder-Kullback Dirichlet Diffusion: A Framework for High Dimensional Generative Modeling
DREAM: Drafting with Refined Target Features and Entropy-Adaptive Cross-Attention Fusion for Multimodal Speculative Decoding
LD-RoViS: Training-free Robust Video Steganography for Deterministic Latent Diffusion Model
Neighborhood Self-Dissimilarity Attention for Medical Image Segmentation
A Closed-Form Solution for Fast and Reliable Adaptive Testing
A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules
A duality framework for analyzing random feature and two-layer neural networks
Policy learning “without” overlap: Pessimism and generalized empirical Bernstein’s inequality
REINFORCEMENT LEARNING FOR INDIVIDUAL OPTIMAL POLICY FROM HETEROGENEOUS DATA
Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering
IMPROVED LEARNING THEORY FOR KERNEL DISTRIBUTION REGRESSION WITH TWO-STAGE SAMPLING
Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems
Stochastic-Constrained Stochastic Optimization with Markovian Data
Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks
Real-Time Hyper-Personalized Generative AI Should Be Regulated to Prevent the Rise of "Digital Heroin"
Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track
Position: Biology is the Challenge Physics-Informed ML Needs to Evolve
AI Testing Should Account for Sophisticated Strategic Behaviour
Position: If Innovation in AI systematically Violates Fundamental Rights, Is It Innovation at All?
Position: Towards Bidirectional Human-AI Alignment
Collective Bargaining in the Information Economy Can Address AI-Driven Power Concentration
Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor
SMRS: advocating a unified reporting standard for surrogate models in the artificial intelligence era.
Military AI Needs Technically-Informed Regulation to Safeguard AI Research and its Applications
Prohibiting Generative AI in any Form of Weapon Control
The Right to Red-Team: Adversarial AI Literacy as a Civic Imperative in K-12 Education
Position: AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
DataSIR: A Benchmark Dataset for Sensitive Information Recognition
Meta-World+: An Improved, Standardized, RL Benchmark
InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts
CodeAssistBench (CAB): Dataset & Benchmarking for Multi-turn Chat-Based Code Assistance
Intend to Move: A Multimodal Dataset for Intention-Aware Human Motion Understanding
DAVE: Diagnostic benchmark for Audio Visual Evaluation
OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models
Sheetpedia: A 300K-Spreadsheet Corpus for Spreadsheet Intelligence and LLM Fine-Tuning
ConnectomeBench: Can LLMs proofread the connectome?
Struct-Bench: A Benchmark for Differentially Private Structured Text Generation
LawShift: Benchmarking Legal Judgment Prediction Under Statute Shifts
PHANTOM: A Benchmark for Hallucination Detection in Financial Long-Context QA
Open-Insect: Benchmarking Open-Set Recognition of Novel Species in Biodiversity Monitoring
SWE-smith: Scaling Data for Software Engineering Agents
Factorio Learning Environment
MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
Merlin L48 Spectrogram Dataset
MolVision: Molecular Property Prediction with Vision Language Models
mmWalk: Towards Multi-modal Multi-view Walking Assistance
CheMixHub: Datasets and Benchmarks for Chemical Mixture Property Prediction
PSBench: a large-scale benchmark for estimating the accuracy of protein complex structural models
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
TAPAS: Datasets for Learning the Learning with Errors Problem
R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization
GTPBD: A Fine-Grained Global Terraced Parcel and Boundary Dataset
SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
DQVis Dataset: Natural Language to Biomedical Visualization
TreeFinder: A US-Scale Benchmark Dataset for Individual Tree Mortality Monitoring Using High-Resolution Aerial Imagery
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Aeolus: A Multi-structural Flight Delay Dataset
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data
EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition
Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm
REFED: A Subject Real-time Dynamic Labeled EEG-fNIRS Synchronized Recorded Emotion Dataset
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Benchmarking Large Language Models with Integer Sequence Generation Tasks
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
CPSea: Large-scale cyclic peptide-protein complex dataset for machine learning in cyclic peptide design
RGB-to-Polarization Estimation: A New Task and Benchmark Study
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding
Establishing Best Practices in Building Rigorous Agentic Benchmarks
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks
MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
PF∆: A Benchmark Dataset for Power Flow under Load, Generation, and Topology Variations
Reasoning Gym: Reasoning Environments for Reinforcement Learning with Verifiable Rewards
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Distributional Training Data Attribution: What do Influence Functions Sample?
GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data
VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance
Through the Lens: Benchmarking Deepfake Detectors Against Moiré-Induced Distortions
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
FlySearch: Exploring how vision-language models explore
CLEVER: A Curated Benchmark for Formally Verified Code Generation
STSBench: A Large-Scale Dataset for Modeling Neuronal Activity in the Dorsal Stream of Primate Visual Cortex
SeasonBench-EA: A Multi-Source Benchmark for Seasonal Prediction and Numerical Model Post-Processing in East Asia
MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?
PUO-Bench: A Panel Understanding and Operation Benchmark with A Privacy-Preserving Framework
From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Videos Generation
Towards A Generalist Code Embedding Model Based On Massive Data Synthesis
A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks
Augmenting Biological Fitness Prediction Benchmarks with Landscapes Features from GraphFLA
egoEMOTION: Egocentric Vision and Physiological Signals for Emotion and Personality Recognition in Real-world Tasks
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes
COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation
VMDT: Decoding the Trustworthiness of Video Foundation Models
CarbonGlobe: A Global-Scale, Multi-Decade Dataset and Benchmark for Carbon Forecasting in Forest Ecosystems
Towards Automated Petrography
EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark
UniHG: A Large-scale Universal Heterogeneous Graph Dataset and Benchmark for Representation Learning and Cross-Domain Transferring
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios
Hyperphantasia: A Benchmark for Evaluating the Mental Visualization Capabilities of Multimodal LLMs
QCircuitBench: A Large-Scale Dataset for Benchmarking Quantum Algorithm Design
Multimodal Bandits: Regret Lower Bounds and Optimal Algorithms
ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models
SWE-bench Goes Live!
Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation
Amortized Variational Transdimensional Inference
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360° Firefighting Video
Improving Deep Learning for Accelerated MRI With Data Filtering
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
WritingBench: A Comprehensive Benchmark for Generative Writing
STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving
Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics
Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents
MLIP Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials via an Open, Accessible Benchmark Platform
IRRISIGHT: A Large-Scale Multimodal Dataset and Scalable Pipeline to Address Irrigation and Water Management in Agriculture
CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research
C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Long-term Intracortical Neural activity and Kinematics (LINK): An intracortical neural dataset for chronic brain-machine interfaces, neuroscience, and machine learning
ML4CFD Competition: Results and Retrospective Analysis
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
IR-OptSet: An Optimization-Sensitive Dataset for Advancing LLM-Based IR Optimizer
CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models
CogPhys: Assessing Cognitive Load via Multimodal Remote and Contact-based Physiological Sensing
AutoOpt: A Dataset and a Unified Framework for Automating Optimization Problem Solving
HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks
CaMiT: A Time-Aware Car Model Dataset for Classification and Generation
BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing
Toward Real-world Text Image Forgery Localization: Structured and Interpretable Data Synthesis
CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks
Towards Understanding Camera Motions in Any Video
Massive Sound Embedding Benchmark (MSEB)
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
OpenLex3D: A Tiered Benchmark for Open-Vocabulary 3D Scene Representations
MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing
Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts
BO4Mob: Bayesian Optimization Benchmarks for High-Dimensional Urban Mobility Problem
In the Eye of MLLM: Benchmarking Egocentric Video Intent Understanding with Gaze-Guided Prompting
NS-Gym: A Comprehensive and Open-Source Simulation Framework for Non-Stationary Markov Decision Processes
EngiBench: A Framework for Data-Driven Engineering Design Research
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
SMMILE: An expert-driven benchmark for multimodal medical in-context learning
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
EVAAA: A Virtual Environment Platform for Essential Variables in Autonomous and Adaptive Agents
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology
DermaCon-IN: A Multiconcept-Annotated Dermatological Image Dataset of Indian Skin Disorders for Clinical AI Research
TaiwanVQA: Benchmarking and Enhancing Cultural Understanding in Vision-Language Models
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
VADB: A Large-Scale Video Aesthetic Database with Professional and Multi-Dimensional Annotations
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications
EndoBench: A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis
What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes
DGCBench: A Deep Graph Clustering Benchmark
LIFEBENCH: Evaluating Length Instruction Following in Large Language Models
ExAct: A Video-Language Benchmark for Expert Action Analysis
OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model
Identifiability of Deep Polynomial Neural Networks
AGI-Elo: How Far Are We From Mastering A Task?
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction
LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training
Semantic-KG: Using Knowledge Graphs to Construct Benchmarks for Measuring Semantic Similarity
All that structure matches does not glitter
BEDLAM2.0: Synthetic humans and cameras in motion
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models
Alchemist: Turning Public Text-to-Image Data into Generative Gold
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models
COGNAC: Cooperative Graph-based Networked Agent Challenges for Multi-Agent Reinforcement Learning
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
DecoyDB: A Dataset for Graph Contrastive Learning in Protein-Ligand Binding Affinity Prediction
Chain-of-Model Learning for Language Model
ClinBench: A Standardized Multi-Domain Framework for Evaluating Large Language Models in Clinical Information Extraction
RBench-V: A Primary Assessment for Visual Reasoning Models with Multimodal Outputs
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
C-SEO Bench: Does Conversational SEO Work?
Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
A Temporal Difference Method for Stochastic Continuous Dynamics
Torch-Uncertainty: Deep Learning Uncertainty Quantification
3EED: Ground Everything Everywhere in 3D
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
EDBench: Large-Scale Electron Density Data for Molecular Modeling
T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
EconGym: A Scalable AI Testbed with Diverse Economic Tasks
AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy
NoBOOM: Chemical Process Datasets for Industrial Anomaly Detection
Contrastive Learning with Data Misalignment: Feature Purity, Training Dynamics and Theoretical Generalization Guarantees
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
STARC-9: A Large-scale Dataset for Multi-Class Tissue Classification for CRC Histopathology
NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods
ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests
Quantifying Generalisation in Imitation Learning
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
TAPVid-360: Tracking Any Point in 360 from Narrow Field of View Video
ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design
FLiP: Towards Comprehensive and Reliable Evaluation of Federated Prompt Learning
Seeking and Updating with Live Visual Knowledge
MMOT: The First Challenging Benchmark for Drone-based Multispectral Multi-Object Tracking
Dynamic Risk Assessments for Offensive Cybersecurity Agents
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors
The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements
ReinAD: Towards Real-world Industrial Anomaly Detection with a Comprehensive Contrastive Dataset
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
LexiCon: a Benchmark for Planning under Temporal Constraints in Natural Language
DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios
PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation
Risk-Averse Total-Reward Reinforcement Learning
A Practical Guide for Incorporating Symmetry in Diffusion Policy
MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Bridging Equivariant GNNs and Spherical CNNs for Structured Physical Domains
SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem
PSI: A Benchmark for Human Interpretation and Response in Traffic Interactions
ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs
Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series
UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions
Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind
Uncertainty-Sensitive Privileged Learning
Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation
Spik-NeRF: Spiking Neural Networks for Neural Radiance Fields
BrainMoE: Cognition Joint Embedding via Mixture-of-Expert Towards Robust Brain Foundation Model
Uncover Governing Law of Pathology Propagation Mechanism Through A Mean-Field Game
SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
Quantum Doubly Stochastic Transformers
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation
Towards Identifiability of Hierarchical Temporal Causal Representation Learning
Mitigating Sexual Content Generation via Embedding Distortion in Text-conditioned Diffusion Models
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping
RidgeLoRA: Matrix Ridge Enhanced Low-Rank Adaptation of Large Language Models
Distributional LLM-as-a-Judge
Refinement Methods for Distributed Distribution Estimation under $\ell^p$-Losses
A Stable Whitening Optimizer for Efficient Neural Network Training
Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models
Deep Legendre Transform
Privacy Reasoning in Ambiguous Contexts
Dual-Space Semantic Synergy Distillation for Continual Learning of Unlabeled Streams
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation
DyFlow: Dynamic Workflow Framework for Agentic Reasoning
CF-VLM:CounterFactual Vision-Language Fine-tuning
Learning CAD Modeling Sequences via Projection and Part Awareness
Gaze-VLM: Bridging Gaze and VLMs through Attention Regularization for Egocentric Understanding
SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction
Towards a Geometric Understanding of Tensor Learning via the t-Product
Fully Autonomous Neuromorphic Navigation and Dynamic Obstacle Avoidance
Audio-Sync Video Generation with Multi-Stream Temporal Control
DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
CVGL: Causal Learning and Geometric Topology
GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion
Non-stationary Bandit Convex Optimization: A Comprehensive Study
Recurrent Attention-based Token Selection for Efficient Streaming Video-LLMs
PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding
From Sequence to Structure: Uncovering Substructure Reasoning in Transformers
GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI Agents
RUAGO: Effective and Practical Retain-Free Unlearning via Adversarial Attack and OOD Generator
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
PLD: A Choice-Theoretic List-Wise Knowledge Distillation
Streaming Audio Generation from Discrete Tokens via Streaming Flow Matching
CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs
Flow Field Reconstruction with Sensor Placement Policy Learning
Provable Gradient Editing of Deep Neural Networks
Learning conformational ensembles of proteins based on backbone geometry
Generalizable, real-time neural decoding with hybrid state-space models
This Time is Different: An Observability Perspective on Time Series Foundation Models
Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning
FlowDAS: A Stochastic Interpolant-based Framework for Data Assimilation
FlexOLMo: Open Language Models for Flexible Data Use
Matching Markets Meet LLMs: Algorithmic Reasoning with Ranked Preferences
Backward Conformal Prediction
Learning Multi-Source and Robust Representations for Continual Learning
Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
Detecting Generated Images by Fitting Natural Image Distributions
ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation
Spectral Estimation with Free Decompression
Asymptotically Stable Quaternion-valued Hopfield-structured Neural Network with Periodic Projection-based Supervised Learning Rules
On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization
CoUn: Empowering Machine Unlearning via Contrastive Learning
FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning
DOTA: Distributional Test-time Adaptation of Vision-Language Models
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Understanding outer learning rates in Local SGD
Path-specific effects for pulse-oximetry guided decisions in critical care
Understanding Generalization in Physics Informed Models through Affine Variety Dimensions
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
FP64 is All You Need: Rethinking Failure Modes in Physics-Informed Neural Networks
Improving Formal Reasoning of Transformer with State Stack
Conformal Inference under High-Dimensional Covariate Shifts via Likelihood-Ratio Regularization
CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder
Direct Fisher Score Estimation for Likelihood Maximization
Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection
Bridging Scales: Spectral Theory Reveals How Local Connectivity Rules Sculpt Global Neural Dynamics in Spatially Extended Networks
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding
A Finite Sample Analysis of Distributional TD Learning with Linear Function Approximation
Inference with correlated priors using sisters cells
Improve Temporal Reasoning in Multimodal Large Language Models via Video Contrastive Decoding
Adversarial Robustness of Nonparametric Regression
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning
How Memory in Optimization Algorithms Implicitly Modifies the Loss
Prediction with expert advice under additive noise
SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
pLSTM: parallelizable Linear Source Transition Mark networks
RCCDA: Adaptive Model Updates in the Presence of Concept Drift under a Constrained Resource Budget
Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Lorentz Local Canonicalization: How to make any Network Lorentz-Equivariant
Learn2Mix: Training Neural Networks Using Adaptive Data Integration
GRIP: A Graph-Based Reasoning Instruction Producer
When Data Can't Meet: Estimating Correlation Across Privacy Barriers
Cost-aware LLM-based Online Dataset Annotation
ENMA: Tokenwise Autoregression for Continuous Neural PDE Operators
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms
A Physics-preserved Transfer Learning Method for Differential Equations
Copresheaf Topological Neural Networks: A Generalized Deep Learning Framework
A TRIANGLE Enables Multimodal Alignment Beyond Cosine Similarity
ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning
VLMs can Aggregate Scattered Training Patches
Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees
PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
Exploring the Translation Mechanism of Large Language Models
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
Decoding Causal Structure: End-to-End Mediation Pathways Inference
Token-Level Self-Play with Importance-Aware Guidance for Large Language Models
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling
Time-uniform and Asymptotic Confidence Sequence of Quantile under Local Differential Privacy
Structural Causal Bandits under Markov Equivalence
Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration
The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification Perspective
Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents
When Causal Dynamics Matter: Adapting Causal Strategies through Meta-Aware Interventions
TARFVAE: Efficient One-Step Generative Time Series Forecasting via TARFLOW based VAE
RespoDiff: Dual-Module Bottleneck Transformation for Responsible & Faithful T2I Generation
HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis
Dimension-free Score Matching and Time Bootstrapping for Diffusion Models
Rationalized All-Atom Protein Design with Unified Multi-Modal Bayesian Flow
Correlated Low-Rank Adaptation for ConvNets
Transformers for Mixed-type Event Sequences
Bayesian Ego-graph inference for Networked Multi-Agent Reinforcement Learning
A Multimodal BiMamba Network with Test-Time Adaptation for Emotion Recognition Based on Physiological Signals
AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
Attention! Your Vision Language Model Could Be Maliciously Manipulated
Tree-Sliced Entropy Partial Transport
Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization
Provably Efficient Online RLHF with One-Pass Reward Modeling
PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation
Robust Explanations of Graph Neural Networks via Graph Curvatures
ReDi: Rectified Discrete Flow
Adaptive Context Length Optimization with Low-Frequency Truncation for Multi-Agent Reinforcement Learning
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping
Embedding Principle of Homogeneous Neural Network for Classification Problem
Gradient-Guided Epsilon Constraint Method for Online Continual Learning
Diffusion Models and the Manifold Hypothesis: Log-Domain Smoothing is Geometry Adaptive
Connecting Neural Models Latent Geometries with Relative Geodesic Representations
In-Context Learning Strategies Emerge Rationally
Finding separatrices of dynamical flows with Deep Koopman Eigenfunctions
Adjusting Initial Noise to Mitigate Memorization in Text-to-Image Diffusion Models
Dynamic Regret Reduces to Kernelized Static Regret
Seeing the Wind from a Falling Leaf
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs
Theoretical Insights into In-context Learning with Unlabeled Data
Exploring Tradeoffs through Mode Connectivity for Multi-Task Learning
RAGRouter: Learning to Route Queries to Multiple Retrieval-Augmented Language Models
Towards a Pairwise Ranking Model with Orderliness and Monotonicity for Label Enhancement
Uncovering the Spectral Bias in Diagonal State Space Models
Fast-in-Slow: A Dual-System VLA Model Unifying Fast Manipulation within Slow Reasoning
Accelerating Visual-Policy Learning through Parallel Differentiable Simulation
The Best Instruction-Tuning Data are Those That Fit
Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models
NeuroGenPoisoning: Neuron-Guided Attacks on Retrieval-Augmented Generation of LLM via Genetic Optimization of External Knowledge
Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data
RAPTR: Radar-based 3D Pose Estimation using Transformer
PRIMT: Preference-based Reinforcement Learning with Multimodal Feedback and Trajectory Synthesis from Foundation Models
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
MJ-Video: Benchmarking and Rewarding Video Generation with Fine-Grained Video Preference
Distributionally Robust Performative Optimization
Sampling by averaging: A multiscale approach to score estimation
IDOL: Meeting Diverse Distribution Shifts with Prior Physics for Tropical Cyclone Multi-Task Estimation
Masked Gated Linear Unit
FedRAM: Federated Reweighting and Aggregation for Multi-Task Learning
Distributional Autoencoders Know the Score
Infinite-Width Limit of a Single Attention Layer: Analysis via Tensor Programs
Memory by accident: a theory of learning as a byproduct of network stabilization
Consistency Conditions for Differentiable Surrogate Losses
Diffusion Generative Modeling on Lie Group Representations
Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models
VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Multi-Task Vehicle Routing Solver via Mixture of Specialized Experts under State-Decomposable MDP
Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
DLoFT: Gradient-Decoupled Fine-Tuning for Generalizable Long Chain-of-Thought Reasoning
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Self-diffusion for Solving Inverse Problems
Smart Surrogate Losses for Contextual Stochastic Linear Optimization with Robust Constraints
Self-Supervised Learning of Graph Representations for Network Intrusion Detection
Zero-shot protein stability prediction by inverse folding models: a free energy interpretation
Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
Solving Neural Min-Max Games: The Role of Architecture, Initialization & Dynamics
Preference-driven Knowledge Distillation for Few-shot Node Classification
Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework
Bit-swapping Oriented Twin-memory Multi-view Clustering in Lifelong Incomplete Scenarios
Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration
Efficient Fairness-Performance Pareto Front Computation
Eliciting Reasoning in Language Models with Cognitive Tools
Personalized Visual Content Generation in Conversational Systems
Logic.py: Bridging the Gap between LLMs and Constraint Solvers
Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-based Decoding
Explaining the Law of Supply and Demand via Online Learning
Inference of Whole Brain Electrophysiological Networks Through Multimodal Integration of Simultaneous Scalp and Intracranial EEG
Efficiently Verifiable Proofs of Data Attribution
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
RFMPose: Generative Category-level Object Pose Estimation via Riemannian Flow Matching
HoliTom: Holistic Token Merging for Fast Video Large Language Models
OpenCUA: Open Foundations for Computer-Use Agents
FedFACT: A Provable Framework for Controllable Group-Fairness Calibration in Federated Learning
The Underappreciated Power of Vision Models for Graph Structural Understanding
Delving into Large Language Models for Effective Time-Series Anomaly Detection
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
Covariances for Free: Exploiting Mean Distributions for Training-free Federated Learning
Deep Continuous-Time State-Space Models for Marked Event Sequences
FedWMSAM: Fast and Flat Federated Learning via Weighted Momentum and Sharpness-Aware Minimization
The Complexity of Symmetric Equilibria in Min-Max Optimization and Team Zero-Sum Games
Understanding Parametric and Contextual Knowledge Reconciliation within Large Language Models
Towards Robust Parameter-Efficient Fine-Tuning for Federated Learning
BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation
Bandit and Delayed Feedback in Online Structured Prediction
Functional Matching of Logic Subgraphs: Beyond Structural Isomorphism
On the Loss of Context Awareness in General Instruction Fine-tuning
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Faster Algorithms for Structured John Ellipsoid Computation
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
The Rise of Parameter Specialization for Knowledge Storage in Large Language Models
Smoothed Differentiation Efficiently Mitigates Shattered Gradients in Explanations
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
EA3D: Online Open-World 3D Object Extraction from Streaming Videos
Learning Counterfactual Outcomes Under Rank Preservation
Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data
Curly Flow Matching for Learning Non-gradient Field Dynamics
Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks
Flick: Empowering Federated Learning with Commonsense Knowledge
OpenBox: Annotate Any Bounding Boxes in 3D
MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification
Discovering Important Experts for Mixture-of-Experts Models Pruning Through a Theoretical Perspective
MR. Video: MapReduce as an Effective Principle for Long Video Understanding
Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
Escaping saddle points without Lipschitz smoothness: the power of nonlinear preconditioning
Differentially Private Gomory-Hu Trees
Graph Your Own Prompt
Unified Reinforcement and Imitation Learning for Vision-Language Models
AlphaFold Database Debiasing for Robust Inverse Folding
Miss-ReID: Delivering Robust Multi-Modality Object Re-Identification Despite Missing Modalities
CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment
PointTruss: K-Truss for Point Cloud Registration
GMM-based VAE model with Normalising Flow for effective stochastic segmentation
A Cautionary Tale on Integrating Studies with Disparate Outcome Measures for Causal Inference
VideoTitans: Scalable Video Prediction with Integrated Short- and Long-term Memory
Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Deep Taxonomic Networks for Unsupervised Hierarchical Prototype Discovery
InstaInpaint: Instant 3D-Scene Inpainting with Masked Large Reconstruction Model
SNAP: Low-Latency Test-Time Adaptation with Sparse Updates
Probabilistic Token Alignment for Large Language Model Fusion
Rebalancing Contrastive Alignment with Bottlenecked Semantic Increments in Text-Video Retrieval
Masked Diffusion Models as Energy Minimization
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
ZeroS: Zero‑Sum Linear Attention for Efficient Transformers
Continual Optimization with Symmetry Teleportation for Multi-Task Learning
Faster Generic Identification in Tree-Shaped Structural Causal Models
Bounds on the computational complexity of neurons due to dendritic morphology
HEIR: Learning Graph-Based Motion Hierarchies
Neural Collapse is Globally Optimal in Deep Regularized ResNets and Transformers
Detecting High-Stakes Interactions with Activation Probes
AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning
Inference-Time Hyper-Scaling with KV Cache Compression
Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation
VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
Certifying Concavity and Monotonicity in Games via Sum-of-Squares Hierarchies
Variational Inference with Mixtures of Isotropic Gaussians
Traversal Verification for Speculative Tree Decoding
TF-MAS: Training-free Mamba2 Architecture Search
Learning from Disjoint Views: A Contrastive Prototype Matching Network for Fully Incomplete Multi-View Clustering
Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
Hierarchical Implicit Neural Emulators
Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
Safety Pretraining: Toward the Next Generation of Safe AI
Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
For Better or for Worse, Transformers Seek Patterns for Memorization
GeoAda: Efficiently Finetune Geometric Diffusion Models with Equivariant Adapters
Asymmetric Duos: Sidekicks Improve Uncertainty
PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization
Hierarchical Demonstration Order Optimization for Many-shot In-Context Learning
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
Compute-Optimal Scaling for Value-Based Deep RL
Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm
Agnostic Continuous-Time Online Learning
TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration
FedRACE: A Hierarchical and Statistical Framework for Robust Federated Learning
Unveiling the Uncertainty in Embodied and Operational Carbon of Large AI Models through a Probabilistic Carbon Accounting Model
Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers
AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections
From Faults to Features: Pretraining to Learn Robust Representations against Sensor Failures
Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models
Secure and Confidential Certificates of Online Fairness
Towards Pre-trained Graph Condensation via Optimal Transport
Object-Centric Concept-Bottlenecks
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
An Adaptive Algorithm for Bilevel Optimization on Riemannian Manifolds
Tightening Regret Lower and Upper Bounds in Restless Rising Bandits
A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1
GSPN-2: Efficient Parallel Sequence Modeling
Precise Asymptotics and Refined Regret of Variance-Aware UCB
On the Optimality of the Median-of-Means Estimator under Adversarial Contamination
What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Thumb on the Scale: Optimal Loss Weighting in Last Layer Retraining
Robust Distributed Estimation: Extending Gossip Algorithms to Ranking and Trimmed Means
PRESCRIBE: Predicting Single-Cell Responses with Bayesian Estimation
HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models
Competitive Advantage Attacks to Decentralized Federated Learning
Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Diversity as a Reward: Fine-Tuning LLMs on a Mixture of Domain-Undetermined Data
Formal Models of Active Learning from Contrastive Examples
Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics
Evolving and Regularizing Meta-Environment Learner for Fine-Grained Few-Shot Class-Incremental Learning
Sketched Gaussian Mechanism for Private Federated Learning
Sum Estimation under Personalized Local Differential Privacy
PhysVLM-AVR: Active Visual Reasoning for Multimodal Large Language Models in Physical Environments
Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks
Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
Training Language Models to Reason Efficiently
Discovering Compositional Hallucinations in LVLMs
Neuro-Spectral Architectures for Causal Physics-Informed Networks
Improving Perturbation-based Explanations by Understanding the Role of Uncertainty Calibration
Revisiting Orbital Minimization Method for Neural Operator Decomposition
Federated Continual Learning via Orchestrating Multi-Scale Expertise
NFL-BA: Near-Field Light Bundle Adjustment for SLAM in Dynamic Lighting
Robust Label Proportions Learning
SyncHuman: Synchronizing 2D and 3D Generative Models for Single-view Human Reconstruction
Multimodal Tabular Reasoning with Privileged Structured Information
Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
Learning to Better Search with Language Models via Guided Reinforced Self-Training
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
In Silico Mapping of Visual Categorical Selectivity Across the Whole Brain
Information-Theoretic Discrete Diffusion
Partial Physics Informed Diffusion Model for Ocean Chlorophyll Concentration Reconstruction
PubSub-VFL: Towards Efficient Two-Party Split Learning in Heterogeneous Environments via Publisher/Subscriber Architecture
Convergent Functions, Divergent Forms
SEAL: Semantic-Aware Hierarchical Learning for Generalized Category Discovery
FlashMo: Geometric Interpolants and Frequency-Aware Sparsity for Scalable Efficient Motion Generation
Personalized Image Editing in Text-to-Image Diffusion Models via Collaborative Direct Preference Optimization
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
Modelling the control of offline processing with reinforcement learning
HyPlaneHead: Rethinking Tri-plane-like Representations in Full-Head Image Synthesis
MetaKoopman: Bayesian Meta-Learning of Koopman Operators for Modeling Structured Dynamics under Distribution Shifts
Dynamic View Synthesis as an Inverse Problem
EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
SignFlow Bipartite Subgraph Network For Large-Scale Graph Link Sign Prediction
SceneForge: Enhancing 3D-text alignment with Structured Scene Compositions
Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors
Gaussian Processes for Shuffled Regression
GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning
Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
EchoShot: Multi-Shot Portrait Video Generation
Treatment Effect Estimation for Optimal Decision-Making
Near-Optimal Quantum Algorithms for Computing (Coarse) Correlated Equilibria of General-Sum Games
Synergy Between the Strong and the Weak: Spiking Neural Networks are Inherently Self-Distillers
Adversarial Diffusion for Robust Reinforcement Learning
AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks
Adjusted Count Quantification Learning on Graphs
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart Understanding
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
Reinforcement Learning Meets Masked Generative Models: Mask-GRPO for Text-to-Image Generation
VIBE: Annotation-Free Video-to-Text Information Bottleneck Evaluation for TL;DR
Momentum Multi-Marginal Schrödinger Bridge Matching
Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models
ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual Decoding
Latent Refinement via Flow Matching for Training-free Linear Inverse Problem Solving
On the Entropy Calibration of Language Models
Measuring AI Ability to Complete Long Software Tasks
In Search of Adam’s Secret Sauce
MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning
Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach
Self-Supervised Direct Preference Optimization for Text-to-Image Diffusion Models
Median Selection with Noisy and Structural Information
Exploring Diffusion Transformer Designs via Grafting
On the Convergence of Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent
IA-GGAD: Zero-shot Generalist Graph Anomaly Detection via Invariant and Affinity Learning
Decoupled Entropy Minimization
SpecMAS: A Multi-Agent System for Self-Verifying System Generation via Formal Model Checking
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Multi-Agent Learning under Uncertainty: Recurrence vs. Concentration
Accurately Predicting Protein Mutational Effects via a Hierarchical Many-Body Attention Network
Searching Latent Program Spaces
Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Learn and Ensemble Bridge Adapters for Multi-domain Task Incremental Learning
Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles
NeurIPT: Foundation Model for Neural Interfaces
Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior
Any-stepsize Gradient Descent for Separable Data under Fenchel–Young Losses
The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement
Spatiotemporal Consensus with Scene Prior for Unsupervised Domain Adaptive Person Search
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
MVSMamba: Multi-View Stereo with State Space Model
Understanding and Enhancing Message Passing on Heterophilic Graphs via Compatibility Matrix
DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis
ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks
Balancing Positive and Negative Classification Error Rates in Positive-Unlabeled Learning
Robust LLM Alignment via Distributionally Robust Direct Preference Optimization
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant
FreqExit: Enabling Early-Exit Inference for Visual Autoregressive Models via Frequency-Aware Guidance
Assignments for Congestion-Averse Agents: Seeking Competitive and Envy-Free Solutions
Time-Embedded Algorithm Unrolling for Computational MRI
Transductive Conformal Inference for Full Ranking
SPACE: SPike-Aware Consistency Enhancement for Test-Time Adaptation in Spiking Neural Networks
Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards
Guiding LLM Decision-Making with Fairness Reward Models
Convergence Rates of Constrained Expected Improvement
Gaussian Regression-Driven Tensorized Incomplete Multi-View Clustering with Dual Manifold Regularization
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models
Machine Unlearning under Overparameterization
Learning Diffusion Models with Flexible Representation Guidance
Frequency-Aware Token Reduction for Efficient Vision Transformer
When Models Don’t Collapse: On the Consistency of Iterative MLE
$\boldsymbol{\lambda}$-Orthogonality Regularization for Compatible Representation Learning
Differentiation Through Black-Box Quadratic Programming Solvers
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging
OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction
GSRF: Complex-Valued 3D Gaussian Splatting for Efficient Radio-Frequency Data Synthesis
FedMGP: Personalized Federated Learning with Multi-Group Text-Visual Prompts
Efficient Data Selection at Scale via Influence Distillation
LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
One Filters All: A Generalist Filter For State Estimation
ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models
Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference
Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
Geometric Algorithms for Neural Combinatorial Optimization with Constraints
Controlling Thinking Speed in Reasoning Models
SegMASt3R: Geometry Grounded Segment Matching
Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks
Non-Singularity of the Gradient Descent Map for Neural Networks with Piecewise Analytic Activations
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Characterization and Learning of Causal Graphs from Hard Interventions
Entropic Time Schedulers for Generative Diffusion Models
How to Scale Second-Order Optimization
Robust Satisficing Gaussian Process Bandits Under Adversarial Attacks
Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes
PathVQ: Reforming Computational Pathology Foundation Model for Whole Slide Image Analysis via Vector Quantization
Optimal Mistake Bounds for Transductive Online Learning
A Private Approximation of the 2nd-Moment Matrix of Any Subsamplable Input
Out-of-Distribution Generalized Graph Anomaly Detection with Homophily-aware Environment Mixup
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
Predictive Preference Learning from Human Interventions
Dual Alignment Framework for Few-shot Learning with Inter-Set and Intra-Set Shifts
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion
What Expressivity Theory Misses: Message Passing Complexity for GNNs
Continuous Soft Actor-Critic: An Off-Policy Learning Method Robust to Time Discretization
Set-LLM: A Permutation-Invariant LLM
Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Spectral Perturbation Bounds for Low-Rank Approximation with Applications to Privacy
Conditional Representation Learning for Customized Tasks
Non-Stationary Structural Causal Bandits
Class-aware Domain Knowledge Fusion and Fission for Continual Test-Time Adaptation
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling
Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE
Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow
Learning from Delayed Feedback in Games via Extra Prediction
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
SpectraLDS: Provable Distillation for Linear Dynamical Systems
Can LLMs Reason Over Non-Text Modalities in a Training-Free Manner? A Case Study with In-Context Representation Learning
Beyond the Average: Distributional Causal Inference under Imperfect Compliance
Edit Flows: Variable Length Discrete Flow Matching with Sequence-Level Edit Operations
Better Language Model Inversion by Compactly Representing Next-Token Distributions
Flux4D: Flow-based Unsupervised 4D Reconstruction
Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization
A-Mem: Agentic Memory for LLM Agents
Investigating and Mitigating Catastrophic Forgetting in Medical Knowledge Injection through Internal Knowledge Augmentation Learning
Learning Parameterized Skills from Demonstrations
NeuroH-TGL: Neuro-Heterogeneity Guided Temporal Graph Learning Strategy for Brain Disease Diagnosis
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models
Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning
A Gradient Guided Diffusion Framework for Chance Constrained Programming
FACT: Mitigating Inconsistent Hallucinations in LLMs via Fact-Driven Alternating Code-Text Training
A Counterfactual Semantics for Hybrid Dynamical Systems
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
Fully Spiking Neural Networks for Unified Frame-Event Object Tracking
Contextual Thompson Sampling via Generation of Missing Data
CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning
TabSTAR: A Tabular Foundation Model for Tabular Data with Text Fields
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
A learnability analysis on neuro-symbolic learning
OASIS: One-Shot Federated Graph Learning via Wasserstein Assisted Knowledge Integration
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
CCL: Causal-aware In-context Learning for Out-of-Distribution Generalization
Learning World Models for Interactive Video Generation
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
UEPI: Universal Energy-Behavior-Preserving Integrators for Energy Conservative/Dissipative Differential Equations
NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems
Provable Scaling Laws for the Test-Time Compute of Large Language Models
A Closer Look at NTK Alignment: Linking Phase Transitions in Deep Image Regression
Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
Optimized Minimal 3D Gaussian Splatting
ElliCE: Efficient and Provably Robust Algorithmic Recourse via the Rashomon Sets
AtlasGS: Atlanta-world Guided Surface Reconstruction with Implicit Structured Gaussians
Fourier Token Merging: Understanding and Capitalizing Frequency Domain for Efficient Image Generation
FoGE: Fock Space inspired encoding for graph prompting
SCoT: Unifying Consistency Models and Rectified Flows via Straight-Consistent Trajectories
Low Precision Streaming PCA
Obliviator Reveals the Cost of Nonlinear Guardedness in Concept Erasure
Improving Generative Behavior Cloning via Self-Guidance and Adaptive Chunking
3D Human Pose Estimation with Muscles
TAI3: Testing Agent Integrity in Interpreting User Intent
Non-exchangeable Conformal Prediction with Optimal Transport: Tackling Distribution Shift with Unlabeled Data
Optimal Online Change Detection via Random Fourier Features
SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration
Towards 3D Objectness Learning in an Open World
Stitch and Tell: A Structured Data Augmentation Method for Spatial Understanding
Rethinking Verification for LLM Code Generation: From Generation to Testing
PlayerOne: Egocentric World Simulator
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations
Adaptive Data Analysis for Growing Data
Knowledge Distillation of Uncertainty using Deep Latent Factor Model
Personalized Safety in LLMs: A Benchmark and A Planning-Based Agent Approach
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights
AttentionPredictor: Temporal Patterns Matter for KV Cache Compression
HMVLM:Human Motion-Vision-Language Model via MoE LoRA
A Geometrical Analysis of Kernel Ridge Regression and its Applications
Hybrid-Balance GFlowNet for Solving Vehicle Routing Problems
OmniGen-AR: AutoRegressive Any-to-Image Generation
Exponential Dynamic Energy Network for High Capacity Sequence Memory
CRRL: Learning Channel-invariant Neural Representations for High-performance Cross-day Decoding
Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization
AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking
Informed Correctors for Discrete Diffusion Models
Online Prediction with Limited Selectivity
LLM Query Scheduling with Prefix Reuse and Latency Constraints
First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
Probabilistic Reasoning with LLMs for Privacy Risk Estimation
Prediction-Powered Causal Inferences
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time
Instant4D: 4D Gaussian Splatting in Minutes
Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains
A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
The Temporal Graph of Bitcoin Transactions
Conformal Prediction for Time-series Forecasting with Change Points
Amortized Active Generation of Pareto Sets
Hybrid Latent Representations for PDE Emulation
Fractional Diffusion Bridge Models
SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path
Permissioned LLMs: Enforcing Access Control in Large Language Models
Adaptable Safe Policy Learning from Multi-task Data with Constraint Prioritized Decision Transformer
JAFAR: Jack up Any Feature at Any Resolution
Confidence-Aware With Prototype Alignment for Partial Multi-label Learning
COS3D: Collaborative Open-Vocabulary 3D Segmentation
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models
Fostering the Ecosystem of AI for Social Impact Requires Expanding and Strengthening Evaluation Standards
Temporal In‑Context Fine‑Tuning for Versatile Control of Video Diffusion Models
FairDD: Fair Dataset Distillation
Distributionally Robust Feature Selection
Toward Artificial Palpation: Representation Learning of Touch on Soft Bodies
Tree-Guided Diffusion Planner
Your Pre-trained LLM is Secretly an Unsupervised Confidence Calibrator
PRESTO: Preimage-Informed Instruction Optimization for Prompting Black-Box LLMs
Differentially Private Bilevel Optimization: Efficient Algorithms with Near-Optimal Rates
Predicting the Performance of Black-box Language Models with Follow-up Queries
Bridging the gap to real-world language-grounded visual concept learning
MoRIC: A Modular Region-based Implicit Codec for Image Compression
ZeroSep: Separate Anything in Audio with Zero Training
Stochastic Gradients under Nuisances
Private Continual Counting of Unbounded Streams
4KAgent: Agentic Any Image to 4K Super-Resolution
Towards A Translative Model of Sperm Whale Vocalization
A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization
Perception Encoder: The best visual embeddings are not at the output of the network
DP²O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution
Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback
Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning
When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
System Prompt Optimization with Meta-Learning
Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism
Estimation of Treatment Effects in Extreme and Unobserved Data
Steering Generative Models with Experimental Data for Protein Fitness Optimization
Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
BevSplat: Resolving Height Ambiguity via Feature-Based Gaussian Primitives for Weakly-Supervised Cross-View Localization
Anytime-valid, Bayes-assisted, Prediction-Powered Inference
On Transferring Transferability: Towards a Theory for Size Generalization
SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization
Quantum speedup of non-linear Monte Carlo problems
Online Feedback Efficient Active Target Discovery in Partially Observable Environments
Quantifying Distributional Invariance in Causal Subgraph for IRM-Free Graph Generalization
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
Active Target Discovery under Uninformative Priors: The Power of Permanent and Transient Memory
Multi-agent Markov Entanglement
Diffusion Guided Adversarial State Perturbations in Reinforcement Learning
How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?
PolarQuant: Leveraging Polar Transformation for Key Cache Quantization and Decoding Acceleration
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
Transstratal Adversarial Attack: Compromising Multi-Layered Defenses in Text-to-Image Models
Solving Inequality Proofs with Large Language Models
StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models
Unified all-atom molecule generation with neural fields
MoleBridge: Synthetic Space Projecting with Discrete Markov Bridges
Brain-Informed Fine-Tuning for Improved Multilingual Understanding in Language Models
Fast Monte Carlo Tree Diffusion: 100× Speedup via Parallel and Sparse Planning
Constrained Best Arm Identification
Harnessing Feature Resonance under Arbitrary Target Alignment for Out-of-Distribution Node Detection
InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks
BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection
Amortized Sampling with Transferable Normalizing Flows
SMARTraj$^2$: A Stable Multi-City Adaptive Method for Multi-View Spatio-Temporal Trajectory Representation Learning
ReMindRAG: Low-Cost LLM-Guided Knowledge Graph Traversal for Efficient RAG
The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets
Agents Robust to Distribution Shifts Learn Causal World Models Even Under Mediation
Solving Continuous Mean Field Games: Deep Reinforcement Learning for Non-Stationary Dynamics
Multi-Class Support Vector Machine with Differential Privacy
Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning
Efficient Last-Iterate Convergence in Solving Extensive-Form Games
Contimask: Explaining Irregular Time Series via Perturbations in Continuous Time
Instance-Optimality for Private KL Distribution Estimation
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations
Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness
From Likelihood to Fitness: Improving Variant Effect Prediction in Protein and Genome Language Models
Understanding and Rectifying Safety Perception Distortion in VLMs
Understanding Adam Requires Better Rotation Dependent Assumptions
WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY
Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models
TRoVe: Discovering Error-Inducing Static Feature Biases in Temporal Vision-Language Models
A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values
Preference Learning with Response Time: Robust Losses and Guarantees
MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation
Fixed-Point RNNs: Interpolating from Diagonal to Dense
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Convex Potential Mirror Langevin Algorithm for Efficient Sampling of Energy-Based Models
RESPIN-S1.0: A read speech corpus of 10000+ hours in dialects of nine Indian Languages
Heterogeneous Adversarial Play in Interactive Environments
Flat Channels to Infinity in Neural Loss Landscapes
Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
DAMamba: Vision State Space Model with Dynamic Adaptive Scan
HollowFlow: Efficient Sample Likelihood Evaluation using Hollow Message Passing
MagCache: Fast Video Generation with Magnitude-Aware Cache
RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation
How Does Topology Bias Distort Message Passing in Graph Recommender? A Dirichlet Energy Perspective
Multilevel neural simulation-based inference
OmniSegmentor: A Flexible Multi-Modal Learning Framework for Semantic Segmentation
Diffusion Feature Field for Text-based 3D Editing with Gaussian Splatting
Register and [CLS] tokens induce a decoupling of local and global features in large ViTs
Defending Multimodal Backdoored Models by Repulsive Visual Prompt Tuning
Tree of Preferences for Diversified Recommendation
Asymmetric Dual-Lens Video Deblurring
VL-SAM-V2: Open-World Object Detection with General and Specific Query Fusion
Conformal Prediction Beyond the Seen: A Missing Mass Perspective for Uncertainty Quantification in Generative Models
Cross-Modal Representational Knowledge Distillation for Enhanced Spike-informed LFP Modeling
Zero-Shot Blind-Spot Image Denoising via Cross-Scale Non-Local Pixel Refilling
BaRISTA: Brain Scale Informed Spatiotemporal Representation of Human Intracranial Neural Activity
Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Generalized Linear Mode Connectivity for Transformers
Local-Global Associative Frames for Symmetry-Preserving Crystal Structure Modeling
Stabilizing LTI Systems under Partial Observability: Sample Complexity and Fundamental Limits
CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding
QuanDA: Quantile-Based Discriminant Analysis for High-Dimensional Imbalanced Classification
Continuous Domain Generalization
Advancing Interpretability of CLIP Representations with Concept Surrogate Model
Do Language Models Use Their Depth Efficiently?
Multi-Agent Collaboration via Evolving Orchestration
Partial Correlation Network Estimation by Semismooth Newton Methods
Are Pixel-Wise Metrics Reliable for Computerized Tomography Reconstruction?
Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding
LoRASuite: Efficient LoRA Adaptation Across Large Language Model Upgrades
An Ellipsoid Algorithm for Online Convex Optimization
NormFit: A Lightweight Solution for Few-Shot Federated Learning with Non-IID Data
Reverse Diffusion Sequential Monte Carlo Samplers
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents
REMI: Reconstructing Episodic Memory During Internally Driven Path Planning
Physics-informed Value Learner for Offline Goal-Conditioned Reinforcement Learning
Towards Robust Uncertainty Calibration for Composed Image Retrieval
Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards
UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset
OSKAR: Omnimodal Self-supervised Knowledge Abstraction and Representation
HQA-VLAttack: Towards High Quality Adversarial Attack on Vision-Language Pre-Trained Models
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
Topology-Aware Conformal Prediction for Stream Networks
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
Audio Super-Resolution with Latent Bridge Models
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Learning Latent Variable Models via Jarzynski-adjusted Langevin Algorithm
RobIA: Robust Instance-aware Continual Test-time Adaptation for Deep Stereo
Mechanism Design via the Interim Relaxation
Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning
Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling
Parameter-free Algorithms for the Stochastically Extended Adversarial Model
Learning Without Augmenting: Unsupervised Time Series Representation Learning via Frame Projections
Open-World Drone Active Tracking with Goal-Centered Rewards
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
Scalable Exploration via Ensemble++
BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning
Consistent Supervised-Unsupervised Alignment for Generalized Category Discovery
Flow Equivariant Recurrent Neural Networks
GraphTOP: Graph Topology-Oriented Prompting for Graph Neural Networks
Algorithm- and Data-Dependent Generalization Bounds for Diffusion Models
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
Erasing Conceptual Knowledge from Language Models
Can Agent Fix Agent Issues?
Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks
MixAT: Combining Continuous and Discrete Adversarial Training for LLMs
One Subgoal at a Time: Zero-Shot Generalization to Arbitrary Linear Temporal Logic Requirements in Multi-Task Reinforcement Learning
V-CECE: Visual Counterfactual Explanations via Conceptual Edits
Learning from Demonstrations via Capability-Aware Goal Sampling
Hawk: Leveraging Spatial Context for Faster Autoregressive Text-to-Image Generation
GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data
Learning to Instruct for Visual Instruction Tuning
MI-TRQR: Mutual Information-Based Temporal Redundancy Quantification and Reduction for Energy-Efficient Spiking Neural Networks
Attention Mechanism, Max-Affine Partition, and Universal Approximation
Approximate Domain Unlearning for Vision-Language Models
The Good, the Bad and the Ugly: Meta-Analysis of Watermarks, Transferable Attacks and Adversarial Defenses
Markov Persuasion Processes: Learning to Persuade From Scratch
VR-Drive: Viewpoint-Robust End-to-End Driving with Feed-Forward 3D Gaussian Splatting
Embodied Crowd Counting
URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training
PDEfuncta: Spectrally-Aware Neural Representation for PDE Solution Modeling
The Cost of Compression: Tight Quadratic Black-Box Attacks on Sketches for $\ell_2$ Norm Estimation
Latency NMS Attacks: Is It Real Life or Is It Just Fantasy?
Acceleration via silver step-size on Riemannian manifolds with applications to Wasserstein space
Elucidated Rolling Diffusion Models for Probabilistic Forecasting of Complex Dynamics
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Uncoupled and Convergent Learning in Monotone Games under Bandit Feedback
Exact Expressive Power of Transformers with Padding
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Rethinking Losses for Diffusion Bridge Samplers
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
WildCAT3D: Appearance-Aware Multi-View Diffusion in the Wild
Overcoming Challenges of Long-Horizon Prediction in Driving World Models
Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks
IOSTOM: Offline Imitation Learning from Observations via State Transition Occupancy Matching
Training-Free Test-Time Adaptation via Shape and Style Guidance for Vision-Language Models
Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts
Fair Continuous Resource Allocation with Equality of Impact
From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
HubGT: Fast Graph Transformer with Decoupled Hierarchy Labeling
Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment
Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected
Scale-invariant attention
Single-Step Operator Learning for Conditioned Time-Series Diffusion Models
Information-Theoretic Reward Decomposition for Generalizable RLHF
SPOT: Scalable Policy Optimization with Trees for Markov Decision Processes
DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints
HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs
Multimodal Negative Learning
Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
Actor-Free Continuous Control via Structurally Maximizable Q-Functions
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
What are you sinking? A geometric approach on attention sink
On the Convergence of Single-Timescale Actor-Critic
Adaptive Data-Borrowing for Improving Treatment Effect Estimation using External Controls
CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
EventMG: Efficient Multilevel Mamba-Graph Learning for Spatiotemporal Event Representation
Controllable Human-centric Keyframe Interpolation with Generative Prior
The Curse of Depth in Large Language Models
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
LeVo: High-Quality Song Generation with Multi-Preference Alignment
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation
Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables
QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation
Jacobian-Based Interpretation of Nonlinear Neural Encoding Model
Hierarchical Information Aggregation for Incomplete Multimodal Alzheimer's Disease Diagnosis
Learning Dynamics of RNNs in Closed-Loop Environments
StruDiCO: Structured Denoising Diffusion with Gradient-free Inference-stage Boosting for Memory and Time Efficient Combinatorial Optimization
Image Editing As Programs with Diffusion Models
Rethinking Residual Distribution in Locate-then-Edit Model Editing
Shape it Up! Restoring LLM Safety during Finetuning
Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models
Improving Task-Specific Multimodal Sentiment Analysis with General MLLMs via Prompting
MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework
MIND: Material Interface Generation from UDFs for Non-Manifold Surface Reconstruction
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation
SAMPO: Scale-wise Autoregression with Motion Prompt for Generative World Models
Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models
SD-VLM: Spatial Measuring and Understanding with Depth-Encoded Vision-Language Models
Implicit Generative Property Enhancer
Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies
3D-GSRD: 3D Molecular Graph Auto-Encoder with Selective Re-mask Decoding
Handling Missing Responses under Cluster Dependence with Applications to Language Model Evaluation
Scaffolding Dexterous Manipulation with Vision-Language Models
Hadamard Test is Sufficient for Efficient Quantum Gradient Estimation with Lie Algebraic Symmetries
The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
SPiDR: A Simple Approach for Zero-Shot Safety in Sim-to-Real Transfer
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
From Information to Generative Exponent: Learning Rate Induces Phase Transitions in SGD
UniMotion: A Unified Motion Framework for Simulation, Prediction and Planning
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs
Counterfactual Implicit Feedback Modeling
ImageSentinel: Protecting Visual Datasets from Unauthorized Retrieval-Augmented Image Generation
Epistemic Uncertainty Estimation in Regression Ensemble Models with Pairwise Epistemic Estimators
Dynamic Focused Masking for Autoregressive Embodied Occupancy Prediction
WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting
KOALA++: Efficient Kalman-Based Optimization with Gradient-Covariance Products
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
Deep RL Needs Deep Behavior Analysis: Exploring Implicit Planning by Model-Free Agents in Open-Ended Environments
Object Concepts Emerge from Motion
Least squares variational inference
ROOT: Rethinking Offline Optimization as Distributional Translation via Probabilistic Bridge
Alleviating Hallucinations in Large Language Models through Multi-Model Contrastive Decoding and Dynamic Hallucination Detection
Seg-VAR:Image Segmentation with Visual Autoregressive Modeling
LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
Theoretical Investigation of Adafactor for Non-Convex Smooth Optimization
PyraMotion: Attentional Pyramid-Structured Motion Integration for Co-Speech 3D Gesture Synthesis
Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability
Gatekeeper: Improving Model Cascades Through Confidence Tuning
Length Generalization via Auxiliary Tasks
Learning Linear Attention in Polynomial Time
Building 3D Representations and Generating Motions From a Single Image via Video-Generation
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
Estimating Model Performance Under Covariate Shift Without Labels
Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning
How Classifier Features Transfer to Downstream: An Asymptotic Analysis in a Two-Layer Model
Continual Knowledge Adaptation for Reinforcement Learning
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Learning Preferences without Interaction for Cooperative AI: A Hybrid Offline-Online Approach
VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching
X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
ChatbotID: Identifying Chatbots with Granger Causality Test
Beyond Scores: Proximal Diffusion Models
Rising from Ashes: Generalized Federated Learning via Dynamic Parameter Reset
Automated Detection of Visual Attribute Reliance with a Self-Reflective Agent
Rectifying Shortcut Behaviors in Preference-based Reward Learning
Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards
Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers
Improving planning and MBRL with temporally-extended actions
Pool Me Wisely: On the Effect of Pooling in Transformer-Based Models
Joint Design of Protein Surface and Backbone Using a Diffusion Bridge Model
MPMAvatar: Learning 3D Gaussian Avatars with Accurate and Robust Physics-Based Dynamics
Fine-grained List-wise Alignment for Generative Medication Recommendation
ProDAG: Projected Variational Inference for Directed Acyclic Graphs
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization
MemEIC: A Step Toward Continual and Compositional Knowledge Editing
RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing
Knowledge Distillation Detection for Open-weights Models
Curriculum Abductive Learning
Coloring Learning for Heterophilic Graph Representation
FedLPA: Local Prior Alignment for Heterogeneous Federated Generalized Category Discovery
AdvPrefix: An Objective for Nuanced LLM Jailbreaks
Practical Kernel Selection for Kernel-based Conditional Independence Test
Less is More: Improving LLM Alignment via Preference Data Selection
Alias-Free ViT: Fractional Shift Invariance via Linear Attention
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
More of the Same: Persistent Representational Harms Under Increased Representation
OmniDraft: A cross-vocabulary, online adaptive drafter for on-device speculative decoding
Learning-Augmented Streaming Algorithms for Correlation Clustering
Spectral Conditioning of Attention Improves Transformer Performance
TimePerceiver: An Encoder-Decoder Framework for Generalized Time-Series Forecasting
Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
On Linear Mode Connectivity of Mixture-of-Experts Architectures
Max Entropy Moment Kalman Filter for Polynomial Systems with Arbitrary Noise
Generalizable Insights for Graph Transformers in Theory and Practice
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Imagine360: Immersive 360 Video Generation from Perspective Anchor
Metis: A Foundation Speech Generation Model with Masked Generative Pre-training
One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling
Holistic Order Prediction in Natural Scenes
GRIT: Teaching MLLMs to Think with Images
FlexAC: Towards Flexible Control of Associative Reasoning in Multimodal Large Language Models
Fréchet Geodesic Boosting
Interpreting vision transformers via residual replacement model
Functional Complexity-adaptive Temporal Tensor Decomposition
SIFusion: A Unified Fusion Framework for Multi-granularity Arctic Sea Ice Forecasting
OmniGaze: Reward-inspired Generalizable Gaze Estimation in the Wild
GeoRemover: Removing Objects and Their Causal Visual Artifacts
Uncertainty Quantification with the Empirical Neural Tangent Kernel
A Latent Multilayer Graphical Model For Complex, Interdependent Systems
GUARD: Constructing Realistic Two-Player Matrix and Security Games for Benchmarking Game-Theoretic Algorithms
Analogy-based Multi-Turn Jailbreak against Large Language Models
Time-o1: Time-Series Forecasting Needs Transformed Label Alignment
MALinZero: Efficient Low-Dimensional Search for Mastering Complex Multi-Agent Planning
Omnidirectional 3D Scene Reconstruction from Single Image
Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
Informed Initialization for Bayesian Optimization and Active Learning
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Text-to-Code Generation for Modular Building Layouts in Building Information Modeling
Learning Dense Hand Contact Estimation from Imbalanced Data
Riemannian Consistency Model
Curl Descent : Non-Gradient Learning Dynamics with Sign-Diverse Plasticity
Emergent Temporal Correspondences from Video Diffusion Transformers
Low-Rank Graphon Learning for Networks
An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models
A Few Moments Please: Scalable Graphon Learning via Moment Matching
Infrequent Exploration in Linear Bandits
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
Multidimensional Bayesian Utility Maximization: Tight Approximations to Welfare
On Optimal Steering to Achieve Exact Fairness
Learning to Insert for Constructive Neural Vehicle Routing Solver
Adjoint Schrödinger Bridge Sampler
MOSDT: Self-Distillation-Based Decision Transformer for Multi-Agent Offline Safe Reinforcement Learning
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
Geometry-Aware Collaborative Multi-Solutions Optimizer for Model Fine-Tuning with Parameter Efficiency
SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
Improved Balanced Classification with Theoretically Grounded Loss Functions
SpikingVTG: A Spiking Detection Transformer for Video Temporal Grounding
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
$\texttt{BetaConform}$: Efficient MAP Estimation of LLM Ensemble Judgment Performance with Prior Transfer
Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty
The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition
UMA: A Family of Universal Models for Atoms
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
RAPID Hand: Robust, Affordable, Perception-Integrated, Dexterous Manipulation Platfrom for Embodied Intelligence
Multiscale guidance of protein structure prediction with heterogeneous cryo-EM data
Activation-Informed Merging of Large Language Models
Certifying Deep Network Risks and Individual Predictions with PAC-Bayes Loss via Localized Priors
Convex Approximation of Two-Layer ReLU Networks for Hidden State Differential Privacy
Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Generalized Contrastive Learning for Universal Multimodal Retrieval
MaintainCoder: Maintainable Code Generation Under Dynamic Requirements
Dependency Parsing is More Parameter-Efficient with Normalization
From Pixels to Views: Learning Angular-Aware and Physics-Consistent Representations for Light Field Microscopy
Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
Generating Creative Chess Puzzles
How Patterns Dictate Learnability in Sequential Data
Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
In-context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-separation
A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation
TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
Tackling Biased Evaluators in Dueling Bandits
ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Is Grokking a Computational Glass Relaxation?
PUATE: Efficient ATE Estimation from Treated (Positive) and Unlabeled Units
TESTING STATIONARITY AND CHANGE POINT DETECTION IN REINFORCEMENT LEARNING
PASS: Path-selective State Space Model for Event-based Recognition
Learning to price with resource constraints: from full information to machine-learned prices
DKDR: Dynamic Knowledge Distillation for Reliability in Federated Learning
CPO: Condition Preference Optimization for Controllable Image Generation
MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents
Private Statistical Estimation via Truncation
Enhancing Tactile-based Reinforcement Learning for Robotic Control
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!
SensorLM: Learning the Language of Wearable Sensors
Transformer brain encoders explain human high-level visual responses
Coupled Data and Measurement Space Dynamics for Enhanced Diffusion Posterior Sampling
Scaling RL to Long Videos
Monitoring Risks in Test-Time Adaptation
Order-Level Attention Similarity Across Language Models: A Latent Commonality
Channel Matters: Estimating Channel Influence for Multivariate Time Series
Split Gibbs Discrete Diffusion Posterior Sampling
PolyPose: Deformable 2D/3D Registration via Polyrigid Transformations
An Optimized Franz-Parisi Criterion and its Equivalence with SQ Lower Bounds
Long-Tailed Recognition via Information-Preservable Two-Stage Learning
Scaling Up Active Testing to Large Language Models
Improving Regret Approximation for Unsupervised Dynamic Environment Generation
When Are Concepts Erased From Diffusion Models?
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Emergence of Linear Truth Encodings in Language Models
High-Dimensional Calibration from Swap Regret
Neural Combinatorial Optimization for Time Dependent Traveling Salesman Problem
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Gradient-Weight Alignment as a Train-Time Proxy for Generalization in Classification Tasks
OPMapper: Enhancing Open-Vocabulary Semantic Segmentation with Multi-Guidance Information
Real-Time Execution of Action Chunking Flow Policies
Contrastive Representations for Temporal Reasoning
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
Breaking the Gradient Barrier: Unveiling Large Language Models for Strategic Classification
ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking
On Universality Classes of Equivariant Networks
Risk-aware Direct Preference Optimization under Nested Risk Measure
HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis
Spike4DGS: Towards High-Speed Dynamic Scene Rendering with 4D Gaussian Splatting via a Spike Camera Array
Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation
Native Segmentation Vision Transformers
On the Universal Near Optimality of Hedge in Combinatorial Settings
Shortcut Features as Top Eigenfunctions of NTK: A Linear Neural Network Case and More
Exploiting Task Relationships in Continual Learning via Transferability-Aware Task Embeddings
Reward Reasoning Models
Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
On topological descriptors for graph products
Generative Trajectory Stitching through Diffusion Composition
Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems
Non-Adaptive Adversarial Face Generation
Improved Representation Steering for Language Models
When Does Curriculum Learning Help? A Theoretical Perspective
Reframing Gaussian Splatting Densification with Complexity-Density Consistency of Primitives
Mixture-of-Experts Meets In-Context Reinforcement Learning
Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs
Adversary Aware Optimization for Robust Defense
Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits
Towards Generalizable Multi-Policy Optimization with Self-Evolution for Job Scheduling
Blackbox Model Provenance via Palimpsestic Membership Inference
PurpCode: Reasoning for Safer Code Generation
GaRA-SAM: Robustifying Segment Anything Model with Gated-Rank Adaptation
R$^2$ec: Towards Large Recommender Models with Reasoning
Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation
Joint Hierarchical Representation Learning of Samples and Features via Informed Tree-Wasserstein Distance
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
A Principle of Targeted Intervention for Multi-Agent Reinforcement Learning
Scaling Diffusion Transformers Efficiently via $\mu$P
Learned Prefix Caching for Efficient LLM Inference
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Second-Order Convergence in Private Stochastic Non-Convex Optimization
Antidistillation Sampling
Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning
Language Modeling by Language Models
TTRL: Test-Time Reinforcement Learning
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
Greedy Algorithms for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure
Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations
Small Resamples, Sharp Guarantees: Convergence Rates for Resampled Studentized Quantile Estimators
Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions
CREA: A Collaborative Multi-Agent Framework for Creative Image Editing and Generation
Distilling LLM Prior to Flow Model for Generalizable Agent’s Imagination in Object Goal Navigation
HyPINO: Multi-Physics Neural Operators via HyperPINNs and the Method of Manufactured Solutions
Language Models can Self-Improve at State-Value Estimation for Better Search
Cancer Survival Analysis via Zero-shot Tumor Microenvironment Segmentation on Low-resolution Whole Slide Pathology Images
Diffusion Beats Autoregressive in Data-Constrained Settings
MLEP: Multi-granularity Local Entropy Patterns for Generalized AI-generated Image Detection
Strategic Costs of Perceived Bias in Fair Selection
Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling
Reconstruction and Secrecy under Approximate Distance Queries
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
A Circular Argument: Does RoPE need to be Equivariant for Vision?
NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs
TOMCAT: Test-time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy
Image Token Matters: Mitigating Hallucination in Discrete Tokenizer-based Large Vision-Language Models via Latent Editing
MAP Estimation with Denoisers: Convergence Rates and Guarantees
Composition and Alignment of Diffusion Models using Constrained Learning
Puppeteer: Rig and Animate Your 3D Models
Thoughts Are All Over the Place: On the Underthinking of Long Reasoning Models
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions
Hankel Singular Value Regularization for Highly Compressible State Space Models
Learning (Approximately) Equivariant Networks via Constrained Optimization
Mitigating Semantic Collapse in Partially Relevant Video Retrieval
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts
Unleashing Foundation Vision Models: Adaptive Transfer for Diverse Data-Limited Scientific Domains
Large Language Models Miss the Multi-agent Mark
Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model
Incentive-Aware Dynamic Resource Allocation under Long-Term Cost Constraints
Learning-Augmented Algorithms for $k$-median via Online Learning
E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
🎧MOSPA: Human Motion Generation Driven by Spatial Audio
DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
Increasing the Utility of Synthetic Images through Chamfer Guidance
Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
CoFFT: Chain of Foresight-Focus Thought for Visual Language Models
Bridging the Gap Between Cross-Domain Theory and Practical Application: A Case Study on Molecular Dissolution
Domain Adaptive Hashing Retrieval via VLM Assisted Pseudo-Labeling and Dual Space Adaptation
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Evaluating LLMs in Open-Source Games
VERA: Variational Inference Framework for Jailbreaking Large Language Models
MIX: A Multi-view Time-Frequency Interactive Explanation Framework for Time Series Classification
CDFlow: Building Invertible Layers with Circulant and Diagonal Matrices
Hamiltonian Neural PDE Solvers through Functional Approximation
Pay Attention to Small Weights
MiNT: Multi-Network Transfer Benchmark for Temporal Graph Learning
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Quartet: Native FP4 Training Can Be Optimal for Large Language Models
Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models
On Evaluating LLM Alignment by Evaluating LLMs as Judges
Can DPO Learn Diverse Human Values? A Theoretical Scaling Law
Among Us: A Sandbox for Measuring and Detecting Agentic Deception
Agnostic Active Learning Is Always Better Than Passive Learning
SAFE: Multitask Failure Detection for Vision-Language-Action Models
TransferTraj: A Vehicle Trajectory Learning Model for Region and Task Transferability
Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training
FFN Fusion: Rethinking Sequential Computation in Large Language Models
AF-UMC: An Alignment-Free Fusion Framework for Unaligned Multi-View Clustering
GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning
Semi-infinite Nonconvex Constrained Min-Max Optimization
Structure Matters: Dynamic Policy Gradient
Two Heads are Better than One: Simulating Large Transformers with Small Ones
3DPE-Gaze:Unlocking the Potential of 3D Facial Priors for Generalized Gaze Estimation
FlowRefiner: A Robust Traffic Classification Framework against Label Noise
Collapsing Taylor Mode Automatic Differentiation
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Neural MJD: Neural Non-Stationary Merton Jump Diffusion for Time Series Prediction
Discrete Neural Flow Samplers with Locally Equivariant Transformer
Training Robust Graph Neural Networks by Modeling Noise Dependencies
Track3R: Joint Point Map and Trajectory Prior for Spatiotemporal 3D Understanding
Ambient Proteins - Training Diffusion Models on Noisy Structures
POCO: Scalable Neural Forecasting through Population Conditioning
BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination
The Structure of Relation Decoding Linear Operators in Large Language Models
Overcoming Long Context Limitations of State Space Models via Context Dependent Sparse Attention
Once Upon an Input: Reasoning via Per-Instance Program Synthesis
Non-rectangular Robust MDPs with Normed Uncertainty Sets
Linear Transformers Implicitly Discover Unified Numerical Algorithms
Automaton Constrained Q-Learning
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
On scalable and efficient training of diffusion samplers
T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks
IPFormer: Visual 3D Panoptic Scene Completion with Context-Adaptive Instance Proposals
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Bandit Guided Submodular Curriculum for Adaptive Subset Selection
VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Metropolis Adjusted Microcanonical Hamiltonian Monte Carlo
Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the Role of Model Complexity
OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects
Strategic Classification with Non-Linear Classifiers
A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings
MIDAS: Misalignment-based Data Augmentation Strategy for Imbalanced Multimodal Learning
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
Caption This, Reason That: VLMs Caught in the Middle
Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain
On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization
MIHC: Multi-View Interpretable Hypergraph Neural Networks with Information Bottleneck for Chip Congestion Prediction
Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization
Meta-D2AG: Causal Graph Learning with Interventional Dynamic Data
Mitigating the Privacy–Utility Trade-off in Decentralized Federated Learning via f-Differential Privacy
Program Synthesis via Test-Time Transduction
macOSWorld: A Multilingual Interactive Benchmark for GUI Agents
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning
Robust Distortion-Free Watermark for Autoregressive Audio Generation Models
Understanding and Improving Fast Adversarial Training against $l_0$ Bounded Perturbations
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Opinion Maximization in Social Networks by Modifying Internal Opinions
Spike-timing-dependent Hebbian learning as noisy gradient descent
Who Reasons in the Large Language Models?
Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering
Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention
Robust Ego-Exo Correspondence with Long-Term Memory
GeoDynamics: A Geometric State‑Space Neural Network for Understanding Brain Dynamics on Riemannian Manifolds
Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising
HetSyn: Versatile Timescale Integration in Spiking Neural Networks via Heterogeneous Synapses
Towards General Continuous Memory for Vision-Language Models
Vision-centric Token Compression in Large Language Model
Metric Automata Theory: A Unifying Theory of RNNs
PINNs with Learnable Quadrature
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models
Coreset for Robust Geometric Median: Eliminating Size Dependency on Outliers
Selftok-Zero: Reinforcement Learning for Visual Generation via Discrete and Autoregressive Visual Tokens
Learning-Augmented Facility Location Mechanisms for the Envy Ratio Objective
Dataset Distillation of 3D Point Clouds via Distribution Matching
Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference
Aligning Text-to-Image Diffusion Models to Human Preference by Classification
Small Singular Values Matter: A Random Matrix Analysis of Transformer Models
Causal Climate Emulation with Bayesian Filtering
CTRL-ALT-DECEIT Sabotage Evaluations for Automated AI R&D
PROFIT: A Specialized Optimizer for Deep Fine Tuning
KGGen: Extracting Knowledge Graphs from Plain Text with Language Models
REINFORCE Converges to Optimal Policies with Any Learning Rate
Joint Relational Database Generation via Graph-Conditional Diffusion Models
Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge
Energy Landscape-Aware Vision Transformers: Layerwise Dynamics and Adaptive Task-Specific Training via Hopfield States
ShortListing Model: A Streamlined Simplex Diffusion for Discrete Variable Generation
Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling
Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding
Generating Computational Cognitive models using Large Language Models
DroneAudioset: An Audio Dataset for Drone-based Search and Rescue
The Emergence of Abstract Thought in Large Language Models Beyond Any Language
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Uncertainty Quantification for Deep Regression using Contextualised Normalizing Flows
GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity
PointMapPolicy: Structured Point Cloud Processing for Multi-Modal Imitation Learning
SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
Optimal Spectral Transitions in High-Dimensional Multi-Index Models
Towards Understanding Transformers in Learning Random Walks
Causal Discovery over Clusters of Variables in Markovian Systems
Remarkable Robustness of LLMs: Stages of Inference?
ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism
Make Information Diffusion Explainable: LLM-based Causal Framework for Diffusion Prediction
Causal Differentiating Concepts: Interpreting LM Behavior via Causal Representation Learning
LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion
GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization
Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs
BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent
Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers
Failure Prediction at Runtime for Generative Robot Policies
Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data
From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
Large Language Bayes
Q3R: Quadratic Reweighted Rank Regularizer for Effective Low-Rank Training
Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
TAPIP3D: Tracking Any Point in Persistent 3D Geometry
Cost-Aware Contrastive Routing for LLMs
QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training
Learning to Rank for In-Context Example Retrieval
LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision
Video-R1: Reinforcing Video Reasoning in MLLMs
Predicting Empirical AI Research Outcomes with Language Models
Fast Training of Large Kernel Models with Delayed Projections
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
Density Ratio-Free Doubly Robust Proxy Causal Learning
Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
Differentiable extensions with rounding guarantees for combinatorial optimization over permutations
Spectral Graph Coarsening Using Inner Product Preservation and the Grassmann Manifold
Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
Are Greedy Task Orderings Better Than Random in Continual Linear Regression?
Revisiting Agnostic Boosting
From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications
Improved Robust Estimation for Erdős-Rényi Graphs: The Sparse Regime and Optimal Breakdown Point
Data-Adaptive Exposure Thresholds under Network Interference
Distributed Multi-Agent Bandits Over Erdős-Rényi Random Networks
Sherlock: Self-Correcting Reasoning in Vision-Language Models
RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving
Online Two-Stage Submodular Maximization
Class-wise Balancing Data Replay for Federated Class-Incremental Learning
Balanced Active Inference
Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling
A Semantic Parsing Framework for End-to-End Time Normalization
Wasserstein Transfer Learning
LinPrim: Linear Primitives for Differentiable Volumetric Rendering
Fast Zeroth-Order Convex Optimization with Quantum Gradient Methods
Concentration and excess risk bounds for imbalanced classification with synthetic oversampling
TimeWak: Temporal Chained-Hashing Watermark for Time Series Data
Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs
Marginal-Nonuniform PAC Learnability
Inference-time Alignment in Continuous Space
On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning
Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution
Transfer Learning for Benign Overfitting in High-Dimensional Linear Regression
Thinkless: LLM Learns When to Think
Bernstein–von Mises for Adaptively Collected Data
Reinventing Multi-Agent Collaboration through Gaussian-Image Synergy in Diffusion Policies
Unified Transferability Metrics for Time Series Foundation Models
Balancing Multimodal Training Through Game-Theoretic Regularization
I2-NeRF: Learning Neural Radiance Fields Under Physically-Grounded Media Interactions
Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments
Non-equilibrium Annealed Adjoint Sampler
DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers
SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting
A data and task-constrained mechanistic model of the mouse outer retina shows robustness to contrast variations
Learning-Augmented Online Bipartite Fractional Matching
Activation-Guided Consensus Merging for Large Language Models
PaZO: Preconditioned Accelerated Zeroth-Order Optimization for Fine-Tuning LLMs
Private Geometric Median in Nearly-Linear Time
Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Private Training Large-scale Models with Efficient DP-SGD
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions
Sharp Gaussian approximations for Decentralized Federated Learning
WMCopier: Forging Invisible Watermarks on Arbitrary Images
Causal Explanation-Guided Learning for Organ Allocation
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks
Vision Transformers Don't Need Trained Registers
Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms
Generalization Bounds for Model-based Algorithm Configuration
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
TRACE: Contrastive learning for multi-trial time series data in neuroscience
Learning from positive and unlabeled examples -Finite size sample bounds
Rectified Point Flow: Generic Point Cloud Pose Estimation
Perturbation Bounds for Low-Rank Inverse Approximations under Noise
Towards Understanding the Mechanisms of Classifier-Free Guidance
VLMs have Tunnel Vision: Evaluating Nonlocal Visual Reasoning in Leading VLMs
Toward Efficient Inference Attacks: Shadow Model Sharing via Mixture-of-Experts
Incremental Sequence Classification with Temporal Consistency
VaporTok: RL-Driven Adaptive Video Tokenizer with Prior & Task Awareness
Quantifying Cross-Modality Memorization in Vision-Language Models
Graph Diffusion that can Insert and Delete
Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go
FedQS: Optimizing Gradient and Model Aggregation for Semi-Asynchronous Federated Learning
Revisiting Glorot Initialization for Long-Range Linear Recurrences
Tight Bounds for Answering Adaptively Chosen Concentrated Queries
What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers
RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling
Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration
Efficient Randomized Experiments Using Foundation Models
HOI-Dyn: Learning Interaction Dynamics for Human-Object Motion Diffusion
Matchings Under Biased and Correlated Evaluations
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Dynamic Diameter in High-Dimensions against Adaptive Adversary and Beyond
ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition
Exponential Convergence Guarantees for Iterative Markovian Fitting
Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences
CoCoA: A Minimum Bayes Risk Framework Bridging Confidence and Consistency for Uncertainty Quantification in LLMs
Surface-Aware Feed-Forward Quadratic Gaussian for Frame Interpolation with Large Motion
SHGR: A Generalized Maximal Correlation Coefficient
WorldMem: Long-term Consistent World Simulation with Memory
Unlocking Dataset Distillation with Diffusion Models
DISCO: Disentangled Communication Steering for Large Language Models
Are Large Reasoning Models Good Translation Evaluators? Analysis and Performance Boost
Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions
Restoring Pruned Large Language Models via Lost Component Compensation
How Many Tokens Do 3D Point Cloud Transformer Architectures Really Need?
BlockDecoder: Boosting ASR Decoders with Context and Merger Modules
Diffusion Adaptive Text Embedding for Text-to-Image Diffusion Models
Axial Neural Networks for Dimension-Free Foundation Models
Recursive Inference Scaling: A Winning Path to Scalable Inference in Language and Multimodal Systems
Geometry-Aware Edge Pooling for Graph Neural Networks
From Euler to AI: Unifying Formulas for Mathematical Constants
IntrinsiX: High-Quality PBR Generation using Image Priors
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Robust Minimax Boosting with Performance Guarantees
Combinatorial Ski Rental Problem: Robust and Learning-Augmented Algorithms
Diversity-oriented Deep Multi-modal Clustering
Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs
TV-Rec: Time-Variant Convolutional Filter for Sequential Recommendation
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets
OmniFC: Rethinking Federated Clustering via Lossless and Secure Distance Reconstruction
Quantifying Task-relevant Similarities in Representations Using Decision Variable Correlations
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
Finite-Time Analysis of Stochastic Nonconvex Nonsmooth Optimization on the Riemannian Manifolds
Alignment of Large Language Models with Constrained Learning
RETRO SYNFLOW: Discrete Flow-Matching for Accurate and Diverse Single-Step Retrosynthesis
Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints
Mind the Gap: Removing the Discretization Gap in Differentiable Logic Gate Networks
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections
ESCA: Contextualizing Embodied Agents via Scene-Graph Generation
Can Large Language Models Master Complex Card Games?
CAT: Content-Adaptive Image Tokenization
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
MERIT: Multilingual Semantic Retrieval with Interleaved Multi-Condition Query
Crucible: Quantifying the Potential of Control Algorithms through LLM Agents
Generalized and Invariant Single-Neuron In-Vivo Activity Representation Learning
Pancakes: Consistent Multi-Protocol Image Segmentation Across Biomedical Domains
Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations
The third pillar of causal analysis? A measurement perspective on causal representations
A CLT for Polynomial GNNs on Community-Based Graphs
FedRW: Efficient Privacy-Preserving Data Reweighting for Enhancing Federated Learning of Language Models
GoRA: Gradient-driven Adaptive Low Rank Adaptation
TreeGen: A Bayesian Generative Model for Hierarchies
Class conditional conformal prediction for multiple inputs by p-value aggregation
Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents
Amplifying Prominent Representations in Multimodal Learning via Variational Dirichlet Process
Stability and Sharper Risk Bounds with Convergence Rate $\tilde{O}(1/n^2)$
Online Experimental Design With Estimation-Regret Trade-off Under Network Interference
SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
Near-Optimal Regret-Queue Length Tradeoff in Online Learning for Two-Sided Markets
A Unified Framework for the Transportability of Population-Level Causal Measures
Equivariant Eikonal Neural Networks: Grid-Free, Scalable Travel-Time Prediction on Homogeneous Spaces
Tree Ensemble Explainability through the Hoeffding Functional Decomposition and TreeHFD Algorithm
Weaver: Shrinking the Generation-Verification Gap by Scaling Compute for Verification
LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs
Less is More: Local Intrinsic Dimensions of Contextual Language Models
Rethinking Fair Federated Learning from Parameter and Client View
Beyond Random: Automatic Inner-loop Optimization in Dataset Distillation
Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis
Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions
Sparse Polyak: an adaptive step size rule for high-dimensional M-estimation
LoRO: Real-Time on-Device Secure Inference for LLMs via TEE-Based Low Rank Obfuscation
Low Rank Gradients and Where to Find Them
Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems
Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization
Convergence of the Gradient Flow for Shallow ReLU Networks on Weakly Interacting Data
You Only Spectralize Once: Taking a Spectral Detour to Accelerate Graph Neural Network
Human-assisted Robotic Policy Refinement via Action Preference Optimization
Fading to Grow: Growing Preference Ratios via Preference Fading Discrete Diffusion for Recommendation
Where Graph Meets Heterogeneity: Multi-View Collaborative Graph Experts
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models
Kuramoto Orientation Diffusion Models
Free-Lunch Color-Texture Disentanglement for Stylized Image Generation
Sequence Modeling with Spectral Mean Flows
SilentStriker: Toward Stealthy Bit-Flip Attacks on Large Language Models
C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video
ForceFM: Enhancing Protein-Ligand Predictions through Force-Guided Flow Matching
NaDRO: Leveraging Dual-Reward Strategies for LLMs Training on Noisy Data
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
Scaling Language-centric Omnimodal Representation Learning
DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
Place Cells as Multi-Scale Position Embeddings: Random Walk Transition Kernels for Path Planning
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Synergistic Tensor and Pipeline Parallelism
RayFusion: Ray Fusion Enhanced Collaborative Visual Perception
Accelerated Distance-adaptive Methods for Hölder Smooth and Convex Optimization
ToolRL: Reward is All Tool Learning Needs
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
Plasticity as the Mirror of Empowerment
Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum
Negative Feedback Really Matters: Signed Dual-Channel Graph Contrastive Learning Framework for Recommendation
Causal-R: A Causal-Reasoning Geometry Problem Solver for Optimized Solution Exploration
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis
DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Don’t Trade Off Safety: Diffusion Regularization for Constrained Offline RL
Contrastive Self-Supervised Learning As Neural Manifold Packing
SPOT-Trip: Dual-Preference Driven Out-of-Town Trip Recommendation
Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models
Refusal Direction is Universal Across Safety-Aligned Languages
ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive
Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals
SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought
Joint‑Embedding vs Reconstruction: Provable Benefits of Latent Space Prediction for Self‑Supervised Learning
On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm
Vision‑Language‑Vision Auto‑Encoder: Scalable Knowledge Distillation from Diffusion Models
Measure-Theoretic Anti-Causal Representation Learning
When Can Model-Free Reinforcement Learning be Enough for Thinking?
Adaptive LoRA Experts Allocation and Selection for Federated Fine-Tuning
ARIA: Training Language Agents with Intention-driven Reward Aggregation
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Split conformal classification with unsupervised calibration
3D Gaussian Splatting based Scene-independent Relocalization with Unidirectional and Bidirectional Feature Fusion
TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval
UniTok: a Unified Tokenizer for Visual Generation and Understanding
Probably Approximately Precision and Recall Learning
FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data
Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Normalizing Flows are Capable Models for Continuous Control
RrED: Black-box Unsupervised Domain Adaptation via Rectifying-reasoning Errors of Diffusion
OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
Continuous Simplicial Neural Networks
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
VADTree: Explainable Training-Free Video Anomaly Detection via Hierarchical Granularity-Aware Tree
Think before Recommendation: Autonomous Reasoning-enhanced Recommender
STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern Clouds
Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Computational Efficiency under Covariate Shift in Kernel Ridge Regression
Repo2Run: Automated Building Executable Environment for Code Repository at Scale
Hybrid-Collaborative Augmentation and Contrastive Sample Adaptive-Differential Awareness for Robust Attributed Graph Clustering
Universal Causal Inference in a Topos
Training-Free Safe Text Embedding Guidance for Text-to-Image Diffusion Models
Automatic Auxiliary Task Selection and Adaptive Weighting Boost Molecular Property Prediction
Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization
World-aware Planning Narratives Enhance Large Vision-Language Model Planner
ZEUS: Zero-shot Embeddings for Unsupervised Separation of Tabular Data
Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge
BrainFlow: A Holistic Pathway of Dynamic Neural System on Manifold
STAR: Spatial-Temporal Tracklet Matching for Multi-Object Tracking
Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity
Structured Initialization for Vision Transformers
Availability-aware Sensor Fusion via Unified Canonical Space
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing
Revitalizing SVD for Global Covariance Pooling: Halley’s Method to Overcome Over-Flattening
Distributed mediation analysis with communication efficiency
ASGO: Adaptive Structured Gradient Optimization
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
Robustly Learning Monotone Single-Index Models
SD-KDE: Score-Debiased Kernel Density Estimation
Meta CLIP 2: A Worldwide Scaling Recipe
Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras
Adaptive Cannistraci-Hebb Network Automata Modelling of Complex Networks for Path-based Link Prediction
Tensor-Parallelism with Partially Synchronized Activations
Tracking and Understanding Object Transformations
Cloud4D: Estimating Cloud Properties at a High Spatial and Temporal Resolution
MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
REOrdering Patches Improves Vision Models
Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries
Neural Emulator Superiority: When Machine Learning for PDEs Surpasses its Training Data
Boosting Resilience of Large Language Models through Causality-Driven Robust Optimization
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
YOLOv12: Attention-Centric Real-Time Object Detectors
CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction
Learning Generalizable Shape Completion with SIM(3) Equivariance
Tapered Off-Policy REINFORCE - Stable and efficient reinforcement learning for large language models
Coupling Generative Modeling and an Autoencoder with the Causal Bridge
MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization
Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
The World Is Bigger: A Computationally-Embedded Perspective on the Big World Hypothesis
Flash Invariant Point Attention
Does Stochastic Gradient really succeed for bandits?
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
TokenSwap: A Lightweight Method to Disrupt Memorized Sequences in LLMs
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Geometric Logit Decoupling for Energy-Based Graph Out-of-distribution Detection
TractoTransformer: Diffusion MRI Streamline Tractography using CNN and Transformer Networks
Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
Bidirectional Representations Augmented Autoregressive Biological Sequence Generation: Application in De Novo Peptide Sequencing
Real-DRL: Teach and Learn in Reality
Statistical Analysis of the Sinkhorn Iterations for Two-Sample Schr\"{o}dinger Bridge Estimation
Sketch-Augmented Features Improve Learning Long-Range Dependencies in Graph Neural Networks
Non-Clairvoyant Scheduling with Progress Bars
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
On the Robustness of Transformers against Context Hijacking for Linear Classification
Towards Fully FP8 GEMM LLM Training at Scale
Private Online Learning against an Adaptive Adversary: Realizable and Agnostic Settings
FIPER: Factorized Features for Robust Image Super-Resolution and Compression
OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles
Probing Equivariance and Symmetry Breaking in Convolutional Networks
LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
A Closer Look to Positive-Unlabeled Learning from Fine-grained Perspectives: An Empirical Study
Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs
Cross-modal Associations in Vision and Language Models: Revisiting the Bouba-Kiki Effect
AutoEdit: Automatic Hyperparameter Tuning for Image Editing
Training-free Online Video Step Grounding
The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
WKV-sharing embraced random shuffle RWKV high-order modeling for pan-sharpening
Towards Accurate Time Series Forecasting via Implicit Decoding
DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
Curvature Tuning: Provable Training-free Model Steering From a Single Parameter
Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization
DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning
Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models
A Sustainable AI Economy Needs Data Deals That Work for Generators
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering
DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
S$^2$NN: Sub-bit Spiking Neural Networks
Purifying Approximate Differential Privacy with Randomized Post-processing
Differentiable Generalized Sliced Wasserstein Plans
You Only Communicate Once: One-shot Federated Low-Rank Adaptation of MLLM
PhysX-3D: Physical-Grounded 3D Asset Generation
Nonlinear Laplacians: Tunable principal component analysis under directional prior information
On the Sample Complexity Bounds of Bilevel Reinforcement Learning
Statistical Analysis of an Adversarial Bayesian Weak Supervision Method
Vicinal Label Supervision for Reliable Aleatoric and Epistemic Uncertainty Estimation
Redundancy-Aware Test-Time Graph Out-of-Distribution Detection
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models
On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling
Functional Virtual Adversarial Training for Semi-Supervised Time Series Classification
Generalizing while preserving monotonicity in comparison-based preference learning models
Taming generative video models for zero-shot optical flow extraction
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
AnimateQR: Bridging Aesthetics and Functionality in Dynamic QR Code Generation
PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-forward Planar Splatting
Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity
Toward Human Deictic Gesture Target Estimation
Less Greedy Equivalence Search
TokMan:Tokenize Manhattan Mask Optimization for Inverse Lithography
Optimal Nuisance Function Tuning for Estimating a Doubly Robust Functional under Proportional Asymptotics
Optimistic Online-to-Batch Conversions for Accelerated Convergence and Universality
Gaussian-Augmented Physics Simulation and System Identification with Complex Colliders
D2SA: Dual-Stage Distribution and Slice Adaptation for Efficient Test-Time Adaptation in MRI Reconstruction
ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression
Effective Neural Approximations for Geometric Optimization Problems
High Resolution UDF Meshing via Iterative Networks
Language Models Can Predict Their Own Behavior
Perturb a Model, Not an Image: Towards Robust Privacy Protection via Anti-Personalized Diffusion Models
Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding
Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound
Fairness under Competition
Constrained Posterior Sampling: Time Series Generation with Hard Constraints
Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere
VCM: Vision Concept Modeling with Adaptive Vision Token Compression via Instruction Fine-Tuning
Towards Unsupervised Open-Set Graph Domain Adaptation via Dual Reprogramming
TRIM: Scalable 3D Gaussian Diffusion Inference with Temporal and Spatial Trimming
To Distill or Decide? Understanding the Algorithmic Trade-off in Partially Observable RL
KaRF: Weakly-Supervised Kolmogorov-Arnold Networks-based Radiance Fields for Local Color Editing
High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction
FLOWING: Implicit Neural Flows for Structure-Preserving Morphing
Continuous Subspace Optimization for Continual Learning
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions
Hadamax Encoding: Elevating Performance in Model-Free Atari
Modeling the Economic Impacts of AI Openness Regulation
ReservoirTTA: Prolonged Test-time Adaptation for Evolving and Recurring Domains
Unraveling Metameric Dilemma for Spectral Reconstruction: A High-Fidelity Approach via Semi-Supervised Learning
Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph
Causal LLM Routing: End-to-End Regret Minimization from Observational Data
ZeroPatcher: Training-free Sampler for Video Inpainting and Editing
Towards Provable Emergence of In-Context Reinforcement Learning
Product Distribution Learning with Imperfect Advice
TimeXL: Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop
Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting
When Lower-Order Terms Dominate: Adaptive Expert Algorithms for Heavy-Tailed Losses
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
Learning Shared Representations from Unpaired Data
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation
AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
Non-Line-of-Sight 3D Reconstruction with Radar
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
U-REPA: Aligning Diffusion U-Nets to ViTs
Grasp2Grasp: Vision-Based Dexterous Grasp Translation via Schrödinger Bridges
DecompNet: Enhancing Time Series Forecasting Models with Implicit Decomposition
Incentivizing Desirable Effort Profiles in Strategic Classification: The Role of Causality and Uncertainty
Robust and Diverse Multi-Agent Learning via Rational Policy Gradient
UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation
Inference-Time Personalized Alignment with a Few User Preference Queries
SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents
GauSAM: Contour‑Guided 2D Gaussian Fields for Multi‑Scale Medical Image Segmentation with Segment Anything
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
Towards Realistic Earth-Observation Constellation Scheduling: Benchmark and Methodology
Sign-In to the Lottery: Reparameterizing Sparse Training
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Exploiting Dynamic Sparsity in Einsum
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
VLM in a flash: I/O-Efficient Sparsification of Vision-Language Model via Neuron Chunking
Blindfolded Experts Generalize Better: Insights from Robotic Manipulation and Videogames
Discovering Symbolic Partial Differential Equation by Abductive Learning
Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval
Why Popular MOEAs are Popular: Proven Advantages in Approximating the Pareto Front
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Contextual Online Pricing with (Biased) Offline Data
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search
Less but More: Linear Adaptive Graph Learning Empowering Spatiotemporal Forecasting
CADMorph: Geometry‑Driven Parametric CAD Editing via a Plan–Generate–Verify Loop
CHiQPM: Calibrated Hierarchical Interpretable Image Classification
Scalable, Explainable and Provably Robust Anomaly Detection with One-Step Flow Matching
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
Efficient Spectral Control of Partially Observed Linear Dynamical Systems
UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights
FedGPS: Statistical Rectification Against Data Heterogeneity in Federated Learning
Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting
Position: Bridge the Gaps between Machine Unlearning and AI Regulation
Greedy Sampling Is Provably Efficient For RLHF
SAGE: A Unified Framework for Generalizable Object State Recognition with State-Action Graph Embedding
Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions
Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law
Multimodal 3D Genome Pre-training
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
Generalized Top-k Mallows Model for Ranked Choices
Normal-Abnormal Guided Generalist Anomaly Detection
Exploring and Leveraging Class Vectors for Classifier Editing
DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
Optimizing Distributional Geometry Alignment with Optimal Transport for Generative Dataset Distillation
Reinforced Context Order Recovery for Adaptive Reasoning and Planning
Non-Uniform Multiclass Learning with Bandit Feedback
Improved Scaling Laws in Linear Regression via Data Reuse
Harnessing the Universal Geometry of Embeddings
Adaptive Inference-Time Scaling via Cyclic Diffusion Search
DisMo: Disentangled Motion Representations for Open-World Motion Transfer
RLVR-World: Training World Models with Reinforcement Learning
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
Shaping Sequence Attractor Schema in Recurrent Neural Networks
Reverse-Annealed Sequential Monte Carlo for Efficient Bayesian Optimal Experiment Design
T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning
AC-LoRA: (Almost) Training-Free Access Control Aware Multi-Modal LLMs
E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning
Safe + Safe = Unsafe? Exploring How Safe Images Can Be Exploited to Jailbreak Large Vision-Language Models
Differentiable Structure Learning and Causal Discovery for General Binary Data
KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
Integration Matters for Learning PDEs with Backwards SDEs
Generating Multi-Table Time Series EHR from Latent Space with Minimal Preprocessing
Adaptive Re-calibration Learning for Balanced Multimodal Intention Recognition
Unlearning-Aware Minimization
FEEDBACK FRICTION: LLMs Struggle to Fully Incorporate External Feedback
Model–Behavior Alignment under Flexible Evaluation: When the Best-Fitting Model Isn’t the Right One
ConTextTab: A Semantics-Aware Tabular In-Context Learner
Robust Equilibria in Continuous Games: From Strategic to Dynamic Robustness
AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise
Cameras as Relative Positional Encoding
Generalization vs Specialization under Concept Shift
Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems
R1-ShareVL: Incentivizing Reasoning Capabilities of Multimodal Large Language Models via Share-GRPO
Federated Dialogue-Semantic Diffusion for Emotion Recognition under Incomplete Modalities
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization
DualEqui: A Dual-Space Hierarchical Equivariant Network for Large Biomolecules
Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities
Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
Not All Data are Good Labels: On the Self-supervised Labeling for Time Series Forecasting
Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data
DiCoFlex: Model-Agnostic Diverse Counterfactuals with Flexible Control
Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch Perspective
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity
WaveAR: Wavelet-Aware Continuous Autoregressive Diffusion for Accurate Human Motion Prediction
PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
Agentic RL Scaling Law: Spontaneous Code Execution for Mathematical Problem Solving
Simple and Effective Specialized Representations for Fair Classifiers
Near-Optimal Sample Complexity for Online Constrained MDPs
Track, Inpaint, Resplat: Subject-driven 3D and 4D Generation with Progressive Texture Infilling
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
Enhancing Visual Prompting through Expanded Transformation Space and Overfitting Mitigation
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
Generating Informative Samples for Risk-Averse Fine-Tuning of Downstream Tasks
Symmetry-Preserving Conformer Ensemble Networks for Molecular Representation Learning
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers
Differentially Private Relational Learning with Entity-level Privacy Guarantees
Selective Learning for Deep Time Series Forecasting
CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting
CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
Stochastically Dominant Peer Prediction
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Registration is a Powerful Rotation-Invariance Learner for 3D Anomaly Detection
Mechanistic Interpretability of RNNs emulating Hidden Markov Models
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization
Object-centric 3D Motion Field for Robot Learning from Human Videos
un$^2$CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP
ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs
Fundamental Limitations in Pointwise Defences of LLM Finetuning APIs
Balanced Conic Rectified Flow
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization
Imitation Beyond Expectation Using Pluralistic Stochastic Dominance
ShiQ: Bringing back Bellman to LLMs
AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
Accelerating Parallel Diffusion Model Serving with Residual Compression
Steering When Necessary: Flexible Steering Large Language Models with Backtracking
Weak-shot Keypoint Estimation via Keyness and Correspondence Transfer
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
iFinder: Structured Zero-Shot Vision-Based LLM Grounding for Dash-Cam Video Reasoning
ModHiFi: Identifying High Fidelity predictive components for Model Modification
Optimal Rates in Continual Linear Regression via Increasing Regularization
Bilevel Optimization for Adversarial Learning Problems: Sharpness, Generation, and Beyond
REDOUBT: Duo Safety Validation for Autonomous Vehicle Motion Planning
PAC-Bayes Bounds for Multivariate Linear Regression and Linear Autoencoders
Robust Hyperbolic Learning with Curvature-Aware Optimization
HYPRL: Reinforcement Learning of Control Policies for Hyperproperties
Don't be lazy: CompleteP enables compute-efficient deep transformers
A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities
$\mu$PC: Scaling Predictive Coding to 100+ Layer Networks
LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities
DualMPNN: Harnessing Structural Alignments for High-Recovery Inverse Protein Folding
Towards Comprehensive Scene Understanding: Integrating First and Third-Person Views for LVLMs
Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models
Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking
Towards Straggler-Resilient Split Federated Learning: An Unbalanced Update Approach
Conformal Prediction in The Loop: A Feedback-Based Uncertainty Model for Trajectory Optimization
ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts
Graphs Help Graphs: Multi-Agent Graph Socialized Learning
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning
Towards a General Attention Framework on Gyrovector Spaces for Matrix Manifolds
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective
Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing
Mitigating Spurious Features in Contrastive Learning with Spectral Regularization
Per-Architecture Training-Free Metric Optimization for Neural Architecture Search
RoMa: A Robust Model Watermarking Scheme for Protecting IP in Diffusion Models
Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization
Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning
DoDo-Code: an Efficient Levenshtein Distance Embedding-based Code for 4-ary IDS Channel
Low-Rank Head Avatar Personalization with Registers
Normalization in Attention Dynamics
KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity
Integrating Drug Substructures and Longitudinal Electronic Health Records for Personalized Drug Recommendation
AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement
The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems
From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring
Preference-Guided Diffusion for Multi-Objective Offline Optimization
CTSketch: Compositional Tensor Sketching for Scalable Neurosymbolic Learning
Parameter Efficient Fine-tuning via Explained Variance Adaptation
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding
Behavior Injection: Preparing Language Models for Reinforcement Learning
CSGO: Content-Style Composition in Text-to-Image Generation
Reconciling Geospatial Prediction and Retrieval via Sparse Representations
A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random
Understanding Contrastive Learning via Gaussian Mixture Models
Active Measurement: Efficient Estimation at Scale
Bipolar Self-attention for Spiking Transformers
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts
Improving Video Generation with Human Feedback
Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation
Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention
Unveiling Environmental Sensitivity of Individual Gains in Influence Maximization
C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails
Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
VarFlow: Proper Scoring-Rule Diffusion Distillation via Energy Matching
Sparta Alignment: Collectively Aligning Multiple Language Models through Combat
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Quantifying Uncertainty in the Presence of Distribution Shifts
Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time
Language Models (Mostly) Know When to Stop Reading
On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration Model
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees
Representational Difference Explanations
G-Net: A Provably Easy Construction of High-Accuracy Random Binary Neural Networks
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
FLUX: Efficient Descriptor-Driven Clustered Federated Learning under Arbitrary Distribution Shifts
LuxDiT: Lighting Estimation with Video Diffusion Transformer
New Parallel and Streaming Algorithms for Directed Densest Subgraph
Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts
Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
Generative Pre-trained Autoregressive Diffusion Transformer
VIPAMIN: Visual Prompt Initialization via Embedding Selection and Subspace Expansion
GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection
Federated Multi-armed Bandits with Efficient Bit-Level Communications
One for All: Universal Topological Primitive Transfer for Graph Structure Learning
Instance-Dependent Regret Bounds for Nonstochastic Linear Partial Monitoring
Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention
Bisecle: Binding and Separation in Continual Learning for Video Language Understanding
Concept-Guided Interpretability via Neural Chunking
When Worse is Better: Navigating the Compression Generation Trade-off In Visual Tokenization
FSI-Edit: Frequency and Stochasticity Injection for Flexible Diffusion-Based Image Editing
VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning
Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection
A Signed Graph Approach to Understanding and Mitigating Oversmoothing
Neural Rule Lists: Learning Discretizations, Rules, and Order in One Go
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
PRSformer: Disease Prediction from Million-Scale Individual Genotypes
Flow-GRPO: Training Flow Matching Models via Online RL
Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation
Learning with Statistical Equality Constraints
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
Uncertainty-aware Preference Alignment for Diffusion Policies
Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
Synthesizing Photorealistic and Dynamic Urban Environments for Multimodal Robot Navigation and Collaboration
Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition
Large Language Models as Model Organisms for Human Associative Learning
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding
Understanding and Enhancing Mask-Based Pretraining towards Universal Representations
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors
Bayesian Concept Bottleneck Models with LLM Priors
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
Prediction-Powered Semi-Supervised Learning with Online Power Tuning
Injecting Frame-Event Complementary Fusion into Diffusion for Optical Flow in Challenging Scenes
PoE-World: Compositional World Modeling with Products of Programmatic Experts
Counterfactual reasoning: an analysis of in-context emergence
Causality-Induced Positional Encoding for Transformer-Based Representation Learning of Non-Sequential Features
AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling
Online Learning of Neural Networks
Rectified CFG++ for Flow Based Models
Maximizing the Value of Predictions in Control: Accuracy Is Not Enough
FastJAM: a Fast Joint Alignment Model for Images
Compositional Monte Carlo Tree Diffusion for Extendable Planning
SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models
Tree-Based Premise Selection for Lean4
Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
FlowMoE: A Scalable Pipeline Scheduling Framework for Distributed Mixture-of-Experts Training
HiFC: High-efficiency Flash-based KV Cache Swapping for Scaling LLM Inference
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
A Single-Swap Local Search Algorithm for k-Means of Lines
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs
Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling
Learning to Condition: A Neural Heuristic for Scalable MPE Inference
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning
FSNet: Feasibility-Seeking Neural Network for Constrained Optimization with Guarantees
Improved Training Technique for Shortcut Models
Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection
Streaming Attention Approximation via Discrepancy Theory
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
Exploring and Exploiting Model Uncertainty in Bayesian Optimization
Neural B-frame Video Compression with Bi-directional Reference Harmonization
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
Dense SAE Latents Are Features, Not Bugs
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
Data Fusion for Partial Identification of Causal Effects
On the Robustness of Verbal Confidence of LLMs in Adversarial Attacks
AREAL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study
A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning
Conditioning Matters: Training Diffusion Policies is Faster Than You Think
Quantifying Statistical Significance of Deep Nearest Neighbor Anomaly Detection via Selective Inference
C$^2$Prompt: Class-aware Client Knowledge Interaction for Federated Continual Learning
Anchor-based Maximum Discrepancy for Relative Similarity Testing
Mask Image Watermarking
KTAE: A Model-Free Algorithm to Key-Tokens Advantage Estimation in Mathematical Reasoning
Contrastive Consolidation of Top-Down Modulations Achieves Sparsely Supervised Continual Learning
Kinaema: a recurrent sequence model for memory and pose in motion
Are Large Language Models Sensitive to the Motives Behind Communication?
CoIDO: Efficient Data Selection for Visual Instruction Tuning via Coupled Importance-Diversity Optimization
Probabilistic Stability Guarantees for Feature Attributions
Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve
Cooperative Bargaining Games Without Utilities: Mediated Solutions from Direction Oracles
Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition
Tight Asymptotics of Extreme Order Statistics
Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms
Point Cloud Synthesis Using Inner Product Transforms
Individually Fair Diversity Maximization
What Does It Take to Build a Performant Selective Classifier?
Parameter Dynamics of Online Machine Learning and Test-time Adaptation
Machine Unlearning via Task Simplex Arithmetic
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs
Kinetics: Rethinking Test-Time Scaling Law
Differential Privacy on Fully Dynamic Streams
Neural Atlas Graphs for Dynamic Scene Decomposition and Editing
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies
CausalVTG: Towards Robust Video Temporal Grounding via Causal Inference
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models
SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?
AudSemThinker: Enhancing Audio-Language Models Through Reasoning over Semantics of Sound
DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
A Partition Cover Approach to Tokenization
Online robust locally differentially private learning for nonparametric regression
Thinker: Learning to Think Fast and Slow
DynaPhArM: Adaptive and Physics-Constrained Modeling for Target-Drug Complexes with Drug-Specific Adaptations
Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders
Self Iterative Label Refinement via Robust Unlabeled Learning
LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers
Consistent Story Generation: Unlocking the Potential of Zigzag Sampling
Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
CAT: Circular-Convolutional Attention for Sub-Quadratic Transformers
Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning
A Theory for Worst-Case vs. Average-Case Guarantees for LLMs
Let a Neural Network be Your Invariant
Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent
Accelerated Vertical Federated Adversarial Learning through Decoupling Layer-Wise Dependencies
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
AnaCP: Toward Upper-Bound Continual Learning via Analytic Contrastive Projection
Don’t call it privacy-preserving or human-centric pose estimation if you don’t measure privacy
SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches
CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model
Atom of Thoughts for Markov LLM Test-Time Scaling
Replicable Distribution Testing
Photography Perspective Composition: Towards Aesthetic Perspective Recommendation
Omni-DNA: A Genomic Model Supporting Sequence Understanding, Long-context, and Textual Annotation
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning
Score-informed Neural Operator for Enhancing Ordering-based Causal Discovery
Provable Watermarking for Data Poisoning Attacks
AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
MOTION: Multi-Sculpt Evolutionary Coarsening for Federated Continual Graph Learning
Evolutionary Prediction Games
Reward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image Generation
On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning
WebDancer: Towards Autonomous Information Seeking Agency
Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior
Demystifying Spectral Feature Learning for Instrumental Variable Regression
ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error
Missing Data Imputation by Reducing Mutual Information with Rectified Flows
DynaRend: Learning 3D Dynamics via Masked Future Rendering for Robotic Manipulation
Causal Mixture Models: Characterization and Discovery
A Set of Generalized Components to Achieve Effective Poison-only Clean-label Backdoor Attacks with Collaborative Sample Selection and Triggers
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Learning Gradient Boosted Decision Trees with Algorithmic Recourse
Protein Design with Dynamic Protein Vocabulary
Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization
Embeddings as Probabilistic Equivalence in Logic Programs
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation
VIKING: Deep variational inference with stochastic projections
Superposition Yields Robust Neural Scaling
STACI: Spatio-Temporal Aleatoric Conformal Inference
Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
EVODiff: Entropy-aware Variance Optimized Diffusion Inference
Credal Prediction based on Relative Likelihood
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
WolBanking77: Wolof Banking Speech Intent Classification Dataset
From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction
Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness
Conditional Gradient Methods with Standard LMO for Stochastic Simple Bilevel Optimization
Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise
Learning Reconfigurable Representations for Multimodal Federated Learning with Missing Data
DINO-Foresight: Looking into the Future with DINO
Robust SuperAlignment: Weak-to-Strong Robustness Generalization for Vision-Language Models
On Extending Direct Preference Optimization to Accommodate Ties
Physics-informed machine learning with domain decomposition and global dynamics for three-dimensional intersecting flows
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Eve3D: Elevating Vision Models for Enhanced 3D Surface Reconstruction via Gaussian Splatting
Comparator-Adaptive $\Phi$-Regret: Improved Bounds, Simpler Algorithms, and Applications to Games
Online Learning of Pure States is as Hard as Mixed States
Linearization Explains Fine-Tuning in Large Language Models
Preconditioned Langevin Dynamics with Score-based Generative Models for Infinite-Dimensional Linear Bayesian Inverse Problems
Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits
Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment
Know Thyself by Knowing Others: Learning Neuron Identity from Population Context
REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
Optimization Inspired Few-Shot Adaptation for Large Language Models
A Near-Optimal Algorithm for Decentralized Convex-Concave Finite-Sum Minimax Optimization
HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation
Reasoning Models Better Express Their Confidence
SAM2Flow: Interactive Optical Flow Estimation with Dual Memory for in vivo Microcirculation Analysis
DAAC: Discrepancy-Aware Adaptive Contrastive Learning for Medical Time series
Shortcuts and Identifiability in Concept-based Models from a Neuro-Symbolic Lens
Unifying Re-Identification, Attribute Inference, and Data Reconstruction Risks in Differential Privacy
Improving Target Sound Extraction via Disentangled Codec Representations with Privileged Knowledge Distillation
ESCORT: Efficient Stein-variational and Sliced Consistency-Optimized Temporal Belief Representation for POMDPs
EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver
PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors
FRAM: Frobenius-Regularized Assignment Matching with Mixed-Precision Computing
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model
Language Ranker: A Lightweight Ranking framework for LLM Decoding
On the necessity of adaptive regularisation: Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
Heavy-Ball Momentum Method in Continuous Time and Discretization Error Analysis
Neural Correlates of Serial Dependence: Synaptic Short-term Plasticity Orchestrates Repulsion and Attraction
Solving Discrete (Semi) Unbalanced Optimal Transport with Equivalent Transformation Mechanism and KKT-Multiplier Regularization
Zero-Shot Context Generalization in Reinforcement Learning from Few Training Contexts
Greed is Good: A Unifying Perspective on Guided Generation
Spotlight Attention: Towards Efficient LLM Generation via Non-linear Hashing-based KV Cache Retrieval
FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning
A Clean Slate for Offline Reinforcement Learning
An Analysis of Concept Bottleneck Models: Measuring, Understanding, and Mitigating the Impact of Noisy Annotations
INC: An Indirect Neural Corrector for Auto-Regressive Hybrid PDE Solvers
The Primacy of Magnitude in Low-Rank Adaptation
Bi-Level Knowledge Transfer for Multi-Task Multi-Agent Reinforcement Learning
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
Adaptive 3D Reconstruction via Diffusion Priors and Forward Curvature-Matching Likelihood Updates
Extrapolation by Association: Length Generalization Transfer In Transformers
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
In-Context Compositional Learning vis Sparse Coding Transformer
Individual Fairness In Strategic Classification
Emergent Risk Awareness in Rational Agents under Resource Constraints
FraPPE: Fast and Efficient Preference-Based Pure Exploration
SpecEM: Training-Free LLM Ensembling via Iterative Drafting, Verification, and Online Feedback
Conformal Prediction for Ensembles: Improving Efficiency via Score-Based Aggregation
OmniSVG: A Unified Scalable Vector Graphics Generation Model
T-norm Selection for Object Detection in Autonomous Driving with Logical Constraints
Buffer layers for Test-Time Adaptation
Taming Adversarial Constraints in CMDPs
Non-stationary Equivariant Graph Neural Networks for Physical Dynamics Simulation
Value Improved Actor Critic Algorithms
Quantitative convergence of trained neural networks to Gaussian processes
Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits
Let's Revise Step-by-Step: A Unified Local Search Framework for Code Generation with LLMs
Fair Representation Learning with Controllable High Confidence Guarantees via Adversarial Inference
Self-Supervised Selective-Guided Diffusion Model for Old-Photo Face Restoration
MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling
Curious Causality-Seeking Agents Learn Meta Causal World
Vision Transformers with Self-Distilled Registers
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex
On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization
Scalable inference of functional neural connectivity at submillisecond timescales
C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models
Point4Bit: Post Training 4-bit Quantization for Point Cloud 3D Detection
Uncertainty Estimation on Graphs with Structure Informed Stochastic Partial Differential Equations
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
$\Psi$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect Storage
Gains: Fine-grained Federated Domain Adaptation in Open Set
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions
K-DeCore: Facilitating Knowledge Transfer in Continual Structured Knowledge Reasoning via Knowledge Decoupling
Deep Gaussian from Motion: Exploring 3D Geometric Foundation Models for Gaussian Splatting
Equivariance by Contrast: Identifiable Equivariant Embeddings from Unlabeled Finite Group Actions
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward
Corrector Sampling in Language Models
Diffusion Transformers as Open-World Spatiotemporal Foundation Models
LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding
Learning Crossmodal Interaction Patterns via Attributed Bipartite Graphs for Single-Cell Omics
Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation
ASDSV: Multimodal Generation Made Efficient with Approximate Speculative Diffusion and Speculative Verification
Soft Task-Aware Routing of Experts for Equivariant Representation Learning
FairNet: Dynamic Fairness Correction without Performance Loss via Contrastive Conditional LoRA
Analyzing the Power of Chain of Thought through Memorization Capabilities
Ultrametric Cluster Hierarchies: I Want ‘em All!
NeuSymEA: Neuro-symbolic Entity Alignment via Variational Inference
Interactive Anomaly Detection for Articulated Objects via Motion Anticipation
EuroSpeech: A Multilingual Speech Corpus
Neural Green’s Functions
Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models
Assessing the quality of denoising diffusion models in Wasserstein distance: noisy score and optimal bounds
From Self-Check to Consensus: Bayesian Strategic Decoding in Large Language Models
Dimension-adapted Momentum Outscales SGD
ChromFound: Towards A Universal Foundation Model for Single-Cell Chromatin Accessibiltiy Data
Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning
Geometric Mixture Models for Electrolyte Conductivity Prediction
Preserving Task-Relevant Information Under Linear Concept Removal
Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning
A Bayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
$\mathcal{X}^2$-DFD: A framework for e$\mathcal{X}$plainable and e$\mathcal{X}$tendable Deepfake Detection
Learning Stochastic Multiscale Models
Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction
The Structural Complexity of Matrix-Vector Multiplication
Hyperbolic Dataset Distillation
EUGens: Efficient, Unified and General Dense Layers
Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization
Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism
PseuZO: Pseudo-Zeroth-Order Algorithm for Training Deep Neural Networks
Object-Centric Representation Learning for Enhanced 3D Semantic Scene Graph Prediction
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Imagine Beyond ! Distributionally Robust Autoencoding for State Space Coverage in Online Reinforcement Learning
On the Coexistence and Ensembling of Watermarks
Kernel Regression in Structured Non-IID Settings: Theory and Implications for Denoising Score Learning
Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens
Far from the Shallow: Brain-Predictive Reasoning Embedding through Residual Disentanglement
ContextAgent: Context-Aware Proactive LLM Agents with Open-world Sensory Perceptions
Principled Data Augmentation for Learning to Solve Quadratic Programming Problems
CausalDynamics: A large‐scale benchmark for structural discovery of dynamical causal models
Interpreting Arithmetic Reasoning in Large Language Models using Game-Theoretic Interactions
LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits
Boundary-Value PDEs Meet Higher-Order Differential Topology-aware GNNs
The Flood Complex: Large-Scale Persistent Homology on Millions of Points
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Deep Learning with Plausible Deniability
Multiresolution Analysis and Statistical Thresholding on Dynamic Networks
Representation Consistency for Accurate and Coherent LLM Answer Aggregation
SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders
Enhancing Graph Classification Robustness with Singular Pooling
TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding
AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems
Unveiling the Spatial-temporal Effective Receptive Fields of Spiking Neural Networks
Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
LLM Unlearning via Neural Activation Redirection
DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing
D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images
Time-Evolving Dynamical System for Learning Latent Representations of Mouse Visual Neural Activity
DataRater: Meta-Learned Dataset Curation
On the Integration of Spatial-Temporal Knowledge: A Lightweight Approach to Atmospheric Time Series Forecasting
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation
GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding
Sketched Adaptive Distributed Deep Learning: A Sharp Convergence Analysis
Efficient Adaptive Experimentation with Noncompliance
LT-Soups: Bridging Head and Tail Classes via Subsampled Model Soups
Diffusion Federated Dataset
Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits
Robo2VLM: Improving Visual Question Answering using Large-Scale Robot Manipulation Data
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Breaking AR’s Sampling Bottleneck: Provable Acceleration via Diffusion Language Models
$i$MIND: Insightful Multi-subject Invariant Neural Decoding
UFM: A Simple Path towards Unified Dense Correspondence with Flow
Thought Communication in Multiagent Collaboration
\(\varepsilon\)-Optimally Solving Two-Player Zero-Sum POSGs
Hippocampal-like Sequential Editing for Continual Knowledge Updates in Large Language Models
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
UniTransfer: Video Concept Transfer via Progressive Spatio-Temporal Decomposition
LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions
Dynamic and Chemical Constraints to Enhance the Molecular Masked Graph Autoencoders
Transferring Causal Effects using Proxies
SeePhys: Does Seeing Help Thinking? – Benchmarking Vision-Based Physics Reasoning
FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens
Offline imitation learning in $Q^\pi$-realizable MDPs without expert realizability
Stochastic Principal-Agent Problems: Computing and Learning Optimal History-Dependent Policies
Can Diffusion Models Disentangle? A Theoretical Perspective
Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity
The Complexity of Finding Local Optima in Contrastive Learning
The Complexity of Correlated Equilibria in Generalized Games
Tradeoffs between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning
Convolution Goes Higher-Order: A Biologically Inspired Mechanism Empowers Image Classification
Stochastic Shortest Path with Sparse Adversarial Costs
MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Cultural Learning
Data Selection Matters: Towards Robust Instruction Tuning of Large Multimodal Models
Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning
Temporal Smoothness-Aware Rate-Distortion Optimized 4D Gaussian Splatting
Squared families are useful conjugate priors
Neurosymbolic Diffusion Models
Nyström-Accelerated Primal LS-SVMs: Breaking the $O(an^3)$ Complexity Bottleneck for Scalable ODEs Learning
Conformal Risk Training: End-to-End Optimization of Conformal Risk Control
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Graph-based Symbolic Regression with Invariance and Constraint Encoding
Let the LLM Stick to Its Strengths: Learning to Route Economical LLM
Noise Matters: Optimizing Matching Noise for Diffusion Classifiers
Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning
Fix False Transparency by Noise Guided Splatting
MetaFind: Scene-Aware 3D Asset Retrieval for Coherent Metaverse Scene Generation
When Additive Noise Meets Unobserved Mediators: Bivariate Denoising Diffusion for Causal Discovery
Towards Graph Foundation Models: Training on Knowledge Graphs Enables Transferability to General Graphs
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
How to Auto-optimize Prompts for Domain Tasks? Adaptive Prompting and Reasoning through Evolutionary Domain Knowledge Adaptation
Learning Repetition-Invariant Representations for Polymer Informatics
TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving
Variational Supervised Contrastive Learning
Estimating Hitting Times Locally at Scale
Leveraging semantic similarity for experimentation with AI-generated treatments
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Wavelet Canonical Coherence for Nonstationary Signals
Two‑Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models
An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation
Higher-Order Learning with Graph Neural Networks via Hypergraph Encodings
Head Pursuit: Probing Attention Specialization in Multimodal Transformers
Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era
LinEAS: End-to-end Learning of Activation Steering with a Distributional Loss
Comparing Uniform Price and Discriminatory Multi-Unit Auctions through Regret Minimization
Mean Flows for One-step Generative Modeling
VisualLens: Personalization through Task-Agnostic Visual History
Group-Level Data Selection for Efficient Pretraining
AliO: Output Alignment Matters in Long-Term Time Series Forecasting
LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders
DMWM: Dual-Mind World Model with Long-Term Imagination
FairImagen: Post-Processing for Bias Mitigation in Text-to-Image Models
Learning Provably Improves the Convergence of Gradient Descent
Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
Influence Guided Context Selection for Effective Retrieval-Augmented Generation
Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning
Synthetic-powered predictive inference
Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence
DGH: Dynamic Gaussian Hair
OOD Detection with Relative Angles
FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning
OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
Feedback-Aware MCTS for Goal-Oriented Information Seeking
InstructRestore: Region-Customized Image Restoration with Human Instructions
Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
Extracting task-relevant preserved dynamics from contrastive aligned neural recordings
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
Covariate-moderated Empirical Bayes Matrix Factorization
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
A Plug-and-Play Query Synthesis Active Learning Framework for Neural PDE Solvers
Less is More: an Attention-free Sequence Prediction Modeling for Offline Embodied Learning
Learning and Planning Multi-Agent Tasks via an MoE-based World Model
On the Relation between Rectified Flows and Optimal Transport
Towards Irreversible Attack: Fooling Scene Text Recognition via Multi-Population Coevolution Search
CG-SSL: Concept-Guided Self-Supervised Learning
Generalization Guarantees for Learning Score-Based Branch-and-Cut Policies in Integer Programming
LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation
Flex-Judge: Text-Only Reasoning Unleashes Zero-Shot Multimodal Evaluators
Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization
Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training
Last-Iterate Convergence of Smooth Regret Matching$^+$ Variants in Learning Nash Equilibria
Learning to Clean: Reinforcement Learning for Noisy Label Correction
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
MIBP-Cert: Certified Training against Data Perturbations with Mixed-Integer Bilinear Programs
Distributional Adversarial Attacks and Training in Deep Hedging
Calibrating Translation Decoding with Quality Estimation on LLMs
Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation
TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Boundary-to-Region Supervision for Offline Safe Reinforcement Learning
Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization
Efficient Bayesian Experiment Design with Equivariant Networks
Luminance-Aware Statistical Quantization: Unsupervised Hierarchical Learning for Illumination Enhancement
Spectral Graph Neural Networks are Incomplete on Graphs with a Simple Spectrum
InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing
Enhancing Zero-Shot Black-Box Optimization via Pretrained Models with Efficient Population Modeling, Interaction, and Stable Gradient Approximation
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Optimism Without Regularization: Constant Regret in Zero-Sum Games
Interpretable Next-token Prediction via the Generalized Induction Head
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
A Geometric Analysis of PCA
ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training
Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
Anomaly Detection by an Ensemble of Random Pairs of Hyperspheres
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
Causal Spatio-Temporal Prediction: An Effective and Efficient Multi-Modal Approach
Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation
Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab
Learning to Reason under Off-Policy Guidance
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model
Backdoor Mitigation via Invertible Pruning Masks
Data-Dependent Regret Bounds for Constrained MABs
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
Towards Robust Zero-Shot Reinforcement Learning
Shape-Informed Clustering of Multi-Dimensional Functional Data via Deep Functional Autoencoders
Improving Model-Based Reinforcement Learning by Converging to Flatter Minima
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Training-free Detection of AI-generated images via Cropping Robustness
NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
Zero-shot Denoising via Neural Compression: Theoretical and algorithmic framework
EditInfinity: Image Editing with Binary-Quantized Generative Models
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation
Removing Concepts from Text-to-Image Models with Only Negative Samples
A Generalized Binary Tree Mechanism for Private Approximation of All-Pair Shortest Distances
NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
EverybodyDance: Bipartite Graph–Based Identity Correspondence for Multi-Character Animation
EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction
OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Statistical Inference for Gradient Boosting Regression
Faster Fixed-Point Methods for Multichain MDPs
Learning Source-Free Domain Adaptation for Visible-Infrared Person Re-Identification
BlurGuard: A Simple Approach for Robustifying Image Protection Against AI-Powered Editing
PANORAMA: A Dataset and Benchmarks Capturing Decision Trails and Rationales in Patent Examination
The Dual Nature of Plasticity Loss in Deep Continual Learning: Dissection and Mitigation
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
HOComp: Interaction-Aware Human-Object Composition
It’s Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation
FAIR Universe HiggsML Uncertainty Dataset and Competition
STRIDER: Navigation via Instruction-Aligned Structural Decision Space Optimization
Document Summarization with Conformal Importance Guarantees
Multivariate Time Series Anomaly Detection with Idempotent Reconstruction
Aligning Transformers with Continuous Feedback via Energy Rank Alignment
Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions
Preference-Based Dynamic Ranking Structure Recognition
A Single-Loop Gradient Algorithm for Pessimistic Bilevel Optimization via Smooth Approximation
SpaceServe: Spatial Multiplexing of Complementary Encoders and Decoders for Multimodal LLMs
Latent Harmony: Synergistic Unified UHD Image Restoration via Latent Space Regularization and Controllable Refinement
CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation
Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
Tight Generalization Bounds for Large-Margin Halfspaces
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
Model-Informed Flows for Bayesian Inference
Regression Trees Know Calculus
UnCLe: Towards Scalable Dynamic Causal Discovery in Non-linear Temporal Systems
The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?
Salient Concept-Aware Generative Data Augmentation
From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
What Can RL Bring to VLA Generalization? An Empirical Study
Segment Anything Model Meets Semi-supervised Medical Image Segmentation: A Novel Perspective
MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement
Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning
FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed
Distance Adaptive Beam Search for Provably Accurate Graph-Based Nearest Neighbor Search
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
Learning long range dependencies through time reversal symmetry breaking
CURE: Co-Evolving Coders and Unit Testers via Reinforcement Learning
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
RoME: Domain-Robust Mixture-of-Experts for MILP Solution Prediction across Domains
ProSpero: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods
Multi-scale Temporal Prediction via Incremental Generation and Multi-agent Collaboration
Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning
RAG4GFM: Bridging Knowledge Gaps in Graph Foundation Models through Graph Retrieval Augmented Generation
Graph Persistence goes Spectral
Pinpointing Attention-Causal Communication in Language Models
Globally Optimal Policy Gradient Algorithms for Reinforcement Learning with PID Control Policies
Why Playing Against Diverse and Challenging Opponents Speeds Up Coevolution: A Theoretical Analysis on Combinatorial Games
Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers
MonoLift: Learning 3D Manipulation Policies from Monocular RGB via Distillation
Straight-Line Diffusion Model for Efficient 3D Molecular Generation
Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations
Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions
RoboScape: Physics-informed Embodied World Model
DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging
MMaDA: Multimodal Large Diffusion Language Models
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
Visual Diversity and Region-aware Prompt Learning for Zero-shot HOI Detection
Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness
HoT-VI: Reparameterizable Variational Inference for Capturing Instance-Level High-Order Correlations
Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning
Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Uncertainty Estimation by Flexible Evidential Deep Learning
TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE
Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity
LayerNavigator: Finding Promising Intervention Layers for Efficient Activation Steering in Large Language Models
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Generative Perception of Shape and Material from Differential Motion
Sequential Attention-based Sampling for Histopathological Analysis
Generating Full-field Evolution of Physical Dynamics from Irregular Sparse Observations
Collaborative Reasoner: Self-Improving Social Agents with Synthetic Conversations
SAS: Simulated Attention Score
Many LLMs Are More Utilitarian Than One
Beyond Prediction: Managing the Repercussions of Machine Learning Applications
Sparse Image Synthesis via Joint Latent and RoI Flow
COOPERA: Continual Open-Ended Human-Robot Assistance
Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data
Risk-Averse Constrained Reinforcement Learning with Optimized Certainty Equivalents
Learning normalized image densities via dual score matching
Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders
Attention (as Discrete-Time Markov) Chains
VividFace: A Robost and High-Fidelity Video Face Swapping Framework
Non-Convex Tensor Recovery from Tube-Wise Sensing
Exploring the limits of strong membership inference attacks on large language models
AutoData: A Multi-Agent System for Open Web Data Collection
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
DeepHalo: A Neural Choice Model with Controllable Context Effects
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
Synthetic Series-Symbol Data Generation for Time Series Foundation Models
MODEM: A Morton-Order Degradation Estimation Mechanism for Adverse Weather Image Recovery
RNNs perform task computations by dynamically warping neural representations
MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control
Path-Enhanced Contrastive Learning for Recommendation
No-Regret Online Autobidding Algorithms in First-price Auctions
Solver-Free Decision-Focused Learning for Linear Optimization Problems
From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning
A Pre-training Framework for Relational Data with Information-theoretic Principles
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration
DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization
On Group Sufficiency Under Label Bias
Revisiting Bi-Linear State Transitions in Recurrent Neural Networks
Smooth Regularization for Efficient Video Recognition
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)
Performative Validity of Recourse Explanations
LLM-DAMVC: A Large Language Model Assisted Dynamic Agent for Multi-View Clustering
Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis
Robust Cross-modal Alignment Learning for Cross-Scene Spatial Reasoning and Grounding
Multimodal Causal Reasoning for UAV Object Detection
MLZero: A Multi-Agent System for End-to-end Machine Learning Automation
Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks
ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction
NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables
Policy Compatible Skill Incremental Learning via Lazy Learning Interface
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
A Differential and Pointwise Control Approach to Reinforcement Learning
Bridging Arbitrary and Tree Metrics via Differentiable Gromov Hyperbolicity
TabDPT: Scaling Tabular Foundation Models on Real Data
Role Bias in Diffusion Models: Diagnosing and Mitigating through Intermediate Decomposition
Equi-mRNA: Protein Translation Equivariant Encoding for mRNA Language Models
Solving and Learning Partial Differential Equations with Variational Q-Exponential Processes
Exploring Neural Granger Causality with xLSTMs: Unveiling Temporal Dependencies in Complex Data
A Novel General Framework for Sharp Lower Bounds in Succinct Stochastic Bandits
DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs
E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products
Continuous Thought Machines
Differentially Private Quantiles with Smaller Error
Asymptotic theory of SGD with a general learning-rate
Attention with Trained Embeddings Provably Selects Important Tokens
Multi-step Visual Reasoning with Visual Tokens Scaling and Verification
ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints
Differentiable Hierarchical Visual Tokenization
Do different prompting methods yield a common task representation in language models?
Generalization Bounds for Kolmogorov-Arnold Networks (KANs) and Enhanced KANs with Lower Lipschitz Complexity
Bridging Brains and Concepts: Interpretable Visual Decoding from fMRI with Semantic Bottlenecks
SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts
Provably Efficient Multi-Task Meta Bandit Learning via Shared Representations
AlphaBeta is not as good as you think: a simple random games model for a better analysis of deterministic game-solving algorithms
Activity Pruning for Efficient Spiking Neural Networks
Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization
Coarse-to-Fine 3D Part Assembly via Semantic Super-Parts and Symmetry-Aware Pose Estimation
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework
Revisiting Semi-Supervised Learning in the Era of Foundation Models
Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark
SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation
Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
Space Group Equivariant Crystal Diffusion
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
Fisher meets Feynman: score-based variational inference with a product of experts
DiffLiG: Diffusion-enhanced Liquid Graph with Attention Propagation for Grid-to-Station Precipitation Correction
Alligat0R: Pre-Training through Covisibility Segmentation for Relative Camera Pose Regression
MAPLE: Multi-scale Attribute-enhanced Prompt Learning for Few-shot Whole Slide Image Classification
Prompt Tuning Decision Transformers with Structured and Scalable Bandits
Novel Exploration via Orthogonality
Mint: A Simple Test-Time Adaptation of Vision-Language Models against Common Corruptions
Beyond $\tilde{O}(\sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting
EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution
Deeper with Riemannian Geometry: Overcoming Oversmoothing and Oversquashing for Graph Foundation Models
The Future Unmarked: Watermark Removal in AI-Generated Images via Next-Frame Prediction
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning
PDPO: Parametric Density Path Optimization
Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components
Fair Cooperation in Mixed-Motive Games via Conflict-Aware Gradient Adjustment
Robust Policy Expansion for Offline-to-Online RL under Diverse Data Corruption
DNA-DetectLLM: Unveiling AI-Generated Text via a DNA-Inspired Mutation-Repair Paradigm
Ask a Strong LLM Judge when Your Reward Model is Uncertain
Evaluating the Inductive Abilities of Large Language Models: Why Chain-of-Thought Reasoning Sometimes Hurts More Than Helps
Improved Best-of-Both-Worlds Regret for Bandits with Delayed Feedback
Execution Guided Line-by-Line Code Generation
Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting
Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models
TRiCo: Triadic Game-Theoretic Co-Training for Robust Semi-Supervised Learning
Optimizing Retrieval for RAG via Reinforced Contrastive Learning
Coresets for Clustering Under Stochastic Noise
Enforcing convex constraints in Graph Neural Networks
Learning in Compact Spaces with Approximately Normalized Transformer
Towards a Golden Classifier-Free Guidance Path via Foresight Fixed Point Iterations
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models
IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation
CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections
Adversarial generalization of unfolding (model-based) networks
PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding
Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning
Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models
Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
Interpretable and Parameter Efficient Graph Neural Additive Models with Random Fourier Features
Spectral Analysis of Diffusion Models with Application to Schedule Design
High-order Interactions Modeling for Interpretable Multi-Agent Q-Learning
Learning Human-Object Interaction as Groups
Rethinking Out-of-Distribution Detection and Generalization with Collective Behavior Dynamics
DCA: Graph-Guided Deep Embedding Clustering for Brain Atlases
Hierarchical Semantic-Augmented Navigation: Optimal Transport and Graph-Driven Reasoning for Vision-Language Navigation
Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections
Cost-Sensitive Freeze-thaw Bayesian Optimization for Efficient Hyperparameter Tuning
Vulnerable Data-Aware Adversarial Training
Model Inversion with Layer-Specific Modeling and Alignment for Data-Free Continual Learning
Revising and Falsifying Sparse Autoencoder Feature Explanations
Graph–Smoothed Bayesian Black-Box Shift Estimator and Its Information Geometry
Transformers Learn Faster with Semantic Focus
A Minimalist Example of Edge-of-Stability and Progressive Sharpening
Revisiting Logit Distributions for Reliable Out-of-Distribution Detection
HumanCrafter: Synergizing Generalizable Human Reconstruction and Semantic 3D Segmentation
ScatterAD: Temporal-Topological Scattering Mechanism for Time Series Anomaly Detection
Precise Information Control in Long-Form Text Generation
Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
Part-Aware Bottom-Up Group Reasoning for Fine-Grained Social Interaction Detection
PhysDiff-VTON: Cross-Domain Physics Modeling and Trajectory Optimization for Virtual Try-On
HoloScene: Simulation‑Ready Interactive 3D Worlds from a Single Video
Logical Expressiveness of Graph Neural Networks with Hierarchical Node Individualization
Visual Instruction Bottleneck Tuning
Theoretical Guarantees for the Retention of Strict Nash Equilibria by Coevolutionary Algorithms
Variational Task Vector Composition
AI Debate Aids Assessment of Controversial Claims
Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search
DUET: Dual-Perspective Pseudo Labeling and Uncertainty-aware Exploration & Exploitation Training for Source-Free Domain Adaptation
Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Sim-LLM: Optimizing LLM Inference at the Edge through Inter-Task KV Reuse
Doodle to Detect: A Goofy but Powerful Approach to Skeleton-based Hand Gesture Recognition
IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
Learning to Flow from Generative Pretext Tasks for Neural Architecture Encoding
GeRaF: Neural Geometry Reconstruction from Radio Frequency Signals
LogicTree: Improving Complex Reasoning of LLMs via Instantiated Multi-step Synthetic Logical Data
Variational Uncertainty Decomposition for In-Context Learning
Composing Linear Layers from Irreducibles
FlowPrune: Accelerating Attention Flow Calculation by Pruning Flow Network
OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
Faithful Group Shapley Value
A compressive-expressive communication framework for compositional representations
Scaling Speculative Decoding with Lookahead Reasoning
ARM: Adaptive Reasoning Model
Continual Model Merging without Data: Dual Projections for Balancing Stability and Plasticity
Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers
Curriculum Design for Trajectory-Constrained Agent: Compressing Chain-of-Thought Tokens in LLMs
Test-Time Adaptation by Causal Trimming
Size-adaptive Hypothesis Testing for Fairness
A Reinforcement Learning-based Bidding Strategy for Data Consumers in Auction-based Federated Learning
SALS: Sparse Attention in Latent Space for KV Cache Compression
Discovering Opinion Intervals from Conflicts in Signed Graphs
Learning to cluster neuronal function
Omni-Mol: Multitask Molecular Model for Any-to-any Modalities
DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
BeyondMix: Leveraging Structural Priors and Long-Range Dependencies for Domain-Invariant LiDAR Segmentation
Learnable Burst-Encodable Time-of-Flight Imaging for High-Fidelity Long-Distance Depth Sensing
LLMs Encode Harmfulness and Refusal Separately
Scaling can lead to compositional generalization
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
A Dataset for Distilling Knowledge Priors from Literature for Therapeutic Design
Bootstrap Off-policy with World Model
FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic
Private Evolution Converges
FlashMoE: Fast Distributed MoE in a Single Kernel
Model Editing for Vision Transformers
Incentivizing Time-Aware Fairness in Data Sharing
Scalable Policy-Based RL Algorithms for POMDPs
Dynamic Masking and Auxiliary Hash Learning for Enhanced Cross-Modal Retrieval
OpenHype: Hyperbolic Embeddings for Hierarchical Open-Vocabulary Radiance Fields
Clip-and-Verify: Linear Constraint-Driven Domain Clipping for Accelerating Neural Network Verification
Towards Generalizable Retina Vessel Segmentation with Deformable Graph Priors
InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions
Mozart: Modularized and Efficient MoE Training on 3.5D Wafer-Scale Chiplet Architectures
RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
UltraLED: Learning to See Everything in Ultra-High Dynamic Range Scenes
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS
RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
Optimal Adjustment Sets for Nonparametric Estimation of Weighted Controlled Direct Effect
Scalable In-context Ranking with Generative Models
TAMI: Taming Heterogeneity in Temporal Interactions for Temporal Graph Link Prediction
Causality Meets the Table: Debiasing LLMs for Faithful TableQA via Front-Door Intervention
From Synapses to Dynamics: Obtaining Function from Structure in a Connectome Constrained Model of the Head Direction Circuit
Robust Integrated Learning and Pauli Noise Mitigation for Parametrized Quantum Circuits
ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training
Theory-Driven Label-Specific Representation for Incomplete Multi-View Multi-Label Learning
Accelerating Feature Conformal Prediction via Taylor Approximation
Scalable and Cost-Efficient de Novo Template-Based Molecular Generation
Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking
Unlocker: Disentangle the Deadlock of Learning between Label-noisy and Long-tailed Data
RAST: Reasoning Activation in LLMs via Small-model Transfer
HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene
Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation
Uncovering a Universal Abstract Algorithm for Modular Addition in Neural Networks
Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Transforming Generic Coder LLMs to Effective Binary Code Embedding Models for Similarity Detection
Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion
GeneFlow: Translation of Single-cell Gene Expression to Histopathological Images via Rectified Flow
Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach
Private Zeroth-Order Optimization with Public Data
PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization
TRAP: Targeted Redirecting of Agentic Preferences
DINGO: Constrained Inference for Diffusion LLMs
GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Demystifying Network Foundation Models
AMBER: Adaptive Mesh Generation by Iterative Mesh Resolution Prediction
MaxSup: Overcoming Representation Collapse in Label Smoothing
MaNGO — Adaptable Graph Network Simulators via Meta-Learning
UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression
Self-Calibrating BCIs: Ranking and Recovery of Mental Targets Without Labels
Network two-sample test for block models
Scaling Laws For Scalable Oversight
Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models
Understanding Data Influence in Reinforcement Finetuning
Slow Transition to Low-Dimensional Chaos in Heavy-Tailed Recurrent Neural Networks
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers
How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs
One Head to Rule Them All: Amplifying LVLM Safety through a Single Critical Attention Head
NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI
PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
IndEgo: A Dataset of Industrial Scenarios and Collaborative Work for Egocentric Assistants
Neural Networks Generalize on Low Complexity Data
SmokeViz: A Large-Scale Satellite Dataset for Wildfire Smoke Detection and Segmentation
Learning from Interval Targets
PhySense: Sensor Placement Optimization for Accurate Physics Sensing
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models
Aligning Text to Image in Diffusion Models is Easier Than You Think
Bridging Symmetry and Robustness: On the Role of Equivariance in Enhancing Adversarial Robustness
ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
Dimensionality Mismatch Between Brains and Artificial Neural Networks
Conformal Mixed-Integer Constraint Learning with Feasibility Guarantees
Generalizable Reasoning through Compositional Energy Minimization
Meta-learning how to Share Credit among Macro-Actions
Go With the Flow: Fast Diffusion for Gaussian Mixture Models
A Controllable Examination for Long-Context Language Models
ProfiX: Improving Profile-Guided Optimization in Compilers with Graph Neural Networks
Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training
CryptoMoE: Privacy-Preserving and Scalable Mixture of Experts Inference via Balanced Expert Routing
GRAPE: Optimize Data Mixture for Group Robust Multi-target Adaptive Pretraining
REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects
Value-Guided Search for Efficient Chain-of-Thought Reasoning
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities
Hybrid Boundary Physics-Informed Neural Networks for Solving Navier-Stokes Equations with Complex Boundary
Nested Learning: The Illusion of Deep Learning Architectures
Fast attention mechanisms: a tale of parallelism
PiKE: Adaptive Data Mixing for Large-Scale Multi-Task Learning Under Low Gradient Conflicts
Fair Deepfake Detectors Can Generalize
Pairwise Calibrated Rewards for Pluralistic Alignment
SAO-Instruct: Free-form Audio Editing using Natural Language Instructions
Multi-Token Prediction Needs Registers
Structured Temporal Causality for Interpretable Multivariate Time Series Anomaly Detection
Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones
The Omni-Expert: A Computationally Efficient Approach to Achieve a Mixture of Experts in a Single Expert Model
Direct Alignment with Heterogeneous Preferences
Accurate and Efficient Low-Rank Model Merging in Core Space
MuSLR: Multimodal Symbolic Logical Reasoning
Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought
VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation
Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
SimpleStrat: Diversifying Language Model Generation with Stratification
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
NAVIX: Scaling MiniGrid Environments with JAX
SynTSBench: Rethinking Temporal Pattern Learning in Deep Learning Models for Time Series
Imagined Autocurricula
From Black-box to Causal-box: Towards Building More Interpretable Models
HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate Class
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Conformal Prediction under Lévy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations
Large language models can learn and generalize steganographic chain-of-thought under process supervision
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
Bidirectional Motion Transformer for Safety-Critical Traffic Scenario Generation
Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
Feasibility-Aware Decision-Focused Learning for Predicting Parameters in the Constraints
Protocols for Verifying Smooth Strategies in Bandits and Games
A Unified Approach to Submodular Maximization Under Noise
Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Plug-and-Play Context Feature Reuse for Efficient Masked Generation
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics
Learning to Solve Complex Problems via Dataset Decomposition
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion
Explicitly Modeling Subcortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness
Radial Attention: $\mathcal O(n \log n)$ Sparse Attention for Long Video Generation
Energy Loss Functions for Physical Systems
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Constructing an Optimal Behavior Basis for the Option Keyboard
Accelerated Evolving Set Processes for Local PageRank Computation
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Titans: Learning to Memorize at Test Time
Compositional Neural Network Verification via Assume-Guarantee Reasoning
Generating and Checking DNN Verification Proofs
Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation
Solving Inverse Problems with FLAIR
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
Training-Free Constrained Generation With Stable Diffusion Models
Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
ReCon-GS: Continuum-Preserved Guassian Streaming for Fast and Compact Reconstruction of Dynamic Scenes
Orochi: Versatile Biomedical Image Processor
Conditional Diffusion Anomaly Modeling on Graphs
SHF: Symmetrical Hierarchical Forest with Pretrained Vision Transformer Encoder for High-Resolution Medical Segmentation
Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra
A$^3$E: Towards Compositional Model Editing
Predictability Enables Parallelization of Nonlinear State Space Models
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis
ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection
A faster training algorithm for regression trees with linear leaves, and an analysis of its complexity
Knowledge Starts with Practice: Knowledge-Aware Exercise Generative Recommendation with Adaptive Multi-Agent Cooperation
Optimality and NP-Hardness of Transformers in Learning Markovian Dynamical Functions
PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series in Typhoon Forecasting
$\text{S}^2$Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Generalizing Experience for Language Agents with Hierarchical MetaFlows
Stratify or Die: Rethinking Data Splits in Image Segmentation
Dual-Flow: Transferable Multi-Target, Instance-Agnostic Attacks via $\textit{In-the-wild}$ Cascading Flow Optimization
RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints
Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport
Are Language Models Efficient Reasoners? A Perspective from Logic Programming
All You Need is One: Capsule Prompt Tuning with a Single Vector
Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers
Monoculture or Multiplicity: Which Is It?
A Principled Path to Fitted Distributional Evaluation
Improving Time Series Forecasting via Instance-aware Post-hoc Revision
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation
AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios
Inpainting the Neural Picture: Inferring Unrecorded Brain Area Dynamics from Multi-Animal Datasets
Better Training Data Attribution via Better Inverse Hessian-Vector Products
Explaining and Mitigating Crosslingual Tokenizer Inequities
Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation
Reverse Engineering Human Preferences with Reinforcement Learning
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning
Reliably detecting model failures in deployment without labels
Incentivizing Truthful Language Models via Peer Elicitation Games
ELDET: Early-Learning Distillation with Noisy Labels for Object Detection
Safe and Stable Control via Lyapunov-Guided Diffusion Models
Angles Don’t Lie: Unlocking Training‑Efficient RL Through the Model’s Own Signals
OrdShap: Feature Position Importance for Sequential Black-Box Models
Scaling Offline RL via Efficient and Expressive Shortcut Models
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
ProteinConformers: Benchmark Dataset for Simulating Protein Conformational Landscape Diversity and Plausibility
OmniBench: Towards The Future of Universal Omni-Language Models
Quadratic Coreset Selection: Certifying and Reconciling Sequence and Token Mining for Efficient Instruction Tuning
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
RDB2G-Bench: A Comprehensive Benchmark for Automatic Graph Modeling of Relational Databases
Conformal Online Learning of Deep Koopman Linear Embeddings
ShapeX: Shapelet-Driven Post Hoc Explanations for Time Series Classification Models
CleverBirds: A Multiple-Choice Benchmark for Fine-grained Human Knowledge Tracing
Isotropic Noise in Stochastic and Quantum Convex Optimization
Universal Sequence Preconditioning
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Grids Often Outperform Implicit Neural Representation at Compressing Dense Signals
Model Merging in Pre-training of Large Language Models
Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project
Diffusion Classifiers Understand Compositionality, but Conditions Apply
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
Emerging Risks from Embodied AI Require Urgent Policy Action
Feature-Based Instance Neighbor Discovery: Advanced Stable Test-Time Adaptation in Dynamic World
Multi-order Orchestrated Curriculum Distillation for Model-Heterogeneous Federated Graph Learning
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
FracFace: Breaking The Visual Clues—Fractal-Based Privacy-Preserving Face Recognition
Knot So Simple: A Minimalistic Environment for Spatial Reasoning
DrivAerStar: An Industrial-Grade CFD Dataset for Vehicle Aerodynamic Optimization
Accelerating data-driven algorithm selection for combinatorial partitioning problems
SciArena: An Open Evaluation Platform for Non-Verifiable Scientific Literature-Grounded Tasks
OSTAR: Optimized Statistical Text-classifier with Adversarial Resistance
FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network
DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
Beyond Value Functions: Single-Loop Bilevel Optimization under Flatness Conditions
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
Optimal Best Arm Identification under Differential Privacy
Linearly Constrained Diffusion Implicit Models
Zero-shot World Models via Search in Memory
A Near-optimal, Scalable and Parallelizable Framework for Stochastic Bandits Robust to Adversarial Corruptions and Beyond
Valid Inference with Imperfect Synthetic Data
Spectral Compressive Imaging via Chromaticity-Intensity Decomposition
VITRIX-CLIPIN: Enhancing Fine-Grained Visual Understanding in CLIP via Instruction-Editing Data and Long Captions
A Standardized Benchmark for Multilabel Antimicrobial Peptide Classification
Measuring and Guiding Monosemanticity
UMoE: Unifying Attention and FFN with Shared Experts
Model Provenance Testing for Large Language Models
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape
SRHand: Super-Resolving Hand Images and 3D Shapes via View/Pose-aware Neural Image Representations and Explicit Meshes
Synthesizing Performance Constraints for Evaluating and Improving Code Efficiency
Unifying Proportional Fairness in Centroid and Non-Centroid Clustering
Provable Meta-Learning with Low-Rank Adaptations
Introducing FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark
TS-MOF: Two-Stage Multi-Objective Fine-tuning for Long-Tailed Recognition
ShapeEmbed: a self-supervised learning framework for 2D contour quantification
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Understanding Softmax Attention Layers:\\ Exact Mean-Field Analysis on a Toy Problem
Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery
A multiscale analysis of mean-field transformers in the moderate interaction regime
LUNA: Efficient and Topology-Agnostic Foundation Model for EEG Signal Analysis
Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding
Towards Multi-Table Learning: A Novel Paradigm for Complementarity Quantification and Integration
Fantastic Bugs and Where to Find Them in AI Benchmarks
FSEO: Few-Shot Evolutionary Optimization via Meta-Learning for Expensive Multi-Objective Optimization
SplashNet: Split‑and‑Share Encoders for Accurate and Efficient Typing with Surface Electromyography
Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time
Impartial Selection with Predictions
Rethinking PCA Through Duality
Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis
Energy-based generator matching: A neural sampler for general state space
OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions
Seemingly Redundant Modules Enhance Robust Odor Learning in Fruit Flies
Differentially Private Federated Low Rank Adaptation Beyond Fixed-Matrix
Flexible MOF Generation with Torsion-Aware Flow Matching
Generalization Bounds for Rank-sparse Neural Networks
On Agnostic PAC Learning in the Small Error Regime
RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering
AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking
LooGLE v2: Are LLMs Ready for Real World Long Dependency Challenges?
Model-Based Policy Adaptation for Closed-Loop End-to-end Autonomous Driving
Token Bottleneck: One Token to Remember Dynamics
CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic
DCcluster-Opt: Benchmarking Dynamic Multi-Objective Optimization for Geo-Distributed Data Center Workloads
The Leaderboard Illusion
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants
LC-Opt: Benchmarking Reinforcement Learning and Agentic AI for End-to-End Liquid Cooling Optimization in Data Centers
Look-Ahead Reasoning on Learning Platforms
Comprehensive Assessment and Analysis for NSFW Content Erasure in Text-to-Image Diffusion models
Robust Sampling for Active Statistical Inference
EgoBlind: Towards Egocentric Visual Assistance for the Blind
FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design
FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine
Conditional Forecasts and Proper Scoring Rules for Reliable and Accurate Performative Predictions
SaFiRe: Saccade-Fixation Reiteration with Mamba for Referring Image Segmentation
Brain-Like Processing Pathways Form in Models With Heterogeneous Experts
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Network
TabArena: A Living Benchmark for Machine Learning on Tabular Data
CIDD: Collaborative Intelligence for Structure-Based Drug Design Empowered by LLMs
AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation
Stability and Oracle Inequalities for Optimal Transport Maps between General Distributions
$\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
EPFL-Smart-Kitchen: An Ego-Exo Multi-Modal Dataset for Challenging Action and Motion Understanding in Video-Language Models
OCTDiff: Bridged Diffusion Model for Portable OCT Super-Resolution and Enhancement
BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
PSMBench: A Benchmark and Dataset for Evaluating LLMs Extraction of Protocol State Machines from RFC Specifications
Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health
Geometry of Decision Making in Language Models
Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
Ridge Boosting is Both Robust and Efficient
Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme
Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms
Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation
Neural Tangent Knowledge Distillation for Optical Convolutional Networks
Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning
SPINT: Spatial Permutation-Invariant Neural Transformer for Consistent Intracortical Motor Decoding
Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
Causally Reliable Concept Bottleneck Models
DSAS: A Universal Plug-and-Play Framework for Attention Optimization in Multi-Document Question Answering
Diversifying Parallel Ergodic Search: A Signature Kernel Evolution Strategy
Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond
AION-1: Omnimodal Foundation Model for Astronomical Sciences
Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization
A Difference-of-Convex Functions Approach to Energy-Based Iterative Reasoning
Best-of-N Jailbreaking
HouseLayout3D: A Benchmark and Training-free Baseline for 3D Layout Estimation in the Wild
SutureBot: A Precision Framework & Benchmark For Autonomous End-to-End Suturing
Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis
Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization
ConfTuner: Training Large Language Models to Express Their Confidence Verbally
Rethinking Joint Maximum Mean Discrepancy for Visual Domain Adaptation
We Should Chart an Atlas of All the World's Models
RIGNO: A Graph-based Framework For Robust And Accurate Operator Learning For PDEs On Arbitrary Domains
DeepDiver: Adaptive Web-Search Intensity Scaling via Reinforcement Learning
Beyond Expectations: Quantile-Guided Alignment for Risk-Calibrated Language Models
FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities
What Matters in Data for DPO?
Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
DiffEye: Diffusion-Based Continuous Eye-Tracking Data Generation Conditioned on Natural Images
Variational Polya Tree
PhysDiff: A Physically-Guided Diffusion Model for Multivariate Time Series Anomaly Detection
Situat3DChange: Situated 3D Change Understanding Dataset for Multimodal Large Language Model
Risk Management for Mitigating Benchmark Failure Modes: BenchRisk
Individual Regret in Cooperative Stochastic Multi-Armed Bandits
Absence Bench: Language Models Can’t See What’s Missing
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
What Really is a Member? Discrediting Membership Inference via Poisoning
Codifying Character Logic in Role-Playing
Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution
Flexible Realignment of Language Models
LABridge: Text–Image Latent Alignment Framework via Mean-Conditioned OU Process
BikeBench: A Bicycle Design Benchmark for Generative Models with Objectives and Constraints
ConViS-Bench: Estimating Video Similarity Through Semantic Concepts
Anchored Diffusion Language Model
Angular Steering: Behavior Control via Rotation in Activation Space
Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs
Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities
Breaking the Frozen Subspace: Importance Sampling for Low-Rank Optimization in LLM Pretraining
A geometric framework for momentum-based optimizers for low-rank training
How many measurements are enough? Bayesian recovery in inverse problems with general distributions
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
ELECTRA: A Cartesian Network for 3D Charge Density Prediction with Floating Orbitals
CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models
URB - Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles
Real-World Adverse Weather Image Restoration via Dual-Level Reinforcement Learning with High-Quality Cold Start
Posterior Contraction for Sparse Neural Networks in Besov Spaces with Intrinsic Dimensionality
SECODEPLT: A Unified Benchmark for Evaluating the Security Risks and Capabilities of Code GenAI
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
Deep Edge Filter: Return of the Human-Crafted Layer in Deep Learning
UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting
PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement
Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
Vision Function Layer in Multimodal LLMs
Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients
Vocabulary-Guided Gait Recognition
Is Your Diffusion Model Actually Denoising?
Learning in Stackelberg Mean Field Games: A Non-Asymptotic Analysis
Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning
RF-Agent: Automated Reward Function Design via Language Agent Tree Search
Towards Predicting Any Human Trajectory In Context
NeuroRenderedFake: A Challenging Benchmark to Detect Fake Images Generated by Advanced Neural Rendering Methods
MATCH: Multi-faceted Adaptive Topo-Consistency for Semi-Supervised Histopathology Segmentation
Mamba Modulation: On the Length Generalization of Mamba Models
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models
Streaming Stochastic Submodular Maximization with On-Demand User Requests
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
Image Super-Resolution with Guarantees via Conformalized Generative Models
FLAME: Fast Long-context Adaptive Memory for Event-based Vision
SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes
Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing
The Fragile Truth of Saliency: Improving LLM Input Attribution via Attention Bias Optimization
Constrained Feedback Learning for Non-Stationary Multi-Armed Bandits
Minimizing False-Positive Attributions in Explanations of Non-Linear Models
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench
REOBench: Benchmarking Robustness of Earth Observation Foundation Models
3D Interaction Geometric Pre-training for Molecular Relational Learning
LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions
Can NeRFs "See" without Cameras?
Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts
Tensor Product Attention Is All You Need
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Counterfactual Identifiability via Dynamic Optimal Transport
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
WorldModelBench: Judging Video Generation Models As World Models
On the Edge of Memorization in Diffusion Models
VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction, Characterization and Recognition
KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge
REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model
Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction
The Indra Representation Hypothesis
Distillation Robustifies Unlearning
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
MyoChallenge 2024: A New Benchmark for Physiological Dexterity and Agility in Bionic Humans
Predictive Coding Enhances Meta-RL To Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability
Generative Graph Pattern Machine
CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning
Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning
Scaling Physical Reasoning with the PHYSICS Dataset
On the Emergence of Linear Analogies in Word Embeddings
CellVerse: Do Large Language Models Really Understand Cell Biology?
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks
OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning
Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework
Architectural and Inferential Inductive Biases for Exchangeable Sequence Modeling
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories
BitMark: Watermarking Bitwise Autoregressive Image Generative Models
Actial: Activate Spatial Reasoning Ability of Multimodal Large Language Models
On the Stability of Graph Convolutional Neural Networks: A Probabilistic Perspective
Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning
3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes
Reward-oriented Causal Representation Learning
Distribution Learning Meets Graph Structure Sampling
The Impact of Coreset Selection on Spurious Correlations and Group Robustness
MESS+: Dynamically Learned Inference-Time LLM Routing in Model Zoos with Service Level Guarantees
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning
Adaptive Variance Inflation in Thompson Sampling: Efficiency, Safety, Robustness, and Beyond
Learning Human-Like RL Agents Through Trajectory Optimization With Action Quantization
Concept Incongruence: An Exploration of Time and Death in Role Playing
Efficiently Scaling LLM Reasoning Programs with Certaindex
Looking Into the Water by Unsupervised Learning of the Surface Shape
GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
Differentiable Sparsity via $D$-Gating: Simple and Versatile Structured Penalization
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards
Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models
Private Set Union with Multiple Contributions
Generalized Category Discovery under Domain Shift: A Frequency Domain Perspective
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Learning Interactive World Model for Object-Centric Reinforcement Learning
CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
Online Time Series Forecasting with Theoretical Guarantees
Two Causally Related Needles in a Video Haystack
Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data
Conformal Prediction for Causal Effects of Continuous Treatments
Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference
CARE: Decoding-Time Safety Alignment via Rollback and Introspection Intervention
Chain of Execution Supervision Promotes General Reasoning in Large Language Models
Parallel Scaling Law for Language Models
Graph Neural Network Based Action Ranking for Planning
Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff
On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
CCS: Controllable and Constrained Sampling with Diffusion Models via Initial Noise Perturbation
Whitened Score Diffusion: A Structured Prior for Imaging Inverse Problems
Neural Networks for Learnable and Scalable Influence Estimation of Instruction Fine-Tuning Data
Metropolis-Hastings Sampling for 3D Gaussian Reconstruction
Near-Exponential Savings for Population Mean Estimation with Active Learning
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dim Subspaces in Diffusion Models
A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging
Heterogeneous Diffusion Structure Inference for Network Cascade
MetaDefense: Defending Fine-tuning based Jailbreak Attack Before and During Generation
A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective
How Ensembles of Distilled Policies Improve Generalisation in Reinforcement Learning
Blockwise Flow Matching: Improving Flow Matching Models For Efficient High-Quality Generation
Projection-Manifold Regularized Latent Diffusion for Robust General Image Fusion
Decomposing motor units through elimination for real-time intention driven assistive neurotechnology
FlexWorld: Progressively Expanding 3D Scenes for Flexible-View Exploration
Parameterized Synthetic Text Generation with SimpleStories
Sharp Matrix Empirical Bernstein Inequalities
Enhancing Optimizer Stability: Momentum Adaptation of The NGN Step-size
EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT
Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model
Differentiable Cyclic Causal Discovery Under Unmeasured Confounders
SegGraph: Leveraging Graphs of SAM Segments for Few-Shot 3D Part Segmentation
Temporal-Difference Variational Continual Learning
Abstract Rendering: Certified Rendering Under 3D Semantic Uncertainty
PolyGuard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset
Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
FaCT: Faithful Concept Traces for Explaining Neural Network Decisions
The Computational Advantage of Depth in Learning High-Dimensional Hierarchical Targets
NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion
UniFoil: A Universal Dataset of Airfoils in Transitional and Turbulent Regimes for Subsonic and Transonic Flows
ESCA: Enabling Seamless Codec Avatar Execution through Algorithm and Hardware Co-Optimization for Virtual Reality
PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling
Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and Serving
A Provable Approach for End-to-End Safe Reinforcement Learning
Watermarking Autoregressive Image Generation
Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation
A Technical Report on “Erasing the Invisible”: The 2024 NeurIPS Competition on Stress Testing Image Watermarks
scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
Deno-IF: Unsupervised Noisy Visible and Infrared Image Fusion Method
Scaling Unlocks Broader Generation and Deeper Functional Understanding of Proteins
Retro-R1: LLM-based Agentic Retrosynthesis
Random Search Neural Networks for Efficient and Expressive Graph Learning
DISC: Dynamic Decomposition Improves LLM Inference Scaling
Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning
metaTextGrad: Automatically optimizing language model optimizers
EddyFormer: Accelerated Neural Simulations of Three-Dimensional Turbulence at Scale
RePO: Understanding Preference Learning Through ReLU-Based Optimization
Unleashing Hour-Scale Video Training for Long Video-Language Understanding
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Test-Time Scaling of Diffusion Models via Noise Trajectory Search
Breaking the Order Barrier: Off-Policy Evaluation for Confounded POMDPs
DEAL: Diffusion Evolution Adversarial Learning for Sim-to-Real Transfer
Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs
Stackelberg Self-Annotation: A Robust Approach to Data-Efficient LLM Alignment
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
WaLRUS: Wavelets for Long range Representation Using State Space Methods
ComPO: Preference Alignment via Comparison Oracles
Differentiable Constraint-Based Causal Discovery
State Size Independent Statistical Error Bound for Discrete Diffusion Models
Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings
VLMLight: Safety-Critical Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning Architecture
Securing the Language of Life: Inheritable Watermarks from DNA Language Models to Proteins
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding
Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning
REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving
Dynamic Algorithm for Explainable $k$-medians Clustering under $\ell_p$ Norm
KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows
Efficient Quadratic Corrections for Frank-Wolfe Algorithms
StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations
SimWorld: An Open-ended Simulator for Agents in Physical and Social Worlds
Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning
Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs
LILO: Learning to Reason at the Frontier of Learnability
SING: SDE Inference via Natural Gradients
VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents
Characterizing the Expressivity of Fixed-Precision Transformer Language Models
Lyapunov-Stable Adaptive Control for Multimodal Concept Drift
Open-Vocabulary Part Segmentation via Progressive and Boundary-Aware Strategy
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
Efficient Rectified Flow for Image Fusion
MoBA: Mixture of Block Attention for Long-Context LLMs
Uncertainty-Guided Exploration for Efficient AlphaZero Training
Real-World Reinforcement Learning of Active Perception Behaviors
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
MUSTAFAR: Promoting Unstructured Sparsity for KV Cache Pruning in LLM Inference
On the Role of Hidden States of Modern Hopfield Network in Transformer
Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
Monotone and Separable Set Functions: Characterizations and Neural Models
Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models
TopER: Topological Embeddings in Graph Representation Learning
Generative Data Augmentation via Diffusion Distillation, Adversarial Alignment, and Importance Reweighting
Boosting the Uniqueness of Neural Networks Fingerprints with Informative Triggers
Escaping Collapse: The Strength of Weak Data for Large Language Model Training
DynaAct: Large Language Model Reasoning with Dynamic Action Spaces
Minimax-Optimal Univariate Function Selection in Sparse Additive Models: Rates, Adaptation, and the Estimation-Selection Gap
Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Associations
Self-Guided Hierarchical Exploration for Generalist Foundation Model Web Agents
MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes
Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
Can Dependencies Induced by LLM-Agent Workflows Be Trusted?
Stable Coresets via Posterior Sampling: Aligning Induced and Full Loss Landscapes
CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs
Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models
Privacy amplification by random allocation
An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction
Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning
Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models
Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2
Data-Driven Performance Guarantees for Classical and Learned Optimizers
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
CALM-PDE: Continuous and Adaptive Convolutions for Latent Space Modeling of Time-dependent PDEs
Unsupervised Learning for Optimal Transport plan prediction between unbalanced graphs
Stab-SGD: Noise-Adaptivity in Smooth Optimization with Stability Ratios
Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval
Balancing Performance and Costs in Best Arm Identification
Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models
Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks
On Reasoning Strength Planning in Large Reasoning Models
Simultaneous Swap Regret Minimization via KL-Calibration
Towards Effective Federated Graph Foundation Model via Mitigating Knowledge Entanglement
Exploration from a Primal-Dual Lens: Value-Incentivized Actor-Critic Methods for Sample-Efficient Online RL
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Stable Matching with Ties: Approximation Ratios and Learning
Incomplete Multi-view Clustering via Hierarchical Semantic Alignment and Cooperative Completion
Non-monotone Submodular Optimization: $p$-Matchoid Constraints and Fully Dynamic Setting
Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning
Learning to Generate Human-Human-Object Interactions from Textual Descriptions
4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos
Efficient Multimodal Dataset Distillation via Generative Models
BrainODE: Neural Shape Dynamics for Age- and Disease-aware Brain Trajectories
Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling
SuperCLIP: CLIP with Simple Classification Supervision
T2SMark: Balancing Robustness and Diversity in Noise-as-Watermark for Diffusion Models
Fast Inference for Augmented Large Language Models
From Programs to Poses: Factored Real-World Scene Generation via Learned Program Libraries
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
Faithful Dynamic Imitation Learning from Human Intervention with Dynamic Regret Minimization
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
Robust Contextual Pricing
nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
Dyn-O: Building Structured World Models with Object-Centric Representations
What Do Latent Action Models Actually Learn?
GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving
Tackling Feature-Classifier Mismatch in Federated Learning via Prompt-Driven Feature Transformation
Robust and Computation-Aware Gaussian Processes
Enhancing CLIP Robustness via Cross-Modality Alignment
Test Time Scaling for Neural Processes
Compact Memory for Continual Logistic Regression
Reliable Decision‑Making via Calibration‑Oriented Retrieval‑Augmented Generation
Procurement Auctions with Predictions: Improved Frugality for Facility Location
Benign Overfitting in Single-Head Attention
Smooth and Flexible Camera Movement Synthesis via Temporal Masked Generative Modeling
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
On Traceability in $\ell_p$ Stochastic Convex Optimization
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
TADA: Improved Diffusion Sampling with Training-free Augmented DynAmics
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
UniViT: Unifying Image and Video Understanding in One Vision Encoder
Consensus-Robust Transfer Attacks via Parameter and Representation Perturbations
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Support Vector Generation: Kernelizing Large Language Models for Efficient Zero‑Shot NLP
Online Learning in the Repeated Mediated Newsvendor Problem
Neural Fractional Attention Differential Equations
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
A Multimodal Benchmark for Framing of Oil & Gas Advertising and Potential Greenwashing Detection
Continual Multimodal Contrastive Learning
dKV-Cache: The Cache for Diffusion Language Models
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Linear Extrapolation
LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching
Dual-Comb Ghost Imaging with Transformer-Based Reconstruction for Optical Fiber Endomicroscopy
ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos
Pareto Optimal Risk-Agnostic Distributional Bandits with Heavy-Tail Rewards
Exploring Structural Degradation in Dense Representations for Self-supervised Learning
Versatile Transferable Unlearnable Example Generator
SDPGO: Efficient Self-Distillation Training Meets Proximal Gradient Optimization
GeoLink: Empowering Remote Sensing Foundation Model with OpenStreetMap Data
ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search
Memory-Efficient Training with In-Place FFT Implementation
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
VideoMAR: Autoregressive Video Generation with Continuous Tokens
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations
ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining
HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets
Online Functional Tensor Decomposition via Continual Learning for Streaming Data Completion
Learning Skill-Attributes for Transferable Assessment in Video
MutualVPR: A Mutual Learning Framework for Resolving Supervision Inconsistencies via Adaptive Clustering
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
Omnipresent Yet Overlooked: Heat Kernels in Combinatorial Bayesian Optimization
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
Optimize the Unseen - Fast NeRF Cleanup with Free Space Prior
Deep Compositional Phase Diffusion for Long Motion Sequence Generation
LaRes: Evolutionary Reinforcement Learning with LLM-based Adaptive Reward Search
Self-Evolving Pseudo-Rehearsal for Catastrophic Forgetting with Task Similarity in LLMs
Inv-Entropy: A Fully Probabilistic Framework for Uncertainty Quantification in Language Models
Benford’s Curse: Tracing Digit Bias to Numerical Hallucination in LLMs
Sparse Diffusion Autoencoder for Test-time Adapting Prediction of Complex Systems
SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models
Uni-LoRA: One Vector is All You Need
RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards
Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
Learning Expandable and Adaptable Representations for Continual Learning
Bio-Inspired Image Restoration
On the Value of Cross-Modal Misalignment in Multimodal Representation Learning
How Well Can Differential Privacy Be Audited in One Run?
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning
Emergence and scaling laws in SGD learning of shallow neural networks
Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions
PBR-SR: Mesh PBR Texture Super Resolution from 2D Image Priors
MV-CoLight: Efficient Object Compositing with Consistent Lighting and Shadow Generation
VITRIX-UniViTAR: Unified Vision Transformer with Native Resolution
Improved Algorithms for Overlapping and Robust Clustering of Edge-Colored Hypergraphs: An LP-Based Combinatorial Approach
Tri-MARF: A Tri-Modal Multi-Agent Responsive Framework for Comprehensive 3D Object Annotation
Rao-Blackwellised Reparameterisation Gradients
Seeing the Arrow of Time in Large Multimodal Models
Checklists Are Better Than Reward Models For Aligning Language Models
Activated LoRA: Fine-tuned LLMs for Intrinsics
UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification
Subsampled Ensemble Can Improve Generalization Tail Exponentially
Semantic Representation Attack against Aligned Large Language Models
Fairness-aware Bayes Optimal Functional Classification
OVS Meets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation
FNOPE: Simulation-based inference on function spaces with Fourier Neural Operators
Evolutionary Multi-View Classification via Eliminating Individual Fitness Bias
AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness
Graph Alignment via Birkhoff Relaxation
Pixel Reasoner: Incentivizing Pixel Space Reasoning via Curiosity-Driven Reinforcement Learning
Visual Sync: Multi‑Camera Synchronization via Cross‑View Object Motion
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
General-Reasoner: Advancing LLM Reasoning Across All Domains
Self-Training with Dynamic Weighting for Robust Gradual Domain Adaptation
Scaling and context steer LLMs along the same computational path as the human brain
ST$^2$360D: Spatial-to-Temporal Consistency for Training-free 360 Monocular Depth Estimation
Future Link Prediction Without Memory or Aggregation
Tight High-Probability Bounds for Nonconvex Heavy-Tailed Scenario under Weaker Assumptions
Scalable Fingerprinting of Large Language Models
Self-Adapting Language Models
Toward Interpretable Evaluation Measures for Time Series Segmentation
MEIcoder: Decoding Visual Stimuli from Neural Activity by Leveraging Most Exciting Inputs
SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Self-Improving Embodied Foundation Models
Deployment Efficient Reward-Free Exploration with Linear Function Approximation
Right for the Right Reasons: Avoiding Reasoning Shortcuts via Prototypical Neurosymbolic AI
Learning Efficient Fuse-and-Refine for Feed-Forward 3D Gaussian Splatting
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Dimensional Collapse in VQVAEs: Evidence and Remedies
Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection
Leveraging Conditional Dependence for Efficient World Model Denoising
CamEdit: Continuous Camera Parameter Control for Photorealistic Image Editing
InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy
WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world scenarios
GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling
ROGR: Relightable 3D Objects using Generative Relighting
CLEAR: Command Level Annotated Dataset for Ransomware Detection
MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference
On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study
QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks
Optimal Regret of Bandits under Differential Privacy
Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training
Functional data analysis for multivariate distributions through Wasserstein slicing
KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation
SGCD: Stain-Guided CycleDiffusion for Unsupervised Domain Adaptation of Histopathology Image Classification
MisoDICE: Multi-Agent Imitation from Mixed-Quality Demonstrations
Resolution of Simpson's paradox via the common cause principle
Joint Modeling of fMRI and EEG Imaging Using Ordinary Differential Equation-Based Hypergraph Neural Networks
Scaling Image Geo-Localization to Continent Level
Cascaded Language Models for Cost-Effective Human–AI Decision-Making
On the Hardness of Conditional Independence Testing In Practice
AutoHood3D: A Multi‑Modal Benchmark for Automotive Hood Design and Fluid–Structure Interaction
Simulation-Based Inference for Adaptive Experiments
From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series
With Limited Data for Multimodal Alignment, Let the STRUCTURE Guide You
RADAR: Benchmarking Language Models on Imperfect Tabular Data
Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
AutoJudge: Judge Decoding Without Manual Annotation
Auto-Compressing Networks
Adaptively Coordinating with Novel Partners via Learned Latent Strategies
Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets
Let Brain Rhythm Shape Machine Intelligence for Connecting Dots on Graphs
Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design via Constrained RL
Value Gradient Guidance for Flow Matching Alignment
Deep Nonlinear Sufficient Dimension Reduction
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation
Estimation and Inference in Distributional Reinforcement Learning
Unmasking Puppeteers: Leveraging Biometric Leakage to Expose Impersonation in AI-Based Videoconferencing
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces
Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the compact case
HairFree: Compositional 2D Head Prior for Text-Driven 360° Bald Texture Synthesis
Embracing Trustworthy Brain-Agent Collaboration as Paradigm Extension for Intelligent Assistive Technologies
On the Existence and Complexity of Core-Stable Data Exchanges
Position: Benchmarking is Broken - Don't Let AI be Its Own Judge
SingRef6D: Monocular Novel Object Pose Estimation with a Single RGB Reference
HeroFilter: Adaptive Spectral Graph Filter for Varying Heterophilic Relations
The Narrow Gate: Localized Image-Text Communication in Native Multimodal Models
MLLM-ISU: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models based Intrusion Scene Understanding
Fairness-Regularized Online Optimization with Switching Costs
MARS-VFL: A Unified Benchmark for Vertical Federated Learning with Realistic Evaluation
Towards Syn-to-Real IQA: A Novel Perspective on Reshaping Synthetic Data Distributions
Learning “Partner-Aware” Collaborators in Multi-Party Collaboration
EnCompass: Enhancing Agent Programming with Search Over Program Execution Paths
CoreaSpeech: Korean Speech Corpus via JAMO-based Coreset Selection for Efficient and Robust Korean Speech Generation
InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention
Reparameterized LLM Training via Orthogonal Equivalence Transformation
Unbalanced Optimal Total Variation Transport: A Theoretical Approach to Spatial Resource Allocation Problems
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity
Structure-Aware Fusion with Progressive Injection for Multimodal Molecular Representation Learning
ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Deep learning for continuous-time stochastic control with jumps
Surprise3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes
A Cramér–von Mises Approach to Incentivizing Truthful Data Sharing
Preference Distillation via Value based Reinforcement Learning
BRACE: A Benchmark for Robust Audio Caption Quality Evaluation
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization
SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
Cross City Traffic Flow Generation via Retrieval Augmented Diffusion Model
CURV: Coherent Uncertainty-Aware Reasoning in Vision-Language Models for X-Ray Report Generation
LoSplit: Loss-Guided Dynamic Split for Training-Time Defense Against Graph Backdoor Attacks
RoFt-Mol: Benchmarking Robust Fine-tuning with Molecular Graph Foundation Models
GTR-Loc: Geospatial Text Regularization Assisted Outdoor LiDAR Localization
Continual Release Moment Estimation with Differential Privacy
SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
EgoExoBench: A Benchmark for First- and Third-person View Video Understanding in MLLMs
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
RGNMR: A Gauss-Newton method for robust matrix completion with theoretical guarantees
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
RAM-W600: A Multi-Task Wrist Dataset and Benchmark for Rheumatoid Arthritis
Uncertainty-Informed Meta Pseudo Labeling for Surrogate Modeling with Limited Labeled Data
VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning
Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
Results of the Big ANN: NeurIPS’23 competition
MMCSBench: A Fine-Grained Benchmark for Large Vision-Language Models in Camouflage Scenes
UniEdit: A Unified Knowledge Editing Benchmark for Large Language Models
DeltaFormer: Unlock the state space of Transformer
DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks
Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis
Bridging Theory and Practice in Link Representation with Graph Neural Networks
Entropy-Calibrated Label Distribution Learning
Multimodal Disease Progression Modeling via Spatiotemporal Disentanglement and Multiscale Alignment
The Rashomon Set Has It All: Analyzing Trustworthiness of Trees under Multiplicity
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image
ImgEdit: A Unified Image Editing Dataset and Benchmark
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
AHa-Bench: Benchmarking Audio Hallucinations in Large Audio-Language Models
Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations
Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
Inductive Domain Transfer In Misspecified Simulation-Based Inference
Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation
Leader360V: A Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment
Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
Fuse2Match: Training-Free Fusion of Flow, Diffusion, and Contrastive Models for Zero-Shot Semantic Matching
Prompt-Guided Alignment with Information Bottleneck Makes Image Compression Also a Restorer
REP: Resource-Efficient Prompting for Rehearsal-Free Continual Learning
Structure-Aware Cooperative Ensemble Evolutionary Optimization on Combinatorial Problems with Multimodal Large Language Models
Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging
Additive Models Explained: A Computational Complexity Approach
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
A Hierarchy of Graphical Models for Counterfactual Inferences
Counterfactual Image Editing with Disentangled Causal Latent Space
Sound Logical Explanations for Mean Aggregation Graph Neural Networks
Confounding Robust Deep Reinforcement Learning: A Causal Approach
Smooth Quadratic Prediction Markets
Decomposing stimulus-specific sensory neural information via diffusion models
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Color Conditional Generation with Sliced Wasserstein Guidance
Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable
Tail-Optimized Caching for LLM Inference
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
The Logical Expressiveness of Temporal GNNs via Two-Dimensional Product Logics
Web-Scale Collection of Video Data for 4D Animal Reconstruction
Siegel Neural Networks
PMLF: A Physics-Guided Multiscale Loss Framework for Structurally Heterogeneous Time Series
Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Contextual Tokenization for Graph Inverted Indices
Multi-Objective One-Shot Pruning for Large Language Models
NOBLE - Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models
SGAR: Structural Generative Augmentation for 3D Human Motion Retrieval
ViSPLA: Visual Iterative Self-Prompting for Language-Guided 3D Affordance Learning
Hyperbolic Fine-Tuning for Large Language Models
AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners
Modeling Dynamic Neural Activity by combining Naturalistic Video Stimuli and Stimulus-independent Latent Factors
Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
Temporal Chain of Thought: Long-Video Understanding by Thinking in Frames
Horizon Reduction Makes RL Scalable
Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining
Staggered Environment Resets Improve Massively Parallel On-Policy Reinforcement Learning
Bubbleformer: Forecasting Boiling with Transformers
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Online Strategic Classification With Noise and Partial Feedback
Sparse Gaussian Processes: Structured Approximations and Power-EP Revisited
Improved Bounds for Swap Multicalibration and Swap Omniprediction
State-Covering Trajectory Stitching for Diffusion Planners
Generative Distribution Embeddings
Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need
Improving Progressive Generation with Decomposable Flow Matching
EvoLM: In Search of Lost Language Model Training Dynamics
GradMetaNet: An Equivariant Architecture for Learning on Gradients
Beyond Last-Click: An Optimal Mechanism for Ad Attribution
Sampled Estimators For Softmax Must Be Biased
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
Venus-MAXWELL: Efficient Learning of Protein-Mutation Stability Landscapes using Protein Language Models
Fairshare Data Pricing via Data Valuation for Large Language Models
Subgraph Federated Learning via Spectral Methods
Sequential Monte Carlo for Policy Optimization in Continuous POMDPs
Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts
Online Segment Any 3D Thing as Instance Tracking
PREAMBLE: Private and Efficient Aggregation via Block Sparse Vectors
Eulerian Neural Network Informed by Chemical Transport for Air Quality Forecasting
Eluder dimension: localise it!
Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search
DC4GS: Directional Consistency-Driven Adaptive Density Control for 3D Gaussian Splatting
Rethinking Entropy in Test-Time Adaptation: The Missing Piece from Energy Duality
Temporal Logic-Based Multi-Vehicle Backdoor Attacks against Offline RL Agents in End-to-end Autonomous Driving
Non-Markovian Discrete Diffusion with Causal Language Models
Rethinking Tokenized Graph Transformers for Node Classification
Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools
CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates
Noisy Multi-Label Learning through Co-Occurrence-Aware Diffusion
Role-aware Multi-agent Reinforcement Learning for Coordinated Emergency Traffic Control
Correcting misinterpretations of additive models
Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks
Show-o2: Improved Native Unified Multimodal Models
DBLoss: Decomposition-based Loss Function for Time Series Forecasting
BioCG: Constrained Generative Modeling for Biochemical Interaction Prediction
FedEL: Federated Elastic Learning for Heterogeneous Devices
E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis
Dynamic Shadow Unveils Invisible Semantics for Video Outpainting
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Simple and Optimal Sublinear Algorithms for Mean Estimation
Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings
SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications
RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills
Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos
Test3R: Learning to Reconstruct 3D at Test Time
Improved Algorithms for Fair Matroid Submodular Maximization
Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs
Many Minds, One Goal: Time Series Forecasting via Sub-task Specialization and Inter-agent Cooperation
Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Breaking the Discretization Barrier of Continuous Physics Simulation Learning
DPAIL: Training Diffusion Policy for Adversarial Imitation Learning without Policy Optimization
OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Clustering via Hedonic Games: New Concepts and Algorithms
DevFD : Developmental Face Forgery Detection by Learning Shared and Orthogonal LoRA Subspaces
Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
vHector and HeisenVec: Scalable Vector Graphics Generation Through Large Language Models
Fourier Analysis Network
Selective Omniprediction and Fair Abstention
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
MAT-Agent: Adaptive Multi-Agent Training Optimization
DictPFL: Efficient and Private Federated Learning on Encrypted Gradients
The Boundaries of Fair AI in Medical Image Prognosis: A Causal Perspective
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Tight analyses of first-order methods with error feedback
GOATex: Geometry & Occlusion-Aware Texturing
Multimodal LiDAR-Camera Novel View Synthesis with Unified Pose-free Neural Fields
Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming
PINN Balls: Scaling Second-Order Methods for PINNs with Domain Decomposition and Adaptive Sampling
FuncGenFoil: Airfoil Generation and Editing Model in Function Space
LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents
Guard Me If You Know Me: Protecting Specific Face-Identity from Deepfakes
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
Optimal Rates for Generalization of Gradient Descent for Deep ReLU Classification
Each Complexity Deserves a Pruning Policy
Aha! - Predicting What Matters Next: Online Highlight Detection Without Looking Ahead
RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes
Table2LaTeX-RL: High-Fidelity LaTeX Code Generation from Table Images via Reinforced Multimodal Language Models
Which Algorithms Have Tight Generalization Bounds?
Ultra-high Resolution Watermarking Framework Resistant to Extreme Cropping and Scaling
Enhancing Consistency of Flow-Based Image Editing through Kalman Control
Variance-Reduced Long-Term Rehearsal Learning with Quadratic Programming Reformulation
Robust Transfer Learning with Unreliable Source Data
Multi-View Oriented GPLVM: Expressiveness and Efficiency
PLEIADES: Building Temporal Kernels with Orthogonal Polynomials
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
Learning Interestingness in Automated Mathematical Theory Formation
DPA: A one-stop metric to measure bias amplification in classification datasets
See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction
Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?
Directed-Tokens: A Robust Multi-Modality Alignment Approach to Large Language-Vision Models
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Prior Forgetting and In-Context Overfitting
OOD-Barrier: Build a Middle-Barrier for Open-Set Single-Image Test Time Adaptation via Vision Language Models
SEGA: Shaping Semantic Geometry for Robust Hashing under Noisy Supervision
Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions
From Replication to Redesign: Exploring Pairwise Comparisons for LLM-Based Peer Review
Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs
LoRA vs Full Fine-tuning: An Illusion of Equivalence
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful
Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
KL Penalty Control via Perturbation for Direct Preference Optimization
Localist Topographic Expert Routing: A Barrel Cortex-Inspired Modular Network for Sensorimotor Processing
Stackelberg Learning with Outcome-based Payment
Controlled Visual Hallucination via Thalamus-Driven Decoupling Network for Domain Adaptation of Black-Box Predictors
Counteractive RL: Rethinking Core Principles for Efficient and Scalable Deep Reinforcement Learning
Sampling 3D Molecular Conformers with Diffusion Transformers
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
Effects of Dropout on Performance in Long-range Graph Learning Tasks
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations
On Fairness of Unified Multimodal Large Language Model for Image Generation
MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching
SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism
Manipulating 3D Molecules in a Fixed-Dimensional E(3)-Equivariant Latent Space
Let LRMs Break Free from Overthinking via Self-Braking Tuning
DeCaFlow: A deconfounding causal generative model
UrbanIng-V2X: A Large-Scale Multi-Vehicle, Multi-Infrastructure Dataset Across Multiple Intersections for Cooperative Perception
Disentangled Representation Learning via Modular Compositional Bias
The Price of Opportunity Fairness in Matroid Allocation Problems
Geometric Algebra-Enhanced Bayesian Flow Network for RNA Inverse Design
Unifying Attention Heads and Task Vectors via Hidden State Geometry in In-Context Learning
Bi-Directional Communication-Efficient Stochastic FL via Remote Source Generation
Extragradient Method for $(L_0, L_1)$-Lipschitz Root-finding Problems
OmniTry: Virtual Try-On Anything without Masks
S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models
MobileODE: An Extra Lightweight Network
Purest Quantum State Identification
UGM2N: An Unsupervised and Generalizable Mesh Movement Network via M-Uniform Loss
Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment
Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation
Principled Model Routing for Unknown Mixtures of Source Domains
Learning to Integrate Diffusion ODEs by Averaging the Derivatives
The $\varphi$ Curve: The Shape of Generalization through the Lens of Norm-based Capacity Control
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
Native-Resolution Image Synthesis
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
DON’T NEED RETRAINING: A Mixture of DETR and Vision Foundation Models for Cross-Domain Few-Shot Object Detection
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
ReSim: Reliable World Simulation for Autonomous Driving
HiMoLE: Towards OOD-Robust LoRA via Hierarchical Mixture of Experts
Self-Challenging Language Model Agents
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Spike-RetinexFormer: Rethinking Low-light Image Enhancement with Spiking Neural Networks
TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model
Representation-Level Counterfactual Calibration for Debiased Zero-Shot Recognition
AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws
Gate to the Vessel: Residual Experts Restore What SAM Overlooks
Learning Robust Spectral Dynamics for Temporal Domain Generalization
CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding
A Driving-Style-Adaptive Framework for Vehicle Trajectory Prediction
CHASM: Unveiling Covert Advertisements on Chinese Social Media
Block Coordinate Descent for Neural Networks Provably Finds Global Minima
ShapeCraft: LLM Agents for Structured, Textured and Interactive 3D Modeling
Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees
FlashMD: long-stride, universal prediction of molecular dynamics
What do you know? Bayesian knowledge inference for navigating agents
Empowering Decision Trees via Shape Function Branching
Block-Biased Mamba for Long-Range Sequence Processing
To Think or Not To Think: A Study of Thinking in Rule-Based Visual Reinforcement Fine-Tuning
Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
Nearly-Linear Time and Massively Parallel Algorithms for $k$-anonymity
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies
Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation
Rescaled Influence Functions: Accurate Data Attribution in High Dimension
Latent Chain-of-Thought for Visual Reasoning
DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning with Dual-Pathway Global-Local Fusion
Does Representation Guarantee Welfare?
BlurDM: A Blur Diffusion Model for Image Deblurring
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models
Multi-Agent Reinforcement Learning with Communication-Constrained Priors
Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph
Model Reconciliation via Cost-Optimal Explanations in Probabilistic Logic Programming
A Beyond-Worst-Case Analysis of Greedy k-means++
Improved Regret and Contextual Linear Extension for Pandora's Box and Prophet Inequality
Restricted Global-Aware Graph Filters Bridging GNNs and Transformer for Node Classification
Exploring the Noise Robustness of Online Conformal Prediction
DAA: Amplifying Unknown Discrepancy for Test-Time Discovery
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Brain-like Variational Inference
Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers?
GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility
Lifelong Test-Time Adaptation via Online Learning in Tracked Low-Dimensional Subspace
Towards Reliable LLM-based Robots Planning via Combined Uncertainty Estimation
Constrained Discrete Diffusion
Train on Pins and Test on Obstacles for Rectilinear Steiner Minimum Tree
H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
BraVE: Offline Reinforcement Learning for Discrete Combinatorial Action Spaces
Beyond Node-Centric Modeling: Sketching Signed Networks with Simplicial Complexes
KScope: A Framework for Characterizing the Knowledge Status of Language Models
From Contextual Combinatorial Semi-Bandits to Bandit List Classification: Improved Sample Complexity with Sparse Rewards
Vid-SME: Membership Inference Attacks against Large Video Understanding Models
Localizing Knowledge in Diffusion Transformers
DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration
OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation
Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains
3DOT: Texture Transfer for 3DGS Objects from a Single Reference Image
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Where Does It Exist from the Low-Altitude: Spatial Aerial Video Grounding
Fast Computation and Optimization for Opinion-Based Quantities of Friedkin-Johnsen Model
MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
A2Seek: Towards Reasoning-Centric Benchmark for Aerial Anomaly Understanding
DUAL: Learning Diverse Kernels for Aggregated Two-sample and Independence Testing
Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards
Contact Map Transfer with Conditional Diffusion Model for Generalizable Dexterous Grasp Generation
MiniMax-Remover: Taming Bad Noise Helps Video Object Removal
VLM-R³: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Mixture of Scope Experts at Test: Generalizing Deeper Graph Neural Networks with Shallow Variants
CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting
THUNDER: Tile-level Histopathology image UNDERstanding benchmark
Quantifying and Alleviating Co-Adaptation in Sparse-View 3D Gaussian Splatting
Learning Task-Agnostic Representations through Multi-Teacher Distillation
Efficient Representativeness-Aware Coreset Selection
The Parameterized Complexity of Computing the VC-Dimension
Scalable Signature Kernel Computations via Local Neumann Series Expansions
Why Do Multi-Agent LLM Systems Fail?
Quantum Visual Fields with Neural Amplitude Encoding
Reinforcement learning for one-shot DAG scheduling with comparability identification and dense reward
Inverse Methods for Missing Data Imputation
Meta Guidance: Incorporating Inductive Biases into Deep Time Series Imputers
DEGauss: Defending Against Malicious 3D Editing for Gaussian Splatting
Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
Locally Optimal Private Sampling: Beyond the Global Minimax
Attribution-Driven Adaptive Token Pruning for Transformers
Bootstrap Your Uncertainty: Adaptive Robust Classification Driven by Optimal-Transport
HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models
Attention-based clustering
Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task
Co-Regularization Enhances Knowledge Transfer in High Dimensions
Glocal Information Bottleneck for Time Series Imputation
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA
Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking
Do Neural Networks Need Gradient Descent to Generalize? A Theoretical Study
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
Stochastic Process Learning via Operator Flow Matching
DOVTrack: Data-Efficient Open-Vocabulary Tracking
Uncertainty-quantified Rollout Policy Adaptation for Unlabelled Cross-domain Video Temporal Grounding
QuARI: Query Adaptive Retrieval Improvement
Prior-Guided Diffusion Planning for Offline Reinforcement Learning
Understanding and Mitigating Numerical Sources of Nondeterminism in LLM Inference
Partition-Then-Adapt: Combating Prediction Bias for Reliable Multi-Modal Test-Time Adaptation
Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning
FP4 All the Way: Fully Quantized Training of Large Language Models
Neural-Driven Image Editing
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Evaluating LLM-contaminated Crowdsourcing Data Without Ground Truth
A Black-Box Debiasing Framework for Conditional Sampling
Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning
Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
Seeing through Uncertainty: Robust Task-Oriented Optimization in Visual Navigation
Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting
Fortifying Time Series: DTW-Certified Robust Anomaly Detection
AlignedGen: Aligning Style Across Generated Images
SAP: Exact Sorting in Splatting via Screen-Aligned Primitives
QiMeng-MuPa: Mutual-Supervised Learning for Sequential-to-Parallel Code Translation
GSAlign: Geometric and Semantic Alignment Network for Aerial-Ground Person Re-Identification
Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling
An Adaptive Quantum Circuit of Dempster's Rule of Combination for Uncertain Pattern Classification
Towards Unsupervised Domain Bridging via Image Degradation in Semantic Segmentation
An Information-theoretical Framework for Understanding Out-of-distribution Detection with Pretrained Vision-Language Models
ZigzagPointMamba: Spatial-Semantic Mamba for Point Cloud Understanding
Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
MixPrompt: Efficient Mixed Prompting for Multimodal Semantic Segmentation
Real-Time Scene-Adaptive Tone Mapping for High-Dynamic Range Object Detection
Discovering Data Structures: Nearest Neighbor Search and Beyond
U-CAN: Unsupervised Point Cloud Denoising with Consistency-Aware Noise2Noise Matching
Orient Anything V2: Unifying Orientation and Rotation Understanding
COME: Adding Scene-Centric Forecasting Control to Occupancy World Model
Foundation Cures Personalization: Improving Personalized Models’ Prompt Consistency via Hidden Foundation Knowledge
OPTFM: A Scalable Multi-View Graph Transformer for Hierarchical Pre-Training in Combinatorial Optimization
Factor Decorrelation Enhanced Data Removal from Deep Predictive Models
Preference Learning with Lie Detectors can Induce Honesty or Evasion
Wavy Transformer
Efficient Algorithms for Robust and Partial Semi-Discrete Optimal Transport
Minimax Adaptive Online Nonparametric Regression over Besov spaces
The Unseen Threat: Residual Knowledge in Machine Unlearning under Perturbed Samples
STNet: Spectral Transformation Network for Solving Operator Eigenvalue Problem
BundleFlow: Deep Menus for Combinatorial Auctions by Diffusion-Based Optimization
AneuG-Flow: A Large-Scale Synthetic Dataset of Diverse Intracranial Aneurysm Geometries and Hemodynamics
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
Scalable Cross-View Sample Alignment for Multi-View Clustering with View Structure Similarity
Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery
Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs
Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation
Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Disentangled Cross-Modal Representation Learning with Enhanced Mutual Supervision
EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes
Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning
Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks
Follow-the-Perturbed-Leader Nearly Achieves Best-of-Both-Worlds for the m-Set Semi-Bandit Problems
Pro3D-Editor: A Progressive Framework for Consistent and Precise 3D Editing
AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?
Estimation of Stochastic Optimal Transport Maps
UFT: Unifying Supervised and Reinforcement Fine-Tuning
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics
Modeling Neural Activity with Conditionally Linear Dynamical Systems
Measuring what Matters: Construct Validity in Large Language Model Benchmarks
Dense Metric Depth Estimation via Event-based Differential Focus Volume Prompting
Rotary Masked Autoencoders are Versatile Learners
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Inferring stochastic dynamics with growth from cross-sectional data
Learning non-equilibrium diffusions with Schrödinger bridges: from exactly solvable to simulation-free
Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
IF-Guide: Influence Function-Guided Detoxification of LLMs
On Evaluating Policies for Robust POMDPs
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
Locality in Image Diffusion Models Emerges from Data Statistics
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Optimal Regret Bounds via Low-Rank Structured Variation in Non-Stationary Reinforcement Learning
Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models
Simulating Society Requires Simulating Thought
Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance
Dataset Distillation for Pre-Trained Self-Supervised Vision Models
MoCha: Towards Movie-Grade Talking Character Generation
Generative Model Inversion Through the Lens of the Manifold Hypothesis
InstructFlow: Adaptive Symbolic Constraint-Guided Code Generation for Long-Horizon Planning
Strategic Cost Selection in Participatory Budgeting
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning
ReDit: Reward Dithering for Improved LLM Policy Optimization
Correlation Dimension of Autoregressive Large Language Models
Whole-Body Conditioned Egocentric Video Prediction
Evaluating Program Semantics Reasoning with Type Inference in System $F$
Optimal kernel regression bounds under energy-bounded noise
Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition
Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms
First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses
MotionBind: Multi-Modal Human Motion Alignment for Retrieval, Recognition, and Generation
Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data
Conformal Information Pursuit for Interactively Guiding Large Language Models
Convergence Rates for Gradient Descent on the Edge of Stability for Overparametrised Least Squares
SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning
KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling
Weak-to-Strong Generalization under Distribution Shifts
RvLLM: LLM Runtime Verification with Domain Knowledge
Scaling Embedding Layers in Language Models
Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals
Efficient Part-level 3D Object Generation via Dual Volume Packing
$\textit{Hyper-GoalNet}$: Goal-Conditioned Manipulation Policy Learning with HyperNetworks
Pseudo-Riemannian Graph Transformer
Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders
Scalable Valuation of Human Feedback through Provably Robust Model Alignment
Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables
Contextual Dynamic Pricing with Heterogeneous Buyers
Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation
Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents
Exploration via Feature Perturbation in Contextual Bandits
Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector
Generating Physically Sound Designs from Text and a Set of Physical Constraints
True Impact of Cascade Length in Contextual Cascading Bandits
Thompson Sampling for Multi-Objective Linear Contextual Bandit
Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble
Accident Anticipation via Temporal Occurrence Prediction
FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting
Explore In-Context Message Passing Operator for Graph Neural Networks in A Mean Field Game
Topology-aware Graph Diffusion Model with Persistent Homology
Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making
Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
Practical Bayes-Optimal Membership Inference Attacks
SPRO: Improving Image Generation via Self-Play
OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
The Burden of Interactive Alignment with Inconsistent Preferences
Collective Counterfactual Explanations: Balancing Individual Goals and Collective Dynamics
REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
Self supervised learning for in vivo localization of microelectrode arrays using raw local field potential
GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation
Transferring Linear Features Across Language Models With Model Stitching
Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis
Precise Diffusion Inversion: Towards Novel Samples and Few-Step Models
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
Foundations of Top-$k$ Decoding for Language Models
SHAP Meets Tensor Networks: Provably Tractable Explanations with Parallelism
FHGS: Feature-Homogenized Gaussian Splatting
Memory-Integrated Reconfigurable Adapters: A Unified Framework for Settings with Multiple Tasks
Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures
Dropout Regularization Versus l2-Penalization in the Linear Model
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
Mellow: a small audio language model for reasoning
Tracing the Roots: Leveraging Temporal Dynamics in Diffusion Trajectories for Origin Attribution
Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning
Gradient Multi-Normalization for Efficient LLM Training
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges
ADMN: A Layer-Wise Adaptive Multimodal Network for Dynamic Input Noise and Compute Resources
Information Retrieval Induced Safety Degradation in AI Agents
Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
Position: Require Frontier AI Labs To Release Small "Analog" Models
Adv-SSL: Adversarial Self-Supervised Representation Learning with Theoretical Guarantees
Rethinking Approximate Gaussian Inference in Classification
Differentiable Decision Tree via "ReLU+Argmin" Reformulation
HeavyWater and SimplexWater: Distortion-free LLM Watermarks for Low-Entropy Distributions
High-Performance Arithmetic Circuit Optimization via Differentiable Architecture Search
Differentially Private High-dimensional Variable Selection via Integer Programming
The Computational Complexity of Counting Linear Regions in ReLU Neural Networks
Stop DDoS Attacking the Research Community with AI-Generated Survey Papers
Parameter-Free Hypergraph Neural Network for Few-Shot Node Classification
Approximating Shapley Explanations in Reinforcement Learning
Efficient Large Language Model Inference with Neural Block Linearization
CAMO: Convergence-Aware Multi-Fidelity Bayesian Optimization
OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales
Multiplayer Federated Learning: Reaching Equilibrium with Less Communication
CyIN: Cyclic Informative Latent Space for Bridging Complete and Incomplete Multimodal Learning
LLM Meeting Decision Trees on Tabular Data
Bridging Crypto with ML-based Solvers: the SAT Formulation and Benchmarks
GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling
3D Equivariant Visuomotor Policy Learning via Spherical Projection
Sequential Multi-Agent Dynamic Algorithm Configuration
NSNQuant: A Double Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache
Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
Multi-Agent Imitation by Learning and Sampling from Factorized Soft Q-Function
Latent Principle Discovery for Language Model Self-Improvement
MIRA: Medical Time Series Foundation Model for Real-World Health Data
Creativity or Brute Force? Using Brainteasers as a Window into the Problem-Solving Abilities of Large Language Models
FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering
CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving
Contribution of task-irrelevant stimuli to drift of neural representations
Online Optimization for Offline Safe Reinforcement Learning
BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems
Mixture of Inputs: Text Generation Beyond Discrete Token Sampling
Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs
Training Language Models to Generate Quality Code with Program Analysis Feedback
Routing Mamba: Scaling State Space Models with Mixture-of-Experts Projection
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Reinforcement Learning Teachers of Test Time Scaling
Constrained Sampling for Language Models Should Be Easy: An MCMC Perspective
BayeSQP: Bayesian Optimization through Sequential Quadratic Programming
XIFBench: Evaluating Large Language Models on Multilingual Instruction Following
TimE: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling
FlashBias: Fast Computation of Attention with Bias
Scaling Data-Driven Probabilistic Robustness Analysis for Semantic Segmentation Neural Networks
Bridging Expressivity and Scalability with Adaptive Unitary SSMs
Efficient Allocation of Working Memory Resource for Utility Maximization in Humans and Recurrent Neural Networks
Efficient Utility-Preserving Machine Unlearning with Implicit Gradient Surgery
Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation
Machine Unlearning in 3D Generation: A Perspective-Coherent Acceleration Framework
Learning Robust Vision-Language Models from Natural Latent Spaces
Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
ChemPile: A 250 GB Diverse and Curated Dataset for Chemical Foundation Models
Learning to Learn with Contrastive Meta-Objective
Adaptive Preference Arithmetic: A Personalized Agent with Adaptive Preference Arithmetic for Dynamic Preference Modeling
FEAT: Free energy Estimators with Adaptive Transport
How to build a consistency model: Learning flow maps via self-distillation
Multitask Learning with Stochastic Interpolants
Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy
A High-Dimensional Statistical Method for Optimizing Transfer Quantities in Multi-Source Transfer Learning
Subspace Networks: Scaling Decentralized Training with Communication-Efficient Model Parallelism
Thresholds for sensitive optimality and Blackwell optimality in stochastic games
How Particle System Theory Enhances Hypergraph Message Passing
RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events
GnnXemplar: Exemplars to Explanations - Natural Language Rules for Global GNN Interpretability
Investigating Hallucinations of Time Series Foundation Models through Signal Subspace Analysis
Efficient and Near-Optimal Algorithm for Contextual Dueling Bandits with Offline Regression Oracles
Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness
RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Probing Hidden Knowledge Holes in Unlearned LLMs
Reinforcement Learning with Backtracking Feedback
Neural Evolution Strategy for Black-box Pareto Set Learning
Enforcing Hard Linear Constraints in Deep Learning Models with Decision Rules
Adaptive Time Encoding for Irregular Multivariate Time-Series Classification
OligoGym: Curated Datasets and Benchmarks for Oligonucleotide Drug Discovery
Generalizing Verifiable Instruction Following
Meta-Learning Objectives for Preference Optimization
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks
Channel Simulation and Distributed Compression with Ensemble Rejection Sampling
List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
Kernel conditional tests from learning-theoretic bounds
DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding
Adapting to Stochastic and Adversarial Losses in Episodic MDPs with Aggregate Bandit Feedback
Enhancing LLM Planning for Robotics Manipulation through Hierarchical Procedural Knowledge Graphs
Post Hoc Regression Refinement via Pairwise Rankings
Token Embeddings Violate the Manifold Hypothesis
ORBIT - Open Recommendation Benchmark for Reproducible Research with Hidden Tests
MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures
ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
Oracle-Efficient Combinatorial Semi-Bandits
VideoCAD: A Dataset and Model for Learning Long‑Horizon 3D CAD UI Interactions from Video
Finite Sample Analyses for Continuous-time Linear Systems: System Identification and Online Control
Anti-Aliased 2D Gaussian Splatting
Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
Training-Free Safe Denoisers for Safe Use of Diffusion Models
Last Iterate Convergence in Monotone Mean Field Games
Forecasting in Offline Reinforcement Learning for Non-stationary Environments
Score-Based Diffusion Modeling for Nonparametric Empirical Bayes in Heteroscedastic Gaussian Mixtures
Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
Neural Stochastic Flows: Solver-Free Modelling and Inference for SDE Solutions
Recurrent Memory for Online Interdomain Gaussian Processes
On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts
Optical Coherence Tomography Harmonization with Anatomy-Guided Latent Metric Schrödinger Bridges
Impact of Layer Norm on Memorization and Generalization in Transformers
Optimal Estimation of the Best Mean in Multi-Armed Bandits
Graph-Based Attention for Differentiable MaxSAT Solving
Group-in-Group Policy Optimization for LLM Agent Training
EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code
Degradation-Aware Dynamic Schrödinger Bridge for Unpaired Image Restoration
Learning a Cross-Modal Schrödinger Bridge for Visual Domain Generalization
Consistency of Physics-Informed Neural Networks for Second-Order Elliptic Equations
AiDE-Q: Synthetic Labeled Datasets Can Enhance Learning Models for Quantum Property Estimation
RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion
AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation
Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death
Accelerating Optimization via Differentiable Stopping Time
Bringing SAM to new heights: leveraging elevation data for tree crown segmentation from drone imagery
GreenHyperSpectra: A multi-source hyperspectral dataset for global vegetation trait prediction
Taming Hyperparameter Sensitivity in Data Attribution: Practical Selection Without Costly Retraining
EvoBrain: Dynamic Multi-Channel EEG Graph Modeling for Time-Evolving Brain Networks
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
When No Paths Lead to Rome: Benchmarking Systematic Neural Relational Reasoning
Neurons as Detectors of Coherent Sets in Sensory Dynamics
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge
Unveiling Concept Attribution in Diffusion Models
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Improved Approximation Algorithms for Chromatic and Pseudometric-Weighted Correlation Clustering
MiCADangelo: Fine-Grained Reconstruction of Constrained CAD Models from 3D Scans
ODG: Occupancy Prediction Using Dual Gaussians
Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?
LLM Safety Alignment is Divergence Estimation in Disguise
scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
SAFEPATH: Preventing Harmful Reasoning in Chain-of-Thought via Early Alignment
A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Feedback Guidance of Diffusion Models
Information-Driven Design of Imaging Systems
Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Volume Transmission Implements Context Factorization to Target Online Credit Assignment and Enable Compositional Generalization
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning
Reasoning Planning for Language Models
Fully Dynamic Algorithms for Chamfer Distance
BioOSS: A Bio-Inspired Oscillatory State System with Spatio-Temporal Dynamics
Enhancing 3D Reconstruction for Dynamic Scenes
PartNeXt: A Next-Generation Dataset for Fine-Grained and Hierarchical 3D Part Understanding
Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models
Scaling Laws for Optimal Data Mixtures
Probing Neural Combinatorial Optimization Models
DesignX: Human-Competitive Algorithm Designer for Black-Box Optimization
MetaBox-v2: A Unified Benchmark Platform for Meta-Black-Box Optimization
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
Diffusion-Driven Two-Stage Active Learning for Low-Budget Semantic Segmentation
Conformal Arbitrage: Risk-Controlled Balancing of Competing Objectives in Language Models
Inference-Time Reward Hacking in Large Language Models
Flow based approach for Dynamic Temporal Causal models with non-Gaussian or Heteroscedastic Noises
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
Efficient Adaptive Federated Optimization
Object-centric binding in Contrastive Language-Image Pretraining
Continual Gaussian Mixture Distribution Modeling for Class Incremental Semantic Segmentation
Optimal community detection in dense bipartite graphs
Fast MRI for All: Bridging Access Gaps by Training without Raw Data
Towards Prospective Medical Image Reconstruction via Knowledge-Informed Dynamic Optimal Transport
GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving
FAME: Adaptive Functional Attention with Expert Routing for Function-on-Function Regression
Automatic Synthetic Data and Fine-grained Adaptive Feature Alignment for Composed Person Retrieval
How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation
Graph-Theoretic Insights into Bayesian Personalized Ranking for Recommendation
Constrained Diffusers for Safe Planning and Control
DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference
Disentangling Latent Shifts of In-Context Learning with Weak Supervision
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models
X-Mahalanobis: Transformer Feature Mixing for Reliable OOD Detection
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
Bilevel Network Learning via Hierarchically Structured Sparsity
Approximation theory for 1-Lipschitz ResNets
Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples
Sequentially Auditing Differential Privacy
Rethinking Scale-Aware Temporal Encoding for Event-based Object Detection
ArchCAD-400K: A Large-Scale CAD drawings Dataset and New Baseline for Panoptic Symbol Spotting
Language‑Bias‑Resilient Visual Question Answering via Adaptive Multi‑Margin Collaborative Debiasing
HyperMixup: Hypergraph-Augmented with Higher-order Information Mixup
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization
Event-based HDR Structured Light
Cyclic Counterfactuals under Shift–Scale Interventions
InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion
A Bayesian Approach to Contextual Dynamic Pricing using the Proportional Hazards Model with Discrete Price Data
Scaling Epidemic Inference on Contact Networks: Theory and Algorithms
Learning single index models via harmonic decomposition
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
Balancing Gradient and Hessian Queries in Non-Convex Optimization
Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning
Glance2Gaze: Efficient Vision-Language Models from Glance Fusion to Gaze Compression
FIGRDock: Fast Interaction-Guided Regression for Flexible Docking
Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection
Strategic Hypothesis Testing
Neural Collapse in Cumulative Link Models for Ordinal Regression: An Analysis with Unconstrained Feature Model
The Price of Sparsity: Sufficient Conditions for Sparse Recovery using Sparse and Sparsified Measurements
Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation
How Many Domains Suffice for Domain Generalization? A Tight Characterization via the Domain Shattering Dimension
Statistics Caching Test-Time Adaptation for Vision-Language Models
Memory-Enhanced Neural Solvers for Routing Problems
Association-Focused Path Aggregation for Graph Fraud Detection
From Human Attention to Diagnosis: Semantic Patch-Level Integration of Vision-Language Models in Medical Imaging
Layer-Wise Modality Decomposition for Interpretable Multimodal Sensor Fusion
SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs
Return of ChebNet: Understanding and Improving an Overlooked GNN on Long Range Tasks
Long-tailed Recognition with Model Rebalancing
A Computationally Viable Numerical Gradient-based Technique for Optimal Covering Problems
Empower Words: DualGround for Structured Phrase and Sentence-Level Temporal Grounding
Zero-Shot Performance Prediction for Probabilistic Scaling Laws
MOBO-OSD: Batch Multi-Objective Bayesian Optimization via Orthogonal Search Directions
The emergence of sparse attention: impact of data distribution and benefits of repetition
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search
Adaptive Latent-Space Constraints in Personalized Federated Learning
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Towards Generalizable 3D Human Pose Estimation via Ensembles on Flat Loss Landscapes
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Multivariate Latent Recalibration for Conditional Normalizing Flows
Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling
RepGuard: Adaptive Feature Decoupling for Robust Backdoor Defense in Large Language Models
Explainable Reinforcement Learning from Human Feedback to Improve Alignment
MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition
AI-Researcher: Autonomous Scientific Innovation
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing
Refining Norms: A Post-hoc Framework for OOD Detection in Graph Neural Networks
Fair Minimum Labeling: Efficient Temporal Network Activations for Reachability and Equity
BIPNN: Learning to Solve Binary Integer Programming via Hypergraph Neural Networks
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
How Benchmark Prediction from Fewer Data Misses the Mark
Virtual Fitting Room: Generating Arbitrarily Long Videos of Virtual Try-On from a Single Image
Martingale Posterior Neural Networks for Fast Sequential Decision Making
Teaching Language Models to Reason with Tools
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing
Timely Clinical Diagnosis through Active Test Selection
Estimating cognitive biases with attention-aware inverse planning
VoxDet: Rethinking 3D Semantic Scene Completion as Dense Object Detection
On Inductive Biases That Enable Generalization in Diffusion Transformers
Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings
Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Decentralized Dynamic Cooperation of Personalized Models for Federated Continual Learning
Value-Guided KV Compression for LLMs via Approximated CUR Decomposition
Embodied Cognition Augmented End2End Autonomous Driving
Adaptive Riemannian ADMM for Nonsmooth Optimization: Optimal Complexity without Smoothing
Spectral Analysis of Representational Similarity with Limited Neurons
Edit Less, Achieve More: Dynamic Sparse Neuron Masking for Lifelong Knowledge Editing in LLMs
LoMix: Learnable Weighted Multi-Scale Logits Mixing for Medical Image Segmentation
HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models
Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities
Taxonomy of reduction matrices for Graph Coarsening
Tight Bounds for Maximum Weight Matroid Independent Set and Matching in the Zero Communication Model
FACE: Faithful Automatic Concept Extraction
Irrational Complex Rotations Empower Low-bit Optimizers
TP-MDDN: Task-Preferenced Multi-Demand-Driven Navigation with Autonomous Decision-Making
Uni-RL: Unifying Online and Offline RL via Implicit Value Regularization
Vertical Federated Feature Screening
GenColor: Generative and Expressive Color Enhancement with Pixel-Perfect Texture Preservation
Towards Unsupervised Training of Matching-based Graph Edit Distance Solver via Preference-aware GAN
Low-degree evidence for computational transition of recovery rate in stochastic block model
Memorization in Graph Neural Networks
Planning and Learning in Average Risk-aware MDPs
SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
From Kolmogorov to Cauchy: Shallow XNet Surpasses KANs
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Learning Spatial-Aware Manipulation Ordering
CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling
MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks
Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model
Diffusion Models Meet Contextual Bandits
Streaming Federated Learning with Markovian Data
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
Efficient Verified Unlearning For Distillation
KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference
SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs
Backdoor Cleaning without External Guidance in MLLM Fine-tuning
A Theoretical Framework for Grokking: Interpolation followed by Riemannian Norm Minimisation
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
Large Language Models Think Too Fast To Explore Effectively
Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference
Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks
MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks
Stable Port-Hamiltonian Neural Networks
Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation
Universal Few-shot Spatial Control for Diffusion Models
Collaborative and Confidential Junction Trees for Hybrid Bayesian Networks
Pre-Trained Policy Discriminators are General Reward Models
Fractional Langevin Dynamics for Combinatorial Optimization via Polynomial-Time Escape
PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
Fast Non-Log-Concave Sampling under Nonconvex Equality and Inequality Constraints with Landing
Mixed-Sample SGD: an End-to-end Analysis of Supervised Transfer Learning
Quasi-Self-Concordant Optimization with $\ell_{\infty}$ Lewis Weights
Too Late to Recall: Explaining the Two-Hop Problem in Multimodal Knowledge Retrieval
Hephaestus: Mixture Generative Modeling with Energy Guidance for Large-scale QoS Degradation
Cypher-RI: Reinforcement Learning for Integrating Schema Selection into Cypher Generation
SALoM: Structure Aware Temporal Graph Networks with Long-Short Memory Updater
A Tale of Two Symmetries: Exploring the Loss Landscape of Equivariant Models
Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening
Stepsize anything: A unified learning rate schedule for budgeted-iteration training
SpiderSolver: A Geometry-Aware Transformer for Solving PDEs on Complex Geometries
Revisiting Frank-Wolfe for Structured Nonconvex Optimization
Projective Equivariant Networks via Second-order Fundamental Differential Invariants
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Skill-Driven Neurosymbolic State Abstractions
Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning
Sample-Adaptivity Tradeoff in On-Demand Sampling
Chain-of-Retrieval Augmented Generation
TANDEM: Bi-Level Data Mixture Optimization with Twin Networks
Mitigating Instability in High Residual Adaptive Sampling for PINNs via Langevin Dynamics
ProtoPairNet: Interpretable Regression through Prototypical Pair Reasoning
Online Bilateral Trade With Minimal Feedback: Don’t Waste Seller’s Time
Fairness-aware Anomaly Detection via Fair Projection
Scalable Evaluation and Neural Models for Compositional Generalization
VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold
FlowFeat: Pixel-Dense Embedding of Motion Profiles
Finding Low-Rank Matrix Weights in DNNs via Riemannian Optimization: RAdaGrad and RAdamW
Rare Text Semantics Were Always There in Your Diffusion Transformer
Demystifying Language Model Forgetting with Low-rank Example Associations
Re-coding for Uncertainties: Edge-awareness Semantic Concordance for Resilient Event-RGB Segmentation
FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration
In-Context Learning of Stochastic Differential Equations with Foundation Inference Models
Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
A Learning-Augmented Dynamic Programming Approach for Orienteering Problem with Time Windows
Think Only When You Need with Large Hybrid-Reasoning Models
Intrinsic Goals for Autonomous Agents: Model-Based Exploration in Virtual Zebrafish Predicts Ethological Behavior and Whole-Brain Dynamics
APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings
Multi-head Temporal Latent Attention
RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models
Theoretical Benefit and Limitation of Diffusion Language Model
Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees
RLZero: Direct Policy Inference from Language Without In-Domain Supervision
A Generalized Label Shift Perspective for Cross-Domain Gaze Estimation
Scaling Law with Learning Rate Annealing
VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View Alignment
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs
Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
Revolutionizing Graph Aggregation: From Suppression to Amplification via BoostGCN
BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading
Event-Driven Dynamic Scene Depth Completion
System-Embedded Diffusion Bridge Models
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity
Fast exact recovery of noisy matrix from few entries: the infinity norm approach
Statistical Inference under Performativity
GPO: Learning from Critical Steps to Improve LLM Reasoning
SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
ZeCO: Zero-Communication Overhead Sequence Parallelism for Linear Attention
Optimize Any Topology: A Foundation Model for Shape- and Resolution-Free Structural Topology Optimization
3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection
Disentangling misreporting from genuine adaptation in strategic settings: a causal approach
Pre-trained Large Language Models Learn to Predict Hidden Markov Models In-context
Reasoning Models Sometimes Output Illegible Chains of Thought
Robust and Scalable Autonomous Reinforcement Learning in Irreversible Environments
Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics
Learning to Factorize Spatio-Temporal Foundation Models
$\text{G}^2\text{M}$: A Generalized Gaussian Mirror Method to Boost Feature Selection Power
Fast Projection-Free Approach (without Optimization Oracle) for Optimization over Compact Convex Set
MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
ActiveVOO: Value of Observation Guided Active Knowledge Acquisition for Open-World Embodied Lifted Regression Planning
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
Understanding Differential Transformer Unchains Pretrained Self-Attentions
Enhanced Self-Distillation Framework for Efficient Spiking Neural Network Training
Cross-fluctuation phase transitions reveal sampling dynamics in diffusion models
PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation
Moment- and Power-Spectrum-Based Gaussianity Regularization for Text-to-Image Models
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data
From Indicators to Insights: Diversity-Optimized for Medical Series-Text Decoding via LLMs
Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning
Template-Guided 3D Molecular Pose Generation via Flow Matching and Differentiable Optimization
RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
GVPO: Group Variance Policy Optimization for Large Language Model Post-Training
Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws
Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints
Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation
TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
Homogeneous Algorithms Can Reduce Competition in Personalized Pricing
Sekai: A Video Dataset towards World Exploration
MIR-Bench: Can Your LLM Recognize Complicated Patterns via Many-Shot In-Context Reasoning?
FEEL: Quantifying Heterogeneity in Physiological Signals for Generalizable Emotion Recognition
Learning Theory for Kernel Bilevel Optimization
CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation
UMU-Bench: Closing the Modality Gap in Multimodal Unlearning Evaluation
Towards Large-Scale In-Context Reinforcement Learning by Meta-Training in Randomized Worlds
Is Artificial Intelligence Generated Image Detection a Solved Problem?
Efficient Safe Meta-Reinforcement Learning: Provable Near-Optimality and Anytime Safety
ROSE: Remove Objects with Side Effects in Videos
Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?
Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos
Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
Efficiently Maintaining the Multilingual Capacity of MCLIP in Downstream Cross-Modal Retrieval Tasks
KeeA*: Epistemic Exploratory A* Search via Knowledge Calibration
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
MUniverse: A Simulation and Benchmarking Suite for Motor Unit Decomposition
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
LTD-Bench: Evaluating Large Language Models by Letting Them Draw
Root Cause Analysis of Outliers with Missing Structural Knowledge
$O(\sqrt{T})$ Static Regret and Instance Dependent Constraint Violation for Constrained Online Convex Optimization
MixSignGraph: A Sign Sequence is Worth Mixed Graphs of Nodes
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering
Unlocking SLM Potential for Data Analysis Code Generation via Non-Parametric Knowledge Distillation
GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation
Disentangling Hyperedges through the Lens of Category Theory
SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation
Multi-Objective Reinforcement Learning with Max-Min Criterion: A Game-Theoretic Approach
BurstDeflicker: A Benchmark Dataset for Flicker Removal in Dynamic Scenes
Is This Tracker On? A Benchmark Protocol for Dynamic Tracking
Practical do-Shapley Explanations with Estimand-Agnostic Causal Inference
STree: Speculative Tree Decoding for Hybrid State Space Models
PID-controlled Langevin Dynamics for Faster Sampling on Generative Models
DreamPRM: Domain-reweighted Process Reward Model for Multimodal Reasoning
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
STAR: A Benchmark for Astronomical Star Fields Super-Resolution
UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning
AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems
Beyond Benign Overfitting in Nadaraya-Watson Interpolators
Structure-Aware Spectral Sparsification via Uniform Edge Sampling
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models
Depth-Bounds for Neural Networks via the Braid Arrangement
MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning
DiffBreak: Is Diffusion-Based Purification Robust?
Dense Associative Memory with Epanechnikov Energy
CLIMB: Class-imbalanced Learning Benchmark on Tabular Data
Beyond Token Probes: Hallucination Detection via Activation Tensors with ACT-ViT
MS-Bench: Evaluating LMMs in Ancient Manuscript Study through a Dunhuang Case Study
Towards precision protein-ligand affinity prediction benchmark: A Complete and Modification-Aware DAVIS Dataset
Flexible inference for animal learning rules using neural networks
Panoptic Captioning: An Equivalence Bridge for Image and Text
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Unveiling Extraneous Sampling Bias with Data Missing-Not-At-Random
MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention
TransferBench: Benchmarking Ensemble-based Black-box Transfer Attacks
Enhancing Multilingual LLM Pretraining with Model-Based Data Selection
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation
DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding
Separating the 'what' and 'how' of compositional computation to enable reuse and continual learning
OceanBench: A Benchmark for Data-Driven Global Ocean Forecasting systems
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance
Unified 2D-3D Discrete Priors for Noise-Robust and Calibration-Free Multiview 3D Human Pose Estimation
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
FORLA: Federated Object-centric Representation Learning with Slot Attention
Unbiased Prototype Consistency Learning for Multi-Modal and Multi-Task Object Re-Identification
Improving Monte Carlo Tree Search for Symbolic Regression
Reinforcement Learning with Action Chunking
Exploring Landscapes for Better Minima along Valleys
Oryx: a Scalable Sequence Model for Many-Agent Coordination in Offline MARL
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Spatially-aware Weights Tokenization for NeRF-Language Models
Tabula: A Tabular Self-Supervised Foundation Model for Single-Cell Transcriptomics
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning
Depth-Supervised Fusion Network for Seamless-Free Image Stitching
Risk Bounds For Distributional Regression
Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality
Path Gradients after Flow Matching
Optimal Graph Clustering without Edge Density Signals
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling
An Efficient Orlicz-Sobolev Approach for Transporting Unbalanced Measures on a Graph
Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
Constrained Optimization From a Control Perspective via Feedback Linearization
InvFusion: Bridging Supervised and Zero-shot Diffusion for Inverse Problems
Unfolding the Black Box of Recurrent Neural Networks for Path Integration
Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems
MoonCast: High-Quality Zero-Shot Podcast Generation
Plug-and-play Feature Causality Decomposition for Multimodal Representation Learning
Faster Video Diffusion with Trainable Sparse Attention
PhySwin: An Efficient and Physically-Informed Foundation Model for Multispectral Earth Observation
Value Diffusion Reinforcement Learning
Partner Modelling Emerges in Recurrent Agents (But Only When It Matters)
MultiNet: Adaptive Multi-Viewed Subgraph Convolutional Networks for Graph Classification
Policy Gradient Methods Converge Globally in Imperfect-Information Extensive-Form Games
Rethinking Nighttime Image Deraining via Learnable Color Space Transformation
PandaPose: 3D Human Pose Lifting from a Single Image via Propagating 2D Pose Prior to 3D Anchor Space
The quest for the GRAph Level autoEncoder (GRALE)
Kernel von Mises Formula of the Influence Function
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
Conditional Distribution Compression via the Kernel Conditional Mean Embedding
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training
VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
CoP: Agentic Red-teaming for Large Language Models using Composition of Principles
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Non-Asymptotic Analysis Of Data Augmentation For Precision Matrix Estimation
Computational Budget Should Be Considered in Data Selection
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
Progressive Data Dropout: An Embarrassingly Simple Approach to Train Faster
BlockScan: Detecting Anomalies in Blockchain Transactions
msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML
LLM Interpretability with Identifiable Temporal-Instantaneous Representation
DreamLight: Towards Harmonious and Consistent Image Relighting
GLVD: Guided Learned Vertex Descent
Dual-Path Temporal Decoder for End-to-End Multi-Object Tracking
Attention on the Sphere
Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
Transformers are almost optimal metalearners for linear classification
Structural Information-based Hierarchical Diffusion for Offline Reinforcement Learning
Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code
Towards Building Model/Prompt-Transferable Attackers against Large Vision-Language Models
Learning Chern Numbers of Multiband Topological Insulators with Gauge Equivariant Neural Networks
QBasicVSR: Temporal Awareness Adaptation Quantization for Video Super-Resolution
Regularized least squares learning with heavy-tailed noise is minimax optimal
DyMoDreamer: World Modeling with Dynamic Modulation
Impact of Dataset Properties on Membership Inference Vulnerability of Deep Transfer Learning
Reading Recognition in the Wild
Bayes optimal learning of attention-indexed models
Self-Refining Language Model Anonymizers via Adversarial Distillation
Understanding while Exploring: Semantics-driven Active Mapping
Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy
Flow Matching Neural Processes
Computable universal online learning
Brain-Inspired fMRI-to-Text Decoding via Incremental and Wrap-Up Language Modeling
Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
Hierarchical Shortest-Path Graph Kernel Network
Plenodium: Underwater 3D Scene Reconstruction with Plenoptic Medium Representation
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Safely Learning Controlled Stochastic Dynamics
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks
Parallelizing MCMC Across the Sequence Length
StarTrail: Concentric Ring Sequence Parallelism for Efficient Near-Infinite-Context Transformer Model Training
Fantastic Features and Where to Find Them: A Probing Method to combine Features from Multiple Foundation Models
Differential Privacy for Euclidean Jordan Algebra with Applications to Private Symmetric Cone Programming
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability
ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
Robustness in Both Domains: CLIP Needs a Robust Text Encoder
Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
When and how can inexact generative models still sample from the data manifold?
L2DGCN: Learnable Enhancement and Label Selection Dynamic Graph Convolutional Networks for Mitigating Degree Bias
On Logic-based Self-Explainable Graph Neural Networks
Elastic ViTs from Pretrained Models without Retraining
Mamba Only Glances Once (MOGO): A Lightweight Framework for Efficient Video Action Detection
Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning
KAIROS: Scalable Model-Agnostic Data Valuation
An Investigation of Memorization Risk in Healthcare Foundation Models
Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry
Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning
GeoClip: Geometry-Aware Clipping for Differentially Private SGD
Bézier Splatting for Fast and Differentiable Vector Graphics Rendering
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model
Influence Functions for Edge Edits in Non-Convex Graph Neural Networks
OMiSO: Adaptive optimization of state-dependent brain stimulation to shape neural population states
Generative Caching for Structurally Similar Prompts and Responses
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
Cycle-Sync: Robust Global Camera Pose Estimation through Enhanced Cycle-Consistent Synchronization
KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills
NEED: Cross-Subject and Cross-Task Generalization for Video and Image Reconstruction from EEG Signals
Large Language Diffusion Models
Distance-informed Neural Processes
Don’t Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation
Rainbow Delay Compensation: A Multi-Agent Reinforcement Learning Framework for Mitigating Observation Delays
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
Planning with Quantized Opponent Models
Hybrid Latent Reasoning via Reinforcement Learning
The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks
Imitation Learning with Temporal Logic Constraints
Vector Quantization in the Brain: Grid-like Codes in World Models
Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation
Angular Constraint Embedding via SpherePair Loss for Constrained Clustering
On the sample complexity of semi-supervised multi-objective learning
The Gaussian Mixing Mechanism: Renyi Differential Privacy via Gaussian Sketches
Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models
Beyond Oracle: Verifier-Supervision for Instruction Hierarchy in Reasoning and Instruction-Tuned LLMs
Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
Remasking Discrete Diffusion Models with Inference-Time Scaling
Learning-Augmented Online Bidding in Stochastic Settings
Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
Efficient semantic uncertainty quantification in language models via diversity-steered sampling
FlareX: A Physics-Informed Dataset for Lens Flare Removal via 2D Synthesis and 3D Rendering
Feature Unlearning: Theoretical Foundations and Practical Applications with Shuffling
Ground-Compose-Reinforce: Grounding Language in Agentic Behaviours using Limited Data
Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
LOMIA: Label-Only Membership Inference Attacks against Pre-trained Large Vision-Language Models
Nonparametric Quantile Regression with ReLU-Activated Recurrent Neural Networks
Video World Models with Long-term Spatial Memory
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Scaling Up Parameter Generation: A Recurrent Diffusion Approach
FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling
Diffusion on Demand: Selective Caching and Modulation for Efficient Generation
Structured Reinforcement Learning for Combinatorial Decision-Making
Self-Supervised Discovery of Neural Circuits in Spatially Patterned Neural Responses with Graph Neural Networks
FedFree: Breaking Knowledge-sharing Barriers through Layer-wise Alignment in Heterogeneous Federated Learning
DSRF: A Dynamic and Scalable Reasoning Framework for Solving RPMs
Online Locally Differentially Private Conformal Prediction via Binary Inquiries
Principled Long-Tailed Generative Modeling via Diffusion Models
SONAR: Long-Range Graph Propagation Through Information Waves
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
Spatial-Aware Decision-Making with Ring Attractors in Reinforcement Learning Systems
Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions
Improving Decision Trees through the Lens of Parameterized Local Search
TGA: True-to-Geometry Avatar Dynamic Reconstruction
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
CAMILA: Context-Aware Masking for Image Editing with Language Alignment
Promptable 3-D Object Localization with Latent Diffusion Models
Adaptive Discretization for Consistency Models
Neural Mutual Information Estimation with Vector Copulas
Semi-supervised Vertex Hunting, with Applications in Network and Text Analysis
Thompson Sampling in Function Spaces via Neural Operators
Schrödinger Bridge Matching for Tree-Structured Costs and Entropic Wasserstein Barycentres
Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
OctoNet: A Large-Scale Multi-Modal Dataset for Human Activity Understanding Grounded in Motion-Captured 3D Pose Labels
Offline Actor-Critic for Average Reward MDPs
Video Perception Models for 3D Scene Synthesis
VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code
Over-squashing in Spatiotemporal Graph Neural Networks
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes
Natural Gradient VI: Guarantees for Non-Conjugate Models
UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis
TraffiDent: A Dataset for Understanding the Interplay Between Traffic Dynamics and Incidents
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning
Event-Guided Consistent Video Enhancement with Modality-Adaptive Diffusion Pipeline
FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
IBGS: Image-Based Gaussian Splatting
TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception
Tensor Decomposition Networks for Accelerating Machine Learning Force Field Computations
SentinelKilnDB: A Large-Scale Dataset and Benchmark for OBB Brick Kiln Detection in South Asia Using Satellite Imagery
FAPEX: Fractional Amplitude-Phase Expressor for Robust Cross-Subject Seizure Prediction
Few-Shot Knowledge Distillation of LLMs With Counterfactual Explanations
Evaluating multiple models using labeled and unlabeled data
GoalLadder: Incremental Goal Discovery with Vision-Language Models
Geometric Imbalance in Semi-Supervised Node Classification
Visual Anagrams Reveal Hidden Differences in Holistic Shape Processing Across Vision Models
Minimal Semantic Sufficiency Meets Unsupervised Domain Generalization
LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning
InFlux: A Benchmark for Self-Calibration of Dynamic Intrinsics of Video Cameras
MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation
Efficient and Generalizable Mixed-Precision Quantization via Topological Entropy
Inexact Column Generation for Bayesian Network Structure Learning via Difference-of-Submodular Optimization
Guarantees for Alternating Least Squares in Overparameterized Tensor Decompositions
Iterative Foundation Model Fine-Tuning on Multiple Rewards
Revisiting 1-peer exponential graph for enhancing decentralized learning efficiency
AVerImaTeC: A Dataset for Automatic Verification of Image-Text Claims with Evidence from the Web
Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
PIVNO: Particle Image Velocimetry Neural Operator
Counterfactual Evolution of Multimodal Datasets via Visual Programming
DoseSurv: Predicting Personalized Survival Outcomes under Continuous-Valued Treatments
Diversity-Aware Policy Optimization for Large Language Model Reasoning
A solvable model of learning generative diffusion: theory and insights
One Stone with Two Birds: A Null-Text-Null Frequency-Aware Diffusion Models for Text-Guided Image Inpainting
StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models
GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers
From Softmax to Score: Transformers Can Effectively Implement In-Context Denoising Steps
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors
LiteReality: Graphic-Ready 3D Scene Reconstruction from RGB-D Scans
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Autoencoding Random Forests
Learning to Generalize: An Information Perspective on Neural Processes
Quantization-Free Autoregressive Action Transformer
The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels
HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model
Just One Layer Norm Guarantees Stable Extrapolation
Learning from A Single Markovian Trajectory: Optimality and Variance Reduction
Online Inverse Linear Optimization: Efficient Logarithmic-Regret Algorithm, Robustness to Suboptimality, and Lower Bound
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Unsupervised Trajectory Optimization for 3D Registration in Serial Section Electron Microscopy using Neural ODEs
Unified Scaling Laws for Compressed Representations
Linear Mixture Distributionally Robust Markov Decision Processes
Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning
Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning
RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility
ChatVLA-2: Vision-Language-Action Model with Open-World Reasoning
Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
Grounded Reinforcement Learning for Visual Reasoning
Why Do Some Language Models Fake Alignment While Others Don't?
Adaptive Quantization in Generative Flow Networks for Probabilistic Sequential Prediction
Efficient $k$-Sparse Band–Limited Interpolation with Improved Approximation Ratio
A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding
H-SPLID: HSIC-based Saliency Preserving Latent Information Decomposition
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
Human Texts Are Outliers: Detecting LLM-generated Texts via Out-of-distribution Detection
On the Hardness of Approximating Distributions with Tractable Probabilistic Models
An Efficient Local Search Approach for Polarized Community Discovery in Signed Networks
Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning
Distil-E2D: Distilling Image-to-Depth Priors for Event-Based Monocular Depth Estimation
Hierarchical Fine-grained Preference Optimization for Physically Plausible Video Generation
Graph Data Selection for Domain Adaptation: A Model-Free Approach
Progress Reward Model for Reinforcement Learning via Large Language Models
Replicable Online Learning
Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms
ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion
EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
MultiScale Contextual Bandits for Long Term Objectives
C-NAV: Towards Self-Evolving Continual Object Navigation in Open World
Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Combining Cost Constrained Runtime Monitors for AI Safety
X-Field: A Physically Informed Representation for 3D X-ray Reconstruction
Practical and Effective Code Watermarking for Large Language Models
Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Stochastic Forward-Forward Learning through Representational Dimensionality Compression
Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Distances for Markov chains from sample streams
RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation
DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
Trajectory Graph Learning: Aligning with Long Trajectories in Reinforcement Learning Without Reward Design
CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
Learning to Steer: Input-dependent Steering for Multimodal LLMs
Memory Injection Attacks on LLM Agents via Query-Only Interaction
Smoothed Agnostic Learning of Halfspaces over the Hypercube
LabelAny3D: Label Any Object 3D in the Wild
Single-Teacher View Augmentation: Boosting Knowledge Distillation via Angular Diversity
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry
What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization
Robust Estimation Under Heterogeneous Corruption Rates
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
SDTagNet: Leveraging Text-Annotated Navigation Maps for Online HD Map Construction
Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling
Efficient Low Rank Attention for Long-Context Inference in Large Language Models
Causal Sufficiency and Necessity Improves Chain-of-Thought Reasoning
Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Self-supervised Blending Structural Context of Visual Molecules for Robust Drug Interaction Prediction
Controlling The Spread of Epidemics on Networks with Differential Privacy
SynCL: A Synergistic Training Strategy with Instance-Aware Contrastive Learning for End-to-End Multi-Camera 3D Tracking
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance
OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
Quantifying Elicitation of Latent Capabilities in Language Models
Scalable and adaptive prediction bands with kernel sum-of-squares
RoomEditor: High-Fidelity Furniture Synthesis with Parameter-Sharing U-Net
Quantum Speedups for Minimax Optimization and Beyond
Rethinking Gradient Step Denoiser: Towards Truly Pseudo-Contractive Operator
RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis
RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting
Global Minimizers of Sigmoid Contrastive Loss
GeoVideo: Introducing Geometric Regularization into Video Generation Model
SPFL: Sequential updates with Parallel aggregation for Enhanced Federated Learning under Category and Domain Shifts
UniTraj: Learning a Universal Trajectory Foundation Model from Billion-Scale Worldwide Traces
Scalable Neural Incentive Design with Parameterized Mean-Field Approximation
zip2zip: Inference-Time Adaptive Tokenization via Online Compression
Uncertainty Quantification for Physics-Informed Neural Networks with Extended Fiducial Inference
From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes
Interpreting Emergent Features in Deep Learning-based Side-channel Analysis
AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented Efficient Long Video Understanding
Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification
OptiTree: Hierarchical Thoughts Generation with Tree Search for LLM Optimization Modeling
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
PoseCrafter: Extreme Pose Estimation with Hybrid Video Synthesis
Enhancing Training Data Attribution with Representational Optimization
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Bits Leaked per Query: Information-Theoretic Bounds for Adversarial Attacks on LLMs
DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
COLA: Towards Efficient Multi-Objective Reinforcement Learning with Conflict Objective Regularization in Latent Space
Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Latent Space Factorization in LoRA
Periodic Skill Discovery
BrainEC-LLM: Brain Effective Connectivity Estimation by Multiscale Mixing LLM
Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations
Latent Retrieval Augmented Generation of Cross-Domain Protein Binders
Can We Infer Confidential Properties of Training Data from LLMs?
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation
Co-PatcheR: Collaborative Software Patching with Component-specific Small Reasoning Models
DMol: A Highly Efficient and Chemical Motif-Preserving Molecule Generation Platform
Transition Matching: Scalable and Flexible Generative Modeling
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses via Convolutional Fenchel–Young Losses
Depth-Width Tradeoffs for Transformers on Graph Tasks
Exploring the Design Space of Diffusion Bridge Models
From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models
Pattern-Guided Adaptive Prior for Structure Learning
CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step
Better NTK Conditioning: A Free Lunch from (ReLU) Nonlinear Activation in Wide Neural Networks
Switchable Token-Specific Codebook Quantization For Face Image Compression
Lattice Boltzmann Model for Learning Real-World Pixel Dynamicity
A Fair Federated Learning Method for Handling Client Participation Probability Inconsistencies in Heterogeneous Environments
Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
CORE: Collaborative Optimization with Reinforcement Learning and Evolutionary Algorithm for Floorplanning
InstructSAM: A Training-free Framework for Instruction-Oriented Remote Sensing Object Recognition
Dynamic Semantic-Aware Correlation Modeling for UAV Tracking
HIDISC: A Hyperbolic Framework for Domain Generalization with Generalized Category Discovery
Attention Sinks: A 'Catch, Tag, Release' Mechanism for Embeddings
Infinite Neural Operators: Gaussian processes on functions
LLM Layers Immediately Correct Each Other
DualCnst: Enhancing Zero-Shot Out-of-Distribution Detection via Text-Image Consistency in Vision-Language Models
On Union-Closedness of Language Generation
OmniZoom: A Universal Plug-and-Play Paradigm for Cross-Device Smooth Zoom Interpolation
Aligning Compound AI Systems via System-level DPO
How to Train Your LLM Web Agent: A Statistical Diagnosis
SteerConf: Steering LLMs for Confidence Elicitation
PANGEA: Projection-Based Augmentation with Non-Relevant General Data for Enhanced Domain Adaptation in LLMs
Efficient PAC Learning for Realizable-Statistic Models via Convex Surrogates
IPSI: Enhancing Structural Inference with Automatically Learned Structural Priors
ViewCraft3D: High-fidelity and View-Consistent 3D Vector Graphics Synthesis
Epistemic Uncertainty for Generated Image Detection
Prompt-guided Disentangled Representation for Action Recognition
Multipole Attention for Efficient Long Context Reasoning
On-Policy Optimization with Group Equivalent Preference for Multi-Programming Language Understanding
QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code
Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning
Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies
Value-Guided Decision Transformer: A Unified Reinforcement Learning Framework for Online and Offline Settings
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
Tackling Continual Offline RL through Selective Weights Activation on Aligned Spaces
Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Autoregressive Motion Generation with Gaussian Mixture-Guided Latent Sampling
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Fast constrained sampling in pre-trained diffusion models
Solving Partial Differential Equations via Radon Neural Operator
Self-Verification Provably Prevents Model Collapse in Recursive Synthetic Training
Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought
Replicable Online pricing
Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings
Controllable 3D Molecular Generation for Structure-Based Drug Design Through Bayesian Flow Networks and Gradient Integration
DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space
Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection
Robustifying Learning-Augmented Caching Efficiently without Compromising 1-Consistency
Unveiling Transformer Perception by Exploring Input Manifolds
Spectral Learning for Infinite-Horizon Average-Reward POMDPs
$\textit{HiMaCon:}$ Discovering Hierarchical Manipulation Concepts from Unlabeled Multi-Modal Data
Bivariate Matrix-valued Linear Regression (BMLR): Finite-sample performance under Identifiability and Sparsity Assumptions
Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization
Beyond Least Squares: Uniform Approximation and the Hidden Cost of Misspecification
Unveiling the Power of Multiple Gossip Steps: A Stability-Based Generalization Analysis in Decentralized Training
Self-supervised Learning of Echocardiographic Video Representations via Online Cluster Distillation
AI-Generated Video Detection via Perceptual Straightening
Identifying multi-compartment Hodgkin-Huxley models with high-density extracellular voltage recordings
Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning
Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory
Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
How do Transformers Learn Implicit Reasoning?
GeoCAD: Local Geometry-Controllable CAD Generation with Large Language Models
Approximately Aligned Decoding
OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model
FlexSelect: Flexible Token Selection for Efficient Long Video Understanding
Fast Local Search Algorithms for Clustering with Adaptive Sampling and Bandit Strategies
Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
Parsimonious Predictions for Strategyproof Scheduling
Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Learning Urban Climate Dynamics via Physics-Guided Urban Surface–Atmosphere Interactions
Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency
Table as a Modality for Large Language Models
Efficient RAW Image Deblurring with Adaptive Frequency Modulation
Reasoning Beyond Points: A Visual Introspective Approach for Few-Shot 3D Segmentation
Characterizing control between interacting subsystems with deep Jacobian estimation
APML: Adaptive Probabilistic Matching Loss for Robust 3D Point Cloud Reconstruction
Gaussian Herding across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS
Semantic and Visual Crop-Guided Diffusion Models for Heterogeneous Tissue Synthesis in Histopathology
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning
Making Classic GNNs Strong Baselines Across Varying Homophily: A Smoothness–Generalization Perspective
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
Physics-informed Neural Operator for Pansharpening
Anatomically inspired digital twins capture hierarchical object representations in visual cortex
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
Towards Visualization-of-Thought Jailbreak Attack against Large Visual Language Models
Interactive Cross-modal Learning for Text-3D Scene Retrieval
Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability
MokA: Multimodal Low-Rank Adaptation for MLLMs
Efficient Training of Minimal and Maximal Low-Rank Recurrent Neural Networks
Bohdi: Heterogeneous LLM Fusion with Automatic Data Exploration
From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots
Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
EraseFlow: Learning Concept Erasure Policies via GFlowNet-Driven Alignment
Single-pass Adaptive Image Tokenization for Minimum Program Search
Ambient Diffusion Omni: Training Good Models with Bad Data
RANK++LETR: Learn to Rank and Optimize Candidates for Line Segment Detection
NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification
Accelerating 3D Molecule Generative Models with Trajectory Diagnosis
Learning to Control Free-Form Soft Swimmers
Reproducing Kernel Banach Space Models for Neural Networks with Application to Rademacher Complexity Analysis
VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates
Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
Reasoning as an Adaptive Defense for Safety
AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping
CroPe: Cross-Modal Semantic Compensation Adaptation for All Adverse Scene Understanding
TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems
Complete Structure Guided Point Cloud Completion via Cluster- and Instance-Level Contrastive Learning
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Strategyproof Reinforcement Learning from Human Feedback
BMW: Bidirectionally Memory bank reWriting for Unsupervised Person Re-Identification
Towards Interpretable and Efficient Attention: Compressing All by Contracting a Few
Implicit-ARAP: Efficient Handle-Guided Neural Field Deformation via Local Patch Meshing
Online Portfolio Selection with ML Predictions
Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
Normalize Filters! Classical Wisdom for Deep Vision
Conflict-Aware Knowledge Editing in the Wild: Semantic-Augmented Graph Representation for Unstructured Text
Rig3R: Rig-Aware Conditioning and Discovery for 3D Reconstruction
A Generalist Intracortical Motor Decoder
Diffusion-Driven Progressive Target Manipulation for Source-Free Domain Adaptation
3D Visual Illusion Depth Estimation
Mechanism Design for LLM Fine-tuning with Multiple Reward Models
GD$^2$: Robust Graph Learning under Label Noise via Dual-View Prediction Discrepancy
You Can Trust Your Clustering Model: A Parameter-free Self-Boosting Plug-in for Deep Clustering
Boosting Adversarial Transferability with Spatial Adversarial Alignment
TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting
Search and Refine During Think: Facilitating Knowledge Refinement for Improved Retrieval-Augmented Reasoning
In-Context Fully Decentralized Cooperative Multi-Agent Reinforcement Learning
Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models
Spiking Neural Networks Need High-Frequency Information
A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond
EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification
Image Stitching in Adverse Condition: A Bidirectional-Consistency Learning Framework and Benchmark
Fit the Distribution: Cross-Image/Prompt Adversarial Attacks on Multimodal Large Language Models
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Self-Perturbed Anomaly-Aware Graph Dynamics for Multivariate Time-Series Anomaly Detection
Preference Optimization by Estimating the Ratio of the Data Distribution
Curriculum Model Merging: Harmonizing Chemical LLMs for Enhanced Cross-Task Generalization
URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model
ToF-IP: Time-of-Flight Enhanced Sparse Inertial Poser for Real-time Human Motion Capture
WISA: World simulator assistant for physics-aware text-to-video generation
Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns
Thousand Voices of Trauma: A Large-Scale Synthetic Dataset for Modeling Prolonged Exposure Therapy Conversations
Whose Instructions Count? Resolving Preference Bias in Instruction Fine-Tuning
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Prioritizing Perception-Guided Self-Supervision: A New Paradigm for Causal Modeling in End-to-End Autonomous Driving
How to Learn a Star: Binary Classification with Starshaped Polyhedral Sets
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries
Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes
Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach
Simple Distillation for One-Step Diffusion Models
Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models
InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models
Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction
ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio–Language Models
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
RadZero: Similarity-Based Cross-Attention for Explainable Vision-Language Alignment in Chest X-ray with Zero-Shot Multi-Task Capability
SGN: Shifted Window-Based Hierarchical Variable Grouping for Multivariate Time Series Classification
LiveStar: Live Streaming Assistant for Real-World Online Video Understanding
Towards Resilient Safety-driven Unlearning for Diffusion Models against Downstream Fine-tuning
Taught Well Learned Ill: Towards Distillation-conditional Backdoor Attack
SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
Towards the Resistance of Neural Network Fingerprinting to Fine-tuning
FreeInv: Free Lunch for Improving DDIM Inversion
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
An Effective Levelling Paradigm for Unlabeled Scenarios
FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification
Proxy-SPEX: Sample-Efficient Interpretability via Sparse Feature Interactions in LLMs
Policy Optimized Text-to-Image Pipeline Design
A Geometry-Aware Metric for Mode Collapse in Time Series Generative Models
GLID$^2$E: A Gradient-Free Lightweight Fine-tune Approach for Discrete Biological Sequence Design
Robust Egocentric Referring Video Object Segmentation via Dual-Modal Causal Intervention
Memory Mosaics at scale
WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Orientation Matters: Making 3D Generative Models Orientation-Aligned
High-Order Flow Matching: Unified Framework and Sharp Statistical Rates
FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies
Image as a World: Generating Interactive World from Single Image via Panoramic Video Generation
Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Protein Inverse Folding From Structure Feedback
Do-PFN: In-Context Learning for Causal Effect Estimation
ALTER: All-in-One Layer Pruning and Temporal Expert Routing for Efficient Diffusion Generation
Model-Guided Dual-Role Alignment for High-Fidelity Open-Domain Video-to-Audio Generation
Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification
CALM: Culturally Self-Aware Language Models
CMoB: Modality Valuation via Causal Effect for Balanced Multimodal Learning
Automatic Visual Instrumental Variable Learning for Confounding-Resistant Domain Generalization
Redefining Experts: Interpretable Decomposition of Language Models for Toxicity Mitigation
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
KSP: Kolmogorov-Smirnov metric-based Post-Hoc Calibration for Survival Analysis
Deep Tree Tensor Networks
ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space
FlowNet: Modeling Dynamic Spatio-Temporal Systems via Flow Propagation
Fast Data Attribution for Text-to-Image Models
Mitigating Overthinking in Large Reasoning Models via Manifold Steering
Compress Large Language Models via Collaboration Between Learning and Matrix Approximation
Black-Box Membership Inference Attack for LVLMs via Prior Knowledge-Calibrated Memory Probing
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Multiplication-Free Parallelizable Spiking Neurons with Efficient Spatio-Temporal Dynamics
Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning
Audits Under Resource, Data, and Access Constraints: Scaling Laws For Less Discriminatory Alternatives
Self-Assembling Graph Perceptrons
Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
Puzzles: Unbounded Video-Depth Augmentation for Scalable End-to-End 3D Reconstruction
New Perspectives on the Polyak Stepsize: Surrogate Functions and Negative Results
Democratizing Clinical Risk Prediction with Cross-Cohort Cross-Modal Knowledge Transfer
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensemble Generation
Gaze Beyond the Frame: Forecasting Egocentric 3D Visual Span
LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
SRA-CL: Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation
Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
The Persistence of Neural Collapse Despite Low-Rank Bias
GenIR: Generative Visual Feedback for Mental Image Retrieval
UniteFormer: Unifying Node and Edge Modalities in Transformers for Vehicle Routing Problems
Pruning-Robust Mamba with Asymmetric Multi-Scale Scanning Paths
Identifying interactions across brain areas while accounting for individual-neuron dynamics with a Transformer-based variational autoencoder
Text to Sketch Generation with Multi-Styles
QSCA: Quantization with Self-Compensating Auxiliary for Monocular Depth Estimation
Solving the Asymmetric Traveling Salesman Problem via Trace-Guided Cost Augmentation
Training-Free Efficient Video Generation via Dynamic Token Carving
GAMMA: Gated Multi-hop Message Passing for Homophily-Agnostic Node Representation in GNNs
Vector Database Watermarking
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens
SAINT: Sequence-Aware Integration for Spatial Transcriptomics Multi-View Clustering
Align-DA: Align Score-based Atmospheric Data Assimilation with Multiple Preferences
Polyline Path Masked Attention for Vision Transformer
P-Law: Predicting Quantitative Scaling Law with Entropy Guidance in Large Recommendation Models
Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification
Tracing the Representation Geometry of Language Models from Pretraining to Post-training
Efficient Kernelized Learning in Polyhedral Games beyond Full Information: From Colonel Blotto to Congestion Games
Revolutionizing Training-Free NAS: Towards Efficient Automatic Proxy Discovery via Large Language Models
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement
Transforming Gaps into Gains: Bridging Model and Data Heterogeneity in Federated Learning via Knowledge Weak-Aware Zones
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
AmorLIP: Efficient Language-Image Pretraining via Amortization
Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon
Shapley-Coop: Credit Assignment for Emergent Cooperation in Self-Interested LLM Agents
Focus-Then-Reuse: Fast Adaptation in Visual Perturbation Environments
Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
SE-GUI: Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering
No Loss, No Gain: Gated Refinement and Adaptive Compression for Prompt Optimization
DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing
Generalizable Hand-Object Modeling from Monocular RGB Images via 3D Gaussians
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
Environment Inference for Learning Generalizable Dynamical System
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
Beyond Average Value Function in Precision Medicine: Maximum Probability-Driven Reinforcement Learning for Survival Analysis
Fourier Clouds: Fast Bias Correction for Imbalanced Semi-Supervised Learning
Riemannian Proximal Sampler for High-accuracy Sampling on Manifolds
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
Toward Relative Positional Encoding in Spiking Transformers
INST-IT: Boosting Instance Understanding via Explicit Visual Prompt Instruction Tuning
LVLM-Driven Attribute-Aware Modeling for Visible-Infrared Person Re-Identification
No Experts, No Problem: Avoidance Learning from Bad Demonstrations
Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
Bi-Level Decision-Focused Causal Learning for Large-Scale Marketing Optimization: Bridging Observational and Experimental Data
Learning Neural Exposure Fields for View Synthesis
MS-BART: Unified Modeling of Mass Spectra and Molecules for Structure Elucidation
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction
Random Forest Autoencoders for Guided Representation Learning
Partition to Evolve: Niching-enhanced Evolution with LLMs for Automated Algorithm Discovery
SymMaP: Improving Computational Efficiency in Linear Solvers through Symbolic Preconditioning
Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training
Semi-Supervised Regression with Heteroscedastic Pseudo-Labels
Personalized Federated Conformal Prediction with Localization
Dual Prototype-Enhanced Contrastive Framework for Class-Imbalanced Graph Domain Adaptation
Dynamic Gaussian Splatting from Defocused and Motion-blurred Monocular Videos
MoEMeta: Mixture-of-Experts Meta Learning for Few-Shot Relational Learning
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model
PairEdit: Learning Semantic Variations for Exemplar-based Image Editing
Towards Reliable and Holistic Visual In-Context Learning Prompt Selection
When majority rules, minority loses: bias amplification of gradient descent
GUIDED: Granular Understanding via Identification, Detection, and Discrimination for Fine-Grained Open-Vocabulary Object Detection
Hybrid Re-matching for Continual Learning with Parameter-Efficient Tuning
Valid Selection among Conformal Sets
Minimum Width for Deep, Narrow MLP: A Diffeomorphism Approach
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Corporate Needs You to Find the Difference: Revisiting Submodular and Supermodular Ratio Optimization Problems
State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding
NopeRoomGS: Indoor 3D Gaussian Splatting Optimization without Camera Pose Input
Enhancing Bioactivity Prediction via Spatial Emptiness Representation of Protein-ligand Complex and Union of Multiple Pockets
SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures
A Unified Framework for Fair Graph Generation: Theoretical Guarantees and Empirical Advances
MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization
EyeBench: Predictive Modeling from Eye Movements in Reading
HYPERION: Fine-Grained Hypersphere Alignment for Robust Federated Graph Learning
FADRM: Fast and Accurate Data Residual Matching for Dataset Distillation
MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting
HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell data
Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution
Multi-agent KTO: Enhancing Strategic Interactions of Large Language Model in Language Game
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer
ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
Is the acquisition worth the cost? Surrogate losses for Consistent Two-stage Classifiers
Learning 3D Persistent Embodied World Models
Unlabeled Data Can Provably Enhance In-Context Learning of Transformers
LLM-PySC2: Starcraft II learning environment for Large Language Models
Rethinking the Role of Verbatim Memorization in LLM Privacy
Learning Differential Pyramid Representation for Tone Mapping
Adaptive Gradient Masking for Balancing ID and MLLM-based Representations in Recommendation
DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy
Robust learning of halfspaces under log-concave marginals
Wasserstein Convergence of Critically Damped Langevin Diffusions
DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
LOPT: Learning Optimal Pigovian Tax in Sequential Social Dilemmas
Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-Index Models
Is Limited Participant Diversity Impeding EEG-based Machine Learning?
Towards Reliable Identification of Diffusion-based Image Manipulations
CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs
Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
RiboFlow: Conditional De Novo RNA Co-Design via Synergistic Flow Matching
SOMBRL: Scalable and Optimistic Model-Based RL
Data-Free Model Extraction for Black-box Recommender Systems via Graph Convolutions
QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction
Understanding protein function with a multimodal retrieval-augmented foundation model
Learning the Plasticity: Plasticity-Driven Learning Framework in Spiking Neural Networks
IneqSearch: Hybrid Reasoning for Olympiad Inequality Proofs
Enhanced Expert Merging for Mixture-of-Experts in Graph Foundation Models
Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift
DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration
Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding
Reliable Lifelong Multimodal Editing: Conflict-Aware Retrieval Meets Multi-Level Guidance
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Optimal Minimum Width for the Universal Approximation of Continuously Differentiable Functions by Deep Narrow MLPs
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
AdaTS: Learning Adaptive Time Series Representations via Dynamic Soft Contrasts
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Advancing Expert Specialization for Better MoE
PALQO: Physics-informed model for Accelerating Large-scale Quantum Optimization
Wisdom is Knowing What not to Say: Hallucination-Free LLMs Unlearning via Attention Shifting
Better Tokens for Better 3D: Advancing Vision-Language Modeling in 3D Medical Imaging
AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference
Local-Global Coupling Spiking Graph Transformer for Brain Disorders Diagnosis from Two Perspectives
Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
SmartCache: Context-aware Semantic Cache for Efficient Multi-turn LLM Inference
Magical: Medical Lay Language Generation via Semantic Invariance and Layperson-tailored Adaptation
Targeted Maximum Likelihood Learning: An Optimization Perspective
Variational Learning Finds Flatter Solutions at the Edge of Stability
Diff-ICMH: Harmonizing Machine and Human Vision in Image Compression with Generative Prior
Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models
Guided Diffusion Sampling on Function Spaces with Applications to PDEs
Leaving No OOD Instance Behind: Instance-Level OOD Fine-Tuning for Anomaly Segmentation
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
DiEP: Adaptive Mixture-of-Experts Compression through Differentiable Expert Pruning
SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models
Bridging Sign and Spoken Languages: Pseudo Gloss Generation for Sign Language Translation
SnapMoGen: Human Motion Generation from Expressive Texts
Align Your Flow: Scaling Continuous-Time Flow Map Distillation
Boosting Knowledge Utilization in Multimodal Large Language Models via Adaptive Logits Fusion and Attention Reallocation
FAST: Foreground‑aware Diffusion with Accelerated Sampling Trajectory for Segmentation‑oriented Anomaly Synthesis
CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning
GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
State Entropy Regularization for Robust Reinforcement Learning
SRSR: Enhancing Semantic Accuracy in Real-World Image Super-Resolution with Spatially Re-Focused Text-Conditioning
Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning
On the rankability of visual embeddings
Adaptive and Multi-scale Affinity Alignment for Hierarchical Contrastive Learning
Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
Problem-Parameter-Free Decentralized Bilevel Optimization
Robust Regression of General ReLUs with Queries
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
Safety Depth in Large Language Models: A Markov Chain Perspective
Social World Model-Augmented Mechanism Design Policy Learning
Regional Explanations: Bridging Local and Global Variable Importance
Better Estimation of the Kullback--Leibler Divergence Between Language Models
SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation
QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation
DUO: No Compromise to Accuracy Degradation
Understanding Bias Terms in Neural Representations
Active Seriation: Efficient Ordering Recovery with Statistical Guarantees
Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation
Learning Juntas under Markov Random Fields
VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models
One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling
Efficient Pre-Training of LLMs via Topology-Aware Communication Alignment on More Than 9600 GPUs
Searching Efficient Semantic Segmentation Architectures via Dynamic Path Selection
Continuous Concepts Removal in Text-to-image Diffusion Models
DIFFSSR: Stereo Image Super-resolution Using Differential Transformer
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding
Confusion-Driven Self-Supervised Progressively Weighted Ensemble Learning for Non-Exemplar Class Incremental Learning
Prior-Guided Flow Matching for Target-Aware Molecule Design with Learnable Atom Number
VETA-DiT: Variance-Equalized and Temporally Adaptive Quantization for Efficient 4-bit Diffusion Transformers
ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks
Zeroth-Order Optimization Finds Flat Minima
Neural Entropy
Continuous Diffusion Model for Language Modeling
CaliGCL: Calibrated Graph Contrastive Learning via Partitioned Similarity and Consistency Discrimination
Dependency Matters: Enhancing LLM Reasoning with Explicit Knowledge Grounding
Computation and Memory-Efficient Model Compression with Gradient Reweighting
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
Learnable Sampler Distillation for Discrete Diffusion Models
Prot2Text-V2: Protein Function Prediction with Multimodal Contrastive Alignment
MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning
Dynamic Siamese Expansion Framework for Improving Robustness in Online Continual Learning
Sparse Optimistic Information Directed Sampling
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Automated Model Discovery via Multi-modal & Multi-step Pipeline
Rethinking Hebbian Principle: Low-Dimensional Structural Projection for Unsupervised Learning
Mitigating Occlusions in Virtual Try-On via A Simple-Yet-Effective Mask-Free Framework
Quantifying Uncertainty in Error Consistency: Towards Reliable Behavioral Comparison of Classifiers
Topology-Aware Learning of Tubular Manifolds via SE(3)-Equivariant Network on Ball B-Spline Curve
Uncertainty-Calibrated Prediction of Randomly-Timed Biomarker Trajectories with Conformal Bands
Knee-Deep in C-RASP: A Transformer Depth Hierarchy
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
Accelerating Model-Free Optimization via Averaging of Cost Samples
LaViDa: A Large Diffusion Model for Vision-Language Understanding
DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
DeltaPhi: Physical States Residual Learning for Neural Operators in Data-Limited PDE Solving
Neural Hamiltonian Diffusions for Modeling Structured Geometric Dynamics
Metritocracy: Representative Metrics for Lite Benchmarks
Listwise Preference Diffusion Optimization for User Behavior Trajectories Prediction
Universally Invariant Learning in Equivariant GNNs
Adversarial Graph Fusion for Incomplete Multi-view Semi-supervised Learning with Tensorial Imputation
ComRank: Ranking Loss for Multi-Label Complementary Label Learning
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
DynaNav: Dynamic Feature and Layer Selection for Efficient Visual Navigation
$\Delta \mathrm{Energy}$: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
What Makes a Reward Model a Good Teacher? An Optimization Perspective
Learning 3D Anisotropic Noise Distributions Improves Molecular Force Fields
DSCS: Fast CPDAG-Based Verification of Collapsible Submodels in High-Dimensional Bayesian Networks
Large Language Models as End-to-end Combinatorial Optimization Solvers
Hypergraph-Enhanced Contrastive Learning for Multi-View Clustering with Hyper-Laplacian Regularization
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing
On the Sample Complexity of Differentially Private Policy Optimization
Noise Consistency Training: A Native Approach for One-step Generator in Learning Additional Controls
Ascent Fails to Forget
SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning
Generalizing Single-Frame Supervision to Event-Level Understanding for Video Anomaly Detection
Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection
NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses
Breaking the Compression Ceiling: Data-Free Pipeline for Ultra-Efficient Delta Compression
AdvEDM: Fine-grained Adversarial Attack against VLM-based Embodied Agents
Entropy Rectifying Guidance for Diffusion and Flow Models
GeGS-PCR: Fast and Robust Color 3D Point Cloud Registration with Two-Stage Geometric-3DGS Fusion
Elastic Robust Unlearning of Specific Knowledge in Large Language Models
From Pose to Muscle: Multimodal Learning for Piano Hand Muscle Electromyography
End-to-End Low-Light Enhancement for Object Detection with Learned Metadata from RAWs
MS-GS: Multi-Appearance Sparse-View 3D Gaussian Splatting in the Wild
Technical Debt in In-Context Learning: Diminishing Efficiency in Long Context
ShoeFit: A New Dataset and Dual-image-stream DiT Framework for Virtual Footwear Try-On
Beyond Modality Collapse: Representation Blending for Multimodal Dataset Distillation
A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
Retrieval is Not Enough: Enhancing RAG through Test-Time Critique and Optimization
Relieving the Over-Aggregating Effect in Graph Transformers
Don't Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retention
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding
Statistical Inference for Decentralized Federated Learning
Towards Principled Unsupervised Multi-Agent Reinforcement Learning
Online Statistical Inference in Decision Making with Matrix Context
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability
GMV: A Unified and Efficient Graph Multi-View Learning Framework
Versatile differentially private learning for general loss functions
Constrained Linear Thompson Sampling
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Geometric Learning with Positively Decomposable Kernels
RePIC: Reinforced Post-Training for Personalizing Multi-Modal Language Models
RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
CSPCL: Category Semantic Prior Contrastive Learning for Deformable DETR-Based Prohibited Item Detectors
On the Stability and Generalization of Meta-Learning: the Impact of Inner-Levels
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
ICLScan: Detecting Backdoors in Black-Box Large Language Models via Targeted In-context Illumination
Foundation Models for Scientific Discovery: From Paradigm Enhancement to Paradigm Transition
Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
NeurIPS should lead scientific consensus on AI policy
World Models Should Prioritize the Unification of Physical and Social Dynamics
Sample-Conditional Coverage in Split-Conformal Prediction
MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM
Semantic-guided Diverse Decoding for Large Language Model
Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE
LLM Generated Persona is a Promise with a Catch
Setting $\varepsilon$ is not the Issue in Differential Privacy
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning
Prompting as Scientific Inquiry
PanTS: The Pancreatic Tumor Segmentation Dataset
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer
The Adaptive Complexity of Minimizing Relative Fisher Information
HPSERec: A Hierarchical Partitioning and Stepwise Enhancement Framework for Long-tailed Sequential Recommendation
Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference
DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
One-Step Diffusion-Based Image Compression with Semantic Distillation
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
EPA: Boosting Event-based Video Frame Interpolation with Perceptually Aligned Learning
Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU Simulation
Scalable Feature Learning on Huge Knowledge Graphs for Downstream Machine Learning
How Far Are We from Optimal Reasoning Efficiency?
Asymptotically exact variational flows via involutive MCMC kernels
Simultaneous Statistical Inference for Off-Policy Evaluation in Reinforcement Learning
Causal Discovery and Inference through Next-Token Prediction
On Efficiency-Effectiveness Trade-off of Diffusion-based Recommenders
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization
The Catechol Benchmark: Time-series Solvent Selection Data for Few-shot Machine Learning
MedSG-Bench: A Benchmark for Medical Image Sequences Grounding
Simple and Efficient Heterogeneous Temporal Graph Neural Network
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning
Can LLMs Correct Themselves? A Benchmark of Self-Correction in LLMs
NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation
Pragmatic Heterogeneous Collaborative Perception via Generative Communication Mechanism
GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation
See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation
Linear Attention for Efficient Bidirectional Sequence Modeling
StreamForest: Efficient Online Video Understanding with Persistent Event Memory
FRN: Fractal-Based Recursive Spectral Reconstruction Network
On the SAC-BL Algorithm for Anomaly Detection
Analog Foundation Models
L2RSI: Cross-view LiDAR-based Place Recognition for Large-scale Urban Scenes via Remote Sensing Imagery
Steering Information Utility in Key-Value Memory for Language Model Post-Training
Resounding Acoustic Fields with Reciprocity
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents
LithoSim: A Large, Holistic Lithography Simulation Benchmark for AI-Driven Semiconductor Manufacturing
Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator
TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Dynamic Bundling with Large Language Models for Zero-Shot Inference on Text-Attributed Graphs
SpEx: A Spectral Approach to Explainable Clustering
Non-Stationary Lipschitz Bandits
The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models
MMPB: It’s Time for Multi-Modal Personalization
STAR-Bets: Sequential TArget-Recalculating Bets for Tighter Confidence Intervals
D$^2$GS: Dense Depth Regularization for LiDAR-free Urban Scene Reconstruction
Flattening Hierarchies with Policy Bootstrapping
Learning to Watermark: A Selective Watermarking Framework for Large Language Models via Multi-Objective Optimization
PC-Net: Weakly Supervised Compositional Moment Retrieval via Proposal-Centric Network
Interaction-Centric Knowledge Infusion and Transfer for Open Vocabulary Scene Graph Generation
Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables
Universal Visuo-Tactile Video Understanding for Embodied Interaction
MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data
VideoLucy: Deep Memory Backtracking for Long Video Understanding
End-to-End Vision Tokenizer Tuning
Gradient Descent as Loss Landscape Navigation: a Normative Framework for Deriving Learning Rules
BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
Compress & Cache: Vision token compression for efficient generation and retrieval
A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees
Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
Improving Bilinear RNN with Closed-loop Control
Active Test-time Vision-Language Navigation
MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis
Lifelong Safety Alignment for Language Models
Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering
High Dynamic Range Imaging with Time-Encoding Spike Camera
Hierachical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM
UniZyme: A Unified Protein Cleavage Site Predictor Enhanced with Enzyme Active-Site Knowledge
OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-time Emotional Speech Synthesis
Adaptive Fission: Post-training Encoding for Low-latency Spike Neural Networks
PAID: Pairwise Angular-Invariant Decomposition for Continual Test-Time Adaptation
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Sinusoidal Initialization, Time for a New Start
Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Imaging Inverse Problems
UtilGen: Utility-Centric Generative Data Augmentation with Dual-Level Task Adaptation
Lookahead Routing for Large Language Models
Accelerating Block Coordinate Descent for LLM Finetuning via Landscape Expansion
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-training
Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Feature-aware Modulation for Learning from Temporal Tabular Data
MigGPT: Harnessing Large Language Models for Automated Migration of Out-of-Tree Linux Kernel Patches Across Versions
Instance-Level Composed Image Retrieval
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference
DeblurDiff: Real-Word Image Deblurring with Generative Diffusion Models
NavBench: Probing Multimodal Large Language Models for Embodied Navigation
STAIR: Addressing Stage Misalignment through Temporal-Aligned Preference Reinforcement Learning
FastLongSpeech: Enhancing Large Speech-Language Models for Efficient Long-Speech Processing
MARS: A Malignity-Aware Backdoor Defense in Federated Learning
AdaptGrad: Adaptive Sampling to Reduce Noise
Zero-Shot Detection of LLM-Generated Text via Implicit Reward Model
MTRec: Learning to Align with User Preferences via Mental Reward Models
TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles
Enhancing Contrastive Learning with Variable Similarity
Unifying Reconstruction and Density Estimation via Invertible Contraction Mapping in One-Class Classification
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks
Spectral Convolutional Conditional Neural Process
Purity Law for Neural Routing Problem Solvers with Enhanced Generalizability
Reasoning is Periodicity? Improving Large Language Models Through Effective Periodicity Modeling
Multi-Modal Interactive Agent Layer for Few-Shot Universal Cross-Domain Retrieval and Beyond
Price of Parsimony: Complexity of Fourier Sparsity Testing
Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
CrypticBio: A Large Multimodal Dataset for Visually Confusing Species
Preventing Shortcuts in Adapter Training via Providing the Shortcuts
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents
GlobalTomo: A global dataset for physics-ML seismic wavefield modeling and FWI
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval
SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry
SafeVid: Toward Safety Aligned Video Large Multimodal Models
InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts
Listening to the Brain: Multi-Band sEEG Auditory Reconstruction via Dynamic Spatio-Temporal Hypergraphs
Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
SwitchLingua: The First Large-Scale Multilingual and Multi-Ethnic Code-Switching Dataset
GenSpace: Benchmarking Spatially-Aware Image Generation
SonoGym: High Performance Simulation for Challenging Surgical Tasks with Robotic Ultrasound
V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception
Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants
Rethinking Evaluation of Infrared Small Target Detection
OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata
Towards Evaluating Proactive Risk Awareness of Multimodal Language Models
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
AnomalyCoT: A Multi-Scenario Chain-of-Thought Dataset for Multimodal Large Language Models
MedChain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence
MedicalNarratives: Connecting Medical Vision and Language with Localized Narratives
Bag of Tricks for Inference-time Computation of LLM Reasoning
We use cookies to store which papers have been visited.
I agree
Successful Page Load
NeurIPS uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept
We use cookies to store which papers have been visited.
I agree