NeurIPS 2026 AI-Assisted Reviewing Experiment

Summary

NeurIPS 2026 is conducting a voluntary AI-assisted reviewing experiment to study how reviewers do, and could, interact with large language models during peer review, and how different forms of assistance affect review quality, reviewer behavior, and the review process. Participating reviewers will be randomly assigned, per eligible paper, to one of three conditions: (1) no LLM assistance, (2) open-ended LLM assistance, or (3) structured LLM assistance. The LLM interface will be integrated into OpenReview and will be available only for papers whose authors have opted into the experiment, and only for paper-reviewer assignments which have been assigned conditions (2) or (3). After the review process, area chairs, blind to the condition, will be asked to assess the quality and helpfulness of the reviews, and reviewers will be asked to provide feedback about their experience based on the conditions they were assigned.

The AI review assistant in the experiment is intended to help reviewers think through and understand submissions, and to assist in assimilation, analysis, and any background research necessary to generate high-quality reviews. It is not intended to replace reviewer judgment or produce a review on the reviewer’s behalf.

Except for this review experiment, NeurIPS does not sanction any other use of LLMs during the review process. Any such use constitutes a violation of integrity policies and may result in consequences for reviewers and their submitted papers, including desk rejection.

Guiding Principles

Human judgement is augmented, not replaced. The LLM assistance tool provides assistance to the human reviewers, and does not replace any humans in the review process.
Informed author consent. A paper is included in the experiment only if its authors opt in.
Voluntary reviewer participation. Reviewers only participate in the experiment if they volunteer.
Observational inference. As an experiment, we will study the impact of AI assistance, not assume it. The goal is to rigorously measure effects on review quality, reviewer behavior, and downstream discussion.
Confidentiality and privacy are upheld to the highest standards. The process is in accordance with NeurIPS review policies and ethics guidelines, and the experiment is reviewed by IRB at multiple institutions. LLMs used in the process are with zero data retention - no data is stored or logged by the LLM providers.

Author and Reviewer Participation

Authors and reviewers will be asked to volunteer and opt-in to the experiment.

Authors will have the opportunity to select, at the time of the submission of their paper, if they choose to include their paper in the experiment. If they choose not to include their paper in the experiment, they will not be invited to participate in the experiment as reviewers, either.

Reviewers will have the opportunity to volunteer to participate in the experiment via a recruitment form that will be sent to eligible reviewers --- those who are either not authors of a submitted paper, or have opted in their paper to the experiment.

Author and reviewer participation in the experiment is entirely voluntary, and will have no bearing on paper decisions or other outcomes of the conference.

Experiment Conditions

In this experiment, participant reviewers will be randomly assigned to one of three conditions per paper assigned to them. Condition 1 is an unassisted peer review. Condition 2 and 3 offer the opportunity to interact with an LLM through the OpenReview interface to assist in peer review. Condition 2 will be open-ended with minimal guidance, while condition 3 will include more structured guidance for interaction with the LLM. Conditions are assigned independently and randomly to reviewer-paper assignments --- each reviewer will be assigned to multiple conditions for their different paper assignments. The reviewers will be notified which assignments of theirs are assigned to which condition. Reviewers participating in the experiment must follow the condition assigned to each paper-reviewer assignment. For assignments in Condition 1, reviewers should not use external LLM tools for that review. For Conditions 2 and 3, reviewers should use only the experiment-provided LLM interface when using LLM assistance on the assigned paper.

The table below details the review experience for each condition.

Condition	Reviewer experience
Condition 1: Unassisted review	The reviewer completes the review entirely on their own, with no LLM assistance. They do not have access to the custom LLM interface for this reviewer-paper assignment.
Condition 2: Open-ended LLM assistance	The reviewer will have access to a custom LLM assistant through a conversation interface in OpenReview. The LLM will respond to reviewer input, and no further structural guidance or restrictions will be imposed on the output of the LLM.
Condition 3: Structured LLM assistance	The reviewer will have access to a custom LLM assistant through a conversation interface in OpenReview. The LLM will offer paper-specific assistance and volunteer to perform tasks it determines may be of particular assistance towards reviewing that paper. The outputs may be structured specifically to assist in ease of assimilation of the information. The reviewer may, in addition to accepting offered assistance, ask the LLM to perform any other tasks to assist in the review as they see fit.

Area chairs, blind to the condition of the review, will be asked to provide review assessments.

We will analyse the impact of the different conditions on the review, and the types of interactions the reviewers engaged in.

Guardrails

The experiment will include a combination of automated and semi-automated checks to serve as guardrails for the experiment. All interactions for the experiment will be visible to the experiment chairs and program chairs. Custom automated monitoring will be in place to ensure interactions are on-topic and relevant to the review assistance, and conversations may be flagged for additional human review.

Randomization And Blinding

After the paper-reviewer assignment process, papers that opted in, and reviewers that volunteered for the experiment will be used to assign conditions (1), (2), or (3) to reviewer-paper assignments. A randomized constrained optimization problem will be used to make these assignments with the following constraints and objectives.

Constraints will include:

Papers considered will only be those that opted in to the experiment
Reviewers considered will only be those who volunteered in the experiment

The objectives will include:

Each paper in the experiment should get one review of each condition (1), (2), and (3)
Even distribution of number of paper-reviewer assignment of conditions to (1), (2), and (3)
Even distribution of number of paper-reviewer assignment of conditions to volunteered researchers

Due to the specific balance of papers opted in, reviewers volunteering, reviewing loads, and reviewer assignments, the objectives may not be met exactly (e.g., there may be a few more of one condition of reviewer-paper assignments than another).

IRB Determination

This research study, under IRB Protocol #: STUDY00009083 has been reviewed by The University of Texas at Austin Institutional Review Board, and determined that this protocol meets the criteria for exemption from IRB review under 45 CFR 46.104 (2)(i) Tests, surveys, interviews, or observation (non-identifiable).

Contact Information

NeurIPS 2026 Experiment Co-Chairs: Joydeep Biswas, joydeepb@cs.utexas.edu , Laurent Charlin, lcharlin@gmail.com
Institutional Review Board (IRB): The University of Texas at Austin Institutional Review Board, Phone: 512-232-1543, Email: irb@austin.utexas.edu (for questions about your rights as a participant)