Workshop
Workshop on Computer Assisted Programming (CAP)
Augustus Odena · Charles Sutton · Nadia Polikarpova · Josh Tenenbaum · Armando Solar-Lezama · Isil Dillig
There are many tasks that could be automated by writing computer programs, but most people don’t know how to program computers (this is the subject of program synthesis, the study of how to automatically write programs from user specifications). Building tools for doing computer-assisted-programming could thus improve the lives of many people (and it’s also a cool research problem!). There has been substantial recent interest in the ML community in the problem of automatically writing computer programs from user specifications, as evidenced by the increased volume of Program Synthesis submissions to ICML, ICLR, and NeurIPS.
Despite this recent work, a lot of exciting questions are still open, such as how to combine symbolic reasoning over programs with deep learning, how to represent programs and user specifications, and how to apply program synthesis within computer vision, robotics, and other control problems. There is also work to be done on fusing work done in the ML community with research on Programming Languages (PL) through collaboration between the ML and PL communities, and there remains the challenge of establishing benchmarks that allow for easy comparison and measurement of progress. The aim of the CAP workshop is to address these points. This workshop will bring together researchers in programming languages, machine learning, and related areas who are interested in program synthesis and other methods for automatically writing programs from a specification of intended behavior.
Schedule
Sat 8:30 a.m. - 8:40 a.m.
|
Welcome Talk
(
Welcome Talk
)
>
|
Augustus Odena 🔗 |
Sat 8:40 a.m. - 9:10 a.m.
|
Sumit Gulwani Talk
(
Keynote Talk
)
>
Title: New directions in Programming by Examples Abstract: Programming by examples (PBE) involves synthesizing programs in an underlying domain-specific language from input-output examples. Our journey in developing usable PBE systems has motivated two kinds of advances: (a) development of algorithms that can synthesize intended programs in real-time and from very few examples, (b) variants of the classical PBE problem including predictive synthesis and modeless synthesis. We have leveraged logical reasoning techniques and its integration with machine learning techniques to develop effective PBE solutions for some domains including string/datatype transformations, table extraction from semi-structured documents (e.g., custom text files, webpages, PDF), and repetitive edits in code. These solutions have shipped inside various mass-market products including Excel, PowerBI, Visual Studio, and Sql Server Management Studio. In this talk, I will describe these applications, technical advances, and the form factors inside different products. Bio: Sumit Gulwani is a computer scientist connecting ideas, research & practice, and (with) people with varied roles. He invented the popular Flash Fill feature in Excel and has shipped program synthesis innovations across multiple Microsoft products (Office, SQL, Visual Studio, Powershell, PowerQuery), having authored 65+ patent applications. He has co-authored 10 award winning papers (including test-of-time awards from ICSE and POPL) amongst 130+ research publications across multiple computer science areas and delivered 50+ keynotes/invited talks. He has received the Robin Milner Young Researcher Award, ACM SIGPLAN Outstanding Doctoral Dissertation Award (PhD from UC-Berkeley), and President’s Gold Medal from IIT Kanpur. |
Sumit Gulwani 🔗 |
Sat 9:10 a.m. - 9:40 a.m.
|
Roopsha Samanta Talk
(
Keynote Talk
)
>
MANTIS: SEMANTICS-GUIDED INDUCTIVE PROGRAM SYNTHESIS The dream of classical program synthesis is to generate programs from complete, formal specifications of their expected behavior. An increasingly favored paradigm of synthesis is inductive program synthesis, where specifications of program behavior are provided in the form of examples. Inductive program synthesis not only helps make program synthesis more tractable, but also has the potential to democratize programming! Unfortunately, inductive synthesis engines encounter challenges like overfitting, ambiguity, and brittleness, similar to other inductive learning engines. PL researchers have typically attacked these problems by applying syntactic biases to the search space in the form of tailored domain-specific languages, grammars and ranking functions. In this talk, I will show how one can further enhance the generalizability and robustness of such synthesis engines by applying semantic biases to the search space. Bio: Roopsha Samanta is an Assistant Professor the Department of Computer Science at Purdue University. She leads the Purdue Formal Methods (PurForM) group and is a member of the Purdue Programming Languages (PurPL) group. Before joining Purdue in 2016, she completed her PhD at UT Austin in 2013, advised by E. Allen Emerson and Vijay K. Garg, and was a postdoctoral researcher at IST Austria from 2014-2016 with Thomas A. Henzinger. She is a recipient of the 2019 NSF CAREER award. Her research interests are in program verification, program synthesis, and concurrency. She likes to work at the intersection of formal methods and programming languages to develop frameworks to assist programmers write reliable programs. Her current research agenda is centered around two themes—formal reasoning about distributed systems and semantics-guided inductive program synthesis. https://www.cs.purdue.edu/homes/roopsha/ |
Roopsha Samanta 🔗 |
Sat 9:40 a.m. - 10:10 a.m.
|
Spotlight Session 1
(
Spotlight Talks
)
>
|
Augustus Odena · Maxwell Nye · Disha Shrivastava · Mayank Agarwal · Vincent J Hellendoorn · Charles Sutton 🔗 |
Sat 10:10 a.m. - 11:00 a.m.
|
Poster Session 1
(
Posters
)
>
|
🔗 |
Sat 11:00 a.m. - 11:30 a.m.
|
Swarat Chaudhuri Talk
(
Keynote Talk
)
>
Neural Attribute Grammars for Semantics-Guided Program Generation Swarat Chaudhuri UT Austin Abstract: I will talk about Neural Attribute Grammars (NAG), a framework for deep statistical generation of source code modulo language-level semantic requirements (such as type safety or initialization of variables before use). Neural models for source code have received significant attention in the recent past. However, these models tend to be trained on syntactic program representations, and consequently, often generate programs that violate essential semantic invariants. In contrast, the NAG framework exposes the semantics of the target language to the training procedure for the neural model using attribute grammars. During training, the model learns to replicate the relationship between the syntactic rules used to construct a program, and the semantic attributes (for example, symbol tables) of the context in which the rule is fired. In the talk, I will give some concrete examples of NAGs and show how to use them in the conditional generation of Java programs. I will demonstrate that these NAGs generate semantically "sensible" programs with significantly higher frequency than traditional neural models of source code. (This talk is based on joint work with Rohan Mukherjee, Chris Jermaine, Tom Reps, Dipak Chaudhari, and Matt Amodio.) Bio: Swarat Chaudhuri is an Associate Professor of computer science at the University of Texas at Austin. His research studies topics in the intersection of machine learning and programming languages, including program induction, probabilistic programming, neurosymbolic programming, programmatically interpretable/explainable learning, learning-accelerated formal reasoning, and formally certified learning. Swarat received a bachelor's degree from the Indian Institute of Technology, Kharagpur, in 2001, and a doctoral degree from the University of Pennsylvania in 2007. Before joining UT Austin, he held faculty positions at Rice University and the Pennsylvania State University. He is a recipient of the National Science Foundation CAREER award, the ACM SIGPLAN John Reynolds Doctoral Dissertation Award, and the Morris and Dorothy Rubinoff Dissertation Award from the University of Pennsylvania. |
Swarat Chaudhuri 🔗 |
Sat 11:30 a.m. - 12:00 p.m.
|
Elena Glassman Talk
(
Keynote Talk
)
>
Title Increasing the Power of [Human+Program Synthesis] through Interface Design Abstract Program synthesis is a powerful tool for generating programs, but in the hands of users, its potential can be severely limited by unanticipated usability obstacles. In this talk, I will describe several key usability obstacles and new synthesis-powered interaction mechanisms that help users get past these obstacles to their goal: a program that behaves the way they want it to. Updated Bio Elena Glassman is an Assistant Professor of Computer Science at the Harvard Paulson School of Engineering & Applied Sciences and the Stanley A. Marks & William H. Marks Professor at the Radcliffe Institute for Advanced Study, specializing in human-computer interaction. At MIT, she earned a PhD and MEng in Electrical Engineering and Computer Science and a BS in Electrical Science and Engineering. Before joining Harvard, she was a postdoctoral scholar in Electrical Engineering and Computer Science at the University of California, Berkeley, where she received the Berkeley Institute for Data Science Moore/Sloan Data Science Fellowship. |
Elena Glassman 🔗 |
Sat 12:00 p.m. - 12:30 p.m.
|
Spotlight Session 2
(
Spotlight Talks
)
>
|
Augustus Odena · Kensen Shi · David Bieber · Ferran Alet · Charles Sutton · Roshni Iyer 🔗 |
Sat 12:30 p.m. - 1:00 p.m.
|
Kevin Ellis Talk
(
Keynote Talk
)
>
SlidesLive Video Title: Growing generalizable, interpretable knowledge with wake-sleep program learning Abstract: Two challenges in engineering program synthesis systems are: (1) crafting specialized yet expressive domain specific languages, and (2) designing search algorithms that can tractably explore the space of expressions in this domain specific language. We take a step toward the joint learning of domain specific languages, and the search algorithms performs synthesis in that language. We propose an algorithm which starts with a relatively minimal domain specific language, and then enriches that language by compressing out common syntactic patterns into a library of reusable domain specific code. In tandem, the system trains a neural network to guide search over expressions in the growing language. From a machine learning perspective, this system implements a wake-sleep algorithms similar to the Helmholtz machine. We apply this algorithm to AI and program synthesis problems, with the goal of understanding how domain specific languages and neural program synthesizers can mutually bootstrap one another. Bio: Kevin Ellis works across program synthesis and artificial intelligence. His focuses on using machine learning to develop better program synthesis algorithms, and on applications of program synthesis to graphics and natural language. He recently finished his PhD at MIT coadvised by Josh Tenenbaum and Armando Solar-Lezama, and is working as a research scientist at Common Sense Machines before starting as an assistant professor at Cornell in summer 2021. |
Kevin Ellis 🔗 |
Sat 1:00 p.m. - 2:30 p.m.
|
Poster Session 2
(
Posters
)
>
|
🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
Satish Chandra Talk
(
Keynote Talk
)
>
Title: Automatic Program Repair using Getafix Abstract: Developers spend a significant amount of their time fixing bugs. Fixes often are repetitive, so it appears that some portion of this work should be automated. Indeed, some recent approaches offer automation, but these typically explore a large space of potential fixes by making varying combinations of mutations, trying them all until one that passes the test suite. This is not only computationally expensive, but the suggested may not look natural to a developer. We present Getafix, a tool that offers readable bug fixes without requiring massive computational resources. Getafix learns from your bug fix history. It extracts past code changes that fixed bugs and learns, in an off-line phase, a set of templates from those fixes. As new bug reports appear, Getafix uses these templates to create and rank a set of suggestions in mere seconds, as well as offer fixes that resemble human-made fixes. At Facebook, Getafix has been used to auto-fix bugs reported by static analysis tools like Infer. |
Satish Chandra · Augustus Odena · Charles Sutton 🔗 |
Sat 3:00 p.m. - 3:30 p.m.
|
Xinyun Chen Talk
(
Keynote Talk
)
>
SlidesLive Video Title: Deep Learning for Program Synthesis from Input-Output Examples Abstract: There has been an emerging interest in applying machine learning-based techniques, especially deep neural networks, for program synthesis. However, because of some unique characteristics of the program domain, directly applying deep learning techniques developed for other applications is generally inappropriate. In this talk, I will present my work on program synthesis from input-output examples, aiming at synthesizing programs with higher complexity and better generalization. I will first discuss our work on execution-guided synthesis, where we develop approaches to leverage the execution results of both partial and full programs. In the second part of my talk, I will discuss our work on neural-symbolic architectures for compositional generalization. |
Xinyun Chen 🔗 |
Sat 3:30 p.m. - 4:00 p.m.
|
Panel
(
Virtual Panel
)
>
|
Augustus Odena · Charles Sutton · Roopsha Samanta · Xinyun Chen · Elena Glassman 🔗 |
Sat 4:00 p.m. - 4:10 p.m.
|
closing talk
(
closing talk
)
>
|
Augustus Odena · Charles Sutton 🔗 |