Skip to yearly menu bar Skip to main content


Poster

CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders

Kevin Frans · Lisa Soros · Olaf Witkowski

Hall J (level 1) #928

Keywords: [ creativity ] [ art ] [ language to text ] [ Computer Vision ] [ CLIP ] [ image synthesis ]


Abstract:

CLIPDraw is an algorithm that synthesizes novel drawings from natural language input. It does not require any additional training; rather, a pre-trained CLIP language-image encoder is used as a metric for maximizing similarity between the given description and a generated drawing. Crucially, CLIPDraw operates over vector strokes rather than pixel images, which biases drawings towards simpler human-recognizable shapes. Results compare CLIPDraw with other synthesis-through-optimization methods, as well as highlight various interesting behaviors of CLIPDraw.

Chat is not available.