Skip to yearly menu bar Skip to main content


video
in
Workshop: Data Centric AI

SCIMAT: Science and Mathematics Dataset


Abstract:

In this work, we announce a comprehensive well curated and opensource dataset with millions of samples for pre-college and college level problems in mathematics and science. A preliminary set of results using transformer architectures with character to character encoding is shown. The dataset identifies some challenging problem and invites research on better architecture search.

Chat is not available.