Human reasoning can distill principles from observed patterns and generalize them to explain and solve novel problems, as exemplified in the success of scientific theories. The patterns in biological data are often complex and high dimensional, suggesting that machine learning could play a vital role in distilling collective rules from patterns that may be challenging for human reasoning. However, the most powerful artificial intelligence systems are currently limited in interpretability and symbolic reasoning ability. Recently, we developed essence neural networks (ENNs), which train to do general supervised learning tasks without requiring gradient optimization, and showed that ENNs are intrinsically interpretable, can generalize out-of-distribution, and perform symbolic learning on sparse data. Here, I discuss our current progress in automatically translating the weights of an ENN into concise, executable computer code for general symbolic tasks, an implementation of data-based automatic programming which we call deep distilling. The distilled code, which can contain loops, nested logical statements, and useful intermediate variables, is equivalent to the ENN neural network but is generally orders of magnitude more compact and human-comprehensible. Because the code is distilled from a general-purpose neural network rather than constructed by searching through libraries of logical functions, deep distilling is flexible in terms of problem domain and size. On a diverse set of problems involving arithmetic, computer vision, and optimization, we show that deep distilling generates concise code that generalizes out-of-distribution to solve problems orders-of-magnitude larger and more complex than the training data. For problems with a known ground-truth rule set, including cellular automata which encode a type of sequence-to-function mapping, deep distilling discovers the rule set exactly with scalable guarantees. For problems that are ambiguous or computationally intractable, the distilled rules are similar to existing human-derived algorithms and perform at par or better. Our approach demonstrates that unassisted machine intelligence can build generalizable and intuitive rules explaining patterns in large datasets that would otherwise overwhelm human detection and reasoning.