Beyond data subsampling: differentiation as an uncertainty source in equation discovery
Abstract
Data-driven discovery of differential equations typically treats numerical differentiation as a fixed preprocessing step. Existing algorithms improve robustness through data and library subsampling but rarely account for variability in the differentiation method itself. We show that this choice introduces a systematic and reproducible source of uncertainty that alters both the structure of the equation and the coefficients. High-resolution schemes amplify noise, while heavily smoothed derivatives suppress meaningful fluctuations, yielding method-dependent results. We evaluate six differentiation techniques across multiple PDEs and noise levels using SINDy and EPDE, finding consistent shifts in the models discovered. These results establish differentiation method selection as a fundamental modeling decision and a new axis to improve ensemble-based equation discovery.