SymFormer: End-to-End Symbolic Regression Using Transformer-Based Architecture


We present you SymFormer, a transformer-based symbolic regression approach. Symbolic regression is a method that automatically generates models as analytic free-form formulas from data. Symbolic regression has been successfully used in many nonlinear modeling tasks with quite impressive results. Traditionally, evolutionary methods like genetic programming have been used for symbolic regression, but they suffer from high computational complexity. In contrast, SymFormer, after pre-training on a vast dataset of mathematical formulas, can quickly infer new formulas for unseen data (in seconds) that is considerably faster than state-of-the-art evolutionary methods. Its unique feature lies in predicting both the formula structure (variables, operators) and the values of all numerical constants present in the formula simultaneously in a single forward pass of the transformer model. The constants predicted by the model serve as a good starting point for further fine-tuning via gradient descent, ultimately enhancing the model’s accuracy. SymFormer shines on various benchmarks, demonstrating superior performance while having faster inference.

Let’s explore some of the technical aspects:

  • The numerical constants are encoded using a scientific-like notation where a constant α is represented as a tuple of the exponent ce and the mantissa cm such that α ≈ cm · 10ce. The mantissa is in the range [−1, 1], and the exponent is an integer. The integer exponent is encoded as a special symbol starting with ‘C’ and followed by the exponent ce. The mantissa is kept as the real number (it is optimized using the regression loss). It has been shown that this encoding of constants significantly improves SymFormer’s performance.
  • We train our model using the Adam optimizer for 3 epochs on 8 NVIDIA A100 GPUs. The training of the model takes roughly 33 hours.
  • We compare our approach to two state-of-the-art approaches: the transformer-based Neural Symbolic Regression that Scales (NSRS), which is a pre-trained transformer model, and the RL-based Deep Symbolic Optimization (DSO), trained for each equation from scratch.
    The results in Table 1 show that SymFormer is competitive in terms of model performance on all the benchmarks while outperforming both NSRS and DSO in the time required to find the equation.
  • SymFormer demonstrated its remarkable extrapolation capabilities as it is able to generate models that predict correct values outside the training data range.
  • The source code and the pre-trained model checkpoints are publicly available at, see also

To access the paper, click here.

Authors of the Work

Martin Vastl, Jonáš Kulhánek, Jiří Kubalík, Erik Derner, Robert Babuška.

EU Logo