About

Time: Wednesdays,
Location: CoRE 431 (DIMACS seminar room)
Mailing list: ak2530@scarletmail.rutgers.edu

The seminar explores the fast-growing interface between mathematical reasoning, formal methods, and modern AI systems. It is co-organized by DIMACS and the Mathematics Department at Rutgers University, New Brunswick.

Organizers

Emily First Ian Jauslin Ayush Khaitan Alex Kontorovich Konstantin Mischaikow Lirong Xia

Seminar Schedule

Fall 2025
Date Speaker Title / Materials
Alex Kontorovich (Rutgers)
An Introduction to Lean + AI for Research Mathematicians

Abstract. We'll do a "show and tell" of what it's like to try to formalize some basic mathematics in Lean, with help from AI.

Emily First (Rutgers)
A survey on AI for Proof Assistants

Abstract. In this talk, I’ll provide an overview of some advancements in AI and machine learning in proof assistant languages, such as Lean, Rocq, and Isabelle/HOL. I’ll discuss both neural and symbolic techniques for various proof assistant tasks, including automated proof synthesis, autoformalization, premise selection, and lemma conjecturing. With respect to neural techniques, I’ll cover the basics of applying language models, reinforcement learning, and graph neural networks to the theorem proving domain.

Chi Jin (Princeton)
Goedel-Prover-V2: The Strongest Open-Source Theorem Prover to Date

Abstract. This talk introduces Goedel-Prover-V2, an open-source model that sets a new state of the art in automated theorem proving.

Tomer Galanti (Texas A&M)
LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search

Abstract. We seek algorithms for program learning that are both sample-efficient and computationally feasible. Classical results show that targets admitting short program descriptions (e.g., with short "python code") can be learned with a "small" number of examples (scaling with the size of the code) via length-first program enumeration, but the search is exponential in description length. Consequently, Gradient-based training avoids this cost yet can require exponentially many samples on certain short-program families.

To address this gap, we introduce LLM-ERM, a propose-and-verify framework that replaces exhaustive enumeration with an LLM-guided search over candidate programs while retaining ERM-style selection on held-out data. Specifically, we draw k candidates with a pretrained reasoning-augmented LLM, compile and check each on the data, and return the best verified hypothesis, with no feedback, adaptivity, or gradients. Theoretically, we show that coordinate-wise online mini-batch SGD requires many samples to learn certain short programs. Empirically, LLM-ERM solves tasks such as parity variants, pattern matching, and primality testing with as few as 200 samples, while SGD-trained transformers overfit even with 100,000 samples. These results indicate that language-guided program synthesis recovers much of the statistical efficiency of finite-class ERM while remaining computationally tractable, offering a practical route to learning succinct hypotheses beyond the reach of gradient-based training.

Ayush Khaitan (Rutgers)
O-Forge: a verifiable, LLM-driven framework for proving inequalities in research mathematics.

Abstract. We introduce an LLM + computer software framework for proving sophisticated inequalities in research mathematics. We first ask a frontier LLM to break up a problem into its simplest parts, and then ask a computer algebra system to complete the proof for each part. In doing so, we solve a concrete question posed by Terence Tao. This is joint work with Vijay Ganesh (Georgia Tech).

Chenyang An (Amazon) TBD
Vijay Ganesh (Georgia Tech) TBD
Kaiyu Yang (FAIR, Meta) TBD
TBDTBA TBD

Schedule subject to change; check back for abstracts and slides.