Introducing Proof Generalization

We present an algorithm that takes as its input a theorem, a proof of the theorem, and some aspect of the theorem that can potentially be generalized. By examining the facts utilized in the proof, this program outputs a correct generalization of the theorem and a corresponding proof.

This Lean tactic builds on previous work done by Olivier Pons in the Rocq theorem prover.

Generalizing the Irrationality of √17

Consider the following theorem and proof.

\sqrt{17} \textrm{ is irrational.}

theorem irrat_sqrt : Irrational 17 := Irrational 17 /- If 17 is irrational, we should not be able to find coprime a and b such that a/b = √17. That is, a^2 = 17 b^2. -/ ¬a b, a.gcd b = 1a * a = 17 * b * b /- It follows that 17 | a^2. Since 17 is prime, 17 | a. -/ a:b:copr:a.gcd b = 1h:a * a = 17 * b * bFalse; have a_div : 17 a := a:b:copr:a.gcd b = 1h:a * a = 17 * b * b17a a:b:copr:a.gcd b = 1h:a * a = 17 * b * b17ahave c : 17 a * a := a:b:copr:a.gcd b = 1h:a * a = 17 * b * b17a * a a:b:copr:a.gcd b = 1h:a * a = 17 * b * b17a * aa:b:copr:a.gcd b = 1h:a * a = 17 * b * b1717 * (b * b); All goals completed! 🐙}; a:b:copr:a.gcd b = 1h:a * a = 17 * b * bc:17a17a17a; a:b:copr:a.gcd b = 1h:a * a = 17 * b * bh✝:17a17aa:b:copr:a.gcd b = 1h:a * a = 17 * b * bh✝:17a17a a:b:copr:a.gcd b = 1h:a * a = 17 * b * bh✝:17a17aa:b:copr:a.gcd b = 1h:a * a = 17 * b * bh✝:17a17a All goals completed! 🐙} /- So we can write a = 17k for some k. -/ a:b:copr:a.gcd b = 1h:a * a = 17 * b * ba_div:17aa_is_pk:k, a = 17 * kFalse /- Plugging means 17 k^2 = b^2. -/ a:b:copr:a.gcd b = 1h:a * a = 17 * b * ba_div:17ak:hk:a = 17 * kFalse; a:b:copr:a.gcd b = 1a_div:17ak:h:17 * k * (17 * k) = 17 * b * bhk:a = 17 * kFalse; a:b:copr:a.gcd b = 1a_div:17ak:h:17 * b * b = 17 * k * (17 * k)hk:a = 17 * kFalse; a:b:copr:a.gcd b = 1a_div:17ak:h:b * b = k * k * 1717 = 0hk:a = 17 * kFalse; a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17False /- It follows that 17 | b^2. Since 17 is prime, 17 | b. -/ have b_div : 17 b := a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 1717b a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 1717bhave c : 17 b * b := a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 1717b * b a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 1717b * ba:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 1717k * k * 17; All goals completed! 🐙}; a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17c:17b17b17b; a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17h✝:17b17ba:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17h✝:17b17b a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17h✝:17b17ba:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17h✝:17b17b All goals completed! 🐙} /- Now we have a contradiction -- a and b were supposed to be coprime, but 17 divides both. -/ a:b:copr:a.gcd b = 1a_div:17ak:hk:a = 17 * kh:b * b = k * k * 17b_div:17bp_dvd_gcd:17a.gcd bFalse; a:b:copr:a.gcd b = 1k:hk:a = 17 * kh:b * b = k * k * 17p_dvd_gcd:17a.gcd bFalse; a:b:copr:a.gcd b = 1k:hk:a = 17 * kh:b * b = k * k * 17p_dvd_gcd:171False; All goals completed! 🐙

If our algorithm is presented with the above theorem and proof, formalized in Lean, and asked to generalize the number 17 by replacing it with an unknown natural number n, it will determine that the only property it used of the number 17 was that it was prime. So its output will be following theorem and its proof:

\textrm{If } p \textrm{ is prime, } \sqrt{p} \textrm{ is irrational.}

theorem irrat_sqrt_generalized : (p : ), Nat.Prime p Irrational p := ∀ (p : ), Nat.Prime pIrrational p /- Generalize `17` in the proof, then add the generalization `irrat_sqrt.Gen` as a hypothesis. -/ irrat_sqrt.Gen:∀ (n : ), Nat.Prime nIrrational n := fun n gen_prime => irrat_def n fun a => Exists.casesOn a fun a h => Exists.casesOn h fun b h => And.casesOn h fun copr h => let_fun a_div := let_fun c := Eq.mpr (id (congrArg (fun _a => n_a) h)) (Eq.mpr (id (congrArg (fun _a => n_a) (mul_assoc n b b))) (Nat.dvd_mul_right n (b * b))); Or.casesOn (motive := fun t => Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c = tna) (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c) (fun h h_1 => h) (fun h h_1 => h) (Eq.refl (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c)); let_fun a_is_pk := dvd_iff_exists_eq_mul_right.mp a_div; Exists.casesOn a_is_pk fun k hk => let_fun b_div := let_fun c := Eq.mpr (id (congrArg (fun _a => n_a) (Eq.mp (Eq.trans (congrArg (Or (b * b = k * k * n)) (eq_false (Nat.Prime.ne_zero gen_prime))) (or_false (b * b = k * k * n))) (Eq.mp (congrArg (fun _a => b * b = _an = 0) (Eq.symm (mul_assoc k k n))) (Eq.mp (congrArg (fun _a => _a) (propext mul_eq_mul_left_iff)) (Eq.mp (congrArg (fun _a => n * (b * b) = n * (k * _a)) (mul_comm n k)) (Eq.mp (congrArg (fun _a => n * (b * b) = _a) (mul_assoc n k (n * k))) (Eq.mp (congrArg (fun _a => _a = n * k * (n * k)) (mul_assoc n b b)) (id (Eq.symm (Eq.mp (congrArg (fun _a => _a * _a = n * b * b) hk) h))))))))))) (Nat.dvd_mul_left n (k * k)); Or.casesOn (motive := fun t => Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c = tnb) (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c) (fun h h_1 => h) (fun h h_1 => h) (Eq.refl (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c)); let_fun p_dvd_gcd := Nat.dvd_gcd_iff.mpra_div, b_div⟩; Nat.Prime.not_dvd_one gen_prime (Eq.mp (congrArg (fun _a => n_a) copr) p_dvd_gcd)∀ (p : ), Nat.Prime pIrrational p /- Use the generalization to close the goal.-/ All goals completed! 🐙