Introducing Proof Generalization
We present an algorithm that takes as its input a theorem, a proof of the theorem, and some aspect of the theorem that can potentially be generalized. By examining the facts utilized in the proof, this program outputs a correct generalization of the theorem and a corresponding proof.
This Lean tactic builds on previous work done by Olivier Pons in the Rocq theorem prover.
Generalizing the Irrationality of √17
Consider the following theorem and proof.
\sqrt{17} \textrm{ is irrational.}
theorem irrat_sqrt :
Irrational √17 :=
⊢ Irrational √17
/- If 17 is irrational, we should not be able to find coprime a and b such that a/b = √17. That is, a^2 = 17 b^2. -/
⊢ ¬∃ a b, a.gcd b = 1 ∧ a * a = 17 * b * b
/- It follows that 17 | a^2. Since 17 is prime, 17 | a. -/
a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ False; have a_div : 17 ∣ a := a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ 17 ∣ a a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ 17 ∣ ahave c : 17 ∣ a * a := a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ 17 ∣ a * a a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ 17 ∣ a * aa:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * b⊢ 17 ∣ 17 * (b * b); All goals completed! 🐙}; a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * bc:17 ∣ a ∨ 17 ∣ a⊢ 17 ∣ a; a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * bh✝:17 ∣ a⊢ 17 ∣ aa:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * bh✝:17 ∣ a⊢ 17 ∣ a a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * bh✝:17 ∣ a⊢ 17 ∣ aa:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * bh✝:17 ∣ a⊢ 17 ∣ a All goals completed! 🐙}
/- So we can write a = 17k for some k. -/
a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * ba_div:17 ∣ aa_is_pk:∃ k, a = 17 * k⊢ False
/- Plugging means 17 k^2 = b^2. -/
a:ℕb:ℕcopr:a.gcd b = 1h:a * a = 17 * b * ba_div:17 ∣ ak:ℕhk:a = 17 * k⊢ False; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕh:17 * k * (17 * k) = 17 * b * bhk:a = 17 * k⊢ False; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕh:17 * b * b = 17 * k * (17 * k)hk:a = 17 * k⊢ False; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕh:b * b = k * k * 17 ∨ 17 = 0hk:a = 17 * k⊢ False; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ False
/- It follows that 17 | b^2. Since 17 is prime, 17 | b. -/
have b_div : 17 ∣ b := a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ 17 ∣ b a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ 17 ∣ bhave c : 17 ∣ b * b := a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ 17 ∣ b * b a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ 17 ∣ b * ba:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17⊢ 17 ∣ k * k * 17; All goals completed! 🐙}; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17c:17 ∣ b ∨ 17 ∣ b⊢ 17 ∣ b; a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17h✝:17 ∣ b⊢ 17 ∣ ba:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17h✝:17 ∣ b⊢ 17 ∣ b a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17h✝:17 ∣ b⊢ 17 ∣ ba:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17h✝:17 ∣ b⊢ 17 ∣ b All goals completed! 🐙}
/- Now we have a contradiction -- a and b were supposed to be coprime, but 17 divides both. -/
a:ℕb:ℕcopr:a.gcd b = 1a_div:17 ∣ ak:ℕhk:a = 17 * kh:b * b = k * k * 17b_div:17 ∣ bp_dvd_gcd:17 ∣ a.gcd b⊢ False; a:ℕb:ℕcopr:a.gcd b = 1k:ℕhk:a = 17 * kh:b * b = k * k * 17p_dvd_gcd:17 ∣ a.gcd b⊢ False; a:ℕb:ℕcopr:a.gcd b = 1k:ℕhk:a = 17 * kh:b * b = k * k * 17p_dvd_gcd:17 ∣ 1⊢ False; All goals completed! 🐙
If our algorithm is presented with the above theorem and proof, formalized in Lean, and asked to generalize the number 17 by replacing it with an unknown natural number n, it will determine that the only property it used of the number 17 was that it was prime. So its output will be following theorem and its proof:
\textrm{If } p \textrm{ is prime, } \sqrt{p} \textrm{ is irrational.}
theorem irrat_sqrt_generalized :
∀ (p : ℕ), Nat.Prime p → Irrational √p :=
⊢ ∀ (p : ℕ), Nat.Prime p → Irrational √↑p
/- Generalize `17` in the proof,
then add the generalization `irrat_sqrt.Gen` as a hypothesis. -/
irrat_sqrt.Gen:∀ (n : ℕ), Nat.Prime n → Irrational √↑n :=
fun n gen_prime =>
irrat_def n fun a =>
Exists.casesOn a fun a h =>
Exists.casesOn h fun b h =>
And.casesOn h fun copr h =>
let_fun a_div :=
let_fun c :=
Eq.mpr (id (congrArg (fun _a => n ∣ _a) h))
(Eq.mpr (id (congrArg (fun _a => n ∣ _a) (mul_assoc n b b))) (Nat.dvd_mul_right n (b * b)));
Or.casesOn (motive := fun t =>
Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c = t → n ∣ a)
(Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c) (fun h h_1 => h)
(fun h h_1 => h) (Eq.refl (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c));
let_fun a_is_pk := dvd_iff_exists_eq_mul_right.mp a_div;
Exists.casesOn a_is_pk fun k hk =>
let_fun b_div :=
let_fun c :=
Eq.mpr
(id
(congrArg (fun _a => n ∣ _a)
(Eq.mp
(Eq.trans (congrArg (Or (b * b = k * k * n)) (eq_false (Nat.Prime.ne_zero gen_prime)))
(or_false (b * b = k * k * n)))
(Eq.mp (congrArg (fun _a => b * b = _a ∨ n = 0) (Eq.symm (mul_assoc k k n)))
(Eq.mp (congrArg (fun _a => _a) (propext mul_eq_mul_left_iff))
(Eq.mp (congrArg (fun _a => n * (b * b) = n * (k * _a)) (mul_comm n k))
(Eq.mp (congrArg (fun _a => n * (b * b) = _a) (mul_assoc n k (n * k)))
(Eq.mp (congrArg (fun _a => _a = n * k * (n * k)) (mul_assoc n b b))
(id (Eq.symm (Eq.mp (congrArg (fun _a => _a * _a = n * b * b) hk) h)))))))))))
(Nat.dvd_mul_left n (k * k));
Or.casesOn (motive := fun t =>
Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c = t → n ∣ b)
(Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c) (fun h h_1 => h)
(fun h h_1 => h) (Eq.refl (Eq.mp (congrArg (fun _a => _a) (propext (Nat.Prime.dvd_mul gen_prime))) c));
let_fun p_dvd_gcd := Nat.dvd_gcd_iff.mpr ⟨a_div, b_div⟩;
Nat.Prime.not_dvd_one gen_prime (Eq.mp (congrArg (fun _a => n ∣ _a) copr) p_dvd_gcd)⊢ ∀ (p : ℕ), Nat.Prime p → Irrational √↑p
/- Use the generalization to close the goal.-/
All goals completed! 🐙