Deciding Univariate Polynomial Problems Using Untrusted Certificates in Isabelle/HOL

We present a proof procedure for univariate real polynomial problems in Isabelle/HOL. The core mathematics of our procedure is based on univariate cylindrical algebraic decomposition. We follow the approach of untrusted certificates, separating solving from verifying: efficient external tools perform expensive real algebraic computations, producing evidence that is formally checked within Isabelle’s logic. This allows us to exploit highly-tuned computer algebra systems like Mathematica to guide our procedure without impacting the correctness of its results. We present experiments demonstrating the efficacy of this approach, in many cases yielding orders of magnitude improvements over previous methods.


Introduction
Nonlinear polynomial systems are ubiquitous in science and engineering.As realworld applications of formal verification continue to grow and diversify, there is increasing need for formal tools to provide automation for reasoning about nonlinear systems over the reals.To ensure correctness, one can develop and verify such automation within a proof assistant.To be efficient, one can take a certificate-based approach, delegating expensive computations to untrusted external tools whose output is used to guide the verified proof tool.
We have built an Isabelle/HOL [17] tactic that is complete for univariate polynomial problems over the reals 3 .As it proceeds, this tactic calls an external program to obtain a real algebraic certificate decomposing R into finitely many regions.A subsidiary tactic then uses techniques based on the Sturm-Tarski Theorem to verify this decomposition and transform it into a formal proof.For example, the theorem ∀x.(x 2 > 2 ∧ x 10 − 2x 5 + 1 ≥ 0) ∨ x < 2 can be proved simply by applying the tactic univ rcf.It calls an external tool to obtain a proof certificate, which it gives to the univ rcf cert tactic for checking.Once this occurs, the univ rcf cert incantation can be recorded in a proof script, removing the future need to call the external tool: univ rcf cert "[Arep [:-2, 0, 1:] (-2) (-1/3), Rat 1, Arep [:-2, 0, 1:] (7/6) (19/12), Rat 2]" The certificates can easily be produced by any program that does cylindrical algebraic decomposition (CAD).For now, we are using the certificate producing implementation of univariate CAD that we have written within the MetiTarski theorem prover [1].
The mathematical basis of our approach is the Sturm-Tarski theorem (Sect.2), which generalizes Sturm's theorem.We use Sturm-Tarski to build a decision procedure to decide the sign of a univariate polynomial at a real algebraic point (Sect.[4][5].For existential cases, the certificate taken by univ rcf cert is a real algebraic number that satisfies the quantifier free part of the formula, and univ rcf cert computationally checks that by invoking the sign determination procedure.For universal cases, the certificate taken by univ rcf cert is a list of roots of polynomials in the formula, and univ rcf cert verifies the completeness of this list and checks the truth value of the quantifier free part of the formula at sample points from regions partitioned by these roots. One of our motivating goals is the eventual integration of MetiTarski [1] into Isabelle/HOL (cf.Sect.7).A key difficulty in checking MetiTarski's proofs is its use of a decision procedure for polynomial systems.The present work is a first step towards a framework for verifying MetiTarski proofs.

The Sturm-Tarski Theorem
Sturm's theorem provides an effective method to count the number of unique real roots of a polynomial P has in a given real interval.Such root counting forms the computational basis of many core real algebraic operations, including root isolation and sign determination.We shall cover these applications in Sections 4 and 5.The Sturm-Tarski theorem generalises Sturm's theorem to allow queries concerning a second polynomial Q at the roots of P [14, p. 309].The Sturm-Tarski theorem is the mathematical crux of our work.We shall first prove Sturm-Tarski and then use it to obtain Sturm's theorem as a special case.

Notation and Terminology
We write R for the extended reals R ∪ {−∞, ∞}, where the four arithmetic operations are expended to ±∞ in a straightforward way.The set (ring) of polynomials with real coefficients over the variable X is written R[X].

Basic Definitions
The notion of a Tarski query [2] is fundamental to the Sturm-Tarski theorem.Intuitively, the Tarski query of (Q, P, a, b) paints a high-level picture of the behaviour of Q at the roots of P over the interval (a, b).This information is encoded by a natural number expressing the (signed) difference between (i) the number of roots of P at which Q is positive, and (ii) the number of roots of P at which Q is negative, where the roots of P considered all reside within (a, b).
It is not hard to see that a Tarski query can be used to perform sign determination, i.e., to determine the sign of Q at a specific root α of P , provided we know an interval (a, b) containing α and no other roots of P .Moreover, when Q = 1, the Tarski query T aQ(P, Q, a, b) gives us the number of real roots P has in (a, b).Thus, with Tarski queries, we can both compute the sign of Q at any given root of P , and verify whether or not each instance of this question is well-defined: that P has a single real root in (a, b).
The key question is then: How can T aQ(Q, P, a, b) be computed?We seek a method for obtaining the value of the Tarski query as, e.g., an algebraic expression derived from the coefficients of Q and P and the interval endpoints a and b.This leads us to the notion of a signed remainder sequence and the number of sign variations these sequences contain.
The signed remainder sequence between two univariate polynomials over a field of coefficients is a generalised version of the sequence of intermediate remainders computed by Euclid's greatest common divisor algorithm.
where P, Q ∈ R[X] and mod (dvd) is the remainder (quotient) of polynomial division, such that P = (P dvd Q) Q + (P mod Q) and degree(P mod Q) < degree(Q).
It is easy to see that SRemS n (P, Q) is the greatest common divisor of P and Q.Then, the difference in the number of sign variations in sequence [p 0 , p 1 , . . ., p n ] evaluated at a and b, Var([p 0 , p 1 , . . ., p n ]; a, b), is defined as

Theorem Statement
We now have everything we need to state the main theorem.Note that Sturm's theorem is the special case of the Sturm-Tarski theorem when Q = 1.

Theorem 1 (Sturm-Tarski). The Sturm-Tarski theorem is
TaQ(Q, P, a, b) = Var(SRemS(P, P ′ Q); a, b) Thus, the Sturm-Tarski theorem yields our sought-after method for computing Tarski queries.This computation proceeds via counting the number of sign changes in a particular sequence of generalised remainders evaluated at our interval endpoints.Let us now discuss our proof of the theorem of Isabelle/HOL.We shall then turn to how it is used in our tactic for univariate polynomial problems in Sect. 5.

The Proof and Its Formalisation in Isabelle
Our proof of the Sturm-Tarski theorem in Isabelle is based on Basu et al. [2] and on Cohen's formalization in Coq [3].The core idea is based on the functions jump and Cauchy index [8].
The corresponding definition in Isabelle is as follows: definition jump :: "'a :: {linordered idom,topological space} poly ⇒ 'a poly ⇒'a ⇒ int" where "jump q p x ≡ ( if q =0 ∧ odd((order x p) -(order x q) ) then if sign r pos (q*p) x then 1 else -1 else 0 )" In this definition, order x p is the multiplicity of x as a root of p, and sign r pos (q*p) x is mathematically equivalent to lim u→x + q(u)p(u) > 0. This definition reflects the following reasoning: if q = 0 or order x q > order x p (in this case order x p -order x q = 0 as order x p is of type nat, hence ¬odd(order x p -order x q) ), then q/p at x is not a jump, hence jump p q x=0 ; -if order x p -order x q is not zero and even, then q/p is of the same sign during the jump at x, hence jump p q x=0 ; -otherwise we can decide the sign of the jump by whether lim u→x + q(u)p(u) > 0 holds.
The Cauchy index cindex a b q p is the sum of the jumps of q/p over the interval (a, b): definition cindex:: "'a:: {linordered idom,topological space} ⇒ 'a ⇒ 'a poly ⇒ 'a poly ⇒ int" where "cindex a b q p ≡ ( x ∈{x.poly p x=0 ∧ a< x ∧ x< b}.jump q p x)" For example, let Q = x − 4 and P = (x − 3)(x − 1) 2 (x + 1).The graph of Q/P is shown in Fig. 1.We have Fig. 1.Graph of the rational function (x − 4)/((x − 3)(x − 1) 2 (x + 1)) By case analysis, we can prove a connection between the Tarski query and the Cauchy index: lemma cindex taq: fixes p q::"real poly" assumes "p =0" shows "taq {x.poly p x = 0 ∧ a < x ∧ x < b} q = cindex a b (pderiv p * q) p" where taq S q is mathematically equivalent to x∈S sgn(q(x)) and pderiv p is the first derivative of p.Moreover, the Cauchy index has its own properties: lemma cindex mod: fixes p q::"real poly" assumes "p =0" shows "cindex a b q p = cindex a b (q mod p) p" lemma cindex cross: fixes p q::"real poly" assumes "a < b" "poly (p * q) a =0" "poly (p * q) b =0" shows "cindex a b q p + cindex a b p q = cross (p * q) a b" where With lemmas cindex mod and cindex cross, we can derive the following recurrence relation for cindex : lemma cindex rec: fixes p q::"real poly" assumes "a < b" "poly (p * q) a =0" "poly (p * q) b =0" shows "cindex a b q p = cross (p * q) a b + cindex a b (-(p mod q)) q" This result shares the same recurrence structure with the difference of the signed remainder sequence evaluated at the same end points: lemma changes itv smods rec: assumes "a<b" "poly (p*q) a =0" "poly (p*q) b =0" shows "changes itv smods a b p q = cross (p*q) a b + changes itv smods a b q (-(p mod q))" where changes itv smods a b p q is mathematically equivalent to Var(SRemS(p, q); a, b) in Theorem 1. Together with lemma cindex taq, the connection between signed remainder sequences and Tarski queries is established: theorem sturm tarski interval: assumes "a<b" "poly p a =0" "poly p b =0" shows "taq {x.poly p x=0 ∧ a<x ∧ x<b} q = changes itv smods a b p (pderiv p * q)" Note, this is just the bounded case of the Sturm-Tarski theorem.We also have proved the unbounded and half-bounded cases.⊓ ⊔

Encoding Real Algebraic Numbers
The real algebraic numbers (R alg ) are the roots of non-zero polynomials with integer (equivalently, rational) coefficients.They form a countable, computable subfield of the real numbers.Both R, +, −, ×, <, 0, 1 and R alg , +, −, ×, <, 0, 1 are real closed fields (RCFs).As RCF is a complete theory, these two structures are elementarily equivalent.Unlike R, arithmetic over R alg is computable.During the execution of real algebraic decision methods based on CAD, one computes concretely over R alg and appeals implicitly to the elementary equivalence to carry the results over to R as a whole.
A real algebraic number α can be encoded as a polynomial p and an interval I ⊆ R such that α is the only root of p residing in I.In this section, we formalize this encoding by building a new constructor of type rat poly ⇒ rat ⇒ rat ⇒ real for real numbers.This constructor directly embeds the real algebraic numbers into the reals.
As Isabelle uses Cauchy sequences to represent real numbers, our first step is to convert an encoding of a real algebraic number into a Cauchy sequence through bisection.The given interval is cut in half, and the root is located in the half where the sign of the polynomial changes.We then define a new constructor for real numbers: definition Alg:: "rat poly ⇒ rat ⇒ rat ⇒ real" where "Alg p lb ub = (if valid alg p lb ub then Real (to cauchy (p,lb,ub)) else undefined)" Here valid alg p lb ub is true if and only if p(lb)p(ub) < 0 and p has exactly one real root within the interval (lb, ub), whence lb < ub.Real is the abstraction function of type real ; it builds a real number based on its underlying representation as a Cauchy sequence of type nat ⇒ rat.
Finally, we prove that Alg p lb ub is a root of p within the interval (lb, ub).
We can decide the sign of a polynomial at a real algebraic point using the Sturm-Tarski theorem.Suppose an real algebraic number α is encoded as Alg p lb ub.If valid alg p lb ub is true, we can be sure α is the only root of p within the interval (lb, ub).In this case, the sign of the polynomial q at α can be computed as sgn(q(α)) = x∈(lb,ub),p(x)=0 sgn(q(x)) = TaQ(q, p, lb, ub) = Var(SRemS(p, p ′ q); a, b).
In Isabelle, we define a constant sgn at q x as the sign of polynomial q at x : definition sgn at::"real poly ⇒ real ⇒ real" where "sgn at q x = sgn (poly q x)" We can then prove the following code equation for sgn at : lemma sgn at code alg[code]: "sgn at q (Alg p lb ub) = ( if valid alg p lb ub then of int (changes itv smods lb ub (real poly p) (pderiv (real poly p) * q)) else sgn (poly q undefined))" Note that the right hand side of the equation in the lemma sgn at code alg is executable including valid alg p lb ub, which also uses the Sturm-Tarski theorem to check if p contains exactly one real root within (lb, ub).

The Decision Procedure
Our decision procedure works by checking certificates produced by an external untrusted program.The checking process is built around the executable function sgn at.Let us illustrate the procedures through some instructive examples.

The Existential Case
Existential formulas are proved by univ rcf through the construction and verification of explicit real algebraic numbers witnessing them.For example, when univ rcf is applied to the tactic invokes our external tool to construct a witness certifying the truth of the lemma.This certificate is of the form "[Arep [:-2,0,1:] 0 2]" representing √ 2. This certificate is then handed to our companion tactic, resulting in the tactic application univ rcf cert "[Arep [:-2,0,1:] 0 2]".

The Universal Case
The decision procedure also transforms the universal problem lemma "( ∀ x::real.x*x -2>0 ∨ x < 2)" into an equivalent form: After that, our decision procedure expects a list of all roots of the polynomials in the formula.In this case, the roots are − √ 2, √ 2 and 2: These roots will be computationally checked to ensure (i) they are all distinct roots of the polynomials via sgn at, and (ii) to count the total number of real roots of a polynomial, ensuring the list of roots is complete via Sturm's theorem.Once this list of roots is proved to be complete, we can conclude that the roots partition R into disjoint regions (subsets) such that the truth value of the quantifier-free part of the formula is invariant over each region.In our running example, R is decomposed as By choosing a sample point from each region, we can convert the unbounded universal formula into a bounded one, which can be evaluated directly.In this case, the bounded universal formula could be

Linking to an External Solver
Certificates for both existential and universal can be produced by any program performing univariate CAD.For now, we are using the certificate producing implementation of univariate CAD in MetiTarski.When called, univ rcf will first transform the problem into the input format of MetiTarski, and then invoke MetiTarski in a special mode to produce the necessary certificate for univ rcf cert.Once this is done, we only need to use univ rcf cert together with the certificate to repeat the proof, and no longer need to call MetiTarski again.
This division of labour between the external solver and our Isabelle/HOL tactic follows the classical certificate-based approach (e.g.John Harrison's sumsof-squares approach [10]) of separating the search aspects of a decision procedure from the verification of its results.
Real root isolation can be computationally expensive, especially for polynomials of high degree, large bit-width, or whose roots are very close together.It is not terribly uncommon for highly tuned implementations of real root isolation to run out of resources on particularly challenging univariate problems arising in MetiTarski proofs.
Thus, we wish to delegate this expensive part of our procedure to highly tuned external tools as much as possible.These tools can utilise as many clever optimisations, tricks and heuristics that they want.All we need to do to check their results is: (i) count the total number of unique real roots a polynomial has on the real line, and (ii) verify that precisely this number of unique roots was isolated by the external tool.In this way, we completely insulate ourselves from the search process.

Discussion and Applications
One of our driving motivations is the integration of MetiTarski with Isabelle.MetiTarski [1] is a first-order theorem prover for real number inequalities involving transcendental functions such as sin, tan and exp.It can automatically prove formulas like ∀x ∈ (0, 1.25) =⇒ tan(x) 2 ≤ 1.75 × 10 −7 + tan(1) tan(x 2 ) The main idea behind MetiTarski is to approximate transcendental functions by polynomial or rational function bounds, and then solve the formula by a combinination of a resolution theorem proving and an external Real Closed Field (RCF) decision procedure (QEPCAD, Mathematica or Z3).MetiTarski is a version of Joe Hurd's Metis prover [12], modified to include arithmetic simplification and integration with RCF decision procedures, along with many other refinements.
Applications of MetiTarski include verification problems arising in air traffic control [6] and analog circuit designs [4].As some of the applications are safety critical, it is natural to consider to integrate MetiTarski with an existing interactive theorem prover, whose internal logic can be used to ensure the correctness of MetiTarski's proofs.Besides, the automation provided by MetiTarski is generally useful to interactive theorem provers.
MetiTarski has been integrated with the PVS theorem prover [18] as a trusted oracle [5].The authors state that the automation introduced by MetiTarski for closing sequents containing real-valued functions considerably outperforms existing tactics in PVS.However, this tactic should not be used in a certification environment, where external oracles are not allowed.
Our eventual goal is to integrate MetiTarski into the Isabelle/HOL [17] theorem prover.Isabelle can verify purely logical inferences (in fact, it contains an internal copy of the Metis theorem prover), and the third author has just formalized most of the bounds of transcendental functions used by MetiTarski [19].The primary remaining hurdle is the RCF decision procedure, and the work presented here is the first step towards it.
Finally, let us say a bit about how our work might be generalised to multivariate problems.The root isolation and sign determination algorithms underlying our tactic are necessary for general multivariate CAD-based real quantifier elimination, and particularly for the real algebraic number computations, which typically consume most of the processing time.By building a framework for efficient certificate-based root isolation within a verification environment, we have the beginnings of a solid and practical foundation for multivariate CAD.
That said, a multivariate generalisation of our work requires a certificatebased approach to CAD in general.Unfortunately, it is not obvious to us where the clear separation between search and verification should be in the multivariate case.As the bivariate case is considerably simpler than the general multivariate case, we plan to work on that next.The key question we hope to answer is: What is a certificate for a bivariate CAD?

Related Work
Our work is greatly inspired by Cyril Cohen's PhD thesis [3], within which the Sturm-Tarski theorem and real algebraic numbers are formalized in the Coq theorem prover.However, our goals and approaches are very different.
Cohen's work is part of a large project that has formalized the Feit-Thompson theorem (odd order theorem) in Coq [9], and focuses more on theoretical developments than we do.For example, they proved the Sturm-Tarski theorem to construct an RCF quantifier elimination procedure in the spirit of Tarski's original method, which has important theoretical properties but is not practical as a proof procedure.Moreover, he has formalized arithmetic on real algebraic numbers and shown that they form a real closed field via resultants.We have not formalized resultants at all.Our sign determination algorithm uses the Sturm-Tarski theorem, which is significantly more efficient in practice than using resultants.On the other hand, as it was unnecessary for our proof procedure, we have not proved in Isabelle that the real algebraic numbers form a real closed field.In general, compared to his work, ours stresses the practical side over the theoretical.Fundamentally, we want to build decision procedures to solve non-trivial problems in practice.
Decision procedures based on Sturm's theorem have been implemented in Isabelle and PVS before [7,16].Their core idea is to count the number of real roots within a certain (bounded or not) interval.Generally, they can only handle formulas involving only single polynomial, so they are not complete for firstorder formulas.Moreover, they have not engaged external tools to undertake the expensive process of isolating roots as we do.
The very recent work by Narkawicz et al. [15] is the most similar to ours, and was done concurrently.They have formalized the Sturm-Tarski theorem (which they call Tarski's theorem) and turned it into a decision procedure.The main difference between their work and ours is that their decision procedure uses linear algebra and resembles Cohen's quantifier elimination in Coq but only for the univariate case, while our work follows the idea of cylindrical algebraic decomposition by introducing real algebraic numbers and does not involve with linear algebra.Secondly, their procedure does not rely on certificates produced by external programs, while we do.Thirdly, their procedure seems not to be able to handle arbitrary boolean expressions, but we believe it should not be hard to overcome.In the long term, we both want to build a verified RCF decision procedure into theorem provers (PVS or Isabelle), and we both treat our current work as the first step towards it.
Mahboubi [13] has implemented the executable part of a CAD decision procedure in Coq, but as far as we know the correctness proof for her implementation is still ongoing.This is also one of the reasons for us to choose the certificate-based approach rather than directly verifying an implementation.
There are other methods to handle nonlinear polynomial problems in theorem provers, such as sum of squares [10], which is good for multivariate universal problems but is not applicable when the existential quantifier arises, and interval arithmetic [11,20], which is very efficient for some cases but is not complete.These methods and ours should be used in a complementary way.

Conclusion
We have described our work of building a complete certificate-based decision procedure for first-order univariate polynomial problems in Isabelle.This is a first step towards integrating MetiTarski into Isabelle.The next step is to enable our tactics to handle univariate problems involving transcendental functions.After that, we will proceed to the multivariate cases.