Proving — not testing — that a network behaves: robustness, safety and consistency, the precondition/postcondition language for stating them, and what a counterexample really is.
Before we touch neural networks, we need to be clear what it means to prove something about a program — and why writing down "what we want" is itself the hard part.
This is a new strand of the module, gl red" x="98" y="222">θ
A "bad state" is any situation in which the two aircraft would collide. Verification was necessary (not just nice-to-have) because the system is safety-critical: it must be certified before deployment, and certification for such systems requires a mathematical guarantee — proven safety on all inputs — not merely good test performance. The inputs include heading angles and vertical separation; outputs are advisories like turn left/right.
Consistency is especially important for generative models: the model's outputs should not contradict each other. For example, we'd like an LLM to not simultaneously state that Block A is on B, B is on C, and C is on A.
The lecturer closes Part II by promising that the coming parts will show how to mathematically express these three properties — which is exactly Part III.
Robustness — small input change ⇒ small output change; most relevant to classifiers. Safety — never enter a "bad" state; most relevant to NNs controlling cyber-physical systems. Consistency — outputs don't contradict each other; most relevant to generative models (e.g. LLMs).
English is fine for intuition but useless for proving theorems. We now meet a compact precondition/postcondition language — Hoare-triple style — and use it to formalise every property from Part II.
We discussed the properties in English, but English is not useful if we want to prove theorems that the properties hold. So we adopt a formal language. Crucially, we want it to handle specifications that involve interactions between several NNs and multiple inputs and outputs — not just one network on one input.
Pre- and postconditions are written with:
This is exactly a Hoare triple from program verification, \(\{P\}\,C\,\{Q\}\): assume \(P\) of the inputs, run the computation \(C\), and \(Q\) must hold of the outputs. The lecturer notes this is only one way to model specifications — it is based on a presentation in Aws al-Barghouthi's book (on neural-network verification).
In specifications, every NN is treated as a function \(f:\mathbb{R}^n \to \mathbb{R}^m\). Henceforth:
"For all input vectors that make the precondition true, the postcondition will hold after the NN computes on one of those inputs." The precondition filters which inputs we care about; the postcondition is the promise we must keep for every one of them.
Inputs: \(x,y,z,\dots\). Outputs: \(a,b,c,\dots\). Each NN is a function \(f:\mathbb{R}^n\to\mathbb{R}^m\). Reading: "for all input vectors satisfying the precondition, after the NN computes its output, the postcondition holds." The precondition restricts the inputs considered; the postcondition is the guaranteed property of the outputs.
Recall the English version: "if \(x\) is slightly darker/lighter than \(y\), then \(x\) is classified the same as \(y\)." In the language:
where \(f:\mathbb{R}^n\to\mathbb{R}^m\). Two inputs that are close (distance \(\le 0.01\)) must receive the same class. The two clauses \(a\leftarrow f(x)\) and \(b\leftarrow f(y)\) show the same network \(f\) being applied to two inputs — exactly the "multiple inputs" capability we wanted.
dist and classdist(x,y) is a notion of distance. Two examples on the slide:
Euclidean distance: \[\operatorname{dist}(x,y)=\sqrt{\sum_{0\le i\le \dim(x)} (x_i - y_i)^2}\]
Chebyshev distance: \[\operatorname{dist}(x,y)=\max_{0\le i\le \dim(x)} \lvert x_i - y_i\rvert\]
(Note: \(x_i\) is the \(i\)-th component of vector \(x\).)
class(a) is the class predicted from the NN's output vector \(a\). If there are \(n\) classes, the output has \(n\) dimensions (one per class), and \[\operatorname{class}(a)=\arg\max_{0\le i\le n-1} a_i.\] i.e. the index of the largest output component — the "most activated" class.
Euclidean: \(\sqrt{\sum_i (x_i-y_i)^2}\). Chebyshev: \(\max_i |x_i - y_i|\) (the largest single-coordinate gap). \(\operatorname{class}(a)=\arg\max_{0\le i\le n-1} a_i\): the index of the largest component of the output vector — i.e. the predicted class is whichever output neuron is highest.
\(\{\operatorname{dist}(x,y)\le 0.01\}\) — precondition: the two images are within distance 0.01. Then \(a\leftarrow f(x)\) and \(b\leftarrow f(y)\) — apply the same network to both. \(\{\operatorname{class}(a)=\operatorname{class}(b)\}\) — postcondition: they must get the same predicted class. Together: any two sufficiently-close inputs are classified identically.
English: "if two sentences differ in one word, and those words are synonyms, the classification is unchanged." Formally:
Postcondition: same classification.
"Change the specification to allow for a difference in two words."
Introduce two distinct positions \(i\) and \(j\) and require both substitutions to recover \(y\), with both pairs being synonyms. One clean version:
The key changes: a second position \(j\) (with \(i\neq j\)); apply both replacements to \(x\) to obtain \(y\); and add the second synonym constraint \(x_j \equiv y_j\). Everything else (apply \(f\) to both, demand equal class) is unchanged.
The collision-avoidance NN can output several advisories; one of them is "do nothing." A sensible safety property: if the intruder is very far (\(>10\text{km}\)) and our aircraft is much faster, the system should advise "do nothing" — i.e. not panic.
\(\nu_{\text{own}} > 5\,\nu_{\text{int}}\): our speed exceeds five times the intruder's. \(\rho > 10000\): range exceeds 10,000 (≈10 km). The comma between the two conditions is a conjunction (both must hold). Under those circumstances, the recommended action must be do nothing.
The lecturer's takeaway: for sophisticated systems, coming up with the right property to verify is itself hard and can require domain experts. The maths of the triple is easy; knowing that "\(\nu_{\text{own}}>5\nu_{\text{int}}\) and \(\rho>10000\) ⇒ do nothing" is the safe behaviour requires aviation knowledge. (Recall the sorting trap — the same lesson.)
Consistency here is stated as: two networks \(f\) and \(g\) should produce the same output on any given input.
The precondition is literally true — which holds for all possible inputs, so there is no restriction at all. The same input \(x\) is fed to two different networks \(f\) and \(g\). The postcondition demands their classifications agree. This is the "interaction between different NNs" the language was designed to express.
true signify, and what makes this specification structurally different from the robustness ones?true is always satisfied, so the precondition imposes no restriction — the property must hold for every input. Structurally, robustness applied one network \(f\) to two inputs \(x,y\); consistency applies two different networks \(f\) and \(g\) to the same input \(x\). This is the multi-network case the language was built to handle.
It is based on a presentation in Aws al-Barghouthi's book on neural-network verification. The equivalent classical concept is the Hoare triple \(\{P\}\,C\,\{Q\}\): precondition \(P\), command/computation \(C\), postcondition \(Q\).
Once a property is written down, there are exactly two outcomes: we disprove it with a counterexample, or we prove it holds. This week stops at the framing; the proof techniques are for next time.
Given a specification, we have one of two options. Either we show it does not hold (find a counterexample), or we show it holds (formally verify it by proof).
A counterexample is a specific input assignment \(\{x\mapsto \text{val}_1,\; y\mapsto \text{val}_2,\; z\mapsto \text{val}_3,\dots\}\) showing the specification fails: the precondition holds for it, yet the postcondition does not hold for the NN's output on it.
The slide's illustration uses a trivial "network" \(f(x)=x+1\):
The lecturer states this specification does not hold, and gives the counterexample \(\{x \mapsto 1\}\).
By the definition on the very same slide, a counterexample must satisfy the precondition. But \(x=1\) does not satisfy \(x\le 0.1\) — so strictly, \(x\mapsto 1\) is not a valid counterexample to this triple. The cleanest correct counterexample is \(\boxed{x = 0.1}\): it satisfies the precondition (\(0.1 \le 0.1\)), and gives \(a=f(0.1)=1.1\), which violates \(a\le 1.05\). So the spec genuinely fails — just witness it with \(x=0.1\), not \(x=1\).
If your exam asks you to produce a counterexample, always check it satisfies the precondition and violates the postcondition. (The point the slide is making — that the spec is false — is correct; only the chosen witness is off.)
Take \(x = 0.1\). Precondition: \(0.1 \le 0.1\) ✓ (holds). Compute: \(a = f(0.1) = 1.1\). Postcondition: need \(a \le 1.05\), but \(1.1 \le 1.05\) is false ✗. Precondition holds and postcondition fails, so \(x=0.1\) is a valid counterexample and the specification does not hold. (Any \(x\in(0.05,\,0.1]\) works, since then \(1.05 < x+1 \le 1.1\).)
The second possibility is that the specification is true, in which case we can formally verify it by mathematically proving it. The slide's example:
This is presented as the contrast case: instead of disproving with a counterexample, here the property holds and our job is to construct a proof. The lecturer uses it to make the conceptual point — "the other outcome is: it's true, now prove it" — rather than to grind through the algebra. (Whether it literally holds depends on the exact network \(f\) and the allowed input domain; treat it as a schematic example of the "spec-holds" branch.)
The week ends on a cliff-hanger: what remains is discussing the techniques that let us prove or disprove a specification in a machine-checkable way. That is the content of "Formally Verifying Neural Networks II."
(1) Disprove it by exhibiting a counterexample — an input satisfying the precondition but whose output violates the postcondition. (2) Prove it holds by formally verifying it. The proof branch is what produces a machine-checkable certificate of correctness; a counterexample is itself easily checkable too (just run \(f\) on it), but it certifies failure rather than correctness.
(1) Decide what property you want (robustness / safety / consistency) — hard, may need domain experts, and the "obvious" wording is often too weak (the sorting trap). (2) Represent it in the precondition/postcondition language: \(\{P\}\) inputs, NN(s) compute, \(\{Q\}\) outputs, using \(\wedge,\vee,\Rightarrow\), arithmetic and comparisons, with each NN as \(f:\mathbb{R}^n\to\mathbb{R}^m\). (3) Then either find a counterexample (precondition holds, postcondition fails) to disprove it, or prove it holds. (4) The proof/check is machine-checkable, giving certainty over all inputs. Step (3)'s actual techniques are next week.
The lecturer's own summary: this week introduced formal verification and specifications, discussed specifications for NNs, and showed how to represent them mathematically so they can be verified.
That maps cleanly onto our three big questions: we answered "what properties?" (robustness, safety, consistency) and "how to represent them?" (the precondition/postcondition language). The third question — "how to verify?" — is deliberately left open for the techniques lecture.
Formal verification means proving — and machine-checking — a property over every input rather than testing a few; the hard part is writing the property down (even "sort" needed a "same-elements" clause), so for neural networks we target three intrinsic, provable properties — robustness, safety, consistency — and encode each as a precondition/postcondition triple over networks-as-functions \(f:\mathbb{R}^n\to\mathbb{R}^m\); given such a spec we either disprove it with a counterexample or prove it holds.
dist (Euclidean & Chebyshev formulas) and class (\(\arg\max\) over output dims).true and two networks \(f,g\).