Week 22 — Formally Verifying Neural Networks I

The lecturer closes Part II by promising that the coming parts will show how to mathematically express these three properties — which is exactly Part III.

Part III

A Language for Specifying NN Properties

English is fine for intuition but useless for proving theorems. We now meet a compact precondition/postcondition language — Hoare-triple style — and use it to formalise every property from Part II.

We discussed the properties in English, but English is not useful if we want to prove theorems that the properties hold. So we adopt a formal language. Crucially, we want it to handle specifications that involve interactions between several NNs and multiple inputs and outputs — not just one network on one input.

11 · The general form (precondition / postcondition)

Specification { Precondition } NN(s) computes something { Postcondition }

Definition — the specification triple

Precondition — a statement that holds for the inputs.
One or more NNs compute some outputs.
Postcondition — a statement that holds for the outputs.

Pre- and postconditions are written with:

logical connectives: and \(\wedge\), or \(\vee\), implication \(\Rightarrow\), etc.;
arithmetic operators: \(+\), \(-\), etc., and comparisons \(\ge\), \(\lt\), etc.

Have you seen this before?

This is exactly a Hoare triple from program verification, \(\{P\}\,C\,\{Q\}\): assume \(P\) of the inputs, run the computation \(C\), and \(Q\) must hold of the outputs. The lecturer notes this is only one way to model specifications — it is based on a presentation in Aws al-Barghouthi's book (on neural-network verification).

12 · Every NN is a function \(f:\mathbb{R}^n \to \mathbb{R}^m\)

Convention

In specifications, every NN is treated as a function \(f:\mathbb{R}^n \to \mathbb{R}^m\). Henceforth:

input vectors are written \(x, y, z, \dots\)
output vectors are written \(a, b, c, \dots\)

How to read the whole triple

"For all input vectors that make the precondition true, the postcondition will hold after the NN computes on one of those inputs." The precondition filters which inputs we care about; the postcondition is the promise we must keep for every one of them.

QIn this specification language, what symbols denote inputs vs outputs, and how is a full specification read aloud?

Inputs: \(x,y,z,\dots\). Outputs: \(a,b,c,\dots\). Each NN is a function \(f:\mathbb{R}^n\to\mathbb{R}^m\). Reading: "for all input vectors satisfying the precondition, after the NN computes its output, the postcondition holds." The precondition restricts the inputs considered; the postcondition is the guaranteed property of the outputs.

13 · Formalising image robustness

Recall the English version: "if \(x\) is slightly darker/lighter than \(y\), then \(x\) is classified the same as \(y\)." In the language:

Specification { dist(x, y) ≤ 0.01 } a ← f(x) b ← f(y) { class(a) = class(b) }

where \(f:\mathbb{R}^n\to\mathbb{R}^m\). Two inputs that are close (distance \(\le 0.01\)) must receive the same class. The two clauses \(a\leftarrow f(x)\) and \(b\leftarrow f(y)\) show the same network \(f\) being applied to two inputs — exactly the "multiple inputs" capability we wanted.

The two pieces: `dist` and `class`

Distance between vectors

dist(x,y) is a notion of distance. Two examples on the slide:

Euclidean distance: \[\operatorname{dist}(x,y)=\sqrt{\sum_{0\le i\le \dim(x)} (x_i - y_i)^2}\]

Chebyshev distance: \[\operatorname{dist}(x,y)=\max_{0\le i\le \dim(x)} \lvert x_i - y_i\rvert\]

(Note: \(x_i\) is the \(i\)-th component of vector \(x\).)

Classification from the output

class(a) is the class predicted from the NN's output vector \(a\). If there are \(n\) classes, the output has \(n\) dimensions (one per class), and \[\operatorname{class}(a)=\arg\max_{0\le i\le n-1} a_i.\] i.e. the index of the largest output component — the "most activated" class.

QWrite the Euclidean and Chebyshev distances between two vectors, and state what \(\operatorname{class}(a)\) computes.

Euclidean: \(\sqrt{\sum_i (x_i-y_i)^2}\). Chebyshev: \(\max_i |x_i - y_i|\) (the largest single-coordinate gap). \(\operatorname{class}(a)=\arg\max_{0\le i\le n-1} a_i\): the index of the largest component of the output vector — i.e. the predicted class is whichever output neuron is highest.

QWrite the full formal robustness specification for image classification, and explain what each line does.

\(\{\operatorname{dist}(x,y)\le 0.01\}\) — precondition: the two images are within distance 0.01. Then \(a\leftarrow f(x)\) and \(b\leftarrow f(y)\) — apply the same network to both. \(\{\operatorname{class}(a)=\operatorname{class}(b)\}\) — postcondition: they must get the same predicted class. Together: any two sufficiently-close inputs are classified identically.

14 · Formalising NLP robustness (synonyms)

English: "if two sentences differ in one word, and those words are synonyms, the classification is unchanged." Formally:

Specification { 0 ≤ i < dim(x) ∧ x[i ↦ y_i] = y ∧ x_i ≡ y_i } a ← f(x) b ← f(y) { class(a) = class(b) }

Reading the precondition

\(0 \le i \lt \dim(x)\): \(i\) is a valid word position;
\(x[i \mapsto y_i] = y\): if you take sentence \(x\) and replace its word at position \(i\) with \(y\)'s word, you get exactly \(y\) — so \(x\) and \(y\) are identical except at position \(i\);
\(x_i \equiv y_i\): the two differing words are synonyms (\(\equiv\) denotes the synonym relation).

Postcondition: same classification.

Exercise (from the slide)

"Change the specification to allow for a difference in two words."

QSlide exercise: rewrite the NLP synonym specification to permit the two sentences to differ in two word positions (both replaced by synonyms).

Introduce two distinct positions \(i\) and \(j\) and require both substitutions to recover \(y\), with both pairs being synonyms. One clean version:

Specification { 0 ≤ i < dim(x) ∧ 0 ≤ j < dim(x) ∧ i ≠ j ∧ x[i ↦ y_i][j ↦ y_j] = y ∧ x_i ≡ y_i ∧ x_j ≡ y_j } a ← f(x) b ← f(y) { class(a) = class(b) }

The key changes: a second position \(j\) (with \(i\neq j\)); apply both replacements to \(x\) to obtain \(y\); and add the second synonym constraint \(x_j \equiv y_j\). Everything else (apply \(f\) to both, demand equal class) is unchanged.

15 · Formalising safety (the aircraft)

The collision-avoidance NN can output several advisories; one of them is "do nothing." A sensible safety property: if the intruder is very far (\(>10\text{km}\)) and our aircraft is much faster, the system should advise "do nothing" — i.e. not panic.

Specification { ν_own > 5·ν_int , ρ > 10000 } a ← f(ν_own, ν_int, ρ, ...) { class(a) = do_nothing }

Reading it

\(\nu_{\text{own}} > 5\,\nu_{\text{int}}\): our speed exceeds five times the intruder's. \(\rho > 10000\): range exceeds 10,000 (≈10 km). The comma between the two conditions is a conjunction (both must hold). Under those circumstances, the recommended action must be do nothing.

Exam note

The lecturer's takeaway: for sophisticated systems, coming up with the right property to verify is itself hard and can require domain experts. The maths of the triple is easy; knowing that "\(\nu_{\text{own}}>5\nu_{\text{int}}\) and \(\rho>10000\) ⇒ do nothing" is the safe behaviour requires aviation knowledge. (Recall the sorting trap — the same lesson.)

16 · Formalising consistency (two networks)

Consistency here is stated as: two networks \(f\) and \(g\) should produce the same output on any given input.

Specification { true } a ← f(x) b ← g(x) { class(a) = class(b) }

Reading it

The precondition is literally true — which holds for all possible inputs, so there is no restriction at all. The same input \(x\) is fed to two different networks \(f\) and \(g\). The postcondition demands their classifications agree. This is the "interaction between different NNs" the language was designed to express.

QIn the consistency specification, what does the precondition true signify, and what makes this specification structurally different from the robustness ones?

true is always satisfied, so the precondition imposes no restriction — the property must hold for every input. Structurally, robustness applied one network \(f\) to two inputs \(x,y\); consistency applies two different networks \(f\) and \(g\) to the same input \(x\). This is the multi-network case the language was built to handle.

QWhose textbook presentation is this specification language based on, and what is the equivalent concept from classical program verification?

It is based on a presentation in Aws al-Barghouthi's book on neural-network verification. The equivalent classical concept is the Hoare triple \(\{P\}\,C\,\{Q\}\): precondition \(P\), command/computation \(C\), postcondition \(Q\).

Part IV

What Can We Do With a Specification?

Once a property is written down, there are exactly two outcomes: we disprove it with a counterexample, or we prove it holds. This week stops at the framing; the proof techniques are for next time.

Given a specification, we have one of two options. Either we show it does not hold (find a counterexample), or we show it holds (formally verify it by proof).

17 · Option 1 — disproof by counterexample

Definition — Counterexample

A counterexample is a specific input assignment \(\{x\mapsto \text{val}_1,\; y\mapsto \text{val}_2,\; z\mapsto \text{val}_3,\dots\}\) showing the specification fails: the precondition holds for it, yet the postcondition does not hold for the NN's output on it.

The slide's illustration uses a trivial "network" \(f(x)=x+1\):

Specification { x ≤ 0.1 } a ← f(x) # f(x) = x + 1 { a ≤ 1.05 }

The lecturer states this specification does not hold, and gives the counterexample \(\{x \mapsto 1\}\).

Exam trap — read the definition carefully (likely slide typo)

By the definition on the very same slide, a counterexample must satisfy the precondition. But \(x=1\) does not satisfy \(x\le 0.1\) — so strictly, \(x\mapsto 1\) is not a valid counterexample to this triple. The cleanest correct counterexample is \(\boxed{x = 0.1}\): it satisfies the precondition (\(0.1 \le 0.1\)), and gives \(a=f(0.1)=1.1\), which violates \(a\le 1.05\). So the spec genuinely fails — just witness it with \(x=0.1\), not \(x=1\).

If your exam asks you to produce a counterexample, always check it satisfies the precondition and violates the postcondition. (The point the slide is making — that the spec is false — is correct; only the chosen witness is off.)

QFor \(f(x)=x+1\) and the spec \(\{x\le 0.1\}\;a\leftarrow f(x)\;\{a\le 1.05\}\), give a valid counterexample and verify both conditions.

Take \(x = 0.1\). Precondition: \(0.1 \le 0.1\) ✓ (holds). Compute: \(a = f(0.1) = 1.1\). Postcondition: need \(a \le 1.05\), but \(1.1 \le 1.05\) is false ✗. Precondition holds and postcondition fails, so \(x=0.1\) is a valid counterexample and the specification does not hold. (Any \(x\in(0.05,\,0.1]\) works, since then \(1.05 < x+1 \le 1.1\).)

18 · Option 2 — the specification holds, so prove it

The second possibility is that the specification is true, in which case we can formally verify it by mathematically proving it. The slide's example:

Specification { x ≤ 0.1 } a ← f(x) { a ≥ 0.1 }

What this slide is really illustrating

This is presented as the contrast case: instead of disproving with a counterexample, here the property holds and our job is to construct a proof. The lecturer uses it to make the conceptual point — "the other outcome is: it's true, now prove it" — rather than to grind through the algebra. (Whether it literally holds depends on the exact network \(f\) and the allowed input domain; treat it as a schematic example of the "spec-holds" branch.)

The week ends on a cliff-hanger: what remains is discussing the techniques that let us prove or disprove a specification in a machine-checkable way. That is the content of "Formally Verifying Neural Networks II."

QGiven any NN specification, what are the only two things we can do with it, and which one yields a machine-checkable proof?

(1) Disprove it by exhibiting a counterexample — an input satisfying the precondition but whose output violates the postcondition. (2) Prove it holds by formally verifying it. The proof branch is what produces a machine-checkable certificate of correctness; a counterexample is itself easily checkable too (just run \(f\) on it), but it certifies failure rather than correctness.

QSynthesis: walk through the full pipeline this week set up, from an English wish to a machine-checkable result.

(1) Decide what property you want (robustness / safety / consistency) — hard, may need domain experts, and the "obvious" wording is often too weak (the sorting trap). (2) Represent it in the precondition/postcondition language: \(\{P\}\) inputs, NN(s) compute, \(\{Q\}\) outputs, using \(\wedge,\vee,\Rightarrow\), arithmetic and comparisons, with each NN as \(f:\mathbb{R}^n\to\mathbb{R}^m\). (3) Then either find a counterexample (precondition holds, postcondition fails) to disprove it, or prove it holds. (4) The proof/check is machine-checkable, giving certainty over all inputs. Step (3)'s actual techniques are next week.

Part V

Wrap-up & Exam Readiness

The lecturer's own summary: this week introduced formal verification and specifications, discussed specifications for NNs, and showed how to represent them mathematically so they can be verified.

That maps cleanly onto our three big questions: we answered "what properties?" (robustness, safety, consistency) and "how to represent them?" (the precondition/postcondition language). The third question — "how to verify?" — is deliberately left open for the techniques lecture.

The week in one breath

Formal verification means proving — and machine-checking — a property over every input rather than testing a few; the hard part is writing the property down (even "sort" needed a "same-elements" clause), so for neural networks we target three intrinsic, provable properties — robustness, safety, consistency — and encode each as a precondition/postcondition triple over networks-as-functions \(f:\mathbb{R}^n\to\mathbb{R}^m\); given such a spec we either disprove it with a counterexample or prove it holds.

Exam-readiness checklist

Define formal verification and explain why proofs are "usually machine-generated, always machine-checkable" — and why the second clause matters more.
Use the sorting example to argue that specifying intent is harder than proving it; explain why "output is sorted" is too weak and state the "same elements" fix.
State the three questions formal verification of NNs decomposes into, and which two this week answers.
Explain why "agrees with most humans on most pandas" is not a verifiable property.
Define robustness; describe the Goodfellow et al. 2014 panda→gibbon example (FGSM: \(\epsilon\cdot\operatorname{sign}(\nabla_x J(\theta,x,y))\), confidences 57.7% → 99.3%) and the STOP-sign safety motivation.
Define safety and a "bad state"; describe the Katz et al. aircraft collision-avoidance NN (inputs, outputs, why certification needs proof).
Define consistency and give the "A on B, B on C, C on A" contradiction; say why it matters for generative models.
Write the general specification triple \(\{P\}\;\text{NN(s)}\;\{Q\}\) and list the connectives/operators allowed (link it to Hoare triples / al-Barghouthi).
State the NN-as-function convention \(f:\mathbb{R}^n\to\mathbb{R}^m\), the \(x,y,z\)/\(a,b,c\) naming, and read a triple aloud correctly.
Write the image-robustness spec and define dist (Euclidean & Chebyshev formulas) and class (\(\arg\max\) over output dims).
Write the NLP synonym spec, explain \(x[i\mapsto y_i]=y\) and \(x_i\equiv y_i\), and do the two-word extension exercise.
Write the safety spec \(\{\nu_{\text{own}}>5\nu_{\text{int}},\ \rho>10000\}\Rightarrow\) do nothing, and the consistency spec with precondition true and two networks \(f,g\).
Define a counterexample precisely, and for \(f(x)=x+1\), \(\{x\le0.1\}\{a\le1.05\}\) produce a valid witness (\(x=0.1\)) — checking it satisfies the precondition and violates the postcondition.
State the two things you can do with any specification (disprove via counterexample / prove it holds) and note that the verification techniques are the "Part II" still to come.

Formally Verifying Neural Networks I

What Formal Verification Is

10 · Property #3 — Consistency

A Language for Specifying NN Properties

11 · The general form (precondition / postcondition)

12 · Every NN is a function \(f:\mathbb{R}^n \to \mathbb{R}^m\)

13 · Formalising image robustness

The two pieces: `dist` and `class`

14 · Formalising NLP robustness (synonyms)

15 · Formalising safety (the aircraft)

16 · Formalising consistency (two networks)

What Can We Do With a Specification?

17 · Option 1 — disproof by counterexample

18 · Option 2 — the specification holds, so prove it

Wrap-up & Exam Readiness

The week in one breath

Exam-readiness checklist

What Formal Verification Is

10 · Property #3 — Consistency

A Language for Specifying NN Properties

11 · The general form (precondition / postcondition)

12 · Every NN is a function \(f:\mathbb{R}^n \to \mathbb{R}^m\)

13 · Formalising image robustness

The two pieces: dist and class

14 · Formalising NLP robustness (synonyms)

15 · Formalising safety (the aircraft)

16 · Formalising consistency (two networks)

What Can We Do With a Specification?

17 · Option 1 — disproof by counterexample

18 · Option 2 — the specification holds, so prove it

Wrap-up & Exam Readiness

The week in one breath

Exam-readiness checklist

The two pieces: `dist` and `class`