KCL · Machine Learning · Week 23

Formally Verifying
Neural Networks II

Propositional logic, Linear Real Arithmetic, and how to compile a network specification into something a solver can prove.

KCL · ML  ·  ~52 min read  ·  17 self-tests  ·  Lecturer: Mohammad Abdulaziz
Part I

The Big Picture — the Verification Pipeline

Last week we learned how to write down what a network should do. This week we learn how to actually prove it — by turning that promise into a logic formula and handing it to a machine.

Suppose you have a neural network and a desirable property you want it to satisfy — say, "if the input image is within a tiny distance of a stop-sign photo, the network still classifies it as a stop sign." Last week's job was to state such properties precisely. This week's job is to verify them: to obtain either a watertight guarantee that the property always holds, or a concrete counter-example — a specific input where it fails.

The whole strategy rests on one move: translate the specification into a logical formula, then feed that formula to an automated theorem prover (also called a solver). The solver does one of two things — it either formally verifies the specification, or it finds a counter-example.

NN spec S {pre} f {post} compile LRA formula φ(S) solver theorem prover CVC5 / Z3 valid ✓ spec holds counter-ex a bug
The pipeline (slides 2–3): specification → logical formula → solver → proof or counter-example.

The target language of this translation is a logic that mixes propositional logic and linear arithmetic. That choice is deliberate: propositional logic gives us the "and / or / not / if-then" structure to glue conditions together, and linear arithmetic gives us the ability to talk about the numbers a network actually computes. The combined logic is called Linear Real Arithmetic (LRA), and most of this week is about it.

Intuition

Think of the solver as an infinitely patient adversary. You hand it a sentence that means "the network always behaves correctly." It tries with all its might to find one situation that breaks the sentence. If it fails after exhausting every possibility, you have a proof. If it succeeds, it hands you the exact breaking input — far more useful than a vague "it sometimes fails."

Exam trap

"Verify the specification" and "find a counter-example" are two faces of the same question, not two different tasks. The solver searches for a counter-example; not finding one is the proof of correctness. Don't describe them as separate algorithms.

QWhy translate the specification into a formula at all? Why not just test the network on lots of inputs?

Testing can only ever check the finitely many inputs you actually try. A network defined over real-valued inputs has infinitely many possible inputs, so testing can never guarantee the absence of a bug — it can only find bugs that happen to be among your test cases. Translating to a logical formula and using a solver lets us reason about all inputs at once: the solver either proves the property over the entire (infinite) input space, or returns a specific counter-example. That is the difference between "we didn't find a problem" and "there is provably no problem."

·  ·  ·
Part II

Propositional Logic & Satisfiability

Before we can compile anything, we need the language. We start with the simplest logic — true/false statements wired together — and build up the four ideas everything else depends on: value, interpretation, satisfiability, validity.

Propositional logic is the logic of statements that are either true or false, combined with connectives like "and", "or" and "not". A formula is built from a tiny grammar. The lecturer wrote out that grammar by hand on the slide; here it is, with every production.

The grammar of formulae

A formula \(\phi\) is defined recursively by the following grammar (slide 4):

\[ \phi ::= \;\bot \;\mid\; \top \;\mid\; v \;\mid\; \lnot\phi \;\mid\; \phi \lor \phi \;\mid\; \phi \land \phi \]

Reading each production in plain English:

ProductionNameMeans
\(\bot\)"bottom"the constant False
\(\top\)"top"the constant True
\(v\)variablea Boolean variable \(v_1, v_2, v_3,\dots\)
\(\lnot\phi\)negation"not \(\phi\)"
\(\phi \lor \phi\)disjunction"or" — either the left holds or the right holds
\(\phi \land \phi\)conjunction"and" — both hold
Definition · connectives

\(v_1 \lor v_2\) reads "either \(v_1\) holds or \(v_2\) holds"; \(v_1 \land v_2\) reads "both \(v_1\) and \(v_2\) hold." Negation \(\lnot v_1\) flips the truth value. These three — \(\lnot, \lor, \land\) — plus the constants \(\bot, \top\) are the basic connectives, and everything else can be built from them.

A formula is just a tree built from these productions. The lecturer's worked example (slide 4) was \((v_1 \land v_2) \lor (\lnot v_3)\): a top-level "or", whose left branch is an "and" of \(v_1\) and \(v_2\), and whose right branch is "not \(v_3\)".

¬ v₁ v₂ v₃
The parse tree of \((v_1 \land v_2) \lor (\lnot v_3)\) — the lecturer's example from slide 4.

Auxiliary connectives — implication and bi-implication

We don't need new primitives for "if-then." Implication and bi-implication are defined (they are "auxiliary" — sugar built from the basics, slide 5):

\[ \phi_1 \rightarrow \phi_2 \;\equiv\; (\lnot\phi_1 \lor \phi_2) \] \[ \phi_1 \leftrightarrow \phi_2 \;\equiv\; (\phi_1 \rightarrow \phi_2) \land (\phi_2 \rightarrow \phi_1) \]
In plain English

Implication \(\phi_1 \rightarrow \phi_2\): "if \(\phi_1\) holds then \(\phi_2\) holds; otherwise we are not asserting anything." That last clause is the subtle part — when \(\phi_1\) is false, the implication is automatically true regardless of \(\phi_2\). The definition \(\lnot\phi_1 \lor \phi_2\) captures exactly this: it is satisfied either by \(\phi_1\) being false, or by \(\phi_2\) being true.

Bi-implication \(\phi_1 \leftrightarrow \phi_2\): "\(\phi_1\) holds if and only if \(\phi_2\) holds" — implication in both directions.

Exam trap

A false hypothesis makes an implication true, not false. "If \(\phi_1\) then \(\phi_2\)" only fails when \(\phi_1\) is true and \(\phi_2\) is false. This "vacuous truth" is exactly why implications encode preconditions so well later: when the precondition does not hold, the specification makes no demand. Memorise \(\phi_1 \rightarrow \phi_2 \equiv \lnot\phi_1 \lor \phi_2\) — you will use it constantly in Part IV.

QRewrite \(\phi_1 \rightarrow \phi_2\) using only \(\lnot, \land\) (no \(\lor\)). Then explain why \(v \rightarrow v\) is always true.

Starting from \(\lnot\phi_1 \lor \phi_2\) and applying De Morgan's law (\(a \lor b \equiv \lnot(\lnot a \land \lnot b)\)):

\[ \phi_1 \rightarrow \phi_2 \;\equiv\; \lnot\phi_1 \lor \phi_2 \;\equiv\; \lnot(\phi_1 \land \lnot\phi_2). \]

For \(v \rightarrow v\): this is \(\lnot v \lor v\). Whatever truth value \(v\) takes, exactly one of \(\lnot v\) and \(v\) is true, so the disjunction is always true. (This is the same tautology pattern as the law of excluded middle below.)

Variable-free formulae and the value rules

Most interesting formulae have variables. But a formula without variables (slide 6) can always be simplified all the way down to either \(\top\) or \(\bot\). The lecturer's example:

\[ (\top \lor \top) \lor (\lnot\bot) \;\longrightarrow\; \top \lor \top \;\longrightarrow\; \top. \]
Definition · value

The value of a variable-free formula, written \(\operatorname{value}(\phi)\), is the form it has after applying the rewrite rules below as many times as possible (slide 7).

These rewrite rules — written entirely by hand on the slide — are the engine that collapses any variable-free formula to \(\top\) or \(\bot\). The arrow \(\rightsquigarrow\) means "rewrites to":

DisjunctionConjunctionNegation
\(\top \lor \phi \rightsquigarrow \top\) \(\top \land \phi \rightsquigarrow \phi\) \(\lnot\bot \rightsquigarrow \top\)
\(\phi \lor \top \rightsquigarrow \top\) \(\phi \land \top \rightsquigarrow \phi\) \(\lnot\top \rightsquigarrow \bot\)
\(\bot \lor \phi \rightsquigarrow \phi\) \(\bot \land \phi \rightsquigarrow \bot\)
\(\phi \lor \bot \rightsquigarrow \phi\) \(\phi \land \bot \rightsquigarrow \bot\)
In plain English

These are just the truth tables, written as simplification steps. "True OR anything is True." "True AND anything is whatever the anything was." "False OR anything is whatever the anything was." "False AND anything is False." And "not False is True; not True is False." Apply them inside-out until nothing changes; what's left is the value.

QCompute \(\operatorname{value}\big((\bot \land \top) \lor \lnot(\top \land \top)\big)\) step by step.

Work inside-out:

\[ (\bot \land \top) \lor \lnot(\top \land \top) \]

Left: \(\bot \land \top \rightsquigarrow \bot\).   Inside the negation: \(\top \land \top \rightsquigarrow \top\), so \(\lnot(\top \land \top) \rightsquigarrow \lnot\top \rightsquigarrow \bot\). Now:

\[ \bot \lor \bot \rightsquigarrow \bot. \]

So the value is \(\bot\).

Interpretations — plugging in truth values

Definition · interpretation

For a formula \(\phi\), an interpretation \(\mathcal{I}\) is a mapping from the variables occurring in \(\phi\) to \(\{\top, \bot\}\). We write \(\mathcal{I}[\phi]\) for the image of \(\phi\) under \(\mathcal{I}\) — that is, \(\phi\) with each variable replaced by the truth value \(\mathcal{I}\) assigns it (slide 8).

An interpretation turns a formula with variables into a variable-free one, which we can then evaluate with the rules above. The lecturer's running example, with \(\phi \equiv (v_1 \lor v_2) \lor (\lnot v_3)\):

\[ \mathcal{I}_1 = \{\, v_1 \mapsto \top,\; v_2 \mapsto \top,\; v_3 \mapsto \bot \,\}, \qquad \mathcal{I}_1[\phi] = (\top \lor \top) \lor (\lnot\bot). \]

Satisfiability

Definition · satisfiability

For a formula \(\phi\) and an interpretation \(\mathcal{I}\), if \(\operatorname{value}(\mathcal{I}[\phi]) = \top\) then we say \(\mathcal{I}\) satisfies \(\phi\), written \(\mathcal{I} \models \phi\) (slide 9).

Continuing the example — first an interpretation that satisfies, then one that does not:

\[ \operatorname{value}(\mathcal{I}_1[\phi]) = \operatorname{value}\big((\top \lor \top) \lor (\lnot\bot)\big) = \operatorname{value}(\top \lor (\lnot\bot)) = \operatorname{value}(\top \lor \top) = \operatorname{value}(\top) = \top \;\Rightarrow\; \mathcal{I}_1 \models \phi. \]

Now take \(\mathcal{I}_2 = \{ v_1 \mapsto \bot,\; v_2 \mapsto \bot,\; v_3 \mapsto \top \}\):

\[ \mathcal{I}_2[\phi] = (\bot \lor \bot) \lor (\lnot\top), \qquad \operatorname{value}(\mathcal{I}_2[\phi]) = \operatorname{value}(\bot \lor \bot) = \bot \;\Rightarrow\; \mathcal{I}_2 \not\models \phi. \]
In plain English

To check whether an interpretation satisfies a formula: substitute the truth values, then simplify with the value rules. If you land on \(\top\), it satisfies; if you land on \(\bot\), it doesn't. A formula is satisfiable if at least one interpretation satisfies it.

Validity

Definition · validity

A formula \(\phi\) is valid if and only if for all interpretations \(\mathcal{I}\) of \(\phi\), we have \(\mathcal{I} \models \phi\) (slide 10).

Our \(\phi \equiv (v_1 \lor v_2) \lor (\lnot v_3)\) is satisfiable (\(\mathcal{I}_1 \models \phi\)) but not valid, precisely because \(\mathcal{I}_2 \not\models \phi\): one failing interpretation is enough to destroy validity. Compare it with the lecturer's valid example, \(\phi_2 = v_1 \lor \lnot v_1\):

Interpretation\(\mathcal{I}[\phi_2]\)valueverdict
\(\mathcal{I}_1 = \{v_1 \mapsto \bot\}\)\(\bot \lor (\lnot\bot) = \bot \lor \top\)\(\top\)\(\mathcal{I}_1 \models \phi_2\)
\(\mathcal{I}_2 = \{v_1 \mapsto \top\}\)\(\top \lor (\lnot\top) = \top \lor \bot\)\(\top\)\(\mathcal{I}_2 \models \phi_2\)

Every interpretation satisfies \(\phi_2\), so \(\phi_2\) is valid. This is the famous law of excluded middle — a tautology.

Exam trap · don't confuse the three

Satisfiable = some interpretation makes it true. Valid = every interpretation makes it true. Unsatisfiable = no interpretation makes it true. Validity is the strong one. A handy bridge: \(\phi\) is valid iff \(\lnot\phi\) is unsatisfiable — which is exactly why a solver proves validity by failing to satisfy the negation.

QIs \(v_1 \land \lnot v_1\) satisfiable? Valid? Justify with interpretations.

Try both interpretations. \(\{v_1 \mapsto \top\}\): \(\top \land \lnot\top = \top \land \bot \rightsquigarrow \bot\). \(\{v_1 \mapsto \bot\}\): \(\bot \land \lnot\bot = \bot \land \top \rightsquigarrow \bot\). Both give \(\bot\), so no interpretation satisfies it: it is unsatisfiable (a contradiction), hence certainly not valid. It is the exact opposite of the tautology \(v_1 \lor \lnot v_1\).

QShow \((v_1 \rightarrow v_2) \lor (v_2 \rightarrow v_1)\) is valid, using the definition of \(\rightarrow\).

Expand both implications with \(\phi_1 \rightarrow \phi_2 \equiv \lnot\phi_1 \lor \phi_2\):

\[ (\lnot v_1 \lor v_2) \lor (\lnot v_2 \lor v_1). \]

This contains both \(v_1\) and \(\lnot v_1\) as disjuncts. For any interpretation, one of \(v_1, \lnot v_1\) is \(\top\), and \(\top \lor \text{anything} = \top\). So every interpretation satisfies it → valid. (Intuitively: of any two statements, at least one must imply the other.)

·  ·  ·
Part III

Linear Real Arithmetic (LRA)

Propositional logic can say "and / or / not" — but a network computes with numbers. So we bolt linear arithmetic onto the logic, and get a language rich enough to describe what a network does and what we demand of it.

Recall that neural networks only compute with numbers — a network is a function \(f : \mathbb{R}^m \rightarrow \mathbb{R}^n\) (slide 11). So if the logic is to model a specification fully, it must be able to talk about numbers and operations on them. We therefore add real numbers and a few operations to the logic. In particular we need: addition, subtraction, multiplication by constants, and comparisons.

Exam trap · why "linear"

The word is multiplication by constants — never variable × variable. You may write \(2x_1\) or \(3x_2\), but never \(x_1 \cdot x_2\). Allowing variables to multiply each other would make the arithmetic non-linear, and the whole tractability story (and the solvers) would break. This single restriction is what the "Linear" in LRA refers to.

The LRA grammar

LRA formulae extend the propositional grammar with two new arithmetic atoms (slide 12). The full grammar is:

\[ \phi ::= \;\underbrace{\bot \;\mid\; \top \;\mid\; v \;\mid\; \phi \lor \phi \;\mid\; \phi \land \phi \;\mid\; \lnot\phi}_{\text{propositional logic}} \;\mid\; \underbrace{\textstyle\sum_{i=1}^{n} c_i \cdot \mathit{var}_i \le k \;\mid\; \textstyle\sum_{i=1}^{n} c_i \cdot \mathit{var}_i < k}_{\text{linear arithmetic}} \]

where the \(c_i\) (coefficients) and \(k\) (the bound) are real constants in \(\mathbb{R}\), and the \(\mathit{var}_i\) are real variables \(x_1, x_2, \dots\). The Boolean variables \(v\) from before are still there. Putting both halves together is what gives the logic its name:

\[ \underbrace{\text{propositional logic}}_{\bot,\top,v,\lor,\land,\lnot} \;+\; \underbrace{\text{linear arithmetic}}_{\sum c_i \mathit{var}_i \,\{\le,<\}\, k} \;\equiv\; \textbf{LRA}. \]
Example

The lecturer's mixed formula from slide 12:

\[ \underbrace{(2x_1 + 3x_2 - x_3 \le 0)}_{\text{linear arithmetic atom}} \;\land\; \underbrace{(v_1 \lor v_2)}_{\text{propositional}}. \]

One conjunct is a linear inequality over real variables; the other is a pure propositional clause over Boolean variables. LRA lets them live in the same formula.

Representing richer functions

The grammar only gives us \(\le\) and \(<\) — yet we can express a surprising amount with just those plus conjunction (slide 13).

Equality

\[ x_1 = x_2 \;\equiv\; (x_1 - x_2 \le 0) \;\land\; (x_2 - x_1 \le 0). \]

"\(x_1 \le x_2\) and \(x_2 \le x_1\)" forces them equal — pinning a number from both sides.

Absolute value

\[ |x_1| \le c \;\equiv\; (x_1 \le c) \;\land\; (-x_1 \le c). \]

The two conjuncts say \(x_1 \le c\) and \(x_1 \ge -c\), i.e. \(-c \le x_1 \le c\) — exactly "the magnitude of \(x_1\) is at most \(c\)."

Why this matters

Absolute-value bounds are how robustness specs are written: "no input within distance \(c\) of this point changes the answer." Being able to compile \(|x| \le c\) down into the bare grammar is the bridge from a human-readable spec to something the solver can chew on.

QExpress the constraint \(x_1 = 5\) and the constraint "\(x_1\) is strictly between \(2\) and \(7\)" in the bare LRA grammar.

\(x_1 = 5\): using the equality trick with constant \(5\):

\[ (x_1 - 5 \le 0) \land (5 - x_1 \le 0), \quad\text{i.e.}\quad (x_1 \le 5) \land (-x_1 \le -5). \]

\(2 < x_1 < 7\): two strict inequalities conjoined. Rearranged into the grammar's form (\(\sum c_i \mathit{var}_i < k\)):

\[ (-x_1 < -2) \land (x_1 < 7). \]

(\(-x_1 < -2\) is just \(x_1 > 2\) rewritten so the variable side matches the grammar.)

Value, interpretation, satisfiability, validity — for LRA

The four core notions carry over almost verbatim; the only change is that arithmetic atoms must now be evaluated as numbers before the propositional rules finish the job.

Definition · LRA value

The value of a variable-free LRA formula, \(\operatorname{value}(\phi)\), is the form it has after applying the rewrite rules as many times as possible (slide 14) — the same propositional rules as before, plus evaluating each arithmetic comparison to \(\top\) or \(\bot\).

Worked example · slide 14

Evaluate \((2 \cdot 4 + 3 \cdot 5 - 1 \cdot 0 \le 0) \land (\top \lor \bot)\):

Left conjunct — do the arithmetic first: \(2\cdot4 + 3\cdot5 - 1\cdot0 = 8 + 15 - 0 = 23\). Then test \(23 \le 0\), which is false, so this atom evaluates to \(\bot\).

Right conjunct: \(\top \lor \bot \rightsquigarrow \top\).

Combine: \(\bot \land \top \rightsquigarrow \bot\). So the value of the whole formula is \(\bot\).

Definition · LRA interpretation

For a formula \(\phi\), an interpretation \(\mathcal{I}\) is a mapping of the variables in \(\phi\) such that Boolean variables map to \(\{\top, \bot\}\) and real variables map to \(\mathbb{R}\). We again write \(\mathcal{I}[\phi]\) for the image of \(\phi\) under \(\mathcal{I}\) (slides 15–16).

So an LRA interpretation assigns each Boolean variable a truth value and each real variable an actual real number. The lecturer's example, with \(\phi \equiv (2x_1 + 3x_2 - x_3 \le 0) \land (v_1 \lor v_2)\):

\[ \mathcal{I}_1 = \{\, x_1 \mapsto 4,\; x_2 \mapsto 5,\; x_3 \mapsto 0,\; v_1 \mapsto \top,\; v_2 \mapsto \bot \,\} \] \[ \mathcal{I}_1[\phi] = (2\cdot4 + 3\cdot5 - 0 \le 0) \land (\top \lor \bot). \]

Notice this is precisely the variable-free formula we just evaluated — substitution by \(\mathcal{I}_1\) produces it.

Definition · LRA satisfiability & validity

Satisfiability (slide 17): \(\mathcal{I} \models \phi\) iff \(\operatorname{value}(\mathcal{I}[\phi]) = \top\).   Validity (slide 18): \(\phi\) is valid iff \(\mathcal{I} \models \phi\) for all interpretations \(\mathcal{I}\). Word for word the same as the propositional case.

For our example, \(\operatorname{value}(\mathcal{I}_1[\phi]) = \bot\) (computed above), so \(\mathcal{I}_1 \not\models \phi\). And because there exists an interpretation that fails it, \(\phi\) is not valid (slide 18).

QThe lecturer's exercise (slide 17): change \(\mathcal{I}_1\) so that it satisfies \(\phi \equiv (2x_1 + 3x_2 - x_3 \le 0) \land (v_1 \lor v_2)\).

We need both conjuncts true. The propositional part \(v_1 \lor v_2\) is true as soon as one of them is \(\top\) — keep \(v_1 \mapsto \top\). For the arithmetic part we need \(2x_1 + 3x_2 - x_3 \le 0\). The old assignment gave \(23\); we just need the sum \(\le 0\). For instance keep \(x_1, x_2\) and raise \(x_3\):

\[ \mathcal{I}' = \{\, x_1 \mapsto 4,\; x_2 \mapsto 5,\; x_3 \mapsto 23,\; v_1 \mapsto \top,\; v_2 \mapsto \bot \,\}. \]

Then \(8 + 15 - 23 = 0 \le 0\) is true (\(\top\)), and \(\top \lor \bot = \top\), so \(\mathcal{I}'[\phi]\) evaluates to \(\top \land \top = \top\). Hence \(\mathcal{I}' \models \phi\). (Many other answers work — e.g. \(x_1 = x_2 = 0,\ x_3 = 0\).)

Why LRA is the right tool — solvers

The practical payoff (slide 19): there exist solvers for LRA that can judge the validity of a formula. Given a claim encoded as an LRA formula, a solver can either prove it valid or compute an interpretation that does not satisfy it. The lecturer noted in the margin that "a model" and "an interpretation" mean the same thing here.

Example · the slogan

If we write an LRA formula encoding "This system has no bugs," a solver will either show the formula is valid (no bugs), or compute an interpretation that does not satisfy it — and that failing interpretation corresponds to a bug.

Two facts about the complexity (slide 19):

Exam trap

"NP-hard" describes the worst case. It does not mean the solvers are useless — the slide stresses that real NN-verification formulae are usually solved efficiently in practice. Don't write "LRA is intractable so verification is impossible"; write "the problem is NP-hard in general, but practical instances are handled efficiently."

What remains — the rest of the lecture — is to show how a specification on an NN can be translated into an LRA formula.

·  ·  ·
Part IV

Encoding NN Specifications in LRA

Now the payoff. We compile a network and its specification into a single LRA formula, piece by piece: one node, then many nodes, then the wiring, then the precondition and postcondition that wrap the whole thing.

The question we want answered (slide 20) is: does a given specification \(S\) hold? — i.e. is it the case that, for any input satisfying the precondition, the postcondition holds after the network computes? Encoding this question has two parts: encoding the network(s), and encoding the precondition and postcondition.

It helps to picture a specification as a Hoare triple — a precondition, a computation, a postcondition (slide 20):

\[ \{\, \text{precondition} \,\}\quad r \leftarrow f(x) \quad\{\, \text{postcondition} \,\} \]

and the encoding as network \(+\) precondition \(+\) postcondition, glued into one formula.

Encoding a single node

Start with the smallest network: one node \(v\) (slides 21–22). For every node we introduce:

vⁱⁿ¹ vⁱⁿ² v vᵒ (output)
One node: inputs \(v^{\mathit{in},i}\) feed in, output \(v^{o}\) comes out. The activation determines the constraint relating them.

Affine node   \(f_v(x) = x + 1\)

\[ \phi(v) \;\equiv\; \big(\, v^{o} = v^{\mathit{in},1} + 1 \,\big). \]

Addition of two inputs   \(f_v(x_1, x_2) = x_1 + x_2\)

\[ \phi(v) \;\equiv\; \big(\, v^{o} = v^{\mathit{in},1} + v^{\mathit{in},2} \,\big). \]

ReLU   \(f_v(x) = \max(x, 0) = \begin{cases} x & \text{if } x > 0 \\ 0 & \text{otherwise} \end{cases}\)

ReLU is a piecewise function, so its encoding is a conjunction of guarded cases — one implication per piece:

\[ \phi(v) \;\equiv\; \big(\, v^{\mathit{in},1} \le 0 \;\Rightarrow\; v^{o} = 0 \,\big) \;\land\; \big(\, v^{\mathit{in},1} > 0 \;\Rightarrow\; v^{o} = v^{\mathit{in},1} \,\big). \]
x fᵛ(x) output 0 output = x
ReLU: flat at 0 for \(x \le 0\), then the line \(y = x\) for \(x > 0\). Two pieces → two guarded implications.
In plain English

"If the input is non-positive, the output is pinned to 0; if the input is positive, the output equals the input." Each guarded implication \(\text{guard} \Rightarrow \text{equation}\) is itself ordinary LRA: the implication unfolds to \(\lnot\text{guard} \lor \text{equation}\), the guard is a linear comparison, and the equation is the two-\(\le\) equality trick from Part III. Nothing leaves the grammar.

General piecewise-linear node

ReLU is a special case of a general piecewise-linear function with breakpoints \(c_1 < c_2 < c_3 < \cdots\):

\[ f_v(x) = \begin{cases} 0 & \text{if } x \le c_1 \\ a_2 x + b_2 & \text{if } c_1 < x \le c_2 \\ a_3 x + b_3 & \text{if } c_2 < x \le c_3 \\ \;\vdots & \end{cases} \]

Its encoding follows the same recipe — one guarded implication for each linear piece (slides 21–22):

\[ \phi(f_v) \equiv \big(v^{\mathit{in},1} \le c_1 \Rightarrow v^{o} = 0\big) \land \big(c_1 < v^{\mathit{in},1} \le c_2 \Rightarrow v^{o} = a_2 v^{\mathit{in},1} + b_2\big) \land \cdots \]
QWrite the LRA encoding of a node computing \(f_v(x) = 2x - 3\) (an affine node, single input).

An affine function is a single linear piece, so one equality constraint suffices:

\[ \phi(v) \equiv \big(v^{o} = 2\,v^{\mathit{in},1} - 3\big). \]

If you must reduce the equality to the bare grammar, use the equality trick: \((v^{o} - 2 v^{\mathit{in},1} + 3 \le 0) \land (2 v^{\mathit{in},1} - 3 - v^{o} \le 0)\). Note all coefficients are constants — it stays linear.

QEncode a "leaky ReLU" node: \(f_v(x) = x\) if \(x > 0\), and \(f_v(x) = 0.01x\) if \(x \le 0\).

Two linear pieces, so two guarded implications:

\[ \phi(v) \equiv \big(v^{\mathit{in},1} > 0 \Rightarrow v^{o} = v^{\mathit{in},1}\big) \land \big(v^{\mathit{in},1} \le 0 \Rightarrow v^{o} = 0.01\, v^{\mathit{in},1}\big). \]

It's just ReLU with the "off" branch changed"240" y="180" text-anchor="middle" font-size="10.5" fill="#5a4a3a">both directions → faithful (sound & complete) encoding

Soundness maps real behaviour into the formula; completeness maps the formula back to real behaviour. A faithful encoding does both.

Two closing notes

Real numbers vs floating point (slide 30)

We modelled NNs in LRA, whose variables are all real numbers. Practical systems, however, implement NNs in floating-point arithmetic — that's how computer hardware works. There exist other logics that model floating-point numbers, and they have been used to verify NNs — but their solvers are less effective than LRA's solvers. So LRA over the reals is a pragmatic, slightly idealised choice.

Non-linear activation functions (slide 30)

We only encoded linear (piecewise-linear) activation functions, and LRA encodes those in a sound and complete fashion. Genuinely non-linear activations — the sigmoid, for example — cannot be captured exactly by linear inequalities; they can only be approximately encoded.

The connecting insight · lecturer's margin note

An encoding that is sound but not complete is precisely an approximate encoding. This is exactly the situation for non-linear activations like sigmoid: the approximation can be made to cover all real behaviours (sound — so "no bug" verdicts are trustworthy), but it admits extra phantom behaviours (not complete — so a reported "bug" might be spurious and needs checking). Piecewise-linear activations avoid this because they fit LRA exactly.

QA solver, run on a sound-but-not-complete (approximate) encoding, reports "valid — no counter-example." Can you trust it? What if it reports a counter-example?

"Valid / no counter-example" → trustworthy. Soundness guarantees every real behaviour is represented, so if no represented behaviour breaks the spec, no real one does either. The "no bug" verdict is safe.

"Found a counter-example" → not directly trustworthy. Incompleteness means the satisfying interpretation might be a phantom — corresponding to no real input. You'd have to check whether the reported input actually produces that behaviour in the true network before believing it's a real bug.

(With a sound and complete encoding — i.e. piecewise-linear activations — both verdicts are trustworthy.)

QWhy does the lecture model NNs over \(\mathbb{R}\) when real hardware uses floating point? Give the trade-off.

Because LRA's solvers are far more effective than the solvers for floating-point logics. Modelling over the reals is a slight idealisation — it ignores rounding behaviour — but it buys efficient, scalable verification. Floating-point logics exist and capture the hardware faithfully, but their solvers are weaker, so in practice the real-valued LRA model is the pragmatic default. The trade-off is fidelity to hardware (favouring floating point) versus solver effectiveness (favouring LRA reals).

QCould an encoding be complete but not sound? What would go wrong in verification?

Yes in principle: every satisfying interpretation corresponds to a real pair (complete), yet some real pair has no satisfying interpretation (unsound). The danger is the dangerous direction: a real behaviour is invisible to the formula, so the solver could find "no violating interpretation" and declare the spec verified while a real input quietly breaks it — a false guarantee of safety. This is why, for safety verification, soundness is the property you cannot give up; losing completeness only costs you false alarms, but losing soundness costs you missed bugs.

The week in one breath

To verify a neural network, compile its specification into a single LRA formula — propositional logic for the structure, linear inequalities for the numbers — where each node becomes a (possibly guarded) constraint, each edge an equality, and the whole network sits inside an implication "precondition & network ⇒ postcondition"; then a solver like Z3 or CVC5 either proves it valid (the spec holds for all inputs) or returns an interpretation that breaks it (a concrete bug) — and you can trust those answers exactly insofar as the encoding is sound (no real behaviour missed) and complete (no fake behaviour invented), which holds exactly for piecewise-linear activations and only approximately for non-linear ones like the sigmoid.

Exam-readiness checklist

  1. Write the propositional grammar (\(\bot, \top, v, \lnot, \lor, \land\)) and define \(\rightarrow\) and \(\leftrightarrow\) from the basics.
  2. Apply the value rewrite rules to collapse any variable-free formula to \(\top\) or \(\bot\).
  3. Define interpretation, satisfiability (\(\mathcal{I} \models \phi\)), validity, and distinguish satisfiable vs valid vs unsatisfiable.
  4. Show a formula is valid or not by exhibiting interpretations (e.g. \(v_1 \lor \lnot v_1\) valid; \((v_1 \lor v_2) \lor \lnot v_3\) not).
  5. State the LRA grammar, including the arithmetic atoms \(\sum c_i \mathit{var}_i \le k\) / \(< k\), and say why "multiplication by constants" keeps it linear.
  6. Encode equality and absolute value using only \(\le\) and \(\land\).
  7. Evaluate a mixed LRA formula by computing arithmetic atoms then applying propositional rules.
  8. Recall that finding a non-satisfying interpretation in LRA is NP-hard, yet efficient on practical formulae.
  9. Encode a single node: affine, addition, ReLU (two guarded implications), general piecewise-linear (one per piece).
  10. Build \(\phi_V\) (conjoin node constraints) and \(\phi_E\) (each edge = "destination input = source output").
  11. Assemble a full spec: \(\phi(S) \equiv (\text{precond} \land \phi_V \land \phi_E) \Rightarrow \text{postcond}\), and the multi-network generalisation.
  12. Reproduce the clipped-ReLU example: why \(\{x \ge M\}\,r{=}M\) is valid, and the counter-example \(\{v^{\mathit{in},1}\mapsto M{+}2, v^o \mapsto M\}\) when the postcondition is wrongly \(M{+}1\).
  13. Define soundness (real pair \(\Rightarrow\) satisfying interp) and completeness (satisfying interp \(\Rightarrow\) real pair) and say what each guarantees about solver verdicts.
  14. Explain "sound but not complete = approximate encoding," and why sigmoid forces approximation while piecewise-linear is exact.
  15. State the real-vs-floating-point trade-off and name the solvers CVC5 and Z3.