This post is primarily directed at students in the AI class at University Erlangen to explain the basics, and in particular the formalism we use in the lectures, of abstract logic.
In its (not actually, but for our purposes) most abstract form, a logical system consists of three things:
- A language , i.e. a set of formulae,
- a class of models, which we use to assign a truth value to a formula, and finally
- a logical entailment relation between models and formulae.
We don’t demand anything more specific. The formulae and models may look however we want them to; as long as we have a relation between models and formulae, we have a logical system. There’s nothing that forbids us from defining , and if is divisible by . It’s pointless as a logical system, but it is a logical system.
We interpret to mean the formula is true in the model – whatever we mean by that. If a formula is true in every model, we call a tautology (or valid) and write . In the pointless logical system above, a tautology would be a number that divides every other number, so the only tautology is 1. If there is some model with , we say is satisfiable. In the pointless system, every number (except 0) is satisfiable, since every multiple of it is a model in which that number is “true” – furthermore, since 0 is divisible by every other number, we have for every natural number except 0. On the other hand, there is no number that is divisible by 0, so 0 is (as a “formula”) not satisfiable. Hey, this isn’t actually as pointless as I thought…
Let’s look at propositional logic to see how this formalism works in practice: We start with some set of propositional variables .
is simply the set of all propositional formulae, i.e. the set defined by the following rules:
- Every variable is a formula, i.e. .
- If is a formula, then .
- If are formulae, then so are , , and .
The way we assign truth values to propositional formulae is via assignments: Functions that assign to each propositional variable either 1 (true) or 0 (false). We can then extend an assignment to a function on all formulae inductively over the recursive definition of formulae:
- For every variable , we let .
- For any formula , we let if and only if .
- For any formulae , we let if and only if and .
We let if and only if either or (or both).
We let if and only if or (or both).
And finally, we let if and only if .
Since we can extend every assignment (on ) to all formulae this way, we don’t actually need to distinguish between assignments and their extensions – once we assign 0 or 1 to each variable, the above rules tell us exactly which value to assign to any arbitrary formula. Now, our set of models is simply the set of assignments, and the relation we define by letting if and only if . So if we have an assignment with and , we have , , … Furthermore, we can immediately see that e.g. is a tautology – for every model (i.e. assignment) we have . Why? Because either (and thus by definition of the extension we have ) or and thus by definition of our extension and hence .
It should be noted, that what a model is depends heavily on the specific logic. In propositional logic they are just assignments; however in first-order logic, models are structures with a universe and various functions and relations on (depending on the signature of the language), in modal logic they are sets of worlds with a visibility relation between them… models can get seriously weird, and they get weirder with the expressive strength of the language. This is why the definition of a logical system is so broad – we want to be able to express all logics, no matter how weird, in this framework – even if it’s a nonsensical pointless logic like the one with natural numbers.
By the semantics of a logic we mean its models and the logical entailment relation. On the other hand we have the syntax of a logic, meaning how formulae are constructed. Syntax tells us what a (“grammatically” correct) formula is, semantics tells us what a formula means (by telling us when/where it is true). The point of a proof calculus is to try and reduce the truth of a formula purely to its syntax. Before we rigorously define what a calculus (and what a proof) is, note that proofs always start with assumptions, i.e. statements, i.e. formulae. I prove a statement (i.e. formula) by stating things that I know (or for the sake of the proof assume) to be true, logically derive various statements (i.e. formulae) from these assumptions and (if I succeed) end with a derivation of the formula I set out to prove. All of this is – in a sense – just things-I-do-with-formulae, hence purely syntactical. So let’s reflect this in the following definitions:
Let be a set of formulae. We say is a model of (and write ) if and only if for every . We say entails a formula (and write ) if is true in every model of .
This way, we can talk about the truth of a formula relative to a set of axioms/assumptions . Again, we call a set of formulae satisfiable if it has a model; i.e. if there is a model in which every formula in the set is true.
Let’s look at a couple of examples again: In propositional logic, we have e.g. – after all, every assignment that satisfies both and also has to satisfy . Also, the set is satisfiable, as any assignment with and proves. An example for a non-satisfiable set would be – even though both formulae on their own are satisfiable, no assignment can set both and to 1.
For our pointless divisibility logic, we already know that 0 is not satisfiable. As a result, any set of numbers containing 0 also is not satisfiable. But we also know that every number (except 0) divides 0, which means 0 is a model for every set of natural numbers (not containing 0) – which makes every set that doesn’t contain 0 satisfiable. This is boring, so let’s get rid of 0 as a model by letting . Now only finite(!) sets of numbers that don’t contain 0 are satisfiable. To prove that, we take an arbitrary finite set of numbers that doesn’t contain 0. We now need to find a model for this set, i.e. a number that divides every one of the . But that’s easy: We just multiply them together, so we have , hence is satisfiable. On the other hand, infinite sets of numbers can’t be satisfiable, since infinite sets of natural numbers are unbounded, i.e. contain arbitrarily large numbers. A number divisible by all of them would have to be larger than any natural number (or 0, which we got rid of). So the satisfiable sets are now exactly the finite sets not containing 0.
Of course a statement still belongs squarely in the realm of semantics even if it doesn’t explicitly mention models anymore – whether the statement is true or not still depends on the models, after all. So what we want is a second relation that is definable completely without models, purely on the basis of syntax, but happens to express (at least approximately) the same thing; i.e. ideally we want to be true if and only if . So let’s use proof calculi to define such a relation:
A proof calculus for a language consists of a set of -axioms and a set of inference rules that allow us to derive new formulae from ones already derived. For a set of formulae we say a formula is -provable in , and write , if and only if one of the following cases applies (again a recursive definition):
- (i.e. is an assumption/axiom), or
- (i.e. is a -axiom of the calculus), or
- there are formulae with (for every ) and an inference rule such that (i.e. is derivable from already proven formulae via some inference rule).
We write short for , i.e. if is provable in without any additional axioms. Note, that a calculus is defined directly on the formulae; purely syntactically. Our definition doesn’t mention models anywhere. Now, given some proof calculus , what matters to us (and what we are interested in) is how that calculus relates to the semantics of our logical system – i.e. how similar and are. The most obvious property a calculus should have in that regard is correctness:
A calculus is called correct or sound if and only if whenever (for some set of formulae and some formula ) we have . In other words: if everything that is provable in is also true.
In other words again: If a calculus isn’t sound, we can prove things that aren’t true. In that case it is completely useless.
Let’s again look at propositional logic as an example. Without using a specific existent calculus, it should be clear that the axioms of the calculus should be tautologies (e.g. or …), since they are always provable, no matter my assumptions. Consequently, they need to be always true if we want our calculus to be sound, hence tautologies. Inference rules are usually written in the following style:
…meaning the inference rule maps the formulas above the bar to the formula below the bar. We can read this as “If are all provable, then I can also derive/prove “. Typical inference rules are e.g.:
|Modus Ponens / Implication-Elimination|
The last one here is interesting: Whereas in the other examples, the premises (i.e. the stuff above the bar) are all just formulae, in the implication introduction the premise is itself a demand that some formula is provable from some other one.
The converse to soundness is completeness:
A calculus is called complete if and only if whenever (for some set of formulae and some formula ) we have . In other words: if everything that is true is also provable in .
Whereas soundness is not just (as good as) required, but also often quite easy to ensure (or prove), completeness is a much more difficult property to achieve (or prove). Let’s take our pointless calculus as an example to try to come up with a sound and complete calculus:
A set of axioms/assumptions is just a set of numbers (not necessarily finite, but for this example let’s assume it is). A “formula” (i.e. number) is entailed by if any “model” (i.e. number) that is divisible by all the is also divisible by . As we already mentioned, 1 is the only “tautology” (since it divides every number, i.e. is “true” in every “model”), so we can take that as the only axiom of a calculus: i.e. . And we already have a sound calculus: The only provable “formulae” (i.e. numbers) so far are 1 and the ones we assume, so everything we can prove in this calculus also happens to be true. But it’s certainly not complete, since e.g. – every number that is divisible by 5 and 6 is also divisible by 10 after all. But the only “formulae” that are provable from the set are 5 and 6 themselves and 1. So if we assume 5 and 6, then 10 is true but not provable.
We already know that 1 is the only possible axiom of our calculus – since axioms are always provable (no matter the assumptions), they need to be tautologies (assuming we want to stay correct!), and 1 is the only tautology in our pointless logical system. So the only way we can extend our calculus to something more useful is to add inference rules.
I’ll choose the following: If we know, that is divisible by , then is also divisible by . We can express that as , which tells us: “If every number that is divisible by is also divisible by , and we know/assume that some number is divisible by , then it is also divisible by “. With this inference rule we can e.g. prove that :
We trivially have . With this and the assumption we can use our inference rule to infer that . And from this and the fact that we can infer . Similarly we can show that . So this new inference rule allows us to infer all divisors of our axioms/assumptions. That’s great, and since our inference rule is obviously correct (i.e. it allows for only inferring valid formulae, provided the premises are valid), our calculus with this new inference rule is also still sound. But it’s still not complete: As before, I still cannot infer that – since this new rules only lets us infer numbers that are smaller than the premises and 10 is larger than both 5 and 6.
For completeness we need one more rule: The fact that if a number is divisible by two numbers , then it is also divisible by the least common multiple of those two numbers, i.e.:
…and with this additional rule we can finally prove (since and ). In fact, with this additional rule, the calculus is finally complete (proof left as an exercise 😉 ).