User blog:DontDrinkH20/H-Boogol-Boogol Bit by bit: PART 1: Model Theory 101

This is a series of blog posts I will make to try to generalize the understanding of H-boogol-boogol to the general public of this wiki. It's quite a mind boggling concept, but the kind of person on this website is likely the kind of person who can figure it out if they just read a couple different detailed descriptions. So just believe in yourself, and feel free to get out a notebook and start taking notes.

Model Theory: What is it?
If you know enough about mathematics folklore, you would know that Godel's incompleteness theorem shows that there are some things which just cannot be proven by a working system of mathematics; one that doesn't imply 0 = 1.

In model theory, a theory \(T\) is that system of mathematics, aided by a language and a formal logic (but those aren't important quite yet). In reality, a theory is just a set of axioms, or "sentences" in mathematical logic. These sentences are formulated in a very specific way. For now, I will describe first-order axioms, which are axioms that use first-order languages.

Note: It doesn't need to actually be able to perform mathematics at all; the empty set is a valid theory, for example.

First-order Finitary Language
A (first-order finitary) language is just a set of constant symbols, relation symbols, and function symbols. They are usually denoted by \(\mathcal{L}\) as a variable (that's the standard variable name for a language). Given a (first-order finitary) language \(\mathcal{L}\), one can define the set of all \(\mathcal{L}\)-terms:
 * 1) \("c"\) for every constant symbol \(c\) of \(\mathcal{L}\) (a term which is meant to represent the constant symbol)
 * 2) \("v_n"\) for every finite \(n\), this is called a "variable symbol"
 * 3) \("f(t_0,t_1...t_n)"\) for every finite set of \(\mathcal{L}\)-terms \(\{t_0,t_1...t_n\}\) and every function symbol \(f\) of \(\mathcal{L}\)

Intuitively, the terms of a language are just the things you can instantly create from functions and constant symbols without quantification. For example, if your function symbol is \(+\) and constant symbols \(0\) and \(1\), \((v_0+1)+(1+(1+0))\) is a valid term.

'''TO CLEAR UP MISCONCEPTIONS: Function symbols do not necessarily have anything to do with specific functions, and constant symbols nothing to do with specific objects. Symbols are defined as literally any object, and every one symbol is "isomorphic" in a sense to every other symbol. The only thing to differentiate them with is equality.'''

Then, one can define the set of all \(\mathcal{L}\)-formulae:
 * 1) \("t_0=t_1"\) for any two \(\mathcal{L}\)-terms \(t_0\) and \(t_1\) (this symbolizes statements of equality)
 * 2) \("r(t_0,t_1...t_n)"\) for any set of \(\mathcal{L}\)-terms \(\{t_0,t_1...t_n\}) and any \(n\)-ary relation symbol \(r\) (for example if the relation symbol is \(<\) then \("v_0<0"\) is a valid formula)
 * 3) \("\neg\varphi"\) where \(\varphi\) is another \(\mathcal{L}\)-formula (this is just taking some formula and making a new one which is "not this formula")
 * 4) \("\psi\land\varphi"\) where \(\varphi\) and \(\psi\) are other \(\mathcal{L}\)-formulae (this is like going "this is true, and THIS is also true)
 * 5) \("\psi\lor\varphi"\) where \(\varphi\) and \(\psi\) are other \(\mathcal{L}\)-formulae (the "or" counterpart to the above one)
 * 6) If \(v_n\) is a "free variable" in \(\varphi\), then \(\forall v_n(\varphi)\) is a valid formula and \(v_n\) is NOT a free variable in \(\varphi\) (for every \(v_n\)...)
 * 7) If \(v_n\) is a free variable in \(\varphi\), then \(\exists v_n(\varphi)\) is a valid formula and \(v_n\) is NOT a free variable in \(\varphi\) (there is some \(v_n\) such that...)

An \(\mathcal{L}-sentence (or axiom) is simply a formula with no free variables; i.e. there is just one truth value of it regardless of what you plug in as a variable.

That's a bit of a mouthful! Try to digest it. Anyways, each of these formulae are actually completely meaningless without specific universes to quantify over and specific functions, relations, and constants to describe the symbols. "Valid formula" just means that the formula is not just a random garble of symbols; it can actually be assigned a truth value given those things.

Let's go back though: "These formulae are meaningless without specific universes to quantify over and specific functions, relations, and constants to describe the symbols." Hang on, why don't we just go ahead and do that?

Structures
Model theory is really just the study of "parallel universes" of mathematics, called structures, which obey the rules given out by a theory. This may seem counterintuitive, and you may start to say "what is the real universe of mathematics given these axioms?" Well, it turns out this is more intuitive than you thought. If you have taken a geometry class, you likely have already dealt with some model theory unknowingly.

Perhaps you are in geometry class, and your teacher says "show that, just going off of the fact that one angle of the triangle is 60 degrees, you cannot prove that the triangle is equilateral." In this case, the theory is something like this:


 * 1) The entire universe is a triangle.
 * 2) There is an angle which is 60 degrees.

There are a couple of structures you could construct which fit the bill; inherently, the structures which are triangles with at least one 60 degree angle. You could go about proving that this theory doesn't prove that the universe is an equilateral triangle by constructing a structure which is a scalene triangle with one 60 degree angle.

Do you see what you just did there? Logically, structures are "universes" which fit certain theories. However, those universes can disagree, and when they do, you find a sentence which is undecidable from that theory. This may seem kind of trivial in the case we have chosen. After all, just saying "the entire universe is a triangle" and "there is a 60 degree angle" is quite useless in the grand scheme of things.

Mathematically, we should study theories like ZFC (the current set of axioms which includes all of the finitary mathematics and most of the infinitary mathematics we do). Sadly, it turns out that Godel's incompleteness theorem shows that EVERY system of mathematics which doesn't imply a contradiction has an undecidable sentence. For example, ZFC cannot prove nor disprove the continuum hypothesis because if there is a universe of ZFC, there are two universes: one which has CH, one which doesn't.

On the other hand, Godel's completeness theorem (not incompleteness theorem) shows that every system of mathematics is satisfiable (it has a structure in which every axiom of the theory is true) if and only if it is consistent (it doesn't imply a contradiction). However, this only applies to first-order finitary logic for the most part, and maybe some others. For this reason, I don't actually talk about satisfiability, I talk about consistency when speaking of other logics in H-boogol-boogol's definition.

Formalizing Structures
You do eventually need to formalize what a structure is and how it satisfies an axiom or a formula when given some input variables. Luckily, I'll do that right now. If you feel you don't need this, you probably do. But you can skip it if you want to, because it's quite heavy in formality.

If you can remember it, I said "these formulae are meaningless without specific universes to quantify over and specific functions, relations, and constants to describe the symbols" and then alluded to the fact that these universes just are structures formally. This is actually pretty much exactly what structures are. The difficult part is describing the semantics of first-order finitary logic; that is, how a formula is considered true in a structure given some input variables.

Formally, an \(\mathcal{L}\)-structure \(\mathcal{M}\) is an ordered tuple \((M;c_0,c_1...c_n;r_0,r_1...r_m;f_0,f_1...f_k)\) where:
 * 1) \(c_0...c_n\) are some elements of \(M\) for each constant symbol \(C_0...C_n\)
 * 2) \(r_0...r_m\) are some \(A_0...A_m\)-ary relations (respectively) for each relation symbol \(R_0...R_n\) over \(M\)
 * 3) \(f_0...f_k\) are some \(D_0...D_k\)-input functions (respectively) for each function symbol \(F_0...F_n\) with domain and range \(M\)

These are the values we assign to a formula once introduced to the structure. A formula by itself has no preference of functions or constants or relations or even universe, but when given a structure it makes sense. I really can't stress that enough - the structure determines the truth of the formula.

'''Let \(t\) be an \(\mathcal{L}\)-term with \(n\) variable symbols. \(t^{\mathcal{M}}(x_0...x_n...)\) for \(x_0...x_n...\in M\) is then defined as follows:'''
 * 1) If \(t\) is \("v_k"\), then \(t^{\mathcal{M}}(x_0...x_k...)=x_k\)
 * 2) If \(t\) is \("C_m"\) for some \(m\), then \(t^{\mathcal{M}}=c_m\)
 * 3) If \(t\) is \("F_m(t_0,t_1...)"\) for some \(m\), then \(t^{\mathcal{M}}(x_0...x_n)=f_m(t_0^{\mathcal{M}}(x_0...x_n...),t_1^{\mathcal{M}}(x_0...x_n...)...)\)

OK! Done with that. If you are confused at this point, \(t^{\mathcal{M}}(x_0...x_n)\) basically calculates the value of \(t\) in \(\mathcal{M}\) with variables \(x_0...x_n\). For example, in the structure of natural numbers \(\mathbb{N}\), letting \(t\) be \("1+v_0"\) and \(x_0\) be \(3\), \(t^{\mathbb{N}}(x_0)=1+3=4\).

'''Let \(\varphi\) be an \(\mathcal{L}\)-formula with \(n\) free variables. Assume they are \(v_0...v_n\) (without loss of generality). Then, \(\mathcal{M}\models\varphi[x_0...x_n]\) if and only if:'''


 * 1) If \(\varphi\) is \("t_0=t_1"\), then \(t_0^{\mathcal{L}}(x_0...x_n)=t_1^{\mathcal{L}}(x_0...x_n)\)
 * 2) If \(\varphi\) is \("R_m(t_0,t_1...)"\), then \(r_m(t_0^{\mathcal{L}}(x_0...x_n),t_1^{\mathcal{L}}(x_0...x_n)...)\)
 * 3) If \(\varphi\) is \("\neg\psi"\), then \(\mathcal{M}\not\models\psi[x_0...x_n]\)
 * 4) If \(\varphi\) is \("\psi\land\chi"\) then \(\mathcal{M}\models\psi[x_0...x_n]\) and \(\mathcal{M}\models\chi[x_0...x_n]\)
 * 5) Similarly with \(\lor\)
 * 6) If \(\varphi\) is \("\forall v_{n+1}(\psi)"\) then \(\mathcal{M}\models\psi[x_0...x_n,X]\) for every \(X\in M\)
 * 7) If \(\varphi\) is \("\exists v_{n+1}(\psi)"\) then \(\mathcal{M}\models\psi[x_0...x_n,X]\) for some \(X\in M\)

That might have been even more confusing, but don't fret! It is completely intuitive. For example, \(\mathbb{N}\models\forall x\exists y(y > x)\) and \(\mathbb{N}\models\forall x(0\leq x)\) (for every \(x\), there is some \(y\) above \(x\), and every number is at least \(0\)).

If \(T\) is an \(\mathcal{L}\)-theory (a set of \(\mathcal{L}\)-sentences) then \(\mathcal{M}\) is a model of \(T\) if and only if \(\mathcal{M}\models\varphi\) for every sentence \(\varphi\in T\). That is, every axiom of \(T\) holds true within the universe \(\mathcal{M}\).

For a good example, of a theory, see the Peano Axioms, of which \(\mathbb{N}\) is a model.