Syntax vs Semantics in Formal Systems

Quick Answer

In a formal system, syntax is the set of rules for which strings of symbols count as legitimate expressions of the language. Semantics is the rule for assigning meaning (typically truth-values, denotations, or structures) to those expressions.

A few illustrations.

The string (p ∧ q) is syntactically well-formed in propositional logic. The string p ∧ ∧ q is not.
Once we fix an interpretation that assigns truth-values to atomic letters, the well-formed string (p ∧ q) means "the conjunction of $p$ and $q$ is true under the assignment iff both $p$ and $q$ are."
Syntax tells you which derivations are legal proofs. Semantics tells you which sentences are true in which models.

This page is the philosophical and formal-systems version of the syntax-vs-semantics distinction. The linguistic version, sentence structure (syntax) versus sentence meaning (semantics) in natural languages, is a different topic with overlapping vocabulary. For the linguistic treatment, see LinguisticsPath. For the philosophy of language treatment of meaning and reference, that lives separately under philosophy-of-language and is not the same distinction.

Side by Side

	Syntax	Semantics
What it studies	Strings, grammars, derivation rules	Truth, models, denotation, satisfaction
Primary objects	Symbols, formulas, proofs	Structures, valuations, interpretations
Mode of evaluation	Mechanical: does this string follow the rules?	Truth-conditional: in this structure, does this formula hold?
Standard relation	$\Gamma \vdash \varphi$ ( $\varphi$ is derivable from $\Gamma$ )	$\Gamma \models \varphi$ ( $\varphi$ is true in every model of $\Gamma$ )
Failure mode	Syntactic ill-formedness, ungrammaticality	Semantic falsehood, lack of model
Example legality	"Is `∀x (P(x) → Q(x))` a well-formed formula?"	"Does `∀x (P(x) → Q(x))` hold in the structure $\mathcal{M}$ ?"

The two sides answer different questions. They are connected by the soundness-completeness bridge below, but they are not the same.

Why the Distinction Exists

Before Tarski, " $\varphi$ is true" was treated as primitive, the property truth applied to the sentence directly. This led to the well-known semantic paradoxes (the Liar being the canonical case: "this sentence is false"). Tarski's 1933 The Concept of Truth in Formalized Languages showed that for formalized languages the paradox dissolves once truth is defined relative to a structure (an interpretation specifying what the symbols denote) and stratified across an object language and a metalanguage containing it.

The pivotal insight: truth is not a property of strings considered alone; it is a relation between strings and structures. Syntactic properties live at the level of the strings. Semantic properties live at the level of the relation. Conflating them is what produced the paradox.

This split made formal logic possible as a precise mathematical discipline, and it is the conceptual foundation of model theory, formal semantics, and substantial parts of theoretical computer science.

A Worked Example: Propositional Logic

Take a tiny formal system: propositional logic with atomic letters $p, q, r$ and connectives $\neg, \land, \lor, \to$ .

Syntax

The alphabet is the set $\{p, q, r, \neg, \land, \lor, \to, (, )\}$ .

The formation rules (the grammar):

Any atomic letter is a well-formed formula (wff).
If $\varphi$ is a wff, so is $\neg \varphi$ .
If $\varphi$ and $\psi$ are wffs, so are $(\varphi \land \psi)$ , $(\varphi \lor \psi)$ , $(\varphi \to \psi)$ .
Nothing else is a wff.

By these rules, $((p \land q) \to r)$ is a wff. The string $p \land \to q$ is not.

The proof rules (one possible system, natural deduction):

∧-introduction: from $\varphi$ and $\psi$ , derive $(\varphi \land \psi)$ .
∧-elimination: from $(\varphi \land \psi)$ , derive $\varphi$ (or $\psi$ ).
→-elimination (modus ponens): from $\varphi$ and $(\varphi \to \psi)$ , derive $\psi$ .
And several more.

A derivation of $\psi$ from a set $\Gamma$ is a finite sequence of wffs ending in $\psi$ , where each wff is either in $\Gamma$ or follows from earlier wffs by a proof rule. We write $\Gamma \vdash \psi$ when such a derivation exists. Note: this is a purely syntactic notion. No mention of truth.

Semantics

A valuation is a function $v$ assigning each atomic letter a truth-value in $\{\text{T}, \text{F}\}$ .

Given a valuation, we extend $v$ to all wffs by the standard truth-functional rules:

$v(\neg \varphi) = \text{T}$ iff $v(\varphi) = \text{F}$ .
$v((\varphi \land \psi)) = \text{T}$ iff $v(\varphi) = v(\psi) = \text{T}$ .
$v((\varphi \lor \psi)) = \text{T}$ iff $v(\varphi) = \text{T}$ or $v(\psi) = \text{T}$ .
$v((\varphi \to \psi)) = \text{T}$ iff $v(\varphi) = \text{F}$ or $v(\psi) = \text{T}$ .

A formula $\varphi$ is semantically valid (a tautology) iff $v(\varphi) = \text{T}$ for every valuation $v$ . We write $\models \varphi$ .

We write $\Gamma \models \varphi$ to mean: every valuation that makes every member of $\Gamma$ true also makes $\varphi$ true. This is semantic consequence.

Note again: this is a purely semantic notion. No mention of proofs.

The Bridge

The two notions agree.

Soundness theorem (propositional logic): $\Gamma \vdash \varphi$ implies $\Gamma \models \varphi$ . If you can prove it, it is true in every model of the premises.

Completeness theorem (propositional logic, Post 1921; first-order logic, Gödel 1929): $\Gamma \models \varphi$ implies $\Gamma \vdash \varphi$ . If it is true in every model of the premises, you can prove it.

The two together: $\Gamma \vdash \varphi \iff \Gamma \models \varphi$ . The syntactic and semantic notions of consequence coincide.

This is not trivial. Soundness is straightforward (the proof rules were designed to preserve truth). Completeness is hard: it says the proof rules are enough, every semantically valid argument has a syntactic proof. That this holds for first-order logic is one of the central results of modern mathematical logic.

For higher-order logic and certain stronger systems, completeness fails (Lindström's theorems characterize this). The bridge is robust for first-order logic but does not extend automatically.

A Worked Example: Arithmetic and Gödel

The cleanest place to see the syntax-semantics gap matter is Gödel's incompleteness theorems.

Consider Peano arithmetic (PA), the standard formal axiomatization of the natural numbers with addition and multiplication.

Syntactic statement: a sentence $\varphi$ is provable in PA, written $\text{PA} \vdash \varphi$ , iff there is a finite derivation of $\varphi$ from the PA axioms using the proof rules of first-order logic.
Semantic statement: a sentence $\varphi$ is true in the standard model of arithmetic iff it holds when interpreted on the natural numbers $\mathbb{N}$ with the standard addition and multiplication.

For propositional logic and pure first-order logic, completeness gives us $\vdash$ iff $\models$ . For PA, the picture is sharper.

Gödel's first incompleteness theorem (1931). There is a sentence $G$ in the language of PA such that $G$ is true in the standard model of arithmetic but not provable in PA. Symbolically, $\mathbb{N} \models G$ but $\text{PA} \not\vdash G$ .

The two notions of consequence, syntactic provability in PA, semantic truth on $\mathbb{N}$ , come apart for any sufficiently strong formal system that includes arithmetic. The Gödel sentence is true; the system cannot prove it.

This is why the syntax-semantics distinction is not a curiosity. It is the framework that lets us say what Gödel's theorem says.

Where the Distinction Bites

Three live applications.

Soundness vs completeness in proof assistants. A proof assistant like Lean or Coq is sound by design: any proof it accepts is a valid derivation in the underlying type theory. Whether it is complete, whether every true statement of mathematics is derivable in its system, depends on the system. Most foundational systems are deliberately incomplete (Gödel) for known reasons.

Tarski-style truth-conditional semantics. When linguists or philosophers use "truth-conditional semantics" they are using Tarski's framework: assign a model, define truth in the model recursively, and identify the meaning of a sentence with its truth-conditions across models. The distinction between formal syntactic structure and semantic interpretation is what makes this enterprise rigorous.

Reasoning under inconsistency. A classical-logic system in which one inconsistency is provable can derive anything (the principle of explosion: $\bot \vdash \varphi$ ). Paraconsistent logics are designed so that this syntactic explosion does not happen, useful when reasoning over inconsistent legal codes, databases, or historical sources.

Common Confusions

Confusion 1: well-formed = true. A string can be syntactically well-formed and semantically false. "All transformers are reptiles" is grammatical English and a well-formed first-order sentence; it is also obviously false. Well-formedness is necessary for evaluation; it is not sufficient for truth.

Confusion 2: provable = true. Provability is relative to a formal system. A statement provable in classical logic may not be provable in intuitionistic logic; a statement provable in PA + a strong axiom may not be provable in PA alone. Truth (in the semantic sense) is relative to a model. The two are connected by soundness and completeness for systems where the bridge holds; they remain distinct concepts.

Confusion 3: syntactic vs linguistic syntax. The syntax studied in formal logic and the syntax studied in theoretical linguistics share a vocabulary but are different objects. Linguistic syntax studies the structure of natural-language sentences (constituency, dependency, transformations). Formal-system syntax studies the rules for legal strings in a designed language. Both live under "syntax" in the broad sense. This page is on the formal-system version. The linguistic-syntax page lives on LinguisticsPath, and the boundary between the two domains is documented in the path-network ownership registry.

Confusion 4: model = semantics, valuation = syntax. A common slip in introductory texts. The valuation is part of the semantics: it specifies which atoms are true. The syntax is the formula's structure independent of any valuation.

Exercises

For each pair, decide whether the two strings are (a) the same well-formed formula, (b) two different well-formed formulas, or (c) at least one is not a well-formed formula at all.

$(p \to q)$ versus $p \to q$ (the same up to outer parentheses).
$((p \to q) \to r)$ versus $(p \to (q \to r))$ .
$\neg p \land q$ versus $\land p \neg q$ .

For each formula, decide whether it is a propositional tautology (true under every valuation), a contradiction (false under every valuation), or contingent (true under some, false under others).

$(p \to (q \to p))$ .
$(p \land \neg p)$ .
$(p \to q) \to (q \to p)$ .

For each pair $(\Gamma, \varphi)$ , decide whether $\Gamma \models \varphi$ .

$\Gamma = \{p \to q,\, p\}$ , $\varphi = q$ .
$\Gamma = \{p \to q,\, \neg p\}$ , $\varphi = \neg q$ .
$\Gamma = \{p \lor q,\, \neg p\}$ , $\varphi = q$ .

Answers

(a) Same up to convention on outer parentheses. The two strings are typically treated as notational variants; whether the outer parentheses are required is a convention of the specific syntax.
(b) Different. $\to$ associates to the right by convention, so the second is the standard reading of $p \to q \to r$ . The first is its left-bracketed variant. They differ on the valuation $p = \text{T}, q = \text{F}, r = \text{F}$ , try it and see which evaluates to $\text{T}$ and which to $\text{F}$ .
(c) The second string is not well-formed. Connectives in standard infix notation appear between operands; $\land p \neg q$ violates the formation rule.
Tautology. Often called the "weakening" tautology: anything implies that anything implies it. Verify by truth table.
Contradiction. A statement and its negation cannot both be true under any valuation.
Contingent. Take $p = \text{T}, q = \text{T}$ : the antecedent $p \to q$ is T, the consequent $q \to p$ is T, the whole conditional is T. Take $p = \text{F}, q = \text{T}$ : the antecedent $p \to q$ is T, the consequent $q \to p$ is F, the whole is F. Therefore contingent.
Yes ( $\Gamma \models \varphi$ ). Modus ponens is semantically valid.
No. This is denying the antecedent. From $\neg p$ , neither $q$ nor $\neg q$ follows. Counterexample valuation: $p = \text{F}, q = \text{T}$ makes the premises true and $\neg q$ false.
Yes. Disjunctive syllogism. From $p \lor q$ and $\neg p$ , $q$ must hold.

Prerequisites and Next Pages

Prerequisite: What Is Logic?, the broader frame.
Prerequisite: Validity vs Soundness, the argument-level distinction this page extends to the system level.
Next: What Is a Symbolic System?, the foundational notion behind formal-system syntax.
Next: The Chinese Room Argument, Searle's argument that syntactic symbol manipulation alone does not suffice for semantic understanding.

References

Primary texts:

Tarski, Alfred. "The Concept of Truth in Formalized Languages." 1933. Polish original; English in Logic, Semantics, Metamathematics, Oxford, 1956.
Tarski, Alfred. "On the Concept of Logical Consequence." 1936. The model-theoretic definition of $\models$ .
Gödel, Kurt. "Über die Vollständigkeit des Logikkalküls" (Completeness of first-order logic). 1929 dissertation; "Über formal unentscheidbare Sätze..." (incompleteness for arithmetic). 1931.

Modern reference:

Mendelson, Elliott. Introduction to Mathematical Logic. CRC Press, 6th ed. 2015. Standard graduate-level treatment of syntax, semantics, soundness, and completeness for first-order logic.
Enderton, Herbert. A Mathematical Introduction to Logic. Academic Press, 2nd ed. 2001. Cleaner notation; equivalent depth.
Boolos, George, John Burgess, and Richard Jeffrey. Computability and Logic. Cambridge, 5th ed. 2007. Covers the syntax-semantics-computability triangle including Gödel.

Stanford Encyclopedia entries (link, not paraphrase):

"Tarski's Truth Definitions."
"Classical Logic", covers the syntax-semantics-bridge for propositional and first-order logic.
"Model Theory."
"Gödel's Incompleteness Theorems", where the syntactic-semantic gap genuinely cannot be closed.