Knowledge, Justification, and LLMs

The Setup

Large language models produce assertions. Sometimes the assertions are true. Sometimes they are false. Sometimes they come with confident hedges, sometimes without. Sometimes the model can trace its claim to a source; sometimes the source is hallucinated.

The pattern raises an epistemic question. When an LLM produces a true assertion, does the model know the truth? Or does it believe the truth, in some weak sense? Or has it merely output a token sequence that happens to match the truth?

The question is not philosophical for its own sake. It bears on:

What standards apply to the system's outputs. A system that knows can be deferred to. A system that merely outputs cannot.
Where accountability lies. If the model's assertion is false, who is responsible: the model, the user, the developer, the deployer?
How the system should be deployed. Should the system be allowed to assert without hedging? Under what conditions?
How the system should be improved. What kind of training, retrieval augmentation, or constraint changes the answer?

This page applies four epistemic frameworks to the question. None of them gives a fully satisfying answer alone. Together, they sharpen the empirical and engineering questions that determine when an LLM's output should be treated as knowledge.

This page assumes the four frameworks: What Is Epistemology?, The Gettier Problem, Induction and Hume's Problem, and Bayesian Epistemology. The companion essay Empiricism, Induction, and the Limits of LLM Generalization treats the inductive-license question in detail.

Frame 1: Classical JTB

The classical analysis defines knowledge as justified true belief. To apply this to an LLM, we need to specify what counts as belief, what counts as truth, and what counts as justification.

Truth. The model's assertion is either true or false in the standard correspondence sense: does it match how the world is. This part is straightforward, with the caveat that some questions admit degrees (population estimates within a tolerance, statistical claims of various granularities).

Belief. Does the model believe its assertion? This is contested.

A weak operational reading: the model behaves as though it believes the assertion. It produces the same answer to paraphrases of the same question, defends the answer when challenged, and does not contradict itself within the conversation. By this reading, modern LLMs do display belief-like behavior in specific cases.
A strong reading: belief requires an internal mental state with the appropriate causal role in the agent's behavior, akin to human belief. By this reading, whether LLMs have beliefs is an open question that depends on how we settle the Chinese Room argument.

For most working epistemic purposes, the operational reading is enough. The model's output is belief-like and can be treated as belief for analysis.

Justification. Here the analysis gets serious. What is the model's justification for an assertion?

The naive answer: the model's training is its evidence; the assertion is justified by the patterns the model learned. But this raises immediate concerns. The training data is mixed: some claims are well-attested, some are rare, some are wrong. The training process is statistical: the model averages over patterns, sometimes correctly, sometimes incorrectly. The model has no introspective access to which patterns are doing the work in any specific case.

By the classical analysis, justification requires that the route to the belief be appropriate, not merely that the belief happen to be true. An LLM's route to a true assertion may be:

A clean retrieval from training data (the assertion appeared verbatim or near-verbatim in many training examples).
A statistical interpolation across similar cases (no specific training example, but the pattern strongly supports the answer).
A confident hallucination (the assertion fits the surface pattern but the actual content is unsupported).

The first two are roughly justified in the classical sense; the third is not. The classical JTB analysis says: an LLM that produces a true assertion via the third route does not know, even though the assertion is true and the model "believes" it.

This is exactly the Gettier structure. The model has a true belief; the belief is supported by evidence in some operationalizable sense; but the route from evidence to belief is not the right kind of connection. JTB-of-LLMs cleanly inherits the post-Gettier worry.

Frame 2: Reliabilism

Goldman 1979 reliabilism asks: is the process that produced the belief reliable? A belief is justified iff it is produced by a reliable belief-forming process.

Applied to LLMs: is the trained model a reliable belief-forming process for the relevant class of questions?

This is sharp because reliability is a per-question-class property, not a per-question property. The standard reliabilist worry is the generality problem: which class is the relevant one?

For LLMs, several natural classes:

Question class	Typical reliability	Justification verdict
Population statistics for major cities	High (well-attested in training)	Justified
Population statistics for small towns	Lower (sparser training signal)	Less justified
Mathematical statements with provable answers	High for famous theorems, low for obscure conjectures	Mixed
Code in major programming languages	High for common patterns, lower for niche	Mixed
Code in minority programming languages	Lower (sparser training data)	Less justified
Citations and reference verification	Often poor (hallucination is common)	Often unjustified
Predictions about the future	Generally low (training-cutoff problem)	Unjustified
Internal facts about the model itself	Often poor (introspection is unreliable)	Unjustified

The reliabilist verdict is per-class. The same model has justified beliefs about Vancouver's 2020 population (a well-attested major-city statistic) and unjustified beliefs about a small-town's 2024 population (sparse training, training cutoff).

Implication. Calibration is the empirical question. A model's claim should come with a confidence that tracks its empirical reliability for that class of question. If the model says "the population of Vancouver in 2020 was 631,486" with high confidence, and 99 percent of similar claims are correct, the claim is reliabilist-justified. If the same model says "the population of Smalltown, Wyoming in 2024 was X" with the same confidence, and only 60 percent of similar claims are correct, the claim is not reliabilist-justified, even though the model does not signal the difference.

This is exactly the engineering problem of calibration discipline in modern AI. The reliabilist framing is the philosophical foundation of why the engineering matters.

Frame 3: Bayesian Calibration

The Bayesian framework takes credences seriously. Replace the binary "the model knows" with a graded "what is the rational credence that the model's assertion is true."

The relevant Bayesian objects:

Prior on assertion type. Before considering this specific case, what fraction of questions in this class does the model get right? This is the base rate.
Likelihood from confidence signal. Given the model's stated confidence, how does that update our belief in the assertion's truth?
Posterior. Combining base rate and confidence signal: what is the rational credence that this specific assertion is true?

For a well-calibrated model, the model's stated confidence is the same as the posterior an external observer should hold. The user can take the confidence at face value. The system says 0.9 and the user can plug in 0.9.

For a miscalibrated model, the user has to recalibrate: ignore or transform the model's confidence based on prior knowledge of the model's reliability for this class of question. This is what experienced LLM users actually do. They learn that the model is over-confident on certain question types and under-confident on others, and they adjust mentally.

The Gettier-style worry in Bayesian form. A miscalibrated model can produce confident-and-correct assertions for the wrong reason. The user's posterior, taking the confidence at face value, is too high; the assertion happens to be correct, but for reasons the user's reasoning has not engaged. This is exactly the Gettier structure: justified (by stated confidence) true belief that fails to be knowledge because the confidence is not actually evidence of the truth-maker.

Implication. Calibration is the empirically actionable form of justification for LLM outputs. A well-calibrated LLM transmits epistemic state cleanly; a miscalibrated one launders confidence into false credence. The engineering work of calibration (temperature scaling, conformal prediction, retrieval augmentation) is the practical version of Bayesian-justification work.

Frame 4: Williamson's Knowledge-First

Williamson's Knowledge and Its Limits (2000) makes the most radical move. He argues that the project of analyzing knowledge into more basic conditions has failed (post-Gettier) and that knowledge is primitive. The norm of assertion is then: assert only what you know.

Applied to LLMs: the question is not whether the model knows. The question is whether the model is in a position to assert without epistemic violation.

This reframes the engineering problem. If knowledge is primitive, what we have is not "is the model justified in asserting" (which requires analyzing justification) but "is the model in a position to assert" (which is closer to operational standards).

The operational standards become:

Reliability for the question type (per Goldman): the model must be reliable for this class of question.
Calibration (per Bayesian framing): the stated confidence must match the empirical accuracy.
Source-traceability: the model must be able to point to the basis of the claim, not merely produce it.
Hedge appropriately: where any of the above fails, the model must hedge or refuse rather than assert.

These are exactly the engineering specifications of modern retrieval-augmented generation (RAG), tool-using agents, and AI-system assertion norms in production. The knowledge-first frame, applied carefully, is the most prescriptively useful frame for AI engineering.

Connection to Williamson 2000's KK principle. Williamson denies that "if S knows p, then S knows S knows p" (the KK principle). Knowledge does not require introspective access to one's own knowledge. Applied to LLMs: a model can know without knowing it knows. Conversely, the model can be confident without that confidence amounting to knowing it knows. Confidence is not knowledge of one's own knowledge; calibration is what makes confidence informative.

Hallucination Is Gettier-Like

A central engineering problem of modern LLMs is hallucination: the model produces a confident assertion that is false. The structural form of hallucination, viewed epistemically, is the non-Gettier failure mode.

But there is a subtler concern: Gettier-flavored hallucination. The model produces a true assertion via a path that would, on a slightly different input, have produced a confident falsehood. The path is not connected to the truth-maker; the truth is accidental.

Examples:

The model is asked for a citation. It produces a plausible-looking citation that happens to refer to a real paper. The citation is well-formed; the paper exists. But the model did not retrieve the citation from a database; it generated a plausible-looking string and got lucky. The next citation might be hallucinated.
The model is asked for a code snippet. It produces working code that happens to use a real API. The API exists; the code runs. But the model's pattern-completion process could just as easily have produced an API call to a function that does not exist; the next snippet might be a hallucination.
The model is asked for a historical date. It produces a date that happens to be correct. The model's process could have produced any nearby date with similar surface plausibility.

In each case, the assertion is true, the model's process is justified-by-its-lights, but the connection between justification and truth is accidental. This is exactly the structure of Gettier cases. The model's "justified true output" is not knowledge in the post-Gettier sense.

Engineering implication. Two responses.

First, reduce the gap: make the model's process actually track the truth-maker. This is what retrieval-augmented generation (RAG) does for citations: instead of generating plausible-looking citations, retrieve real ones from a verified database. Tool use does this for facts: instead of pattern-completing the answer, look it up.

Second, signal the gap: when the model is in a Gettier-like situation (low base rate of correctness for this class of question), the model should hedge or refuse rather than assert. This is the assertion-norm engineering.

Both responses are needed. Either alone is insufficient.

A Worked Application: Citation Hallucination

A live example. A user asks an LLM for a citation supporting a claim. The model returns:

"Smith, J. and Jones, K. (2018). 'Improvements in Bayesian Inference for High-Dimensional Models.' Journal of Statistical Computation 47(3), 412-431."

Apply each frame:

JTB. Is the citation true (the paper exists)? Probably not in this exact form; LLMs hallucinate citations very reliably. Suppose for the moment the paper happens to exist (slightly different journal, similar authors). Is the model justified? The model's basis is statistical pattern completion of citation strings; in a Gettier sense, the basis does not track the existence of any specific paper. The classical JTB analysis declares this not knowledge.

Reliabilism. Is the model's process reliable for citation generation? Empirically: no. Studies have shown LLM-generated citations have high hallucination rates. The reliabilist verdict: the model is not justified in this class. Even when the model gets a citation right, it does not get it right via a reliable process.

Bayesian. What is the rational credence that the citation is correct? Given the high known hallucination rate for this class, the prior should be low; the model's confidence is uninformative because it is uncalibrated for this class. The user should not take the model's stated confidence at face value.

Knowledge-first / assertion norm. Is the model in a position to assert this citation? No: the empirical reliability is too low. The norm of assertion fails. The model should either retrieve from a database (RAG) or refuse to provide a citation.

The four frameworks converge on the engineering answer: do not let LLMs generate citations from training-time pattern completion. Use retrieval. Until then, treat any LLM-generated citation as suspect regardless of confidence.

This is exactly the empirical pattern of best-practice LLM deployment: citations come from RAG, not direct generation. The epistemological frameworks make precise why.

A Second Application: Code Generation

A user asks an LLM to write code that calls a specific API. The model returns code that uses a function that does not exist in the API.

The Gettier structure does not apply here directly: the code is wrong, the assertion is false, this is plain hallucination. But suppose the model returns code that happens to use a real function from the API. Apply the frames:

JTB. True (the function exists). Justified by the model's pattern matching? Sort of: API patterns are common in training data. Belief? Operational. But the model's basis was not knowledge of the specific API; it was statistical pattern matching that happened to land on a real function. Not knowledge in the Gettier-sensitive sense.

Reliabilism. Is code generation in this language and library reliable? Depends on the language and library. For widely-used libraries (NumPy, pandas, React), yes. For niche libraries, less so. The reliabilist verdict is per-library, not uniform.

Bayesian. Calibration: the model's confidence should track per-library accuracy. For code-generation, calibration is generally poor; the model expresses similar confidence whether it knows the library well or is hallucinating.

Knowledge-first / assertion norm. The model should hedge or refuse for niche libraries. Best practice: have the user run the code (the assertion-verification step), or use static analysis as a verifier.

The four frameworks again converge: running the code is the right epistemic protocol because it provides the missing link between justification and truth. The unit-test discipline of modern coding agents is the practical implementation of the assertion-norm response.

Synthesis: The Four Framings

Framework	Verdict on LLM knowledge	Implication for engineering
JTB (classical)	Contested: belief and justification both shaky. Even if applicable, Gettier-style cases proliferate.	Treat LLM "knowledge" as a stipulation, not a fact. Hallucinations are Gettier-like.
Reliabilism	Per-question-class. Reliable for some classes (major-city stats, common code), not others (niche citations, minority languages).	Calibrate per-class. Build retrieval augmentation for unreliable classes.
Bayesian	Treat the model's output as evidence to be combined with priors about the model's reliability.	Calibrate the model's confidence to track empirical accuracy. The norm of evidential update applies to the user, not just the model.
Williamson knowledge-first	The question is whether the model is in a position to assert. The answer depends on reliability, calibration, and traceability.	Engineer assertion norms: hedge or refuse where conditions fail. RAG and tool use are practical implementations.

The four frameworks do not give contradictory answers. They give different framings of the same engineering problem: when is the model's assertion warranted enough to be acted on?

The synthesis: epistemic warrant for an LLM output is the conjunction of per-class reliability, calibration to empirical accuracy, source-traceability, and assertion-norm discipline. Each framework names a different facet of the same engineering target.

What This Means for Stanford-Style AI Research

Two open questions worth flagging.

Question 1: do LLMs have beliefs? The operational reading (behavioral consistency under paraphrase) is enough for working epistemic analysis. The metaphysical question (do LLMs have anything like belief in the strong sense) interacts with the Chinese Room argument and remains contested.

Question 2: is the right model of LLM epistemology Bayesian or knowledge-first? Bayesian framings work well for tasks with clear calibration metrics. Knowledge-first framings work well for tasks with clear assertion-norms. Some tasks (factual question-answering) call for both; some (code generation) call for the assertion-norm framing more directly. The empirical question of which framing buys you more is open.

Both questions are good Stanford-symbolic-systems-relevant research directions. The technical and philosophical pieces fit naturally together.

Common Confusions

Confusion 1: "LLMs do not know anything." Strong claim, depends on the framework. Reliabilism: LLMs are reliable for some question classes, so they have some justified beliefs, so they have some knowledge. JTB: more contested. Knowledge-first: depends on whether the model can assert without violation. The strong claim is too strong without specifying the framework.

Confusion 2: "LLMs know everything." Equally too strong. Reliability is per-class. Calibration is generally poor. Hallucination is endemic. The strong claim ignores the empirical evidence about per-class reliability.

Confusion 3: hallucination is just a bug to be fixed. Hallucination is structurally similar to Gettier-style failure. Reducing it (via RAG, fine-tuning, calibration) is engineering work. Eliminating it would require something close to perfect knowledge of the world, which is not available to anything that learns from finite data. The right framing: hallucination rate is a parameter to be optimized, with calibration as the empirical handle.

Confusion 4: Bayesian framing fully solves it. Bayesian framing gives the cleanest formal apparatus for thinking about confidence and update. It does not select priors (the prior problem); it does not specify reliability classes (the generality problem); it does not by itself answer the assertion-norm question. The Bayesian framing is necessary but not sufficient.

Two Exercises

Exercise 1. Pick a specific LLM behavior you are familiar with (citation generation, code generation, factual question-answering, mathematical reasoning, creative writing, summarization, translation). Apply each of the four frameworks to it:

(a) JTB. Does the model produce true beliefs in this class? Are they justified? Are they Gettier-vulnerable?

(b) Reliabilism. Is the model's process reliable for this class? At what level of granularity?

(c) Bayesian. Is the model well-calibrated for this class? What would the user's prior need to be to use the model's confidence directly?

(d) Knowledge-first. Is the model in a position to assert without epistemic violation? Under what conditions?

Exercise 2. Consider three concrete scenarios in which an LLM produces a true assertion. For each, identify the framework that most naturally explains why the assertion does (or does not) count as knowledge.

(i) The LLM is asked to summarize a document the user provided. It produces an accurate summary.

(ii) The LLM is asked to compute the result of $373 \times 521$ . It produces the correct answer (194,333) without showing work.

(iii) The LLM is asked for the year a particular Supreme Court case was decided. It returns the correct year. The case is well-known.

Sketch of answers

Answer 1. Take citation generation as the example.

(a) JTB. True beliefs: sometimes (rarely, in standard LLM behavior). Justified: the model has a justification process (training-time pattern completion), but the justification does not track the truth-maker (the actual existence of the cited paper). Gettier-vulnerable: yes, classically.

(b) Reliabilism. Process reliability: low. Empirically, LLMs have very high hallucination rates for citations (often 50 percent or more in some studies). Reliabilist verdict: not justified.

(c) Bayesian. Calibration: poor. The model expresses similar confidence regardless of whether the citation is real. The user's prior should be low (perhaps 0.3-0.5) on any LLM-generated citation, and the prior should not be raised much by the model's confidence.

(d) Knowledge-first. Position to assert: low. The model should not directly generate citations. Best practice: retrieve from a real database (RAG) and verify the result before presenting.

The four converge on: do not use LLM-generated citations without verification. The synthesis is consistent across frameworks.

Answer 2.

(i) Summarizing a provided document. The model has direct access to the source. The justification is the document itself; the model's process traces from source to summary. Reliabilism most naturally applies: summarization is generally reliable for in-context content. The output is justified-true-belief in the JTB sense; the Gettier worry is minor because the truth-maker (the document) is the model's input.

(ii) Multiplying two integers. The model performs arithmetic with reasonable accuracy on small numbers. The process is statistical pattern completion, not actual arithmetic. Reliabilism gives a per-class reliability that is high for small numbers, low for large numbers. Bayesian asks about the model's calibration; modern LLMs are typically miscalibrated on arithmetic. Knowledge-first recommends a verification step (call a calculator). The output happens to be correct in this case, but the path is unreliable.

(iii) Supreme Court case year. The case is well-known; the year is well-attested in training data. Reliabilism judges the process reliable for this class. The output is justified-true-belief in the standard sense. Williamson knowledge-first would license the assertion. The Gettier worry is minor because the model's process (retrieval of well-attested fact from training) is the kind of process that does track the truth-maker (the historical date).

The exercise illustrates that LLM knowledge is not uniform. The same model has different epistemic warrant for different classes of question, and the frameworks pick up the differences in different ways.

Where This Lives in Practice

Three concrete uses.

Retrieval-augmented generation (RAG). RAG systems bind LLM output to verified retrieved sources. Epistemically, RAG is the engineering response to the Gettier worry: the model's justification is the retrieved source, which actually tracks the truth-maker. Modern RAG systems (Lewis et al. 2020, the open-source LangChain ecosystem, vector-database-backed enterprise products) are this idea at scale.

Calibration in production AI. Calibration metrics (Expected Calibration Error, reliability diagrams, Brier score) are now standard in evaluating ML systems. The conceptual content is Bayesian. The empirical work is to bring stated confidence into alignment with empirical accuracy. Recalibration techniques (temperature scaling, isotonic regression, conformal prediction) are the practical apparatus.

AI-system assertion norms. Modern LLM products increasingly hedge or refuse when they cannot verify a claim ("I am not sure", "This may not be accurate"). Anthropic's constitutional AI approach builds explicit guidelines for when the model should defer or hedge. OpenAI's response style guides the model to flag uncertainty. The engineering goal is to bring assertion behavior in line with epistemic warrant, exactly as the knowledge-first framework specifies.

The general lesson: epistemic frameworks are not parallel literatures. The four frameworks together specify what we want LLM outputs to satisfy, and the engineering work of modern AI is increasingly framed in their terms.

Prerequisites and Next Pages

Prerequisites: What Is Epistemology?, The Gettier Problem, Induction and Hume's Problem, Bayesian Epistemology.
Related: The Chinese Room Argument on the question of whether symbol manipulation suffices for understanding.
Companion essay: Empiricism, Induction, and the Limits of LLM Generalization on the inductive-license question for AI.

References

Primary epistemology:

Gettier, Edmund L. "Is Justified True Belief Knowledge?" Analysis 23, no. 6 (1963): 121-123.
Goldman, Alvin I. "What Is Justified Belief?" In Justification and Knowledge, ed. George Pappas, Reidel, 1979.
Williamson, Timothy. Knowledge and Its Limits. Oxford, 2000. The knowledge-first framework.

For the Bayesian and statistical-learning side:

Howson, Colin, and Peter Urbach. Scientific Reasoning: The Bayesian Approach. Open Court, 1989. The book-length Bayesian defense.
Vovk, Vladimir, Alex Gammerman, and Glenn Shafer. Algorithmic Learning in a Random World. Springer, 2005. Conformal prediction.

For the AI applications:

Lewis, Patrick, et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." NeurIPS 2020. The founding RAG paper.
Bai, Yuntao, et al. "Constitutional AI: Harmlessness from AI Feedback." Anthropic, 2022.
Kadavath, Saurav, et al. "Language Models (Mostly) Know What They Know." Anthropic, 2022. On LLM calibration.
Marcus, Gary, and Ernest Davis. "GPT-3, Bloviator: OpenAI's Language Generator Has No Idea What It's Talking About." MIT Technology Review, 2020. A skeptical take on LLM knowledge.

Stanford Encyclopedia entries (link, do not paraphrase):