Issue 01

Why Eve-Genesis trains reasoning modes, not answers

Most AI training is QA-shaped: input, then the right answer. Eve-Genesis is reasoning-shaped: input, then the conceptual transitions, the abstraction levels, the dialectical movement between meanings. Why that distinction matters — and what it produces.

Bill FarukiFounder & CEOMay 23, 20267 min read

Most AI training corpora are QA-shaped. An input, an output, a label that says “this is the right answer.” Train on enough of those pairs and the model learns to produce the right answer to inputs it has not seen before. This is the standard pattern. It is, structurally, what instruction-tuning is.

Eve-Genesis is shaped differently. The training target is not the answer. The training target is the structure of the reasoning that produces the answer — the conceptual transitions, the abstraction levels, the explicit moves between concrete and abstract representations, the dialectical tension and its resolution. The model that results does not behave like a QA system. It behaves like a conceptual interpreter.

This essay is about why we made that decision, what it produces, and why our four Operating Systems each declare a different set of reasoning modes on their public surfaces.

The two shapes of training data

Imagine a riddle. I speak without a mouth and hear without ears. What am I? The QA-shaped training pair has one input and one output: echo. Train the model on a hundred thousand such pairs and you get something that is good at riddles.

The reasoning-shaped training pair has the same input but a different output. The output is not a single word. It is a structured object: literal answer plus philosophical interpretations (memory, consequence, historical influence, identity reflected through others) plus the reasoning modes that produced each interpretation (analogical, phenomenological, semiotic). The model learns not what is the answer to the riddle but what kind of cognitive operations does this riddle invite, and what space of meanings do they open.

The first model is a QA system. The second is closer to a conceptual interpreter — closer in posture to what philosophy trains people to do, and to what trained professionals do when they reason in their domain.

The training target is not the answer. It is the structure of the reasoning that produces the answer.

What a reasoning mode actually is

Philosophy has formal categories for the kinds of inference humans use. Logic and epistemology have stress-tested them for centuries. The major ones:

Deductive — reasoning from a general rule to a certain conclusion. All humans are mortal; Socrates is human; therefore Socrates is mortal. Truth-preserving when the premises are true.
Inductive — reasoning from examples to a probable generalization. Every swan I have seen is white; therefore swans are probably white. Probabilistic, not certain.
Abductive — inference to the best explanation. The ground is wet; rain is the most likely cause. Central to medical diagnosis and scientific reasoning.
Analogical — mapping one conceptual structure onto another. Central to teaching, to legal reasoning by precedent, to scientific hypothesis formation.
Dialectical — concepts evolving through tension and resolution. Hegelian thesis-antithesis-synthesis. Central to legal argument and theological hermeneutics.
Hermeneutic — interpreting meaning contextually, in light of a tradition. Central to law, theology, literary criticism.
Phenomenological — analyzing how concepts appear in experience. Central to clinical interviewing, to pedagogy, to anything that depends on understanding the learner's frame.
Semiotic — concepts generating adjacent concepts through symbolic relationships. Peirce's tradition.

These are not synonyms. A clinician constructing a differential diagnosis is doing abductive reasoning — inference to the best explanation from a set of presenting features. A pharmacist auditing an AI dosing recommendation is doing analogical reasoning across patient cases plus deductive reasoning from pharmacological rules. An educator scaffolding a misconception is doing phenomenological reasoning to enter the learner's frame and analogical reasoning to bridge from the misconception to the corrected understanding.

The reasoning the discipline does is not generic. The reasoning modes the discipline uses are not generic.

What Eve-Genesis trains

Eve-Genesis is structured around this insight. Each domain edition — Clinical, Legal, Education, Uṣūl, and the rest — is a synthetic reasoning corpus calibrated to the reasoning modes that domain actually uses. Each training example does not stop at the answer. It carries:

The conceptual transition from input to output, made explicit
The reasoning mode used at each step (abductive, analogical, dialectical, etc.)
The abstraction level at which the move occurs
The ontological tags identifying what kind of entity each concept is
Contradiction handling: when two readings are both linguistically available
Alternative interpretations and the reasoning that produces each

Train at 100,000-to-250,000-record scale on this kind of structured object and what you produce is no longer a model that does instruction-tuned QA. You are doing what is sometimes called reasoning-style conditioning, or, more precisely, epistemic priors shaping — you are shaping the model's prior inclinations about what kind of cognitive moves to make in the presence of which kind of question.

What you are doing is closer to Hegelian progression, Aristotelian categorization, or Peircean semiotics than to ordinary instruction tuning.

The behavioral consequence

A model trained this way behaves differently. Some capabilities are slightly weaker (literal answering, deterministic QA on factoids). Several capabilities are substantially stronger:

Symbolic interpretation
Multi-perspectival inference
Tolerance for ambiguity without collapsing it
Abstract reasoning
Metaphorical generation
Domain-appropriate philosophical dialogue

For regulated industries, this profile is the right trade. Healthcare does not need a faster lookup table. It needs a clinician's differential reasoning made auditable. Law does not need a faster citation lookup. It needs a litigator's analogical reasoning across precedent made traceable. Pedagogy does not need a faster answer key. It needs a teacher's phenomenological grasp of where the learner is standing. The QA-shaped model produces what the QA paradigm rewards: the right answer. The reasoning-shaped model produces what regulated work demands: the right answer plus the reasoning, presented in the discipline's native idiom, with the reasoning chain visible and challengeable at every step.

Why each Operating System declares its own modes

On the public surfaces of the four products you can see the consequence. ArthurAI declares analogical, Socratic, phenomenological — the reasoning modes of pedagogy. ChironAI declares abductive, analogical — the reasoning modes of clinical diagnosis. JustineAI declares analogical, abductive, dialectical — the reasoning modes of litigation: precedent, theory of the case, and the dialectical engagement with the opposing argument. TheoAI declares dialectical, hermeneutic — the reasoning modes of theological scholarship: tension and resolution between texts, interpretation in light of tradition.

Those are not marketing words. They are claims about what the underlying compound reasoning model has actually been calibrated to do. The Eve-Genesis edition behind each product is a different shape of synthetic reasoning corpus, weighted toward the modes the discipline uses, away from the ones it does not.

That is why a single frontier model called directly is structurally weaker for regulated work than a frontier model composed inside a vertical-calibrated reasoner. The frontier model has been trained on everything — saturating in a way that averages reasoning modes toward the median. The Eve-Fusion compound, with its domain-trained Small Reasoning Model in the center, is shaped specifically toward the reasoning modes that produce trustworthy work in the discipline at hand.

Why this matters

The most important consequence is procurement-decisive but rarely articulated: a reasoning-shaped system is auditable in a way a QA-shaped system is not. The audit trail is not just “here is the answer the model produced.” It is here are the reasoning steps the model used to produce it, named in the discipline's own categories. A clinician can inspect the abductive chain. A litigator can inspect the analogical mapping to precedent. A scholar can inspect the dialectical movement between texts. The reasoning is structured for review by the human professional whose name will sign the artifact.

That is the substrate behind every claim we make about physician-attested, attorney-attested, educator-attested, and scholar-attested outputs. The audit chain is not glued on at the output layer. It comes from how the underlying reasoner was trained — on the modes the professional uses, in the structure they recognize, with every transition labeled.

That is the reason Eve-Genesis exists. That is why we train reasoning modes, not answers.