Reasoning & memory·Cognition

Research cognition

Protean separates candidate production from scientific reasoning. The runtime generates, validates, ranks, and explains under bounded controls; a second layer asks what should be studied next.

The shape of cognition

evidence retrieval
-> hypothesis formation
-> computational experiment design
-> candidate-set comparison
-> bounded memory consolidation
-> next-investigation priority

The objective is not more candidates per cycle. The objective is better research direction across cycles — and a system that knows the difference. Cognition produces hypotheses, experiment plans, and memory updates that a reviewer can argue with.

Hypotheses

The hypothesis layer proposes reviewable scientific hypotheses across motif behaviour, cleavage exposure, shielding patterns, novelty pressure, failure correlation, and structural constraints. Each hypothesis carries confidence, supporting evidence, contradictory evidence, candidate groups, status, and lineage. Hypotheses are bounded; the layer cannot produce more than the cap allows per cycle.

Sequence-space exploration

Sequence-space analysis maps the current candidate field into clusters, motif families, neighbourhoods, redundancy groups, novelty gradients, and underexplored regions. The analysis runs on facebook/esm2_t12_35M_UR50D sequence embeddings. Text embeddings (BAAI/bge-m3) support evidence retrieval; they do not replace sequence similarity, by design.

Scientific memory

Scientific memory persists what the runtime has observed as replayable artifacts:

motif memory
failure memory
contradiction memory
exploration memory
candidate lineage
retrieval history
experiment memory

This gives the system continuity without giving it permission to mutate its own architecture.

Contradiction routing and failure memory

There are three signal paths in cognition. Two are live; the third is reserved.

Claim QA → reviewer warnings (live). Generated statements are checked against the local evidence index by tasksource/ModernBERT-base-nli. Unsupported or contradicted statements are flagged at the reviewer surface.
Failure memory → scoring (live). The failure-similarity component of the ranking surface penalises candidates that look like prior failures. This is the canonical read path from memory into prioritisation.
Failure memory → proposal / hypothesis prioritisation (reserved). Today, claim-QA flags reach reviewers and failure memory feeds the failure-similarity scoring signal. The back-edge from contradiction memory or failure memory into proposal generation or hypothesis prioritisation is reserved — when that loop closes, it will be a single review-gated change visible inside the cycle executor.

Experiment planning

The experiment planner produces computational study designs that make the next cycle more informative:

mutation sweeps
motif perturbations
contrastive candidate sets
local exploration branches
ablation and challenge studies

These plans are scientific review artifacts. They are not wet-lab findings.

Data provenance and source trust

The provenance layer scores evidence sources by authority, curation, access quality, freshness, provenance granularity, replication utility, and compliance posture. Source reliability influences review context. It does not convert a computational claim into biological validation.

Runtime modes

The cognition layer exposes six runtime modes that shift scheduling priority for the next cycle: generation, exploration, hypothesis, experiment, memory consolidation, and data expansion. Modes change emphasis. They never change validation or publication truth standards.