Reasoning & memory·Cognition
Research cognition
Protean separates candidate production from scientific reasoning. The runtime generates, validates, ranks, and explains under bounded controls; a second layer asks what should be studied next.
The shape of cognition
evidence retrieval
-> hypothesis formation
-> computational experiment design
-> candidate-set comparison
-> bounded memory consolidation
-> next-investigation priorityThe objective is not more candidates per cycle. The objective is better research direction across cycles — and a system that knows the difference. Cognition produces hypotheses, experiment plans, and memory updates that a reviewer can argue with.
Hypotheses
The hypothesis layer proposes reviewable scientific hypotheses across motif behaviour, cleavage exposure, shielding patterns, novelty pressure, failure correlation, and structural constraints. Each hypothesis carries confidence, supporting evidence, contradictory evidence, candidate groups, status, and lineage. Hypotheses are bounded; the layer cannot produce more than the cap allows per cycle.
Sequence-space exploration
Sequence-space analysis maps the current candidate field into clusters, motif families, neighbourhoods, redundancy groups, novelty gradients, and underexplored regions. The analysis runs on facebook/esm2_t12_35M_UR50D sequence embeddings. Text embeddings (BAAI/bge-m3) support evidence retrieval; they do not replace sequence similarity, by design.
Scientific memory
Scientific memory persists what the runtime has observed as replayable artifacts:
- motif memory
- failure memory
- contradiction memory
- exploration memory
- candidate lineage
- retrieval history
- experiment memory
This gives the system continuity without giving it permission to mutate its own architecture.
Contradiction routing and failure memory
There are three signal paths in cognition. Two are live; the third is reserved.
- Claim QA → reviewer warnings (live). Generated statements are checked against the local evidence index by
tasksource/ModernBERT-base-nli. Unsupported or contradicted statements are flagged at the reviewer surface. - Failure memory → scoring (live). The failure-similarity component of the ranking surface penalises candidates that look like prior failures. This is the canonical read path from memory into prioritisation.
- Failure memory → proposal / hypothesis prioritisation (reserved). Today, claim-QA flags reach reviewers and failure memory feeds the failure-similarity scoring signal. The back-edge from contradiction memory or failure memory into proposal generation or hypothesis prioritisation is reserved — when that loop closes, it will be a single review-gated change visible inside the cycle executor.
Experiment planning
The experiment planner produces computational study designs that make the next cycle more informative:
- mutation sweeps
- motif perturbations
- contrastive candidate sets
- local exploration branches
- ablation and challenge studies
These plans are scientific review artifacts. They are not wet-lab findings.
Data provenance and source trust
The provenance layer scores evidence sources by authority, curation, access quality, freshness, provenance granularity, replication utility, and compliance posture. Source reliability influences review context. It does not convert a computational claim into biological validation.
Runtime modes
The cognition layer exposes six runtime modes that shift scheduling priority for the next cycle: generation, exploration, hypothesis, experiment, memory consolidation, and data expansion. Modes change emphasis. They never change validation or publication truth standards.
