Skip to content
Protean LabsDocs

Research Cognition

Protean Labs separates candidate production from scientific reasoning. The runtime still generates, validates, ranks, explains, and learns within bounded controls, while a second layer asks what should be studied next.

This layer is artifact-only. It does not rewrite code, mutate scoring rules, or alter learning logic.

Operating Model

evidence retrieval
-> hypothesis formation
-> computational experiment design
-> candidate-set comparison
-> bounded memory consolidation
-> next-investigation priority

The goal is to move from repeated candidate ranking toward persistent scientific exploration.

Hypotheses

The hypothesis engine proposes reviewable scientific hypotheses across motif behavior, cleavage exposure, shielding patterns, novelty pressure, failure correlation, and structural constraints.

Each hypothesis carries confidence, supporting evidence, contradictory evidence, candidate groups, status, and lineage.

Sequence Space

Sequence-space analysis maps the current candidate field into clusters, motif families, neighborhoods, redundancy groups, novelty gradients, and underexplored regions.

ESM-derived sequence signals remain the peptide/protein similarity layer. Text embeddings support evidence retrieval; they do not replace sequence similarity.

Scientific Memory

Scientific memory persists what the runtime has learned as replayable artifacts:

  • motif memory
  • failure memory
  • contradiction memory
  • exploration memory
  • candidate lineage
  • retrieval history
  • experiment memory

This gives the system continuity without giving it permission to mutate its own architecture.

Experiment Planning

The experiment planner creates computational study designs:

  • mutation sweeps
  • motif perturbations
  • contrastive candidate sets
  • local exploration branches
  • ablation and challenge studies

These plans are scientific review artifacts, not wet-lab findings.

Data Provenance

The provenance layer scores evidence sources by authority, curation, access quality, freshness, provenance granularity, replication utility, and compliance posture.

Source reliability influences review context. It does not convert computational claims into biological validation.

Bounded Planning

The research planner prioritizes six runtime modes:

  • generation
  • exploration
  • hypothesis
  • experiment
  • memory consolidation
  • data expansion

Every planned action records allowed writes and forbidden mutations. The system may recommend investigations, but it cannot self-modify code, scoring, or learning rules.