Evidence layer·Structure
Entity extraction
Entity extraction turns local scientific records into structured review signals. Additive, not authoritative — it augments ingestion and evidence organisation without replacing deterministic validators.
What gets extracted
The extractor is configured for scientific and peptide-adjacent entities. The route is urchade/gliner_large-v2 when the local GLiNER runtime is available; deterministic field extraction and regex fallback remain active when it is not.
- peptide names
- sequence-like strings
- assay names
- proteases
- organisms
- route-of-administration terms
- degradation and stability terms
- permeability terms
- toxicity terms
- failure signals
Where the output lives
data/processed/entities/entities_latest.jsonlEntity records help the retrieval layer and the paper generator identify relevant assay context, protease language, degradation signals, and failure vocabulary. They appear in candidate explanation context, not in the scoring contract.
What entity extraction is not
Entity extraction can miss entities, over-select terms, or require local package support. It is a structuring aid, not a scientific claim engine. The extracted records influence retrieval and context; they do not influence validation, scoring, or learning.
