Skip to content
Protean
Research archive

Autonomous thesis · thesis_b50734ac395de503 · published 2026-05-25 17:06 UTC · openai-codex/gpt-5.5

Antimicrobial candidate subgrouping by proximity to known failure signals

Runtime Memo: Antimicrobial candidate subgrouping by proximity to known failure signals

Autonomous runtime memo · 2026-05-25T15:54:56+00:00

Confidence: runtime_memo (autonomous) · evidence 5↑ / 0↓ (2 trusted-tier) · strength 0.35 · uncertainty 0.16

Provenance: prose machine-synthesized by openai-codex/gpt-5.5; deterministic skeleton from seed seed_f2f776aeeb081718.

Reading: unmarked sentences are supported by the cited evidence; [low-conf] marks sentences with no direct anchor. Per-section confidence appears beneath each prose heading; structured per-claim classifications live in metadata.jsonsection_confidence.

Abstract

Galen flagged a prioritization gap where apparent peptide rank can mask candidates adjacent to failure signals. The runtime associated failure-correlation proximity with proline-rich motifs, separating rank-favored candidates from a degradation-risk subgroup. Support came from multiple Failure Correlation metric records, CyclicMPNN, and degradation;instability, forming a small motif-centered evidence cluster. No contradicting evidence was supplied, so constraint came from limited evidence strength rather than opposing records. The runtime holds moderate confidence under low uncertainty, with the §8 panel positioned to adjudicate subgroup assay priority.

1. Introduction

conf 0.09 · evidence 5 sup / 0 con · trusted-tier 2 · class mix: unr:5

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Galen flagged a prioritization gap where top-ranked antimicrobial candidates can sit near known failure signals without receiving a separate degradation-risk assay. A sequence-divergence discriminator would isolate candidates by distance from failure-correlated sequences, instead of grouping them with motif-recombined stability analogs from CyclicMPNN. The mechanistic locus sits in proline-rich runs within protease-resistant scaffolds, including PPGP and PGPP, with no receptor family assigned in the graph. Failure Correlation metric for [redacted-seq:21aa:b8787c3d] and degradation;instability anchor the failure-side signal at evidence strength 0.35. The runtime proposes separate assay routing for this subgroup at confidence 0.76, with uncertainty 0.16 limiting claims to workflow triage.

2. Related Work

The following trusted-tier references inform this synthesis:

1. Failure Correlation metric for [redacted-seq:21aa:b8787c3d] · ranked_candidates · source_id:20260523T190743Z-037 2. Failure Correlation metric for [redacted-seq:21aa:076684be] · ranked_candidates · source_id:20260523T190743Z-038 3. Failure Correlation metric for [redacted-seq:22aa:a42d5ef3] · ranked_candidates · source_id:20260523T190743Z-016 4. CyclicMPNN: Stable Cyclic Peptide Sequence Generation · paperclip · source_id:bio_e1a320b06d40 5. degradation;instability · pubmed · source_id:38401875

3. Mechanistic Framework

conf 0.09 · evidence 5 sup / 0 con · trusted-tier 2 · class mix: unr:6

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Motif extraction surfaced proline-rich runs near failure-correlation candidates, with PPGP, PGPP, and PPPG recurring across the redacted sequence metrics. PPGP couples to protease_resistance through constrained backbone geometry, because adjacent prolines can reduce accessible cleavage conformations in short antimicrobial peptides. Failure Correlation metric for [redacted-seq:21aa:b8787c3d] covers proximity to degradation-like failure signals rather than primary antimicrobial activity. The synthesis pass coupled CyclicMPNN: Stable Cyclic Peptide Sequence Generation to cyclic stabilization context, not direct evidence for linear motif durability. The framework does not yet account for cleavage-site mapping, carrier-mediated penetration, or assay-specific degradation kinetics across PPGW, PGWP, and GWPP variants. Runtime confidence stays bounded because degradation;instability supplies a broad failure label, while no contradicting record narrows the subgroup boundary.

4. Evidence Synthesis

  • [TRUST_T1] Failure Correlation metric for [redacted-seq:21aa:b8787c3d] — failure_similarity_score=0.944; notes=0.9442 similarity against 4 failure examples (source_id:20260523T190743Z-037)
  • [TRUST_T1] Failure Correlation metric for [redacted-seq:21aa:076684be] — failure_similarity_score=0.954; notes=0.9539 similarity against 4 failure examples (source_id:20260523T190743Z-038)
  • [TRUST_T1] Failure Correlation metric for [redacted-seq:22aa:a42d5ef3] — failure_similarity_score=0.934; notes=0.9337 similarity against 4 failure examples (source_id:20260523T190743Z-016)
  • [TRUST_T2] CyclicMPNN: Stable Cyclic Peptide Sequence Generation — CyclicMPNN: Stable Cyclic Peptide Sequence Generation Cyclic peptides are a promising class of therapeutics due to their attractive drug qualities such as increased structural stability, cell permeability, and resistance to proteolytic degradation. With recent advancements in cyclic peptide backbone generation models like CyclicCAE and RFPeptide, generating (source_id:bio_e1a320b06d40)
  • [TRUST_T2] degradation;instability — degradation;instability While thuricin CD was degraded by proteases and was unstable and poorly soluble in gastric fluid, it showed increased solubility in intestinal fluid, probably due to micelle encapsulation. Thuricin CD is a two-peptide antimicrobial produced by Bacillus thuringiensis. Unlike previous antibiotics, it has shown narrow spectrum activity a (source_id:38401875)

5. Peptide Motif Analysis

Recurring 4-mer motifs in associated candidates: PPGP, PGPP, PPPG, GPPG, PPGW, PGWP, GWPP, PCPP, GPPP, CPPG.

0 candidate sequences are referenced by opaque ID — raw sequences remain in the private workspace by design (publication boundary). Operators can resolve the IDs locally via papers/candidates/.

6. Hypothesis

Statement. Candidates nearest to known failure signals should be assayed as a separate subgroup so apparent rank does not hide degradation-like behavior.

Type. failure-correlation. Engine confidence. 0.76. Aggregate uncertainty (this thesis). 0.16.

7. Discussion

conf 0.09 · evidence 5 sup / 0 con · trusted-tier 2 · class mix: unr:10

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

The runtime would split near-failure candidates into a separate decision lane if the §8 panel reproduces degradation-like behavior. Failure Correlation metric for redacted-seq:b8787c3d supplied the nearest-neighbor failure signal, while degradation;instability supplied the mechanistic class. Candidate prioritization would demote high apparent rank entries carrying PPGP, PGPP, PPPG, or GPPG runs until protease-resistance data clears them. Motif-family scoring would add a penalty for proline-rich runs overlapping PPGW, PGWP, GWPP, PCPP, GPPP, and CPPG. Receptor-screen sequencing would move later, because antimicrobial triage should first separate cell-activity loss from protease-driven attrition.

Contradiction weighting found no named contradicting records, so constraint comes from §8 adjudication rather than literature conflict. The §8 protease-resistance experiment narrows the rule if near-failure candidates decay faster without matching antimicrobial loss. The §8 motif-family ablation experiment narrows scoring if PPGP-family edits change stability but leave rank-linked failure unchanged. Falsification requires near-failure subgroup matching controls across degradation kinetics and antimicrobial readout, while motif-enriched candidates retain rank. With evidence_strength 0.35 and uncertainty_score 0.16, the rule should guide subgroup assays, not discard candidates.

8. Limitations

  • No explicit blocking limitations detected by automated triage. Manual scientific review remains required.

9. Future Experiments

  • Synthesize representative candidates carrying the listed motifs and run the standard developability + protease-resistance assay panel.

10. Conclusion

conf 0.09 · evidence 5 sup / 0 con · trusted-tier 2 · class mix: unr:5

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Galen ranked proline-rich motifs near failure signals as a separate assay stratum, so rank scores do not mask degradation-like behavior. Motif extraction centered PPGP, PGPP, PPPG, and GPPG with antimicrobial and protease-resistance tags. The cheapest discriminator is a protease-challenge time course with LC-MS intact-peptide tracking and matched antimicrobial readout. No contradicting record entered the graph; constraint comes from low evidence strength. Runtime scope holds as a failure-correlation proposal at 0.76 confidence.

11. References

Supporting (trusted tier):

1. Failure Correlation metric for [redacted-seq:21aa:b8787c3d] · [TRUST_T1] · source_id:20260523T190743Z-037 2. Failure Correlation metric for [redacted-seq:21aa:076684be] · [TRUST_T1] · source_id:20260523T190743Z-038 3. Failure Correlation metric for [redacted-seq:22aa:a42d5ef3] · [TRUST_T1] · source_id:20260523T190743Z-016 4. CyclicMPNN: Stable Cyclic Peptide Sequence Generation · [TRUST_T2] · source_id:bio_e1a320b06d40 5. degradation;instability · [TRUST_T2] · source_id:38401875

12. Runtime Investigation

Runtime capability investigation. Before this synthesis was drafted, Protean queried Galen's bounded capability surface to enrich the seed with structural and prior-art context. The full investigation ledger is preserved in the private snapshot (investigation.json); this section reports the public-safe rollup.

  • Wall-clock duration: 18 ms
  • Capability calls: db.uniprot:motif_search: 3, pdb: 2
  • Call statuses: ok: 2, skipped: 3

Motifs investigated against UniProt:

  • PPGPno family-level hits
  • PGPPno family-level hits
  • PPPGno family-level hits

PDB cross-references (0 resolved):

  • No PDB IDs mentioned in supporting evidence.

Candidate-sequence QC distribution. No candidate sequences were resolvable for this seed.

Structural analog search. 0 Foldseek ticket(s) were submitted against AFDB50 + PDB100; results poll asynchronously and are appended in subsequent cycles.

Prior-failure motif overlap. The following seed motifs also appear in prior rejected/low-scoring candidates and warrant caution in §9 prioritization: CPPG, GPPG, GPPP, GWPP, PCPP, PGPP, PGWP, PPGP.

13. Runtime Metadata

Operational context for this thesis cycle. Sourced from the synthesis seed and the prose-model log; not part of the scientific claim graph.

Publication tier: runtime_memo Prose model: openai-codex/gpt-5.5 · 6/6 sections via primary model

Prose model call log:

SectionWinnerLatency (ms)Validation codes
titleopenai-codex/gpt-5.519688
abstractopenai-codex/gpt-5.522490
introductionopenai-codex/gpt-5.528604
mechanistic_frameworkopenai-codex/gpt-5.534904
discussionopenai-codex/gpt-5.536137
conclusionopenai-codex/gpt-5.522829

Per-section confidence:

SectionConfidenceLow-conf sentences
conclusion0.095
discussion0.0910
introduction0.095
mechanistic_framework0.096

Contradictions: none acknowledged this cycle.

14. Provenance Appendix

Full provenance (evidence lineage, novelty trace, reviewer findings) is persisted to provenance.json alongside this thesis.

  • seed_id: seed_f2f776aeeb081718
  • hypothesis_id: hypothesis:failure-correlation:42c23cf656f4
  • publication_tier: runtime_memo
  • cluster_id: antimicrobial+protease_resistance+structural_motif
  • thesis_layer: protean.autonomous_thesis.v1

To audit: read provenance.json in the same directory.

Citation

How to cite.

@misc{protean_thesis_thesis_b50734ac395de503,
  title  = {Antimicrobial candidate subgrouping by proximity to known failure signals},
  author = {Protean Labs — Autonomous Thesis Layer},
  year   = {2026},
  url    = {https://www.protean.sh/papers/thesis_b50734ac395de503},
  note   = {Autonomous hypothesis proposal — not peer-reviewed.
            Computational rankings are research prioritization, not biological proof.}
}

Computational rankings are research prioritization, not biological proof. Wet-lab review remains authoritative.