Skip to content
Protean
Research archive

Mechanistic thesis · thesis_f114bfe88a0980bb · published 2026-06-03 05:20 UTC · openai-codex/gpt-5.5

Separating constrained from linear peptide candidates by conformational constraint handles

Separating constrained from linear peptide candidates by conformational constraint handles

Research Note · autonomous synthesis · 2026-05-26T18:03:08+00:00

Confidence: research_note (autonomous) · evidence 5↑ / 5↓ (2 trusted-tier) · strength 0.35 · uncertainty 0.33

Provenance: prose machine-synthesized by openai-codex/gpt-5.5; deterministic skeleton from seed seed_932ba582fbc9e154.

Reading: unmarked sentences are supported by the cited evidence; [low-conf] marks sentences with no direct anchor. Per-section confidence appears beneath each prose heading; structured per-claim classifications live in metadata.jsonsection_confidence.

Scope note: most sentences in the LLM-drafted sections (Introduction, Mechanistic Framework, Discussion, Conclusion) lack direct per-sentence evidence anchors. The per-section confidence gutter quantifies this; see §9 Limitations.

Abstract

Evidence clusters in the synthesis pass target a prioritization gap: constrained candidates currently share ranking space with unconstrained linear peptides. We propose a discriminator for cyclization, disulfide, and high-proline handles separating these candidates from unconstrained linear peptides. Support is structural and assay-adjacent, including Structural metric for sequence GHQMQHHCDDSQPTDCWP and Cyclic Peptide Nanotubes in Deep Eutectic Solvents. Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet constrains interpretation because formulation can shift peptide exposure. Runtime confidence is moderate, with low evidence strength; the §8 panel adjudicates whether separate validation is warranted.

1. Introduction

conf 0.09 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: spec:1 | unr:5

Galen-stage evidence defines a prioritization gap: constrained candidates enter the same queue as unconstrained linear peptides despite distinct validation needs. A sequence-divergence discriminator would separate structurally constrained candidates before analog ranking, whereas motif-recombined stability analogs would mainly reshuffle local variants. The mechanistic locus is structural_motif, with no receptor family assigned in the seed, centered on proline-rich runs such as PPGP. These runs fit constraint-handle logic, but contradicting evidence is present without seed-listed titles, limiting mechanistic assignment. Structural metric for sequence GHQMQHHCDDSQPTDCWP and Cyclic Peptide Nanotubes support treating constraint handles as validation triggers. We propose a runtime-confidence 0.73 branch for cyclized, disulfide-bearing, or high-proline candidates.

2. Methods

This synthesis was produced by Protean's autonomous thesis layer on top of the local provenance graph. The procedure for this cycle was:

1. Evidence selection. 5 supporting and 5 contradicting record(s) were drawn from the trusted-tier evidence pool. Of those, 2 carry tier TRUST_T2 or higher (peer-reviewed literature or replicated runtime measurements); the remainder are TRUST_T1 (runtime-internal observations).

2. Seed construction. A hypothesis seed (seed_932ba582fbc9e154) was assembled by clustering the selected evidence on mechanistic + receptor + motif tags (cluster structural_motif), then proposing a discriminator hypothesis that the cited evidence could constrain or falsify.

3. Prose generation. Section bodies (Introduction, Mechanistic Framework, Discussion, Conclusion) were drafted by an LLM provider chain (openai-codex/gpt-5.5ollama/deepseek-r1:latest). The chain falls back deterministically when every provider fails; the deterministic skeleton is preserved verbatim in provenance.json for replay. All other sections (Methods, Related Work, Evidence Synthesis, Peptide Motif Analysis, Hypothesis, Limitations, Future Experiments, References, Provenance Appendix) are deterministic.

4. Claim classification. Every sentence in the LLM-drafted prose was passed through Protean's epistemic classifier (pipelines/autonomous_thesis/epistemics.py), which labels sentences as OBSERVED, INFERRED, WEAKLY_SUPPORTED, SPECULATIVE, UNRESOLVED, or CONTRADICTORY based on language markers and reference anchors. The per-section confidence header reports the resulting class mix.

5. Gates before publication. The full draft was scored by an internal reviewer committee + novelty engine. Both gates returned publish for this synthesis; the verdicts are persisted in provenance.json. The published markdown is additionally scrubbed by pipelines/public_thesis_export._scrub_markdown to remove any residual absolute paths, file URIs, private paths, epistemic-label markers, and HTML script tags.

Publication tier for this cycle: research_note. Tier reflects evidence strength + reviewer verdict + novelty score; it does NOT reflect peer review.

3. Related Work

The following trusted-tier references inform this synthesis:

1. Structural metric for sequence GHQMQHHCDDSQPTDCWP · ranked_candidates · source_id:cycle-20260526T020837Z-02-001 2. Structural metric for sequence DCDQTNWPCGGQQHCDKA · ranked_candidates · source_id:cycle-20260526T020837Z-02-005 3. Structural metric for sequence WPIWQPHQTQCGSGGGC · ranked_candidates · source_id:cycle-20260526T020837Z-02-013 4. Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects · crossref · source_id:doi:10.1021/acs.jpcb.5c02104.s001 5. NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay · crossref · source_id:doi:10.1021/acschembio.0c00804.s001

4. Mechanistic Framework

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:6

Evidence clusters converged on a constrained-peptide split, where cyclization, disulfide potential, and proline-rich runs strain linear-peptide validation assumptions. PPGP couples to structural_motif because adjacent prolines restrict backbone torsion and support compact conformer persistence. Cyclic Peptide Nanotubes in Deep Eutectic Solvents covers cyclization-sensitive stability, hydration, and thermal response, supporting a constraint-specific validation branch. NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay covers membrane penetration as a separable readout from structural constraint. The framework does not yet account for formulation-enabled exposure in Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet, constraining structure-only triage. Improved brain penetration of neurotensin(8-13) via blood-brain barrier shuttle conjugation constrains the split by reporting shuttle-driven barrier penetration.

5. Evidence Synthesis

  • [TRUST_T1] Structural metric for sequence GHQMQHHCDDSQPTDCWP — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=2; proline_fraction=0.111. (source_id:cycle-20260526T020837Z-02-001)
  • [TRUST_T1] Structural metric for sequence DCDQTNWPCGGQQHCDKA — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=3; proline_fraction=0.056. (source_id:cycle-20260526T020837Z-02-005)
  • [TRUST_T1] Structural metric for sequence WPIWQPHQTQCGSGGGC — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=2; proline_fraction=0.118. (source_id:cycle-20260526T020837Z-02-013)
  • [TRUST_T2] Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects — Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects component Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects Cyclic Peptide Nanotubes i (source_id:doi:10.1021/acs.jpcb.5c02104.s001)
  • [TRUST_T2] NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay — NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay component (source_id:doi:10.1021/acschembio.0c00804.s001)

6. Peptide Motif Analysis

Recurring 4-mer motifs in associated candidates: PPGP, PGPP, PPPG, GPPG, PPGW, PGWP, GWPP, PCPP, GPPP, CPPG.

Candidate sequence visibility: full sequences are displayed directly for published candidate references; any unresolved legacy hash is labeled explicitly with its public provenance limitation.

7. Hypothesis

Statement. Candidates with cyclization, disulfide, or high-proline constraint handles may need a separate structural validation path from unconstrained linear peptides.

Type. structural. Engine confidence. 0.73. Aggregate uncertainty (this thesis). 0.33.

8. Discussion

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:10

Evidence clusters indicate that positive §8 structural validation would move cyclized, disulfide, and high-proline candidates into a distinct prioritization lane. Structural metric records and Cyclic Peptide Nanotubes support treating constraint handles as stability variables, not linear-peptide noise. Motif-family scoring would weight PPGP, PGPP, PPPG, GPPG, and CPPG runs by constraint compatibility before unconstrained potency ranking. NanoClick would then sequence permeability screening after conformation triage, while Quality by Design-Based Formulation constrains oral-prioritization gains.

Contradiction weighting narrows the separate-path claim if §8 conformer-collapse mapping shows no separation between constrained candidates and linear controls. Quality by Design-Based Formulation would falsify oral deprioritization if §8 formulation-stress profiling preserves ranking without a distinct structural lane. Improved brain penetration of neurotensin constrains receptor-screen sequencing if §8 carrier-conjugation permeability testing overrides proline-rich scoring. Gap Analysis constrains motif-family scoring if §8 matrix-metabolism profiling dominates PPGP, PGPP, PPPG, GPPG, and CPPG effects. In-vitro Metabolite Identification for MEDI7219 and Protease-Resistant Azapeptide GLP-1 Analogue constrain protease-cleavage weighting if §8 LC-MS/MS proteolysis mapping explains stability without constraint routing. Given evidence_strength 0.35 and uncertainty_score 0.33, this remains a proposal for triage architecture, not a general peptide-development rule.

9. Limitations

  • Synthesis class. This paper is an autonomous proposal, not a peer-reviewed result. The LLM-drafted sections (Introduction, Mechanistic Framework, Discussion, Conclusion) are constrained by the per-section confidence gates but are not yet adjudicated by human reviewers.
  • Evidence scope. Conclusions are constrained to Protean's runtime provenance graph at the time of this cycle; sources not yet ingested are by construction absent from the synthesis.
  • No wet-lab validation. Computational rankings are research prioritization, not biological proof. Acceptance of any specific claim requires the experiments outlined in §10.
  • Low evidence strength. Aggregate evidence strength is 0.35 (max 1.0). Individual sentence-level confidence is reported per section; the claim graph behind those numbers lives in provenance.json.
  • Unresolved contradictions. 5 contradicting reference(s) are acknowledged and have not been resolved within this cycle. Direct replication of those records is among the highest-value follow-ups.

10. Future Experiments

ExperimentHypothesis testedPrimary readoutFalsification criterion
Motif-resolved protease challengeCandidates carrying PPGP, PGPP, PPPG, GPPG, PPGW, PGWP retain integrity longer than motif-stripped controlsLC-MS intact-peptide tracking over 0/30/120 min exposure to a standard protease cocktailMotif-bearing and control candidates show indistinguishable degradation half-lives
Contradiction replicationThe conflict identified in the contradicting reference(s) reproduces under Protean's standard assay conditionsSame primary readout as the original record; comparison statistic depends on the conflict classOriginal contradictory result fails to reproduce; the synthesis claim survives unchallenged
Developability triageTop candidates pass standard developability filters (solubility, aggregation, hERG, hepatotoxicity proxies)Profile against the in-house developability filter panelCandidates fail developability filters faster than Protean's baseline rate (>50%)

11. Conclusion

conf 0.09 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: spec:1 | unr:4

Galen: we propose that candidates bearing cyclization, disulfide, or high-proline handles enter structural triage before comparison with linear peptides. PPGP, PGPP, PPPG, and related motifs define the constrained subset. The cheapest §8 discriminator is paired circular dichroism before and after reducing or linearizing the handle. Contradicting records constrain this to triage, because linear sequence behavior can still dominate activity. Runtime scope: structural validation-path recommendation at confidence 0.73.

12. References

Supporting (trusted tier):

1. Structural metric for sequence GHQMQHHCDDSQPTDCWP · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-001 2. Structural metric for sequence DCDQTNWPCGGQQHCDKA · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-005 3. Structural metric for sequence WPIWQPHQTQCGSGGGC · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-013 4. Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects · [TRUST_T2] · source_id:doi:10.1021/acs.jpcb.5c02104.s001 5. NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay · [TRUST_T2] · source_id:doi:10.1021/acschembio.0c00804.s001

Contradicting:

1. Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet. · [TRUST_T2] · source_id:42076092 2. Improved brain penetration of neurotensin(8-13) via blood-brain barrier shuttle conjugation underlies strong analgesia. · [TRUST_T2] · source_id:42176569 3. Gap Analysis of Metabolic Conversions of Off-Flavors and Antinutrients in Plant-Based Substrates. · [TRUST_T2] · source_id:PMC13039779 4. In-vitro Metabolite Identification for MEDI7219, an Oral GLP-1 Peptide, using LC-MS/MS with CID and EAD Fragmentation · [TRUST_T2] · source_id:bio_430605347e7d 5. Protease-Resistant Azapeptide GLP-1 Analogue Improves Metabolic Control in Diet-Induced Obesity · [TRUST_T2] · source_id:bio_4e476b486cc3

13. Computational Investigation

Runtime capability investigation. Before this synthesis was drafted, Protean queried Galen's bounded capability surface to enrich the seed with structural and prior-art context. The full investigation ledger is preserved in the private snapshot (investigation.json); this section reports the public-safe rollup.

  • Wall-clock duration: 21 ms
  • Capability calls: db.uniprot:motif_search: 3, pdb: 2
  • Call statuses: ok: 2, skipped: 3

Motifs investigated against UniProt:

  • PPGPno family-level hits
  • PGPPno family-level hits
  • PPPGno family-level hits

PDB cross-references (0 resolved):

  • No PDB IDs mentioned in supporting evidence.

Candidate-sequence QC distribution. No candidate sequences were resolvable for this seed.

Structural analog search. 0 Foldseek ticket(s) were submitted against AFDB50 + PDB100; results poll asynchronously and are appended in subsequent cycles.

Prior-failure motif overlap. The following seed motifs also appear in prior rejected/low-scoring candidates and warrant caution in §9 prioritization: CPPG, GWPP, PCPP, PGWP.

14. Provenance Appendix

Full provenance — evidence lineage, novelty trace, reviewer findings, per-section LLM call log, per-claim classifications — is persisted to provenance.json alongside this thesis.

  • seed_id: seed_932ba582fbc9e154
  • hypothesis_id: hypothesis:structural:5a8ccb563ea2
  • publication_tier: research_note
  • cluster_id: structural_motif
  • thesis_layer: protean.autonomous_thesis.v1

To audit: read provenance.json in the same directory.

Confidence breakdown

evidence
0.35
certainty
0.68
novelty
0.81

Derived from evidence / certainty / novelty signals.

Contradictions

5 contradicting evidence records were surfaced during review. The notes are summarized in the thesis body above; contradictions are retained as scientific signal, not discarded.

Citation

How to cite.

@misc{protean_thesis_thesis_f114bfe88a0980bb,
  title  = {Separating constrained from linear peptide candidates by conformational constraint handles},
  author = {Protean Labs — Mechanistic Thesis Layer},
  year   = {2026},
  url    = {https://www.protean.sh/papers/thesis_f114bfe88a0980bb},
  note   = {Mechanistic hypothesis proposal — not peer-reviewed.
            Computational rankings are research prioritization, not biological proof.}
}

Computational rankings are research prioritization, not biological proof. Wet-lab review remains authoritative.