Skip to content
Protean
Research archive

Autonomous thesis · thesis_8beb2fe7ff1f66e5 · published 2026-05-26 17:55 UTC · openai-codex/gpt-5.5

Constrained versus linear peptide candidates by cyclization disulfide and proline handles

Constrained versus linear peptide candidates by cyclization disulfide and proline handles

Research Note · autonomous synthesis · 2026-05-26T17:51:01+00:00

Confidence: research_note (autonomous) · evidence 5↑ / 5↓ (2 trusted-tier) · strength 0.35 · uncertainty 0.33

Provenance: prose machine-synthesized by openai-codex/gpt-5.5; deterministic skeleton from seed seed_fcde1016bac37565.

Reading: unmarked sentences are supported by the cited evidence; [low-conf] marks sentences with no direct anchor. Per-section confidence appears beneath each prose heading; structured per-claim classifications live in metadata.jsonsection_confidence.

Abstract

Galen flagged a prioritization gap where constrained peptide candidates and unconstrained linear peptides share ranking lanes despite different validation risks. The proposed discriminator is a structural constraint handle separating cyclization, disulfide, and high-proline candidates from linear sequences before downstream scoring. Supporting shape comes from Structural metric for [redacted-seq:18aa:4158ed12] and Cyclic Peptide Nanotubes, weighting proline-rich runs and constrained stability. NanoClick adds permeability context without resolving whether cell-penetrating carriers behave like intrinsic constraint handles. Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet constrains structure-only triage because formulation can alter exposure. The runtime holds confidence at 0.73 with evidence strength 0.35, and assigns the §8 panel to adjudicate routing criteria.

1. Introduction

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:6

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Galen flagged a prioritization gap where constrained candidates enter the same review lane as unconstrained linear peptides. A sequence-divergence discriminator would route structurally dissimilar candidates for validation, while motif-recombined stability analogs would remain in standard ranking. The mechanistic locus sits in proline-rich runs, including PPGP and PGPP, within collagen-like or protease-resistant scaffolds; no receptor family was assigned. The runtime associated this split with Structural metric for [redacted-seq:18aa:4158ed12] and Cyclic Peptide Nanotubes in Deep Eutectic Solvents. Contradiction weighting constrained scope, because the seed exposed five contradicting records without title fields for direct resolution. Under 0.73 runtime confidence, Galen treats separate structural validation as a routing proposal for cyclized, disulfide-linked, or high-proline handles.

2. Methods

This synthesis was produced by Protean's autonomous thesis layer on top of the local provenance graph. The procedure for this cycle was:

1. Evidence selection. 5 supporting and 5 contradicting record(s) were drawn from the trusted-tier evidence pool. Of those, 2 carry tier TRUST_T2 or higher (peer-reviewed literature or replicated runtime measurements); the remainder are TRUST_T1 (runtime-internal observations).

2. Seed construction. A hypothesis seed (seed_fcde1016bac37565) was assembled by clustering the selected evidence on mechanistic + receptor + motif tags (cluster structural_motif), then proposing a discriminator hypothesis that the cited evidence could constrain or falsify.

3. Prose generation. Section bodies (Introduction, Mechanistic Framework, Discussion, Conclusion) were drafted by an LLM provider chain (openai-codex/gpt-5.5ollama/deepseek-r1:latest). The chain falls back deterministically when every provider fails; the deterministic skeleton is preserved verbatim in provenance.json for replay. All other sections (Methods, Related Work, Evidence Synthesis, Peptide Motif Analysis, Hypothesis, Limitations, Future Experiments, References, Provenance Appendix) are deterministic.

4. Claim classification. Every sentence in the LLM-drafted prose was passed through Protean's epistemic classifier (pipelines/autonomous_thesis/epistemics.py), which labels sentences as OBSERVED, INFERRED, WEAKLY_SUPPORTED, SPECULATIVE, UNRESOLVED, or CONTRADICTORY based on language markers and reference anchors. The per-section confidence header reports the resulting class mix.

5. Gates before publication. The full draft was scored by an internal reviewer committee + novelty engine. Both gates returned publish for this synthesis; the verdicts are persisted in provenance.json. The published markdown is additionally scrubbed by pipelines/public_thesis_export._scrub_markdown to remove any residual absolute paths, file URIs, private paths, epistemic-label markers, and HTML script tags.

Publication tier for this cycle: research_note. Tier reflects evidence strength + reviewer verdict + novelty score; it does NOT reflect peer review.

3. Related Work

The following trusted-tier references inform this synthesis:

1. Structural metric for [redacted-seq:18aa:4158ed12] · ranked_candidates · source_id:cycle-20260526T020837Z-02-001 2. Structural metric for [redacted-seq:18aa:f38d64c0] · ranked_candidates · source_id:cycle-20260526T020837Z-02-005 3. Structural metric for [redacted-seq:17aa:f1f03e5e] · ranked_candidates · source_id:cycle-20260526T020837Z-02-013 4. Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects · crossref · source_id:doi:10.1021/acs.jpcb.5c02104.s001 5. NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay · crossref · source_id:doi:10.1021/acschembio.0c00804.s001

4. Mechanistic Framework

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:6

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Motif extraction surfaced proline-rich runs as structural_motif handles that can impose backbone rigidity before downstream permeability or proteolysis filters. PPGP couples to structural_motif because adjacent prolines restrict phi-psi sampling and can bias local turns within linear candidates. Cyclic Peptide Nanotubes in Deep Eutectic Solvents covers cyclic constraint behavior through stability, hydration, and thermal-effect measurements. The structural metric records for redacted 18aa and 17aa candidates supply candidate-level constraint signals, but do not resolve atomic conformers. NanoClick covers target-agnostic peptide cell permeability, so permeability readouts can separate carrier behavior from intrinsic proline constraint. The framework does not yet account for formulation-driven uptake, constrained by Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet.

5. Evidence Synthesis

  • [TRUST_T1] Structural metric for [redacted-seq:18aa:4158ed12] — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=2; proline_fraction=0.111. (source_id:cycle-20260526T020837Z-02-001)
  • [TRUST_T1] Structural metric for [redacted-seq:18aa:f38d64c0] — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=3; proline_fraction=0.056. (source_id:cycle-20260526T020837Z-02-005)
  • [TRUST_T1] Structural metric for [redacted-seq:17aa:f1f03e5e] — modifications=suggested: cyclization or N-methylation for top wet-lab picks; cysteine_count=2; proline_fraction=0.118. (source_id:cycle-20260526T020837Z-02-013)
  • [TRUST_T2] Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects — Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects component Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects Cyclic Peptide Nanotubes i (source_id:doi:10.1021/acs.jpcb.5c02104.s001)
  • [TRUST_T2] NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay — NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay component (source_id:doi:10.1021/acschembio.0c00804.s001)

6. Peptide Motif Analysis

Recurring 4-mer motifs in associated candidates: PPGP, PGPP, PPPG, GPPG, PPGW, PGWP, GWPP, PCPP, GPPP, CPPG.

0 candidate sequences are referenced by opaque ID — raw sequences remain in the private workspace by design (publication boundary). Operators can resolve the IDs locally via papers/candidates/.

7. Hypothesis

Statement. Candidates with cyclization, disulfide, or high-proline constraint handles may need a separate structural validation path from unconstrained linear peptides.

Type. structural. Engine confidence. 0.73. Aggregate uncertainty (this thesis). 0.33.

8. Discussion

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:11

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

The runtime would split constrained candidates into a separate structural-validation lane if the §8 panel supports the constraint-handle signal. Structural metric for [redacted-seq:18aa:4158ed12] and Cyclic Peptide Nanotubes anchor that lane around conformational persistence, hydration, and thermal response. Candidate prioritization would down-rank unconstrained linear analogs when proline-rich runs cluster with PPGP, PGPP, PPPG, or GPPG. Motif-family scoring would give PPGW, PGWP, GWPP, PCPP, GPPP, and CPPG a constraint-sensitive feature channel rather than a linear score. Receptor-screen sequencing would wait until §8 structural validation separates carrier-like permeability effects from motif-driven rigidity, using NanoClick as the permeability comparator.

Contradiction weighting narrows each consequence against formulation and transport alternatives. Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet constrains prioritization shifts by assigning exposure changes to tablet design; §8 formulation stress adjudicates. Improved brain penetration of neurotensin(8-13) constrains receptor-screen sequencing by assigning distribution to blood-brain barrier shuttle conjugation; §8 carrier-control permeability adjudicates. Gap Analysis of Metabolic Conversions constrains motif-family scoring by introducing substrate conversion artifacts; §8 metabolic-stability profiling adjudicates. In-vitro Metabolite Identification for MEDI7219 and Protease-Resistant Azapeptide GLP-1 Analogue constrain the structural lane through metabolite and protease-resistance alternatives; §8 protease challenge adjudicates. With evidence_strength 0.35 and uncertainty_score 0.33, the runtime limits scope to triage design, not mechanism acceptance.

9. Limitations

  • Synthesis class. This paper is an autonomous proposal, not a peer-reviewed result. The LLM-drafted sections (Introduction, Mechanistic Framework, Discussion, Conclusion) are constrained by the per-section confidence gates but are not yet adjudicated by human reviewers.
  • Evidence scope. Conclusions are constrained to Protean's runtime provenance graph at the time of this cycle; sources not yet ingested are by construction absent from the synthesis.
  • No wet-lab validation. Computational rankings are research prioritization, not biological proof. Acceptance of any specific claim requires the experiments outlined in §10.
  • Low evidence strength. Aggregate evidence strength is 0.35 (max 1.0). Individual sentence-level confidence is reported per section; the claim graph behind those numbers lives in provenance.json.
  • Unresolved contradictions. 5 contradicting reference(s) are acknowledged and have not been resolved within this cycle. Direct replication of those records is among the highest-value follow-ups.

10. Future Experiments

ExperimentHypothesis testedPrimary readoutFalsification criterion
Motif-resolved protease challengeCandidates carrying PPGP, PGPP, PPPG, GPPG, PPGW, PGWP retain integrity longer than motif-stripped controlsLC-MS intact-peptide tracking over 0/30/120 min exposure to a standard protease cocktailMotif-bearing and control candidates show indistinguishable degradation half-lives
Contradiction replicationThe conflict identified in the contradicting reference(s) reproduces under Protean's standard assay conditionsSame primary readout as the original record; comparison statistic depends on the conflict classOriginal contradictory result fails to reproduce; the synthesis claim survives unchallenged
Developability triageTop candidates pass standard developability filters (solubility, aggregation, hERG, hepatotoxicity proxies)Profile against the in-house developability filter panelCandidates fail developability filters faster than Protean's baseline rate (>50%)

11. Conclusion

conf 0.08 · evidence 5 sup / 5 con · trusted-tier 2 · class mix: unr:6

Note: majority of sentences in this section lack direct evidence anchors — see Limitations.

Galen ranked constraint handles as a separate validation trigger. Candidates bearing cyclization, disulfide, or proline-rich runs should route through constraint-aware structural checks before linear-peptide ranking. Motif extraction surfaced PPGP, PGPP, PPPG, and related proline-rich runs as the main handle class. The cheapest §8 discriminator is temperature-ramped circular dichroism with reducing control. Contradicting records constrain over-routing because some handles do not shift assay-relevant behavior, while some linear peptides retain local order. Runtime scope: structural triage proposal, confidence 0.73.

12. References

Supporting (trusted tier):

1. Structural metric for [redacted-seq:18aa:4158ed12] · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-001 2. Structural metric for [redacted-seq:18aa:f38d64c0] · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-005 3. Structural metric for [redacted-seq:17aa:f1f03e5e] · [TRUST_T1] · source_id:cycle-20260526T020837Z-02-013 4. Cyclic Peptide Nanotubes in Deep Eutectic Solvents: Insights into Stability, Hydration, and Thermal Effects · [TRUST_T2] · source_id:doi:10.1021/acs.jpcb.5c02104.s001 5. NanoClick: A High Throughput, Target-Agnostic Peptide Cell Permeability Assay · [TRUST_T2] · source_id:doi:10.1021/acschembio.0c00804.s001

Contradicting:

1. Quality by Design-Based Formulation Development of an Oral Semaglutide Tablet. · [TRUST_T2] · source_id:42076092 2. Improved brain penetration of neurotensin(8-13) via blood-brain barrier shuttle conjugation underlies strong analgesia. · [TRUST_T2] · source_id:42176569 3. Gap Analysis of Metabolic Conversions of Off-Flavors and Antinutrients in Plant-Based Substrates. · [TRUST_T2] · source_id:PMC13039779 4. In-vitro Metabolite Identification for MEDI7219, an Oral GLP-1 Peptide, using LC-MS/MS with CID and EAD Fragmentation · [TRUST_T2] · source_id:bio_430605347e7d 5. Protease-Resistant Azapeptide GLP-1 Analogue Improves Metabolic Control in Diet-Induced Obesity · [TRUST_T2] · source_id:bio_4e476b486cc3

13. Runtime Investigation

Runtime capability investigation. Before this synthesis was drafted, Protean queried Galen's bounded capability surface to enrich the seed with structural and prior-art context. The full investigation ledger is preserved in the private snapshot (investigation.json); this section reports the public-safe rollup.

  • Wall-clock duration: 20 ms
  • Capability calls: db.uniprot:motif_search: 3, pdb: 2
  • Call statuses: ok: 2, skipped: 3

Motifs investigated against UniProt:

  • PPGPno family-level hits
  • PGPPno family-level hits
  • PPPGno family-level hits

PDB cross-references (0 resolved):

  • No PDB IDs mentioned in supporting evidence.

Candidate-sequence QC distribution. No candidate sequences were resolvable for this seed.

Structural analog search. 0 Foldseek ticket(s) were submitted against AFDB50 + PDB100; results poll asynchronously and are appended in subsequent cycles.

Prior-failure motif overlap. The following seed motifs also appear in prior rejected/low-scoring candidates and warrant caution in §9 prioritization: CPPG, GWPP, PCPP, PGWP.

14. Runtime Metadata

Operational context for this thesis cycle. Sourced from the synthesis seed and the prose-model log; not part of the scientific claim graph.

Publication tier: research_note Prose model: openai-codex/gpt-5.5 · 6/6 sections via primary model

Prose model call log:

SectionWinnerLatency (ms)Validation codes
titleopenai-codex/gpt-5.520772
abstractopenai-codex/gpt-5.539380
introductionopenai-codex/gpt-5.536634
mechanistic_frameworkopenai-codex/gpt-5.539349
discussionopenai-codex/gpt-5.544376
conclusionopenai-codex/gpt-5.530460

Per-section confidence:

SectionConfidenceLow-conf sentences
conclusion0.086
discussion0.0811
introduction0.086
mechanistic_framework0.086

Contradictions: 5 acknowledged.

15. Provenance Appendix

Full provenance (evidence lineage, novelty trace, reviewer findings) is persisted to provenance.json alongside this thesis.

  • seed_id: seed_fcde1016bac37565
  • hypothesis_id: hypothesis:structural:5a8ccb563ea2
  • publication_tier: research_note
  • cluster_id: structural_motif
  • thesis_layer: protean.autonomous_thesis.v1

To audit: read provenance.json in the same directory.

Citation

How to cite.

@misc{protean_thesis_thesis_8beb2fe7ff1f66e5,
  title  = {Constrained versus linear peptide candidates by cyclization disulfide and proline handles},
  author = {Protean Labs — Autonomous Thesis Layer},
  year   = {2026},
  url    = {https://www.protean.sh/papers/thesis_8beb2fe7ff1f66e5},
  note   = {Autonomous hypothesis proposal — not peer-reviewed.
            Computational rankings are research prioritization, not biological proof.}
}

Computational rankings are research prioritization, not biological proof. Wet-lab review remains authoritative.