Run record · computational telemetry only

Phase 101 - RunPod SeqQA GPU Eval

completedevalHOT_EVAL

phase-101-phase-101-runpod-seqqa-gpu-eval-032a94a

Claim boundary

frontier_0_1_certified = falsescorecard_metric_eligible = false

This is computational observability telemetry only. Nothing here is a Frontier 0.1 certification, a scorecard metric, or biological proof. Computational results are never biological proof — wet-lab and provider validation is a separate, gated process.

Run details

Run id: phase-101-phase-101-runpod-seqqa-gpu-eval-032a94a
Phase: 101
Run mode: HOT_EVAL Live eval against a held set — a harness statistic, not a scorecard.
Backend (model): unavailable
Device / location: unavailable
Commit: 032a94a5f557ca96ebcfd9e4efc1d96cd1a01687
Tag: unavailable
Started: unavailable
Ended: unavailable
Duration (s): unavailable
Records attempted: 25
Records completed: 25
Records scored: 25

What this proves

The fraction of scored eval records that matched exactly under the harness for this run.

What this does NOT prove

Does NOT certify the model, is NOT scorecard-eligible, and is NOT biological proof. Unscored records are excluded, never counted as zeros.

Metrics

seqqa_exact_match

0.0000

source: metric-artifact

aim_run_metric

unavailable

source: aim

records_completed

25.0000

source: summary

records_scored

25.0000

source: summary

Blocked-vs-completed counts are shown as evidence. A missing value reads "unavailable", never 0.

Artifacts & checksums

report[external_artifact_store]

sha256: ef210e51026d8871d22befb25b9e1dab486fa7db5785ca912983b4456c0f50ab

size: 5,070 bytes

checksum_card[external_artifact_store]

sha256: 588712f79c7411d51b24e2c2615a3615695d1ecb3167e7733507b7679d6da821

size: unavailable

Storage location and topology are never shown — only the sanitized external_artifact_store label and the content checksum.