Clinical Evidence

Validated against real-world clinical practice

Helena classification methodology is evaluated through pre-registered validation studies and ongoing real-world performance characterization. All evidence is anchored against an established clinical reference laboratory and the broader inter-laboratory benchmark literature. Methodology, study designations, and findings are documented openly. Detailed cohort reports are available to clinical partners, peer reviewers, investors, and regulatory consultants under appropriate confidentiality terms.

Cases evaluated

across two cohorts

Clinical domains

combined coverage

Discordant outcomes

across all evaluable cases

100%

Cohort 1 acceptance

pre-registered threshold met

Contents

01Overview 02Cohort 1: Validation Study 03Cohort 2: Real-World Performance 04Case Selection Methodology 05Methodology Framework 06Methodological Divergence 07Conflict of Interest Disclosure 08Limitations 09Phased Pathway 10References

Master Methodology

Helena Real-World Performance Framework v1.0

The complete methodology document governing all Helena performance studies. Three-layer performance model, multi-source ground truth construction, methodological characterization, conflict of interest management, phased pathway to regulatory submission, and reporting standards. View the full document.

Read the framework

Overview

Helena conducts two complementary types of clinical evidence work: pre-registered validation studies that evaluate platform performance against defined acceptance criteria, and real-world performance studies that characterize platform behavior on real clinical cases under a multi-source ground truth methodology.

Inter-laboratory variant classification studies under the ACMG/AMP 2015 framework consistently document 25 to 35 percent legitimate methodological disagreement between qualified laboratories (Amendola et al. 2016; Harrison et al. 2017). Real-world platform performance characterization for clinical bioinformatics software therefore requires multi-source ground truth construction rather than single-source comparison.

Helena maintains the Real-World Performance Framework (HEL-RWP-FRAMEWORK-v1.0, April 2026) as the master methodology document for ongoing studies. The framework establishes a three-layer performance model, multi-source ground truth construction, methodological characterization of all divergences, transparent conflict of interest disclosure, and an explicitly phased pathway toward future regulatory submission. The framework supersedes prior internal validation framework documents and reflects deliberate reframing from a regulatory-validation paradigm calibrated for organizations with mature quality infrastructure to a real-world performance characterization paradigm appropriate to the current organizational stage.

Validation Study

Cohort 1: Pathogenic Variant Concordance Study

Pre-registered validation study with binary acceptance criteria. Twenty clinical cases with known pathogenic or likely pathogenic variants evaluated against an established clinical reference laboratory.

Study specification

DocumentHEL-VAL-C1-2026-001 v1.0

Study typePre-registered validation study

PeriodFebruary - March 2026

Classifier versionv3.6.4 (final)

Reference comparatorEstablished clinical reference laboratory

Sample size20 cases

Clinical domains11 (rare disease, oncology, cardiology, neurology, metabolic, ophthalmology, nephrology, neuromuscular, hearing, syndromic, reproductive)

Variant types6 (missense, nonsense, frameshift, splice, compound heterozygous, multi-gene carrier)

Zygosity coverage3 (heterozygous, homozygous, compound heterozygous)

Acceptance thresholdat least 18/20 (90%) for all primary criteria

Standards appliedISO 15189:2022, EU IVDR (2017/746), AMP/CAP 2018, ACMG/AMP 2015

Results against pre-registered acceptance criteria

Metric	Result
P1 - Variant detection	20/20 (100%)
P2 - Gene assignment	20/20 (100%)
P3 - ACMG concordance (FULL)	14/20 (70%)
P3 - ACMG concordance (CLINICAL, one-tier P/LP)	6/20 (30%)
P3 - PARTIAL or DISCORDANT	0/20 (0%)
P3 - Overall (FULL + CLINICAL)	20/20 (100%)
S1 - HGVS notation match	20/20 (100%)
S2 - Consequence match	20/20 (100%)
S3 - Zygosity match	20/20 (100%)
All primary criteria met	20/20 (100%)

Validation decision: GO

All 20 cases (100%) met all three primary concordance criteria (variant detection, gene assignment, ACMG concordance), exceeding the pre-defined GO threshold of 90% (>= 18/20). No PARTIAL or DISCORDANT results were observed. The six CLINICAL (one-tier P/LP) differences are within the expected range of inter-laboratory ACMG classification variability as documented in the peer-reviewed literature (Amendola 2016, Harrison 2017). The six CLINICAL differences fall in two patterns: cases where Helena applied more conservative ClinGen SVI 2023 thresholds (SpliceAI < 0.2 for PP3_splice), and cases where Helena applied ClinVar 2-star+ override or VCEP gene-specific rules to elevate Likely Pathogenic to Pathogenic.

The validation process additionally identified and resolved four classifier defects through iterative improvement (v3.6.0 to v3.6.4). The most significant was the systematic autosomal recessive loss-of-function gene curation expansion (approximately 150 genes with Definitive or Strong ClinGen Gene-Disease Validity for biallelic LoF disease mechanism), which prevents PVS1 misclassification of loss-of-function variants in autosomal recessive disease genes whose population constraint metrics are statistically underpowered.

Real-World Performance Study

Cohort 2: Multi-Source Performance Characterization

Real-World Performance Study under HEL-RWP-FRAMEWORK-v1.0. Twenty cases evaluated under a three-layer performance model with multi-source ground truth construction. Methodological characterization of all divergences. Not a binary pass/fail study; foundational scientific evidence at the current organizational stage.

Study specification

DocumentHEL-RWP-C2-2026-001 v1.0

Study typeReal-World Performance Study under HEL-RWP-FRAMEWORK-v1.0

PeriodMarch - April 2026

Classifier versionv3.28.0 (final at cohort closure)

Reference sourcesMulti-source ground truth: VCEP curations (where available), ClinVar 2-star+ aggregated submissions, peer reference comparator

Sample size20 cases

Clinical domains13 (neurology, autoinflammatory, nephrology X-linked, RASopathy, neuro-metabolic, endocrine, neuromuscular, lysosomal storage, skeletal dysplasia, connective tissue, respiratory, craniofacial, ciliopathy, Y-linked reproductive, mitochondrial)

Variant types7 (missense, nonsense, frameshift, splice region, polypyrimidine tract, complex indel, large recurrent indel)

Inheritance patterns8 (AD reduced penetrance, AD haploinsufficient, AR, AR carrier, AR compound heterozygous, X-linked recessive, Y-linked LoF, dual AD/AR mechanism)

Performance modelThree-layer (analytical, classification, clinical workflow)

Disposition methodologyMethodological characterization (not binary pass/fail), six-category disposition taxonomy

Layer 1 - Analytical performance

Variant detection (P1)19/20 (95%)

Single non-detection attributed to upstream sequencing pipeline failure via 7-step audit trail; Helena platform integrity verified via 21:21 input-to-database record correspondence; Helena attribution CLEARED

Gene assignment (P2)19/19 (100% where evaluable)

HGVS / consequence / zygosity (S1-S3)All PASS where evaluable

Layer 2 - Classification performance

FULL concordance13 components

Identical ACMG class with the peer reference comparator. Anchored variously in ClinVar 2-3 star, ClinGen Hearing Loss VCEP v1.0, KDIGO 2025 ADPKD guideline, or independent multi-criterion convergence.

PARTIAL components6 components (across 4 cases)

Methodologically characterized: 2 validated design behavior (computational-only LP guard and AR carrier guard operating per design), 3 feature-gap accumulation at LP/VUS boundary (PP3 tool divergence, PM1 coverage, PP2 implementation), 1 manual-criteria-dependent inherent automation limitation

DISCORDANT outcomes0

No opposite-direction classifications observed

NOT EVALUABLE1

Upstream variant calling pipeline failure (verified non-Helena attribution)

Inter-laboratory benchmarkWithin published range

Amendola 2016 (66%), Harrison 2017 (72-76%), Bergquist 2025 (70-75%) inter-laboratory FULL concordance benchmarks

Layer 3 - Clinical workflow performance

Q1 - Tier placementPASS in diagnostic indication cases

Target P/LP variants placed in Tier 1 of phenotype matching output

Q2 - Phenotype match scoreHIGH in cases with appropriate HPO input

at least 50% match score

Q4 - AI report qualityGOOD in majority

Variant mentioned, interpretation correct, overall quality good

AI Faithfulness14/20 (70%) PASS

Seven consecutive PASS at cohort closure (Cases 14-20). Improvement trajectory documented across the cohort.

Change Request validations10 deployed CRs validated

Trigger-configuration validation of conservative-guard architecture across diverse real-world clinical scenarios

Six-category methodological-disposition taxonomy

Every observed concordance pattern in Cohort 2 is resolved into one of the categories below, providing comprehensive methodological characterization rather than binary pass/fail disposition. The taxonomy is the principal scientific contribution of Cohort 2 and is intended to inform future cohort design.

Tier 1 ground-truth concordance

13 components

Helena and the peer comparator both reach the same classification, anchored in VCEP curation, ClinVar 2-star or higher, or established clinical practice guidelines. Multiple ACMG criteria converge on the same class.

Disposition: No action. System operates correctly. Multiple validation milestones documented.

Validated design behavior

2 components

Helena classification reflects a documented classifier guard operating per ClinGen-aligned conservative design principles. Disagreement with the peer comparator does not reflect platform error; the guard is the explicit intended output for genotype-context-aware classification.

Disposition: No remediation. Update Change Request registry status to validated with cohort cases as evidence anchor.

Feature-gap accumulation at LP/VUS boundary

3 components

Multiple documented feature gaps in Helena automated classification scope collectively prevent reaching the LP combining-rule threshold. Three identified gaps: PP3 computational predictor tool divergence, PM1 critical-domain coverage scope, and PP2 not yet implemented in the automated classifier. Each gap is independently characterized and roadmap-tracked. Distinct from defect: the gaps are documented limitations, not errors.

Disposition: Roadmap-tracked. PM1 domain coverage expansion in flight (HELIX-CR-2026-082, deployed April 2026 with UniProt residue-level evidence integration). PP2 implementation roadmap formalization. PP3 tool-choice divergence: no classifier change recommended (defensible methodological choice per ClinGen SVI).

Manual-criteria-dependent inherent automation limitation

1 component

Classification depends primarily on manual-curation evidence (case-control / case-series literature, family cosegregation, individual ClinVar submitter evaluation) that is not available from VCF input. Distinct from feature gap: not a classifier roadmap item, but an inherent scope limitation of automated VCF-only classification.

Disposition: Long-term automation roadmap items only: literature mining for PS4 case-control automation, external trio data ingestion for PP1 segregation, gene-aware PP2 thresholds for small genes. Not classifier defect remediation.

Input data layer upstream pipeline failure

1 case (NOT EVALUABLE)

Target variants absent from input VCF due to upstream sequencing or variant calling pipeline. Helena platform integrity verified via 21:21 input-to-database record correspondence at correct genomic target coordinates. Failure is upstream, not Helena.

Disposition: No Helena remediation required (attribution cleared). Upstream pipeline investigation pending sequencing facility action.

Case Selection Methodology

Both cohorts were selected under performance-blind procedures with pre-registered diversity criteria. No case was added or excluded after Helena platform output was observed. Selection rationale is documented and reproducible; this section summarizes the principles applied.

Cohort 1 Selection

Pre-Registered Validation Study Selection

Source registryEstablished clinical reference laboratory patient registry

Selection dateFebruary 2026 (prior to any Helena processing)

Selection bias controlPerformance-blind: cases chosen before any Helena platform output was observed

Pre-registrationDiversity criteria documented in writing with date stamp before processing

Inclusion criteriaVCF availability in reference laboratory storage; clinically reported pathogenic or likely pathogenic outcome; complete patient phenotype documentation

Diversity coverage11 clinical domains, 6 variant types, 3 zygosity categories

Domain compositionCardiology, oncology, ophthalmology, nephrology, neurology, hearing, syndromic, metabolic, rare disease, neuromuscular, reproductive

Operator independenceExternal operator (no role in classifier development), independent of Helena R&D

Cohort 2 Selection

Real-World Performance Study Selection

Source registrySame established clinical reference laboratory patient registry as Cohort 1

Selection dateMarch 2026 (prior to any Helena processing)

Selection bias controlPerformance-blind: cases chosen before any Helena platform output was observed

FunnelApproximately 171 available VCF files in registry; approximately 32 eligible candidates after applying inclusion criteria; 20 selected per pre-specified diversity axes

Pre-registrationFive diversity axes documented in writing before selection: clinical domain, mutation type, zygosity, complexity, pathogenic mechanism

Leading principleComplementarity, not repetition: cases selected to characterize platform behavior in clinical domains and mutation types not covered by Cohort 1, and to increase case complexity

Inclusion criteriaVCF availability in reference laboratory storage; clinically reported pathogenic or likely pathogenic outcome; complete patient phenotype documentation

Diversity coverage13 clinical domains, 7 variant types, 8 inheritance patterns

Complexity expansionCompound heterozygous representation quadrupled relative to Cohort 1 (4 cases versus 1); 2 multi-gene cases (up to 3 variants in 3 distinct genes); novel chromosomal contexts including first Y-chromosome variant evaluation

Five Diversity Axes (Cohort 2)

Selection structure for Cohort 2 was deliberately complementary to Cohort 1, not repetitive. Five orthogonal diversity axes were pre-specified to maximize coverage of clinical scenarios and platform challenge cases.

Clinical domain

Seven entirely new clinical domains relative to Cohort 1 (neuromuscular, autoinflammatory, connective tissue, craniofacial, ciliopathy, Y-linked reproductive, pulmonology), plus expansion of existing domains (nephrology, neurology). The two cohorts together cover 24 distinct clinical domains.

Mutation type

Balanced mix across SNVs (8 cases), deletions (7), duplications (1), complex rearrangements (1), and mixed compound heterozygous configurations (4 cases with two variants of different types in one gene). Tests platform behavior across the full spectrum of variant calling outputs.

Zygosity

14 heterozygous (autosomal dominant or autosomal recessive carrier), 1 homozygous (autosomal recessive biallelic), 2 hemizygous (X-linked and Y-linked). The Y-chromosome case constitutes the first hemizygous Y-linked evaluation for the platform.

Case complexity

14 simple cases (one variant in one gene), 4 compound heterozygous (two variants in one gene), 2 multi-gene cases (two to three variants in distinct genes). This complexity distribution quadruples compound heterozygous representation relative to Cohort 1, reflecting the real clinical scenario in autosomal recessive diseases.

Pathogenic mechanism

Coverage spans loss-of-function (frameshift, splice), gain-of-function, enzymatic deficiency, channelopathy, structural defect, and pharmacogenomic significance. Distinct pathogenic mechanisms test classifier behavior across the diverse molecular pathologies encountered in clinical practice.

Clinical Domain Coverage Across Cohorts

The two cohorts together cover 17 distinct clinical domains. Cohort 2 adds seven entirely new domains while extending coverage in nine domains explored in Cohort 1.

Clinical domain	Cohort 1	Cohort 2	Status
Cardiology	3	0	Covered in Cohort 1
Oncology	3	0	Covered in Cohort 1
Neurology and movement disorders	2	2	Expanded
Nephrology	2	2	Expanded
Ophthalmology	1	0	Covered in Cohort 1
Hearing disorders	1	0	Covered in Cohort 1
Metabolic	1	0	Covered in Cohort 1
Syndromic / RASopathy	1	1	Expanded
Reproductive / Carrier screening	1	1	Expanded
Neuromuscular	0	4	NEW in Cohort 2
Autoinflammatory	0	1	NEW in Cohort 2
Connective tissue / Ehlers-Danlos	0	1	NEW in Cohort 2
Craniofacial	0	1	NEW in Cohort 2
Ciliopathy	0	2	NEW in Cohort 2
Endocrinology / Diabetes	0	1	NEW in Cohort 2
Y-linked / Reproductive	0	1	NEW in Cohort 2
Pulmonology	0	1	NEW in Cohort 2

Selection Limitations

Honest disclosure of selection limitations is integral to the performance-blind methodology. The following limitations apply to current Phase 1 selection procedures and inform future cohort design.

Single-source patient data: all 40 cases across both cohorts originate from one peer reference laboratory. Multi-laboratory selection is not available within the current organizational stage. Phase 2 Pivotal Cohorts will introduce multi-laboratory case sourcing as standard.

Positive-case enrichment: 100% of selected cases have a clinically reported pathogenic or likely pathogenic outcome. No negative controls (population samples, healthy individuals, or established benign references) are included. This limits clinical specificity claims and reflects the founder-stage focus on detection and classification correctness for known pathogenic variants.

No pre-specified VCEP-curated subset: at selection time, no explicit attempt was made to enrich either cohort with Variant Curation Expert Panel curated variants for higher-tier ground truth comparison. Multi-source ground truth is constructed retrospectively for each case where evidence is available; this is appropriate to current organizational stage but not optimal for inferential claims.

Sample size: n=20 per cohort yields a Wilson 95% confidence interval of approximately 20 percentage points at 70% concordance. Sufficient for descriptive performance characterization but limited for inferential claims. Future Phase 2 Pivotal Cohorts at minimum n=30-50 will narrow the confidence interval substantially.

Convenience sampling within constraints: selection from the eligible candidate pool was guided by pre-specified diversity criteria but is not strict random sampling. Bias minimization is achieved through explicit performance-blind selection and pre-registered diversity axes, not through randomization.

Methodology Framework

Helena conducts performance studies under HEL-RWP-FRAMEWORK-v1.0 (April 2026), a master methodology document establishing study design, ground truth construction, performance metrics, quality controls, conflict of interest management, and reporting standards.

Multi-source ground truth

Single-laboratory reference is not scientifically defensible as ground truth. Inter-laboratory concordance studies document 25-35% legitimate methodological disagreement between qualified laboratories. The Helena Real-World Performance Framework rejects single-source ground truth and constructs ground truth by triangulation across all available evidence: ClinGen Variant Curation Expert Panels (highest weight where available), ClinVar 2-star+ aggregated submissions, functional studies with clinical follow-up, population genetics evidence, calibrated computational predictors per ClinGen SVI 2022, published case reports and segregation, and disease-specific consortium databases.

Three-layer performance model

Performance characterization is structured as three independent and complementary layers. Layer 1 (analytical performance) addresses variant detection, annotation correctness, repeatability, and reproducibility. Layer 2 (classification performance) addresses ACMG class concordance with multi-source ground truth and methodological characterization of all divergences. Layer 3 (clinical performance) addresses tier placement, phenotype match score, screening rank, AI report quality, and AI faithfulness.

Methodological characterization, not adjudication

Where Helena and any single reference source diverge, the framework requires methodological characterization rather than formal adjudication. Characterization documents which ACMG criteria each platform applied or withheld, whether Helena reflects ClinGen SVI 2020+ specifications, whether independent multi-source evidence supports one classification over another, and whether the divergence falls within published inter-laboratory disagreement benchmarks. Formal external adjudication panels are reserved for Pivotal Cohorts in future organizational stages.

Pre-registration and transparency

Cohort composition and selection criteria are documented in writing with date stamp before processing begins. Selection cannot be performance-aware. Acceptance criteria, where applicable (pre-registered Validation Studies), are defined before any case is processed. Real-World Performance Studies do not produce binary pass/fail dispositions; outputs report performance values across all metric families with reference to published benchmarks, methodological characterization of all divergences, identified platform defects with remediation status, and limitations explicitly acknowledged.

Understanding Methodological Divergence

Inter-laboratory disagreement of 25 to 35 percent is a documented baseline across the field (Amendola 2016, Harrison 2017). Where Helena and the peer comparator differ, every divergence is methodologically characterized. No DISCORDANT outcomes have been observed across either cohort.

Helena applies stricter ClinGen SVI 2020+ specifications

Where Helena and the peer comparator differ, the difference frequently reflects methodological evolution, not platform defect. Helena implements Pejaver 2022 calibrated PP3/BP4 thresholds, Walker 2023 SpliceAI thresholds aligned to ClinGen SVI 2023, ClinGen SVI PP5 deprecation, BS2 inheritance-aware thresholds, and PVS1 expression-aware guards (gnomAD pext v4.1, GTEx v10). Many peer laboratories continue to apply historical ACMG/AMP 2015 criteria without subsequent ClinGen SVI updates.

Conservative guards operating per design

Helena explicitly implements conservative classification guards to prevent overclassification: a computational-only LP guard requires at least one observed criterion (PS1, PM3, PM4, PP4, or PP5) before reaching Likely Pathogenic when the only Strong evidence is a computational prediction; an autosomal recessive carrier guard prevents Likely Pathogenic for heterozygous loss-of-function variants in AR-only LoF genes without a compound heterozygous partner; dual-mechanism bypasses preserve Likely Pathogenic eligibility for genes with both monoallelic and biallelic pathways. These guards are validated design behavior, not divergence from ground truth.

Documented feature-coverage gaps with active roadmap

Three feature gaps were identified as recurring sources of LP/VUS boundary disagreement: PP3 tool divergence, PM1 critical-domain coverage scope, and PP2 not yet implemented. Each is roadmap-tracked. The PM1 coverage gap was directly addressed by the recent UniProt residue-level evidence integration (HELIX-CR-2026-082, classifier v3.30.0, April 2026), which expanded PM1 evidence from approximately 50 hardcoded Pfam domains to 113,126 expert-curated UniProt residues - a 2,262-fold increase in evidence density.

Inherent VCF-only automation limitations

Classification criteria that require manual-curation evidence (PS4 case-control literature, PP1 family cosegregation, PP5 single-submitter evaluations under deprecation review) cannot be fully automated from VCF input alone. Where the peer comparator applies these criteria via manual curation, Helena conservatively declines to fire them - this is an inherent scope limitation of automated VCF-only classification, transparently disclosed.

Zero clinically significant divergence

Across both cohorts (40 cases, 24 clinical domains), no DISCORDANT outcomes were observed. The CSER Consortium (Amendola 2020) reports 11% of variants exhibit discordance affecting clinical recommendations across nine laboratories. In Helena evaluation: 0%. All differences fall within the P/LP boundary or the LP/VUS boundary, neither of which alters clinical actionability under ClinGen SVI guidance.

Conflict of Interest Disclosure

Helena Bioinformatics maintains a transparent conflict of interest disclosure framework aligned with academic publication standards (CIOMS, ICH GCP, Helsinki Declaration) and consistent with downstream regulatory requirements. Disclosure is mandatory; mitigations are tailored to study type and organizational stage.

Founder and Chief Executive Officer

Financial (sole owner), Employment

Disclosure: Sole owner of Helena Bioinformatics. Performs all internal Helena roles in the current single-founder organizational stage, including technical development, scientific direction, business operations, and clinical validation oversight.

Mitigation: Disclosed in all Real-World Performance Study Reports. The multi-source ground truth methodology mitigates single-author classification framing. Future organizational growth will introduce role separation.

Scientific Advisor

Family relationship

Disclosure: Family relationship to the Helena founder.

Mitigation: Excluded from formal adjudication. Role limited to Scientific Advisor (methodology consultation, framework review, scientific guidance). Family relationship disclosed in any report where input is referenced.

Cohort 2 case operator

Employment

Disclosure: Employed by the peer reference laboratory used in Cohort 2.

Mitigation: Disclosed in the Cohort 2 Real-World Performance Study Report. Operator role limited to technical processing and comparison documentation; not classification authority. The multi-source ground truth methodology mitigates single-source dependency. Future cohorts will introduce independent operator structure with no peer-laboratory affiliations.

External clinical signatory

Professional affiliation

Disclosure: Has signed peer reference laboratory clinical reports as external clinical signatory, including reports referenced in Cohort 2. Not an employee of the peer reference laboratory.

Mitigation: Cannot serve as adjudicator on cases previously signed. May serve as Helena Scientific Advisor or methodology consultant for non-overlapping cohorts. Disclosed in any report where input is referenced.

Limitations and Caveats

Honest acknowledgment of study limitations is integral to the methodological characterization approach. The following limitations apply to current Phase 1 evidence and inform future cohort design.

Sample size of 40 cases across two cohorts is appropriate for foundational performance characterization but limits inferential statistical power. Wilson 95% confidence interval at 70% concordance with n=20 is approximately 20 percentage points - sufficient for trend characterization but not for fine-grained inferential claims. Future Phase 2 Pivotal Cohorts at minimum n=30-50 would narrow the confidence interval substantially.

Single peer reference comparator. All peer comparisons in both cohorts are with one established clinical reference laboratory. Inter-laboratory variation inherent in ACMG classification means that disagreement with any single laboratory does not necessarily indicate platform error. The multi-source ground truth methodology mitigates single-source dependency but does not replace multi-laboratory comparison. Phase 2 Pivotal Cohorts will introduce multi-laboratory comparison as standard.

Cohort selection enriched for known pathogenic and likely pathogenic variants. Performance on Variant of Uncertain Significance classification, benign variant filtering, and novel variants without prior literature is not directly assessed in these cohorts. Future Methodology Validation Studies (database-derived against ClinVar 3-star+ benchmarks and GIAB truth sets) will address these performance dimensions at scale.

Genome build and transcript namespace differences. The peer reference comparator uses GRCh37 with RefSeq NM transcripts; Helena uses GRCh38 with Ensembl ENST canonical transcripts. While liftover and cross-namespace mapping are generally reliable for SNVs and small indels, documented edge cases include transcript-isoform divergence with N-terminal exon usage differences and occasional position-offset inconsistencies. These are documented methodological observations, not platform defects.

Methodological characterization without external adjudication. Per the Real-World Performance Framework, methodological characterization of PARTIAL outcomes is performed by Helena R&D rather than by external multi-adjudicator panels. This is appropriate to the current Phase 1 organizational stage but does not provide the same level of independence as multi-adjudicator panel review. Phase 2 Pivotal Cohorts will introduce external adjudication panels with explicit conflict-of-interest controls.

Operator structure for Cohort 2. The Cohort 2 case operator is employed by the peer reference comparator entity. The conflict of interest is disclosed with mitigations applied (operator role limited to technical processing; multi-source ground truth methodology mitigates single-source dependency). Phase 2 Pivotal Cohorts will introduce independent operator structure.

VCEP coverage. Only one of twenty Cohort 2 cases had a Variant Curation Expert Panel curated reference available, reflecting the state-of-the-field limitation that VCEP coverage across rare-disease genes remains sparse (approximately 250 genes with active VCEP curation as of April 2026 per ClinGen).

Results should always be interpreted in the context of the patient clinical presentation, family history, and other available clinical information. Helena is a clinical decision support tool, not a diagnostic device. All classifications require review and confirmation by a qualified clinical geneticist.

Phased Pathway to Regulatory Submission

Helena operates within Phase 1 of an explicitly phased pathway toward eventual regulatory submission. Phase progression is conditional on organizational growth and is not pursued at the expense of current scientific work.

Phase 1Foundation (Current)

Active

Solo founder organizational stage. Real-World Performance Studies and Methodology Validation Studies. Foundational scientific evidence accumulation.

Real-World Performance Studies (Cohorts 1 and 2 completed)
Methodology Validation Studies (database-derived against ClinVar 3-star+ benchmarks and GIAB truth sets, in planning)
Scientific publications and conference abstracts
Foundational evidence carryforward to Phase 2 and Phase 3

Phase 2Growth

Future

Small team. Series A or strategic partnership. Dedicated Quality and Regulatory functions. First Pivotal Cohort possible. Multi-cohort program execution.

Phase 1 activities continue
First Pivotal Cohort (n=30-50, formal acceptance criteria)
Multi-adjudicator panel introduction
Multi-laboratory comparison as standard
Independent operator structure

Phase 3Pre-submission

Future

Established team. Mature Quality Management System. Notified Body engagement. Pivotal Cohorts at full composition. Multi-site validation.

Pivotal Cohorts at full composition (60/30/10 positive/negative/edge)
Multi-site validation across independent laboratories
Pre-submission Notified Body consultations
IVDR (EU 2017/746) conformity assessment submission
Post-market performance follow-up planning

References

Peer-reviewed literature underlying the methodology framework, ACMG/AMP standards, ClinGen SVI specifications, and inter-laboratory benchmark studies.

Amendola LM, Jarvik GP, Leo MC, et al. Performance of ACMG-AMP variant-interpretation guidelines among nine laboratories in the Clinical Sequencing Exploratory Research Consortium.

American Journal of Human Genetics. 2016;98(6):1067-1076.

PMID: 27181684

Harrison SM, Dolinsky JS, Knight Johnson AE, et al. Clinical laboratories collaborate to resolve differences in variant interpretations submitted to ClinVar.

Genetics in Medicine. 2017;19(10):1096-1104.

PMID: 28301460

Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.

Genetics in Medicine. 2015;17(5):405-424.

PMID: 25741868

Tavtigian SV, Greenblatt MS, Harrison SM, et al. Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification framework.

Human Mutation. 2018;39(11):1485-1492.

PMID: 30311386

Pejaver V, Byrne AB, Feng BJ, et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria.

American Journal of Human Genetics. 2022;109(12):2163-2177.

PMID: 36413997

Walker LC, Hoya M, Wiggins GAR, et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup.

American Journal of Human Genetics. 2023;110(7):1046-1067.

PMID: 37352859

Abou Tayoun AN, Pesaran T, DiStefano MT, et al. Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion.

Human Mutation. 2018;39(11):1517-1524.

PMID: 30192042

Roy S, Coldren C, Karunamurthy A, et al. Standards and guidelines for validating next-generation sequencing bioinformatics pipelines: a joint recommendation of the Association for Molecular Pathology and the College of American Pathologists.

Journal of Molecular Diagnostics. 2018;20(1):4-27.

PMID: 29154853

Amendola LM, Muenzen K, Biesecker LG, et al. Variant classification concordance using the ACMG-AMP variant interpretation guidelines across nine genomic implementation research studies.

American Journal of Human Genetics. 2020;107(5):932-941.

PMID: 33108757

Cummings BB, Karczewski KJ, Kosmicki JA, et al. Transcript expression-aware annotation improves rare variant interpretation.

Nature. 2020;581(7809):452-458.

PMID: 32461655

Request Detailed Cohort Reports

Detailed cohort reports, per-case performance documentation, and the full Real-World Performance Framework are available to clinical partners, peer reviewers, investors, and regulatory consultants under appropriate confidentiality terms.

ACMG Methodology Founding Clinical Partner How It Works For Geneticists