Helena
Back to Studies

HEL-RWP-FRAMEWORK-v1.0

Helena Real-World Performance Framework

Methodology for Real-World Performance Studies of Helena

Version 1.0 - April 2026

This page presents Helena Real-World Performance Framework v1.0. Intended for clinical partners, peer reviewers, investors, strategic partners, and regulatory consultants. The methodology described represents Helena Bioinformatics current Phase 1 work and is foundational evidence for future regulatory submission, not a regulatory submission in itself.

Document IDHEL-RWP-FRAMEWORK-v1.0
Version1.0 (initial release)
StatusActive
Document typeReal-World Performance Study Master Methodology (Living Document)
Authoring entityHelena Bioinformatics
Effective dateApril 2026
Applicable toAll real-world performance characterization activities for Helena, including ongoing Cohort 2 study and prospective cohorts.
Primary audienceScientific community, peer reviewers, investors, strategic partners, and as foundational evidence for future regulatory submission.
Eventual regulatory pathIVDR (EU 2017/746), UKCA, IVDR-CH - anticipated submission Q3 2027 or later, subject to organizational maturity and partnership development. Current Framework supports preparatory evidence accumulation, not immediate submission.
Methodology standardsACMG/AMP 2015 (Richards et al.), ClinGen SVI specifications (Pejaver 2022, Walker 2023, Abou Tayoun 2018, Biesecker & Harrison 2020). CIOMS conflict of interest disclosure. CONSORT-style reporting where applicable.

1. Framework Purpose and Positioning

1.1. Purpose

This document establishes the methodology under which Helena Bioinformatics conducts Real-World Performance Studies of the Helena variant classification platform. It defines study design, ground truth construction, performance metrics, quality controls, conflict of interest management, and reporting standards applicable to Helena current organizational stage.

This Framework supersedes the prior Helena Clinical Validation Framework documents (v1.0 and v1.1, now archived). It reflects a deliberate methodological reframing: from a regulatory-validation paradigm calibrated for organizations with mature quality infrastructure, to a real-world performance characterization paradigm appropriate to an early-stage deep-technology startup pursuing clinical credibility through scientifically rigorous evidence accumulation.

1.2. Why Reframing

The prior framework treated a single peer reference laboratory as ground truth and structured study activities around regulatory validation conventions. Three observations during early use motivated reframing:

  • Single-laboratory reference is not scientifically defensible as ground truth. Inter-laboratory concordance studies (Amendola 2016, Harrison 2017, Bergquist 2025) document 25 to 35 percent legitimate methodological disagreement between qualified laboratories. Treating any single peer laboratory as definitive truth misrepresents the state of the field.

  • Helena classifier applies stricter ClinGen SVI 2020+ specifications than many peer laboratories applying historical 2015 ACMG. Where Helena and the comparator laboratory differ, the difference frequently reflects methodological evolution rather than Helena defect. Framing such cases as Helena failures was scientifically misleading.

  • Helena Bioinformatics operates in Phase 1 (current organizational stage) with no immediate Notified Body engagement. Heavy regulatory-validation paperwork serves no current submission. The need is for credible, defensible, peer-publishable evidence supporting Helena classification claims, not for premature regulatory dossiers.

This Framework therefore positions Helena near-term work as Real-World Performance Studies producing scientifically rigorous, peer-reviewable evidence. Path to formal regulatory submission is preserved as a downstream goal but not pursued through current activities.

1.3. Helena Bioinformatics Current Position

Helena Bioinformatics currently operates within Phase 1 of an explicitly phased pathway (Section 9.1). Phase 1 activities reflect the organizational scope appropriate to this stage. The platform processes clinical genomic data through partnership arrangements with established peer reference laboratory partners.

The current stage admits the following considerations, which this Framework explicitly accommodates:

  • Phase 1 organizational scope - activities reliant on external collaboration for clinical and reference functions where appropriate.

  • Partnership-based patient data access - VCF files originating from collaborating peer reference laboratories under appropriate agreements.

  • Phased pathway to Notified Body engagement - IVDR submission anticipated Q3 2027 or later, conditional on funding and organizational growth.

Within Phase 1 scope, scientifically rigorous performance characterization remains achievable. The methodologies in this Framework are designed to deliver defensible evidence under such scope, while preserving an unobstructed path to regulatory-grade work in subsequent organizational stages.

1.4. Intended Audience for Performance Study Outputs

Outputs of studies conducted under this Framework are intended for the following audiences:

  • Scientific community: Peer-reviewed publications. Conference abstracts. Methodology demonstration.

  • Investors and strategic partners: Clinical credibility evidence. Capability demonstration. Due diligence material.

  • Potential collaborators: Platform capability evidence supporting partnership discussions with laboratories and clinical centers.

  • Future Notified Body (downstream): Foundational evidence retained for citation in eventual IVDR submission. Not pivotal evidence under current Framework - designated as foundational methodology and capability demonstration.

  • Internal Helena Bioinformatics R&D: Defect identification, classifier improvement targets, methodology refinement.

2. Study Architecture

2.1. Three-Layer Performance Model

Performance characterization is structured as three independent and complementary layers. Each layer addresses a distinct aspect of platform behavior and produces evidence directly informative to clinical credibility.

Layer 1: Analytical Performance

Does the platform detect, annotate, and process variants with required technical performance? Evidence of correct read-in, correct annotation, repeatability, reproducibility. Output: technical performance metrics, with comparison to standard NGS pipeline benchmarks (CLSI MM09-A2).

Layer 2: Classification Performance

Does the platform classify variants in concordance with established ACMG/AMP standards, ClinGen SVI specifications, and multi-source reference evidence? Evidence of agreement with multi-source ground truth across the available evidence hierarchy. Output: classification concordance metrics with explicit ground truth source labeling. Methodological characterization of disagreements.

Layer 3: Clinical Performance

Does the platform produce clinically meaningful, prioritized, and interpretable results that support clinical decision-making? Evidence of diagnostic prioritization, phenotype matching, and interpretation quality. Output: clinical workflow metrics: tier placement, screening rank, AI report quality, AI faithfulness.

2.2. Multi-Source Ground Truth Construction

This Framework rejects the prior single-source ground truth approach. For each variant evaluated in a Real-World Performance Study, ground truth is constructed by triangulation across all available evidence sources, with explicit hierarchical weighting based on evidence strength.

VCEP-curated classifications

Multi-expert consensus through ClinGen Variant Curation Expert Panels. ClinVar 3-star or 4-star entries.

Weighting: Highest weight. Where available, treated as definitive ground truth.

Functional studies with clinical follow-up

Published experimental evidence demonstrating variant effect, supplemented by reported clinical phenotype.

Weighting: High weight. Particularly for missense variants in genes with established functional assays.

ClinVar aggregated submissions (2-star)

Multiple submitters, criteria provided, no conflicts. Aggregated independent classifications.

Weighting: Moderate weight. Useful for triangulation.

Population genetics evidence

gnomAD allele frequencies, constraint metrics (pLI, oe LoF, LOEUF, missense Z), regional constraint, pext expression.

Weighting: Moderate weight. Strong for ruling out pathogenicity (high frequency or LoF tolerance) and supporting (LoF intolerance for null variants).

Computational predictors (calibrated)

BayesDel noAF, AlphaMissense, REVEL with ClinGen SVI 2022 calibrated thresholds for evidence strength.

Weighting: Supporting weight. Used per ClinGen Pejaver 2022 calibration thresholds.

Published case reports and segregation

Peer-reviewed literature reporting variant in affected individuals, family segregation evidence.

Weighting: Variable weight depending on case quality, segregation strength, and publication rigor.

Disease-specific consortium databases

LOVD, gene-specific locus databases, registry data.

Weighting: Moderate weight. Domain-specific value where available.

Single peer reference laboratory

Single-laboratory clinical classification.

Weighting: Comparator only, not ground truth. Inter-laboratory disagreement of 25 to 35 percent is expected per published benchmarks; disagreement does not establish error.

2.3. Concordance Levels

Comparisons between Helena classification and any reference source are recorded at the following concordance levels:

FULL

Identical ACMG classification (P=P, LP=LP, VUS=VUS, LB=LB, B=B).

Direct agreement on classification.

CLINICAL

P/LP boundary, or LB/B boundary. Same clinical management implication.

Clinically equivalent. Per ClinGen SVI guidance, P and LP do not differ in clinical management.

PARTIAL

Adjacent categories that differ in clinical management. P/LP versus VUS, LB versus VUS.

Methodological divergence requiring characterization. Not automatic platform error.

DISCORDANT

Opposite-direction classifications. P/LP versus B/LB.

Significant divergence requiring detailed investigation, regardless of source.

2.4. Characterization, Not Adjudication

Where Helena and any single reference source diverge (PARTIAL or DISCORDANT), this Framework requires methodological characterization rather than formal adjudication. Characterization documents:

  • Which ACMG criteria each platform applied and which were withheld.

  • Whether Helena choice reflects ClinGen SVI 2020+ specifications (deprecation of PP5, PS4 single-case caution, calibrated computational thresholds, etc.).

  • Whether independent multi-source evidence supports one classification over another.

  • Whether the divergence falls within published inter-laboratory disagreement benchmarks (60 to 75 percent FULL concordance per Amendola 2016, Harrison 2017, Bergquist 2025).

Formal adjudication by external expert panel is reserved for Pivotal Cohorts in future organizational stages (see Section 9.2). Real-World Performance Studies in Phase 1 (current Helena stage) employ methodological characterization, supplemented by optional academic peer review of the final report.

3. Performance Metrics

Performance is measured across three metric families corresponding to the three architectural layers. Metrics in this Framework are descriptive and benchmarked against published literature, not gated against pre-set thresholds. Pass/fail thresholds are reserved for future Pivotal Cohorts under expanded Framework versions.

3.1. Family A - Analytical Performance

Detection Sensitivity

Fraction of variants present in input VCF correctly detected and presented in classified output.

Benchmark: Compared to CLSI MM09-A2 benchmarks for diagnostic NGS pipelines (typical at least 99 percent).

Annotation Concordance

Fraction of variants where Helena annotation (gene, consequence, transcript) matches authoritative reference (Ensembl, MANE Select).

Benchmark: Compared to Ensembl VEP standard (typical at least 99 percent).

Repeatability

Identical classification when same VCF processed twice on same classifier version.

Benchmark: Deterministic classifier expectation: 100 percent.

Reproducibility

Identical classification on different infrastructure with same classifier version.

Benchmark: Deterministic classifier expectation: 100 percent.

Upstream Pipeline Failure Rate

Fraction of cases where target variant is absent from input VCF, attributable to upstream sequencing or variant calling pipeline.

Benchmark: Reported separately. Excluded from Helena Detection Sensitivity calculation.

3.2. Family B - Classification Performance

FULL Concordance with VCEP Reference

Identical ACMG class with VCEP-curated classification (where available).

Benchmark: Higher expectation given expert consensus reference. Reported per available cases.

FULL Concordance with Peer Reference Laboratory

Identical ACMG class with peer laboratory comparator.

Benchmark: Compared to published inter-laboratory benchmarks: Amendola 2016 (66 percent), Harrison 2017 (72 to 76 percent), Bergquist 2025 (70 to 75 percent). Helena results reported in this published context.

FULL+CLINICAL Combined

Identical or P/LP boundary classification.

Benchmark: Reported as clinical-equivalent agreement.

PARTIAL Rate with Methodological Characterization

Fraction of cases at PARTIAL with documented root cause analysis.

Benchmark: Reported with full root cause table. Target: 100 percent characterization coverage.

DISCORDANT Rate

Fraction of cases at DISCORDANT.

Benchmark: Reported with detailed investigation. Target: low, but reported transparently regardless of value.

ClinGen SVI 2020+ Stricter Application Rate

Fraction of PARTIAL cases attributable to Helena application of ClinGen SVI specifications more recent than peer laboratory reference framework.

Benchmark: Documented as methodological divergence reflecting field evolution, not platform defect.

Multi-Source Agreement Rate

For PARTIAL/DISCORDANT cases: fraction where multi-source evidence (VCEP, ClinVar consensus, functional, population) supports Helena classification.

Benchmark: Reported as evidence triangulation outcome.

3.3. Family C - Clinical Performance

Q1 - Tier Placement

Fraction of target P/LP variants placed in Tier 1 of Phenotype Matching output.

Benchmark: Reported as PASS (Tier 1), ACCEPTABLE (Tier 2), FAIL (Tier 3 or 4).

Q2 - Phenotype Match Score

Fraction of cases where target gene phenotype match score is at least 50 percent.

Benchmark: Reported as HIGH (at least 50 percent), MODERATE (20 to 49 percent), LOW (less than 20 percent).

Q3 - Screening Rank

Ordinal rank of target variant in Screening service output.

Benchmark: Reported as TOP-5, TOP-10, TOP-20, OUTSIDE-20.

Q4 - AI Report Quality

Quality rating of AI-generated clinical report.

Benchmark: Reported as GOOD, ACCEPTABLE, POOR per Q4a (variant mentioned), Q4b (interpretation correct), Q4c (overall quality).

AI Faithfulness Rate

Fraction of AI reports free of factual hallucinations.

Benchmark: Target: 100 percent. Any hallucination triggers separate AI Service investigation Change Request.

Diagnostic Yield (cohort-level)

Fraction of cases with at least one P/LP variant addressing primary clinical indication.

Benchmark: Compared to published WGS yield (Lionel 2018: 41 percent; Marshall 2020: 43 percent).

3.4. Reporting Disposition Logic

Real-World Performance Studies do not produce binary pass/fail dispositions. Instead, study outputs report:

  • Performance values across all metric families with reference to published benchmarks.

  • Methodological characterization of all PARTIAL and DISCORDANT cases.

  • Identified platform defects with linked Change Requests and remediation status.

  • Limitations and caveats explicitly acknowledged.

  • Conclusions appropriate to study scope: descriptive performance characterization, not regulatory clearance.

Future Pivotal Cohorts under expanded Framework versions will introduce binary pass/fail thresholds. Such thresholds are not applicable at the current organizational stage.

4. Cohort Design

4.1. Study Type Designations

Each cohort conducted under this Framework is designated, prior to initiation, as one of three study types. Designation determines applicable study design, sample size, and intended use of evidence.

Real-World Performance Study

Purpose: Characterize Helena behavior on real clinical cases. Document inter-laboratory concordance and methodology divergence. Identify defects.

Typical size: 10 to 30 cases

Intended output: Performance characterization report. Peer-reviewable scientific document. Foundational evidence for downstream regulatory work.

Methodology Validation Study

Purpose: Statistical validation against established benchmark datasets (ClinVar 3-star+, GIAB, VCEP curated sets). Demonstrate classifier accuracy at scale.

Typical size: Hundreds to thousands of cases (database-derived)

Intended output: Methodology validation paper. Statistical performance characterization. Suitable for journal submission (JMG, Genet Med, AJHG, etc.).

Pivotal Cohort Study (future)

Purpose: Formal validation evidence for IVDR/UKCA conformity assessment. Multi-site, multi-adjudicator, full bias controls.

Typical size: 30 to 50+ cases minimum

Intended output: Notified Body submission evidence. Reserved for Phase 2/3 organizational stage.

4.2. Cohort Selection Criteria for Real-World Performance Studies

Cohort selection must be documented and reproducible. Selection criteria are pre-specified in the Cohort Design Document before any case is processed. Selection cannot be performance-aware.

For Real-World Performance Studies (current Helena stage), the following selection guidance applies:

  • Pre-registration: Cohort composition and selection criteria documented in writing with date stamp before processing begins.

  • Diversity coverage: Each cohort spans multiple clinical domains (minimum 5 distinct domains), inheritance patterns (AD, AR, X-linked, Y-linked, mitochondrial where applicable), and variant types (SNV, indel, splice, structural where in scope).

  • Source diversity (where feasible): Where multiple source laboratories or sources can contribute, cases are selected to represent multiple sources rather than concentrating on a single source.

  • Multi-source ground truth feasibility: Selection should preferentially include variants where multi-source evidence is achievable (VCEP-curated cases, ClinVar 3-star+ cases, well-characterized variants in published literature).

  • Demographic balance: Sex ratio between 30:70 and 70:30 where data permits. Age range spanning pediatric and adult cases where data permits.

  • Composition flexibility: Real-World Performance Studies may operate at flexible composition (positive cases, negative controls, edge cases in proportions tailored to study question). Strict 60/30/10 composition is reserved for future Pivotal Cohorts.

4.3. Sample Size Guidance

Sample size for Real-World Performance Studies is selected to support the study descriptive purpose and the feasibility of execution under current organizational stage:

Real-World Performance Study (initial)

Minimum n: 10

Recommended n: 15-20

Rationale: Sufficient to identify methodology gaps, document inter-laboratory concordance trend, and identify platform defects. Statistical power is descriptive, not inferential.

Real-World Performance Study (expanded)

Minimum n: 20

Recommended n: 25-30

Rationale: Wilson 95% CI at 70% concordance: ±18 pp at n=25. Sufficient for trend characterization and pre-publication preparation.

Methodology Validation Study

Minimum n: Hundreds-thousands

Recommended n: Thousands+

Rationale: Database-derived. Statistical power for inferential claims.

Pivotal Cohort (future)

Minimum n: 30

Recommended n: 50+

Rationale: Wilson 95% CI narrows to ±13 pp at n=50. Required for IVDR Class C pivotal evidence.

5. Methodology

5.1. Detection Assessment (Layer 1)

Detection assessment evaluates whether Helena correctly identifies variants present in input VCF.

  • 1.

    Input VCF processed through Helena pipeline using production classifier version under study.

  • 2.

    For each target variant, the classified_variants DuckDB queried by genomic position (GRCh38), HGVS coding change, protein change, and ClinVar identifier. Detection confirmed if any query returns a match.

  • 3.

    For variants not detected, root cause determined by examining: input VCF presence, liftover output, quality filter status, annotation pipeline logs.

  • 4.

    Detection failures categorized: (a) Helena pipeline failure - counts against Detection Sensitivity; (b) upstream pipeline failure - reported separately as upstream failure rate; (c) quality filter failure - recorded as quality-flagged detection.

5.2. Classification Assessment (Layer 2)

Classification assessment evaluates Helena ACMG class output against multi-source ground truth.

  • 1.

    For each target variant, all available reference sources are documented (peer laboratory class, ClinVar entry, VCEP curation if any, functional studies in literature, population evidence).

  • 2.

    Helena ACMG class extracted from classified_variants database for the target variant.

  • 3.

    Concordance computed independently against each reference source per Section 2.3 levels.

  • 4.

    For PARTIAL and DISCORDANT cases (Helena vs any reference), ACMG criteria differences documented in detail. Methodology root cause categorized: (a) ClinGen SVI 2020+ stricter application by Helena, (b) evidence access difference, (c) classifier defect, (d) reference data limitation.

  • 5.

    Cases categorized as classifier defect escalated as Change Requests with full traceability.

  • 6.

    Multi-source agreement determined: where Helena disagrees with peer laboratory but agrees with multi-source evidence (VCEP, ClinVar consensus, functional), this is documented as Helena-aligned-with-broader-evidence.

5.3. Clinical Performance Assessment (Layer 3)

Clinical Performance assessment evaluates platform behavior on clinical workflow dimensions. Conducted by clinical reviewer (with COI disclosure per Section 8 if applicable).

  • Q1 (Tier Placement): Reviewer records Phenotype Matching tier assignment for target variant. Rated PASS (Tier 1), ACCEPTABLE (Tier 2), FAIL (Tier 3 or 4).

  • Q2 (Phenotype Match Score): Reviewer records percentage match score. Rated HIGH (at least 50 percent), MODERATE (20 to 49 percent), LOW (less than 20 percent).

  • Q3 (Screening Rank): Reviewer records ordinal position of target variant in screening service output. Rated TOP-5, TOP-10, TOP-20, OUTSIDE-20.

  • Q4 (AI Report Quality): Reviewer reads AI-generated clinical report. Rated Q4a (variant mentioned: YES/NO), Q4b (interpretation correct: YES/PARTIAL/NO), Q4c (overall quality: GOOD/ACCEPTABLE/POOR). Any factual hallucination triggers AI Faithfulness flag.

5.4. Methodological Characterization of Divergences

For all PARTIAL and DISCORDANT cases, the following characterization is mandatory and constitutes the substantive scientific output of the study:

  • Side-by-side ACMG criteria table (Helena vs each reference source).

  • Identification of which Helena-applied or Helena-withheld criteria reflect ClinGen SVI 2020+ specifications (PP5 deprecation, PS4 single-case caution, calibrated computational thresholds, BS2 inheritance-aware thresholds, PVS1 pext/expression evidence requirement, etc.).

  • Independent multi-source evidence summary (VCEP, ClinVar consensus, functional, population, segregation).

  • Statement on whether disagreement falls within published inter-laboratory disagreement benchmarks.

  • Disposition: methodological divergence (Helena and reference apply different but defensible methodologies), classifier defect (Helena error to be remediated), reference limitation (Helena correct, reference applies less stringent standard), or evidence-genuinely-uncertain (current evidence does not unambiguously support one classification).

6. Reference Source Standards

6.1. VCEP-Curated Variants

Highest-strength reference source. Sourcing procedure:

  • ClinVar database queried for variants with review_status of reviewed by expert panel (3-star) or practice guideline (4-star).

  • Variants filtered by gene of interest and disease context.

  • Selected variants documented with sourcing date and ClinVar Variation ID.

  • Periodic re-verification: At each cohort initiation, all included VCEP variants re-verified for current 3-star+ status. ClinVar VCEP classifications can change over time.

6.2. ClinVar Aggregated Submissions

Useful supplementary reference. Used at 2-star (multiple submitters, criteria provided, no conflicts) where 3-star+ unavailable.

  • Disagreement among ClinVar submitters (conflicting interpretations) documented as evidence landscape feature, not used as definitive ground truth.

  • Star rating preserved in all reports for transparency.

6.3. Peer Reference Laboratory

In Real-World Performance Studies under this Framework, the peer reference laboratory serves as a comparator. The relationship structure is:

  • Peer laboratory classifications are designated as peer reference comparator in all study documentation. The term reference standard is reserved for VCEP and adjudicated multi-expert sources.

  • Inter-laboratory concordance is reported as an independent metric and benchmarked against published multi-laboratory studies (Amendola 2016, Harrison 2017, Bergquist 2025).

  • Disagreements between Helena and the peer laboratory are characterized methodologically (Section 5.4). Where independent multi-source evidence supports Helena classification, this is documented; where multi-source evidence supports the peer laboratory, this is also documented.

  • Where peer laboratory personnel serve in Helena study roles (e.g., as case operator), conflicts of interest are managed under Section 8 with appropriate disclosure and methodological independence safeguards.

  • Communication with the peer laboratory regarding individual case disagreements remains the responsibility of Helena Bioinformatics. Methodological characterization documents may be shared with the peer laboratory for their own quality program.

6.4. Functional and Population Evidence

Triangulation evidence sources used in characterization but not as standalone ground truth:

  • Functional studies: PMID-cited published functional assay results. Used in PVS1, PS3, BS3 evidence assessment.

  • Population frequency: gnomAD v4.1 allele frequencies, popmax frequencies. Used in PM2, BA1, BS1 evidence assessment.

  • Constraint metrics: pLI, oe LoF, LOEUF, missense Z, regional pext. Used in PVS1 evidence strength modulation.

  • Segregation: published or available family segregation data. Used in PP1 evidence assessment when source data is reliable.

7. Reporting Standards

7.1. Per-Case Performance Report

Every case in every Real-World Performance Study produces a Per-Case Performance Report following the structure below. This structure is consistent with the Per-Case Concordance Reports already produced for Cohort 1 and Cohort 2; section additions reflect this Framework emphasis on multi-source characterization.

  • 1.

    Case Header and Performance Summary: Patient ID (anonymized), gene, variant identifiers, clinical diagnosis, HPO terms, session ID, classifier version, report date. Performance summary statement and concordance outcomes against each reference source.

  • 2.

    Multi-Source Comparison Table: Side-by-side: Helena classification vs each available reference (peer laboratory, ClinVar with star rating, VCEP if any, published literature). ACMG criteria, HGVS, consequence, zygosity, gnomAD AF documented.

  • 3.

    Evidence Review: ClinVar entries with review_status detail, in silico predictions with calibrated thresholds, population frequency analysis, gene constraint metrics, sequencing quality metrics, published case reports if any, functional evidence if any.

  • 4.

    Layer 1 Assessment (Detection): P1, P2 results. Detection failure root cause if applicable.

  • 5.

    Layer 2 Assessment (Classification): P3 result against each reference source. Methodological characterization for any PARTIAL/DISCORDANT outcome. Specific identification of ClinGen SVI 2020+ application differences.

  • 6.

    Layer 3 Assessment (Clinical Performance): Q1, Q2, Q3, Q4 results with rationale. AI Faithfulness check explicitly noted.

  • 7.

    Diagnostic Relevance Assessment: Comparative clinical utility analysis: Helena versus reference. Documented per intended use.

  • 8.

    Platform Issues Identified: Numbered list of any defects, limitations, or improvement opportunities. Each linked to existing or proposed Change Request.

  • 9.

    COI Statement: Disclosure of any conflict of interest applicable to operator, reviewer, or other contributor to this case. Mitigation applied.

  • 10.

    Conclusions and Methodological Disposition: Final disposition: methodological divergence / classifier defect / reference limitation / evidence-genuinely-uncertain. Action items if any. Responsible authority sign-off.

7.2. Cohort-Level Real-World Performance Study Report

At cohort completion, Helena Bioinformatics produces a Real-World Performance Study Report. This is the primary scientific output of the study and is structured to be peer-publishable.

  • Title and Abstract: Cohort identifier, classifier version, key findings.

  • Introduction: Background on Helena, classifier methodology (ACMG/AMP 2015 with ClinGen SVI specifications), study purpose.

  • Methods: Cohort selection criteria, processing pipeline, classifier version, reference sources used, comparison methodology, COI disclosure.

  • Results - Layer 1: Detection performance with attention to upstream pipeline failures.

  • Results - Layer 2: Classification concordance against each reference source. Inter-laboratory concordance with peer laboratory benchmarked against published literature. Multi-source agreement analysis.

  • Results - Layer 3: Clinical workflow metrics.

  • Discussion: Methodological characterization of PARTIAL/DISCORDANT cases. Identified defects and remediation status. Limitations. Comparison to published benchmarks.

  • Conclusions: Study scope and findings stated honestly. Foundational evidence framing for downstream regulatory work.

  • Appendices: All Per-Case Performance Reports, classifier version manifest, reference database snapshots, COI Disclosure Registry.

7.3. Identified Defects and Change Request Tracking

All identified defects, methodology gaps, and improvement opportunities are entered into a centralized Change Request log. Log entries contain: discovery date, source case(s), description, severity, assigned Change Request identifier, status, target resolution, post-remediation re-validation requirement.

7.4. Optional Academic Peer Review

Real-World Performance Study Reports may optionally be submitted to one or more independent academic clinical geneticists for pre-publication peer review. This is not formal regulatory adjudication but standard scientific peer review practice. Reviewer feedback is documented and addressed in the final report. Reviewers acknowledged in the report. Reviewer COI disclosed in line with Section 8.

8. Conflict of Interest Management

8.1. Position

Helena Bioinformatics maintains a transparent COI disclosure framework aligned with academic publication standards (CIOMS, ICH GCP, Helsinki Declaration) and consistent with downstream regulatory requirements (ISO 15189:2022 Section 7.2, IVDR conformity assessment expectations). Disclosure is mandatory; mitigations are tailored to study type and organizational stage.

8.2. COI Categories

Family

Personal, marital, or family relationship between individual and Helena leadership.

Default management: Disclosure mandatory. Excluded from formal adjudication. Permitted in advisory or scientific consultation roles.

Employment

Current or recent (within 24 months) employment with Helena Bioinformatics or with reference laboratories used in study.

Default management: Disclosure mandatory. Specific role restrictions in study type and cohort context.

Professional Affiliation

Past collaboration, consulting, signatory, or mentorship relationship with reference laboratory or Helena Bioinformatics, falling short of formal employment.

Default management: Disclosure mandatory. Case-by-case mitigation. Includes individuals serving as external clinical signatories on reference laboratory reports.

Financial

Equity ownership, options, royalties, advisory fees, or material financial interest in Helena Bioinformatics or in any organization with material business relationship to study outcome.

Default management: Disclosure mandatory. Does not categorically disqualify but is disclosed in all reports.

Business

Material business relationship between individual primary affiliation and Helena Bioinformatics.

Default management: Disclosure mandatory. Role restrictions may apply.

8.3. Disclosure Procedure

All individuals serving in the following roles complete a COI Disclosure Form before commencing role activities:

  • Case operator (Real-World Performance Study)

  • Clinical reviewer (Layer 3 evaluation)

  • Academic peer reviewer (pre-publication)

  • Scientific Advisor

  • External methodology consultant

The COI Disclosure Form (HEL-COI-DISCLOSURE-FORM-v1.0) is maintained as a separate template document. Completed forms retained in Helena document repository for the duration of engagement plus 15 years.

8.4. Disclosure Registry - Current State

As of v1.0 effective date, the following roles have disclosed conflicts of interest. Disclosures and mitigations apply to all studies conducted under this Framework. Individual names are documented in the internal Disclosure Registry and disclosed in any per-case report or publication where the individual contribution is referenced.

Founder and Chief Executive Officer

Sole technical and scientific lead in current organizational stage.

Category: Financial (sole owner), Employment

Disclosure: Owns 100 percent of Helena Bioinformatics. Performs all internal Helena roles in current organizational stage.

Mitigation: Disclosed in all Real-World Performance Study Reports. The multi-source ground truth methodology mitigates single-author classification framing. Future organizational growth will introduce role separation.

Scientific Advisor

Methodology consultation and scientific guidance.

Category: Family (relationship to founder)

Disclosure: Family relationship to Helena founder.

Mitigation: Excluded from formal adjudication. Role limited to Scientific Advisor (methodology consultation, framework review, scientific guidance). Family relationship disclosed in any report where input is referenced.

Cohort 2 case operator

Real-World Performance Study case processing.

Category: Employment (peer reference laboratory)

Disclosure: Employed by the peer reference comparator laboratory used in Cohort 2.

Mitigation: Disclosed in Cohort 2 Real-World Performance Study Report. Operator role limited to technical processing and comparison documentation; not classification authority. Multi-source ground truth methodology mitigates single-source dependency. Future cohorts will introduce independent operator structure.

External clinical signatory

Historical signatory on peer reference laboratory clinical reports referenced in Cohort 2. Potential future Helena Scientific Advisor or methodology consultant.

Category: Professional Affiliation

Disclosure: Has signed peer reference laboratory clinical reports (including those referenced in Cohort 2) as external clinical signatory. Not a peer reference laboratory employee.

Mitigation: Cannot serve as adjudicator on cases previously signed. May serve as Helena Scientific Advisor or methodology consultant for non-overlapping cohorts. Disclosed in any report where input is referenced.

8.5. Mitigation Strategies

For each disclosed COI, one or more mitigations apply:

  • Role exclusion: Individual excluded from specific roles where COI creates unmitigatable bias.

  • Multi-source triangulation: Single-source dependency mitigated by multi-source ground truth methodology (Section 2.2).

  • Transparent disclosure: COI disclosed in all relevant documentation.

  • Procedural separation: Individual operates under documented procedural controls limiting influence.

  • Role limitation: Individual restricted to roles where COI does not create methodological dependency on outcome.

  • Periodic review: COI status reviewed annually.

8.6. Annual Review

Disclosure Registry reviewed annually. Off-cycle review triggered by:

  • New individual engagement requiring COI disclosure.

  • Material change in individual affiliations, financial interests, or family circumstances.

  • Material change in Helena Bioinformatics organizational structure.

  • External inquiry regarding COI.

9. Path to Future Regulatory-Grade Validation

9.1. Phased Pathway

This Framework operates within Phase 1 of an explicitly phased pathway toward eventual regulatory submission. Phase progression is conditional on organizational growth and is not pursued at the expense of current activities.

Phase 1Foundation (Current)

Organizational stage: Current Phase 1 organizational stage. No internal Quality, Regulatory, or Clinical functions; activities reliant on external collaboration where required.

Permitted activities:

  • Real-World Performance Studies
  • Methodology Validation Studies (database-derived)
  • Scientific publications
  • Foundational evidence accumulation
Phase 2Growth

Organizational stage: Small team (3 to 10). Series A or strategic partnership. Dedicated Quality and Regulatory functions.

Permitted activities:

  • Phase 1 activities continue
  • First Pivotal Cohort possible
  • Multi-cohort program execution
  • Multi-adjudicator panel introduced
Phase 3Pre-submission

Organizational stage: Established team (10+). Mature QMS. Notified Body engagement.

Permitted activities:

  • Pivotal Cohorts at full composition
  • Multi-site validation
  • Notified Body conformity assessment submission
  • Post-market performance follow-up planning

9.2. Differences in Future Phases

Activities reserved for Phase 2 and Phase 3 (and not pursued in current Phase 1) include:

  • Formal external adjudication panels (multi-adjudicator, paid engagement)

  • Independent operator structure (no peer laboratory affiliations)

  • Pivotal Cohort composition (60/30/10 positive/negative/edge)

  • Multi-site validation across independent laboratories

  • Pre-submission Notified Body consultations

  • Risk Management File at full ISO 14971 detail

  • Quality Management System at full ISO 13485 detail

9.3. Foundational Evidence Carryforward

Real-World Performance Studies conducted under this Framework produce foundational evidence that is carried forward to future Pivotal submissions. Specifically:

  • Performance characterization data is cited as preliminary evidence supporting platform readiness.

  • Identified defects and remediation history demonstrate active quality management posture.

  • Methodology rationale established in published literature provides scientific defense for classifier choices.

  • Multi-source agreement analysis demonstrates that Helena classifications align with broader expert consensus where peer-reference disagreements occur.

This Framework outputs are therefore not consolation work for premature regulatory submission. They are appropriate-stage scientific evidence with permanent value across the regulatory pathway.

10. Document Control

10.1. Version History

v1.0April 2026

Initial release of reframed framework. Replaces archived Helena Clinical Validation Framework v1.0 and v1.1. Reframes scope from regulatory validation to Real-World Performance Studies appropriate to current organizational stage. Establishes multi-source ground truth methodology, methodological characterization (rather than formal adjudication), academic peer review, and explicit phased pathway to future regulatory submission.

Author: Helena Bioinformatics

10.2. Change Control

Living document. Revision triggered by:

  • Methodology refinement during cohort execution.

  • Updates to applicable scientific standards (ACMG, ClinGen SVI, ClinVar review tiers).

  • External peer review feedback on Framework or Study Reports.

  • Material change in organizational stage (Phase 1 to 2 to 3 transition).

  • Material change in COI Disclosure Registry.

  • Internal periodic review (minimum annual).

Minor revisions issued as v1.x. Major revisions (e.g., transition to Pivotal Cohort framework) issued as v2.0+.

10.3. Document Retention

All versions retained indefinitely as part of Helena Bioinformatics records. Minimum 15 years retention applied for downstream regulatory traceability.

10.4. Approval

This Framework v1.0 is approved by Helena Bioinformatics on the date stated in Section 10.1.

Approved byHelena Bioinformatics
Date of approvalApril 2026
Document IDHEL-RWP-FRAMEWORK-v1.0
DistributionInternal - Helena Bioinformatics. Provided to academic peer reviewers, scientific collaborators, and prospective regulatory consultants upon engagement.

End of Framework v1.0