Helena

Analysis Workflow

Phenotype Matching

StandardHPO|Whole-genome scale|Second-level latency

ACMG classification tells you how pathogenic a variant is. It does not tell you which variant explains your patient. Phenotype Matching answers that second question, turning clinical relevance into a reproducible score and a five-level priority hierarchy aligned with how a geneticist actually reads a case.

Phenotype-first prioritization. A VUS with strong phenotypic relevance ranks above a Pathogenic variant for an unrelated condition. This is the correct clinical behaviour, and it is built into the system, not deferred to manual triage.

Clinical Positioning

After ACMG classification of a whole-genome or exome case, the geneticist is left with hundreds to thousands of variants flagged as Pathogenic, Likely Pathogenic, or VUS. Most are irrelevant to the referral indication.

The Real Diagnostic Question

A Pathogenic BRCA1 variant has no diagnostic value for a child referred for refractory epilepsy. Manually correlating each candidate against the patient's clinical presentation traditionally consumes the better part of a working week per case, and that effort scales poorly across cohorts and trio analyses.

Phenotype Matching automates the correlation: it reads the patient's clinical presentation as HPO terms, compares those terms semantically to the phenotypic spectrum of every gene carrying a candidate variant, and assigns each variant a phenotype match score, a clinical priority score, and a clinical tier. Results are stored alongside the variant data, so downstream analysis, screening, AI assistance, reporting, sees a single coherent picture.

Built-in clinical principle: a P/LP variant without phenotype match is reported as an Incidental Finding, not as Tier 1. A VUS with strong phenotype relevance can outrank a P/LP variant for an unrelated disease. This mirrors how a careful geneticist actually reads a case.

Workflow for the Geneticist

The service is invoked through the Helena interface. The conceptual flow is the same regardless of access path.

1

Phenotype Capture

The geneticist captures the patient's clinical findings as HPO terms. Three entry modes are supported: direct HPO ontology search with name and synonym autocomplete; AI-assisted extraction from free-text referral letters or discharge summaries with automatic exclusion of negated phrases; and direct lookup by HP:ID. Free-text clinical notes can be attached to the case.

2

Run Matching

The service reads the patient's HPO terms together with the variant data already produced by upstream classification. Phenotype relevance is computed for every variant in scope, tiering rules are applied, and results are written back to the case alongside the existing variant annotations.

3

Review

Results are presented gene-first. Each gene card shows the best tier achieved by any variant in that gene, the strongest phenotype match score, the count of variants per tier, and which patient HPO terms the gene matched on. Expanding a gene reveals every variant with full annotation context and a per-term breakdown of how each patient HPO term aligned with the gene's phenotypic spectrum.

4

Report

A branded PDF report can be downloaded for the case. The report enumerates Tier 1, Tier 2, and Incidental Findings in detail, with summary counts for Tier 3 and Tier 4. The report contains factual phenotype matching data only, no AI-generated interpretation, and is designed to be reviewed and signed off by a clinical geneticist before any clinical action.

The Five-Tier System

Each variant is placed into exactly one tier. Tier ranges do not overlap, so a variant's score alone determines its tier.

TierLabelScore RangeClinical Meaning
Tier 1Actionable80.00 - 99.99Pathogenic or Likely Pathogenic with strong phenotype match. The variant most likely explaining the patient's presentation.
Tier 2Potentially Actionable60.00 - 79.99VUS with strong supporting evidence (high impact, strong phenotype match, rare). Or P/LP variants where inheritance context warrants further investigation before calling them diagnostic.
IFIncidental Finding40.00 - 59.99Pathogenic or Likely Pathogenic, but not relevant to the referral phenotype. Clinically important as a secondary finding, but separated from the primary diagnostic search.
Tier 3Uncertain20.00 - 39.99VUS with moderate evidence. Worth tracking; insufficient for clinical action.
Tier 4Unlikely0.00 - 19.99Benign, Likely Benign, common, or with poor phenotype relevance.

Sorting Within a Tier

Within each tier, finer-grained sorting is driven primarily by phenotype match strength, with ACMG classification and population frequency contributing secondary signal. The geneticist always sees the strongest candidates first within each clinical category.

Conservative Tier 1 and Tier 2

The service is deliberately conservative about the top tiers. A variant cannot reach Tier 1 or Tier 2 on impact alone. Phenotype relevance is required, and benign evidence, ClinVar Benign with curator review, or strong benign ACMG criteria, blocks elevation regardless of phenotype score.

Why Phenotype-First Matters Clinically

Most variant prioritization tools start from the variant and ask: is this pathogenic? They sort the patient's variants by ACMG class and let the clinician filter from there. The result is a Tier 1 list dominated by genuinely pathogenic variants in genes that have nothing to do with the referral.

The Reversed Question

Phenotype Matching reverses the question. It starts from the patient and asks: given this clinical picture, which variants deserve attention first? The output reflects what an experienced geneticist actually does implicitly when reading a case, now made explicit, reproducible, and consistent across cases and reviewers.

A relevant VUS is not buried

A VUS in a gene whose phenotypic spectrum closely matches the patient's presentation appears in Tier 2, not on page seven of a flat pathogenicity-ordered list. This is where many genuine diagnostic discoveries actually live.

An unrelated pathogenic variant is not noise

A Pathogenic variant for a condition unrelated to the referral indication is preserved as an Incidental Finding, visible, scored, reportable as a secondary finding when clinically appropriate, but not competing for attention with the variant that actually explains the case.

Inheritance is respected

A heterozygous pathogenic variant in a gene that causes disease only in the homozygous or compound heterozygous state is treated differently from one in a dominant gene. Carrier status is not a diagnosis. Dual-inheritance genes are flagged so the geneticist can assess context.

Scale and Performance

The service is engineered for whole-genome data. A typical WGS case carries hundreds of thousands of HPO-annotated variants. The matching pipeline returns full results in single-digit seconds for typical cases.

Pre-warmed Parallelism

Worker processes are initialized at service startup with the HPO ontology already loaded and caches primed. Incoming requests do not pay the cold-start cost.

Scale-aware Computation

The pipeline recognizes that variants in the same gene share the same phenotypic context, and exploits that structure to avoid redundant work. The end user sees per-variant results; the engine does dramatically less work to produce them.

Optimized Frontend Rendering

Gene-level summaries are exported as a small compressed payload that streams in near-instantly when the case is opened. The heavier per-gene variant detail is fetched on demand only when the geneticist expands a gene. The interface stays responsive even on cases with hundreds of thousands of variants.

Inputs and Outputs

What the service consumes from the upstream pipeline and from the geneticist, and what it produces for review and downstream use.

Inputs from the Pipeline

Completed variant analysis session

ACMG/AMP classification per variant

Gene and HPO annotation

Population frequencies (gnomAD)

ClinVar context and review status

Inheritance context (Orphanet AD/AR/XLD/XLR, ClinGen dosage)

Inputs from the Geneticist

List of HPO terms representing the patient's clinical presentation

Optional free-text clinical notes that travel with the case

Outputs for the Geneticist

Phenotype match score (0-100) and clinical tier for every variant with HPO data

Gene-level summary view ranked by clinical priority

Per-variant breakdown of which patient HPO terms matched, against what, and how strongly

Inheritance context notes (e.g., dual-inheritance genes, recessive carrier flags)

Downloadable branded PDF report with Tier 1, Tier 2, and Incidental Findings detail

Outputs for Downstream Services

Phenotype tier and score data consumed by the Screening Service for tier-aware variant scoring

Phenotype context consumed by the AI Service for clinical interpretation reports

Standards and Boundaries

The service operates against published standards and within explicit clinical boundaries.

HPO

The Human Phenotype Ontology is the international standard for structured clinical phenotype representation in genetics. The service operates entirely in HPO terms with full ontology hierarchy support.

Reference: Kohler et al., Nucleic Acids Research, 2021, PMID: 33264411

ACMG/AMP

Variant classification follows the ACMG/AMP 2015 guidelines with subsequent ClinGen specifications. Classification itself is performed upstream by the Variant Analysis Service. Phenotype Matching consumes that classification, it does not reclassify.

Reference: Richards et al., Genetics in Medicine, 2015, PMID: 25741868

Reporting Boundary

The service produces factual phenotype matching data. It does not generate clinical interpretations, does not make diagnostic calls, and does not replace clinical review. All output is intended for review and sign-off by a qualified clinical geneticist before any clinical action.

Data Residency

The service runs within the Helena platform on EU-based infrastructure compliant with GDPR Article 9 (special category data) and 1+MG technical requirements. Patient phenotype data and variant data remain inside the platform.

What Sets It Apart

Six design choices that make Phenotype Matching distinct from generic variant prioritization tools.

Phenotype-first by design

Not by post-hoc filtering. The tier system encodes the clinical principle that relevance to the referral phenotype matters more than abstract pathogenicity. A relevant VUS can outrank an unrelated Pathogenic variant, this is the correct clinical behaviour.

Whole-genome scale, second-level latency

A typical WGS case carries hundreds of thousands of HPO-annotated variants. Full results, phenotype scores, tier assignment, and per-term breakdowns for every variant, return in single-digit seconds. Suitable for routine diagnostic workflows, not research-only batch processing.

Transparent, decomposable scoring

Every score decomposes into phenotype, ACMG, and frequency components. Every match decomposes into per-HPO-term contributions. The geneticist always sees why a variant ranked where it did, not just the final number.

Inheritance-aware throughout

Heterozygous findings in recessive genes, dual-inheritance genes, and compound heterozygous candidates are handled distinctly rather than collapsed into a single pathogenic-or-not flag.

Conservative on Tier 1 and Tier 2

Strong benign evidence, ClinVar Benign with curator review, BA1, or strong benign ACMG criteria, blocks tier elevation regardless of phenotype score. Tier 1 is reserved for variants that genuinely warrant clinical action.

Clinically reviewable output

Reports separate Tier 1, Tier 2, and Incidental Findings explicitly. The secondary-findings conversation is never conflated with the primary diagnostic search.

See Phenotype Matching in Practice

Request a demo to see Helena process a real case, from HPO entry to a tiered shortlist of variants ranked by relevance to the patient's presentation.

Contact Us