Analysis Workflow
Phenotype Matching
ACMG classification tells you how pathogenic a variant is. It does not tell you which variant explains your patient. Phenotype Matching answers that second question, turning clinical relevance into a reproducible score and a five-level priority hierarchy aligned with how a geneticist actually reads a case.
Phenotype-first prioritization. A VUS with strong phenotypic relevance ranks above a Pathogenic variant for an unrelated condition. This is the correct clinical behaviour, and it is built into the system, not deferred to manual triage.
Clinical Positioning
After ACMG classification of a whole-genome or exome case, the geneticist is left with hundreds to thousands of variants flagged as Pathogenic, Likely Pathogenic, or VUS. Most are irrelevant to the referral indication.
The Real Diagnostic Question
A Pathogenic BRCA1 variant has no diagnostic value for a child referred for refractory epilepsy. Manually correlating each candidate against the patient's clinical presentation traditionally consumes the better part of a working week per case, and that effort scales poorly across cohorts and trio analyses.
Phenotype Matching automates the correlation: it reads the patient's clinical presentation as HPO terms, compares those terms semantically to the phenotypic spectrum of every gene carrying a candidate variant, and assigns each variant a phenotype match score, a clinical priority score, and a clinical tier. Results are stored alongside the variant data, so downstream analysis, screening, AI assistance, reporting, sees a single coherent picture.
Built-in clinical principle: a P/LP variant without phenotype match is reported as an Incidental Finding, not as Tier 1. A VUS with strong phenotype relevance can outrank a P/LP variant for an unrelated disease. This mirrors how a careful geneticist actually reads a case.
Workflow for the Geneticist
The service is invoked through the Helena interface. The conceptual flow is the same regardless of access path.
Phenotype Capture
The geneticist captures the patient's clinical findings as HPO terms. Three entry modes are supported: direct HPO ontology search with name and synonym autocomplete; AI-assisted extraction from free-text referral letters or discharge summaries with automatic exclusion of negated phrases; and direct lookup by HP:ID. Free-text clinical notes can be attached to the case.
Run Matching
The service reads the patient's HPO terms together with the variant data already produced by upstream classification. Phenotype relevance is computed for every variant in scope, tiering rules are applied, and results are written back to the case alongside the existing variant annotations.
Review
Results are presented gene-first. Each gene card shows the best tier achieved by any variant in that gene, the strongest phenotype match score, the count of variants per tier, and which patient HPO terms the gene matched on. Expanding a gene reveals every variant with full annotation context and a per-term breakdown of how each patient HPO term aligned with the gene's phenotypic spectrum.
Report
A branded PDF report can be downloaded for the case. The report enumerates Tier 1, Tier 2, and Incidental Findings in detail, with summary counts for Tier 3 and Tier 4. The report contains factual phenotype matching data only, no AI-generated interpretation, and is designed to be reviewed and signed off by a clinical geneticist before any clinical action.
The Five-Tier System
Each variant is placed into exactly one tier. Tier ranges do not overlap, so a variant's score alone determines its tier.
| Tier | Label | Score Range | Clinical Meaning |
|---|---|---|---|
| Tier 1 | Actionable | 80.00 - 99.99 | Pathogenic or Likely Pathogenic with strong phenotype match. The variant most likely explaining the patient's presentation. |
| Tier 2 | Potentially Actionable | 60.00 - 79.99 | VUS with strong supporting evidence (high impact, strong phenotype match, rare). Or P/LP variants where inheritance context warrants further investigation before calling them diagnostic. |
| IF | Incidental Finding | 40.00 - 59.99 | Pathogenic or Likely Pathogenic, but not relevant to the referral phenotype. Clinically important as a secondary finding, but separated from the primary diagnostic search. |
| Tier 3 | Uncertain | 20.00 - 39.99 | VUS with moderate evidence. Worth tracking; insufficient for clinical action. |
| Tier 4 | Unlikely | 0.00 - 19.99 | Benign, Likely Benign, common, or with poor phenotype relevance. |
Sorting Within a Tier
Within each tier, finer-grained sorting is driven primarily by phenotype match strength, with ACMG classification and population frequency contributing secondary signal. The geneticist always sees the strongest candidates first within each clinical category.
Conservative Tier 1 and Tier 2
The service is deliberately conservative about the top tiers. A variant cannot reach Tier 1 or Tier 2 on impact alone. Phenotype relevance is required, and benign evidence, ClinVar Benign with curator review, or strong benign ACMG criteria, blocks elevation regardless of phenotype score.
Why Phenotype-First Matters Clinically
Most variant prioritization tools start from the variant and ask: is this pathogenic? They sort the patient's variants by ACMG class and let the clinician filter from there. The result is a Tier 1 list dominated by genuinely pathogenic variants in genes that have nothing to do with the referral.
The Reversed Question
Phenotype Matching reverses the question. It starts from the patient and asks: given this clinical picture, which variants deserve attention first? The output reflects what an experienced geneticist actually does implicitly when reading a case, now made explicit, reproducible, and consistent across cases and reviewers.
A relevant VUS is not buried
A VUS in a gene whose phenotypic spectrum closely matches the patient's presentation appears in Tier 2, not on page seven of a flat pathogenicity-ordered list. This is where many genuine diagnostic discoveries actually live.
An unrelated pathogenic variant is not noise
A Pathogenic variant for a condition unrelated to the referral indication is preserved as an Incidental Finding, visible, scored, reportable as a secondary finding when clinically appropriate, but not competing for attention with the variant that actually explains the case.
Inheritance is respected
A heterozygous pathogenic variant in a gene that causes disease only in the homozygous or compound heterozygous state is treated differently from one in a dominant gene. Carrier status is not a diagnosis. Dual-inheritance genes are flagged so the geneticist can assess context.
Scale and Performance
The service is engineered for whole-genome data. A typical WGS case carries hundreds of thousands of HPO-annotated variants. The matching pipeline returns full results in single-digit seconds for typical cases.
Pre-warmed Parallelism
Worker processes are initialized at service startup with the HPO ontology already loaded and caches primed. Incoming requests do not pay the cold-start cost.
Scale-aware Computation
The pipeline recognizes that variants in the same gene share the same phenotypic context, and exploits that structure to avoid redundant work. The end user sees per-variant results; the engine does dramatically less work to produce them.
Optimized Frontend Rendering
Gene-level summaries are exported as a small compressed payload that streams in near-instantly when the case is opened. The heavier per-gene variant detail is fetched on demand only when the geneticist expands a gene. The interface stays responsive even on cases with hundreds of thousands of variants.
Inputs and Outputs
What the service consumes from the upstream pipeline and from the geneticist, and what it produces for review and downstream use.
Inputs from the Pipeline
Completed variant analysis session
ACMG/AMP classification per variant
Gene and HPO annotation
Population frequencies (gnomAD)
ClinVar context and review status
Inheritance context (Orphanet AD/AR/XLD/XLR, ClinGen dosage)
Inputs from the Geneticist
List of HPO terms representing the patient's clinical presentation
Optional free-text clinical notes that travel with the case
Outputs for the Geneticist
Phenotype match score (0-100) and clinical tier for every variant with HPO data
Gene-level summary view ranked by clinical priority
Per-variant breakdown of which patient HPO terms matched, against what, and how strongly
Inheritance context notes (e.g., dual-inheritance genes, recessive carrier flags)
Downloadable branded PDF report with Tier 1, Tier 2, and Incidental Findings detail
Outputs for Downstream Services
Phenotype tier and score data consumed by the Screening Service for tier-aware variant scoring
Phenotype context consumed by the AI Service for clinical interpretation reports
Standards and Boundaries
The service operates against published standards and within explicit clinical boundaries.
HPO
The Human Phenotype Ontology is the international standard for structured clinical phenotype representation in genetics. The service operates entirely in HPO terms with full ontology hierarchy support.
Reference: Kohler et al., Nucleic Acids Research, 2021, PMID: 33264411
ACMG/AMP
Variant classification follows the ACMG/AMP 2015 guidelines with subsequent ClinGen specifications. Classification itself is performed upstream by the Variant Analysis Service. Phenotype Matching consumes that classification, it does not reclassify.
Reference: Richards et al., Genetics in Medicine, 2015, PMID: 25741868
Reporting Boundary
The service produces factual phenotype matching data. It does not generate clinical interpretations, does not make diagnostic calls, and does not replace clinical review. All output is intended for review and sign-off by a qualified clinical geneticist before any clinical action.
Data Residency
The service runs within the Helena platform on EU-based infrastructure compliant with GDPR Article 9 (special category data) and 1+MG technical requirements. Patient phenotype data and variant data remain inside the platform.
What Sets It Apart
Six design choices that make Phenotype Matching distinct from generic variant prioritization tools.
Phenotype-first by design
Not by post-hoc filtering. The tier system encodes the clinical principle that relevance to the referral phenotype matters more than abstract pathogenicity. A relevant VUS can outrank an unrelated Pathogenic variant, this is the correct clinical behaviour.
Whole-genome scale, second-level latency
A typical WGS case carries hundreds of thousands of HPO-annotated variants. Full results, phenotype scores, tier assignment, and per-term breakdowns for every variant, return in single-digit seconds. Suitable for routine diagnostic workflows, not research-only batch processing.
Transparent, decomposable scoring
Every score decomposes into phenotype, ACMG, and frequency components. Every match decomposes into per-HPO-term contributions. The geneticist always sees why a variant ranked where it did, not just the final number.
Inheritance-aware throughout
Heterozygous findings in recessive genes, dual-inheritance genes, and compound heterozygous candidates are handled distinctly rather than collapsed into a single pathogenic-or-not flag.
Conservative on Tier 1 and Tier 2
Strong benign evidence, ClinVar Benign with curator review, BA1, or strong benign ACMG criteria, blocks tier elevation regardless of phenotype score. Tier 1 is reserved for variants that genuinely warrant clinical action.
Clinically reviewable output
Reports separate Tier 1, Tier 2, and Incidental Findings explicitly. The secondary-findings conversation is never conflated with the primary diagnostic search.
See Phenotype Matching in Practice
Request a demo to see Helena process a real case, from HPO entry to a tiered shortlist of variants ranked by relevance to the patient's presentation.