Helena

ACMG Methodology

Classification Enginev3.30.0|Updated April 2026

Complete documentation of how Helena processes, annotates, and classifies genetic variants. Every threshold, database version, and classification rule used in production is documented on this page. This documentation is intended for clinical geneticists, laboratory directors, and accreditation auditors.

Variant classification follows the ACMG/AMP 2015 framework (Richards et al., Genetics in Medicine, 2015) implemented through the Bayesian point-based system (Tavtigian et al., Hum Mutat. 2018;39(11):1485-1492. PMID: 30311386) with BayesDel ClinGen SVI calibrated thresholds (Pejaver et al., Am J Hum Genet. 2022;109(12):2163-2177. PMID: 36413997) and SpliceAI integration aligned to ClinGen SVI 2023 recommendations (Walker et al., Am J Hum Genet. 2023;110(7):1046-1067. PMID: 37352859). Optional ClinGen VCEP gene-specific specification overlay is available for approximately 50-60 genes. Classification is strictly evidence-based. No machine learning model determines variant pathogenicity. Recent versions (v3.24 through v3.28) introduce subtractive sophistication guards that prevent Likely Pathogenic classification when the evidence profile is computational-only or biologically inconsistent with the gene mechanism (AR-only LoF genes, dual-mechanism bypass, gnomAD LoF tolerance, ClinGen negative evidence).

Pipeline Overview

Six-stage processing pipeline transforms a raw VCF file into fully classified variants. Total execution time under 15 minutes for a whole genome (~4M variants) on dedicated hardware.

1

VCF Parsing

~60s

Standard VCF file parsed into columnar in-memory database. Multi-allelic handling, genome build detection (GRCh38 required).

2

Quality Filtering

~5s

Configurable quality, depth, and genotype quality thresholds. ClinVar-listed pathogenic variants are protected from filtering.

3

VEP Annotation

~3-4 min

Ensembl Variant Effect Predictor for consequence, impact, transcript, and protein annotations. Parallel processing across chromosomes.

4

Reference DB Annotation

~5-10s

Population frequencies, clinical significance, functional predictions, gene constraint, phenotype associations, and dosage sensitivity loaded from 16 reference databases.

5

ACMG Classification

<1s

SQL-based ACMG/AMP 2015 classification using Bayesian point framework (Tavtigian et al. 2018). 19 automated criteria evaluated with calibrated evidence strength, point-based classification thresholds applied, continuous confidence scores assigned. Optional VCEP gene-specific overlay for ~50-60 genes. Homozygous reference genotypes (hom_ref) from multi-allelic sites excluded from classification.

6

Export

~5s

Gene-level summaries exported for streaming. Classified variants persisted to analytical database for downstream services.

Maximum Sensitivity Approach

Helena classifies all variants that pass quality filtering. There is no frequency-based or impact-based pre-filtering at any stage. A common variant (e.g., gnomAD allele frequency 40%) is still classified - it will receive a Benign classification via BA1, but it is not silently discarded before classification. This design ensures that no variant is excluded from clinical review by an automated filter. The geneticist decides clinical relevance based on the complete classification and annotation data.

Reference Databases

All reference data is stored locally on EU-based infrastructure. No variant data is sent to external APIs during processing. Database versions are fixed per deployment and documented here.

gnomAD

v4.1.0~759M variants

Population allele frequencies (global and population-specific), allele counts, homozygote counts

Used by: BA1 (allele frequency > 5%), BS1 (elevated frequency), PM2 (absent in controls), BS2 (homozygote count)

Source: gnomad.broadinstitute.org

ClinVar

2025-01~4.1M variants

Clinical significance assertions, review star levels, disease associations, submitter information

Used by: PS1 (known pathogenic), PP5 (reputable source pathogenic), BP6 (reputable source benign), ClinVar override logic, quality filter rescue

Source: ncbi.nlm.nih.gov/clinvar

dbNSFP

4.9c~80.6M variant sites

Functional impact predictions from multiple algorithms, conservation scores

Used by: PP3/BP4 primary tool: BayesDel_noAF with ClinGen SVI calibrated thresholds (Pejaver et al. 2022). Display predictors: SIFT, AlphaMissense, MetaSVM, DANN, PhyloP, GERP (available for clinical review, not used in classification logic)

Source: sites.google.com/site/jpopgen/dbNSFP

SpliceAI

Ensembl MANE (Release 113)Precomputed for all coding variants

Splice impact predictions (4 delta scores: acceptor gain, acceptor loss, donor gain, donor loss)

Used by: PP3_splice (max score >= 0.2), BP4 guard (max score < 0.1), BP7 (synonymous splice check)

Source: Illumina / Ensembl

gnomAD Constraint

v4.1.0~18.2K genes

Gene-level constraint metrics indicating tolerance to loss-of-function and missense variation

Used by: PVS1 (LoF intolerance), PP2 (missense constraint), BP1 (LoF tolerance)

Source: gnomad.broadinstitute.org

MANE Select Transcripts

v1.4 (GRCh38)19,354 transcripts (19,288 Select + 66 Plus Clinical)

Canonical transcript definitions with CDS start/end coordinates for each protein-coding gene. MANE Select is the single representative transcript per gene agreed upon by NCBI and Ensembl. MANE Plus Clinical covers additional clinically relevant transcripts for 66 genes.

Used by: PVS1 MANE positional guard (v3.22.0, HELIX-CR-2026-055): variants outside MANE Select CDS do not receive PVS1 or PM4. Exon-level pext aggregation (CR-056): pext scores computed per MANE Select CDS exon.

Source: NCBI / EMBL-EBI / MANE Consortium

gnomAD pext (Proportion Expressed)

v4.1 (GTEx v10, GRCh38)189,856 exon records for 18,923 genes

Per-exon expression proportion across 49 GTEx v10 tissues. For each base in the coding sequence, pext measures what proportion of the gene total expression passes through that position. Aggregated to exon level as mean_pext (arithmetic mean across tissues) and max_tissue_pext (maximum across 49 tissues). Max tissue pext used for classification to avoid false negatives for tissue-specific genes.

Used by: PVS1 expression-aware guard (v3.23.0, HELIX-CR-2026-056): max_tissue_pext >= 0.9 = PVS1 Very Strong, 0.1-0.9 = PVS1 downgraded to Strong, < 0.1 = PVS1 blocked. PM4 binary guard (blocked at < 0.1). Scientific basis: Cummings 2020 PMID:32461655.

Source: gnomAD / Broad Institute

HPO (Multi-Source Enriched)

v3.12.0 (6 sources)~927K records, 5,688 genes

Gene-to-phenotype associations aggregated from 6 curated sources: HPO Consortium (5,173 genes), Orphanet disease-to-HPO (3,176 genes with frequency data), DECIPHER G2P 7 clinical panels (2,125 genes), Monarch Initiative (4,791 genes), ClinVar-MedGen P/LP chain mapping (5,258 genes), and manual clinical curation. Source priority ordering with confidence-based filtering.

Used by: PP4 (patient phenotype matching), Phenotype Matching Service (semantic similarity scoring), Screening Service (gene clinical breadth proxy)

Source: HPO Consortium + Orphanet + G2P + Monarch + ClinVar-MedGen

ClinGen

Latest release~1.6K genes

Dosage sensitivity scores (haploinsufficiency, triplosensitivity)

Used by: PVS1 (haploinsufficiency_score = 3 as constraint gate fallback for X-linked genes, v3.11.5), BS1 (inheritance-aware frequency threshold proxy), BP2 (trans with pathogenic in recessive)

Source: clinicalgenome.org

Orphanet / Orphadata

June 2025~3,200 genes

Gene-disease-inheritance mode annotations for rare diseases (autosomal dominant, autosomal recessive, X-linked dominant, X-linked recessive)

Used by: BS1 (inheritance-aware frequency threshold: Orphanet AD/XLD -> 0.1%, Orphanet AR -> 5%), PVS1 disease association gate (gene must have Orphanet entry for PVS1 to apply)

Source: orphadata.com (INSERM, France)

ClinGen VCEP Specifications

Latest release~50-60 genes

Gene-specific ACMG criteria thresholds from ClinGen Variant Curation Expert Panels

Used by: BA1, BS1, PM2 (gene-specific frequency thresholds), PVS1 (applicability gate for gain-of-function genes), PVS1 disease association gate

Source: cspec.genome.network

DECIPHER G2P

2026-02-28 (7 panels + mechanism)~43K records, 2,125 genes; 2,372 mechanism records

Gene-disease associations from 7 curated clinical panels: DD (developmental disorders), Eye, Cardiac, Skin, Skeletal, Cancer, Ear. Each entry includes allelic requirement, molecular mechanism (gain of function, loss of function, dominant negative), and confidence level (definitive, strong, moderate). Molecular mechanism data used for PVS1 GoF/DN guard (v3.17.0).

Used by: HPO enrichment pipeline (source 3 of 6, v3.12.0). PVS1 GoF/DN guard: gof_genes_unified view provides 348 pure GoF/DN/GoE monoallelic genes where PVS1 is blocked (v3.18.0, HELIX-REF-003). Sources: G2P (199 genes), GoFCards (179 genes), manual curation (27 genes). Dual-mechanism genes excluded from view to preserve PVS1 eligibility for LoF phenotypes.

Source: DECIPHER / Wellcome Sanger Institute

Monarch Initiative

Latest release~151K records, 4,791 genes

Gene-phenotype associations aggregated from HPO Consortium redistribution via Monarch Knowledge Graph. Broad coverage of gene-disease relationships from multiple biomedical ontologies.

Used by: HPO enrichment pipeline (source 4 of 6). Large-scale gene coverage complements primary HPO Consortium data.

Source: Monarch Initiative (monarchinitiative.org)

ClinVar-MedGen HPO Mapping

Derived from ClinVar 2025-01~245K records, 5,258 genes

Gene-phenotype associations derived from ClinVar P/LP variant submissions mapped through MedGen CUI to HPO terms. Chain: ClinVar P/LP variant -> gene -> disease (MedGen CUI) -> HPO terms. Largest single contributor of new genes to the HPO enrichment pipeline.

Used by: HPO enrichment pipeline (source 5 of 6). Covers 5,258 genes including many not present in primary HPO Consortium or Orphanet datasets.

Source: NCBI ClinVar + MedGen

UniProt SwissProt

2026_01 (human reviewed proteome)20,431 proteins, 113,126 PM1-eligible features

Expert-curated protein sequence and feature annotation for the reviewed human proteome. Per-residue functional annotations parsed from the SwissProt flat file format: DISULFID (disulfide bonds), ACT_SITE (enzyme active sites), BINDING (substrate/cofactor binding sites), MOD_RES (post-translational modifications), and MOTIF (functional motifs). Includes UniProt-to-Ensembl cross-references for direct ENSP-based lookup at classification time.

Used by: PM1 Tier 1 (primary) -- residue-level critical functional residue evidence (v3.30.0, HELIX-CR-2026-082). 113,126 PM1-eligible features across 14,278 proteins replace previous domain-level Pfam-only logic (2,262x evidence density increase over v3.28). Eligibility determined by qualifier presence (any /evidence= or /note=); naked features excluded by default with optional config override for clinical cases.

Source: UniProt Consortium (EMBL-EBI / SIB / PIR)

InterPro / Pfam

March 2026 (27,481 entries)~27.5K Pfam domain entries, ~50 critical for PM1

Comprehensive protein domain classification catalog from Pfam. Full Pfam entry set loaded into reference database with domain accession, type (domain/family/repeat/motif), and functional annotation. Critical functional domains marked for PM1 Tier 2 fallback evidence (retained for VCEP-defined regions and domain-level evidence not yet captured at residue resolution by UniProt Tier 1, v3.30.0).

Used by: PM1 Tier 2 fallback (critical functional domain detection via is_critical flag, v3.15.0). ~50 domains marked as critical across 14 categories: kinase, DNA binding, ion channel, GTPase, protease, signaling, tumor suppressor, calcium binding, structural, receptor, hearing, nuclear hormone receptor, ubiquitin, RNA binding. Migrated to Parquet-first Pattern A architecture in v3.30.0 (HELIX-CR-2026-082 Phase 5).

Source: EMBL-EBI InterPro

ClinGen Gene-Disease Validity

Latest release~3,374 gene-disease pairs

Gene-disease validity classifications (Definitive, Strong, Moderate, Limited) with mode of inheritance for autosomal dominant, autosomal recessive, and X-linked disorders. Curated by ClinGen Gene Curation Expert Panels.

Used by: PVS1 constraint gate (AD Definitive/Strong genes bypass pLI/LOEUF, v3.16.6). PVS1 AR LoF gene bypass (AR Definitive/Strong with verified LoF mechanism, v3.16.0). Disease association gate for PVS1, PP3_Strong, LP4/LP5/LP6, and ClinVar override (v3.16.0).

Source: clinicalgenome.org

GoFCards

Release 1.0579 genes, 3,161 curated GoF variants

Curated gain-of-function variant database with experimental evidence (animal model, cell model), pathways, disorder associations, and proprietary evidence score (Pscore). Each variant entry includes PMID, functional description, and experimental validation status.

Used by: PVS1 GoF guard via gof_genes_unified view (v3.18.0, HELIX-REF-003). Gene-level aggregation: genes with 3+ curated GoF variants, average Pscore >= 2.0, and at least one variant with animal or cell model evidence qualify as GoF with confidence "strong". 217 genes passed threshold and are integrated into gene_disease_mechanism table.

Source: National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University

ClinGen Dosage Sensitivity (Full)

Latest release (2026-03-21)~1,626 genes

Full dosage sensitivity curation including haploinsufficiency score (0-3, 30, 40), triplosensitivity score, up to 6 PMIDs per gene, date last evaluated, and disease IDs. Replaces the legacy 3-column clingen table (v3.18.0). HI scores: 0 = no evidence, 1 = little evidence, 2 = emerging evidence, 3 = sufficient evidence, 30 = AR phenotype, 40 = dosage sensitivity unlikely.

Used by: PVS1 constraint gate (HI=3 as fallback for X-linked genes, v3.11.5). Disease association gate (HI IN 2, 3, 30 per HELIX-CR-2026-025, v3.18.1). BS1 inheritance proxy. BP2 (HI=30 for AR phenotype).

Source: clinicalgenome.org

Ensembl VEP

Release 113All coding/non-coding consequences

Variant consequence prediction, protein impact, transcript annotation, functional domain mapping

Used by: PVS1 (consequence type), PM1 (Pfam domains), PM4 (in-frame indels), BP3 (non-critical regions), BP7 (synonymous), all impact-based criteria

Source: ensembl.org

ACMG/AMP Classification

Variant classification follows the 2015 ACMG/AMP guidelines with 28 evidence criteria evaluated systematically. 19 criteria are fully automated; 9 require manual curation by the reviewing geneticist.

Classification Priority Order

Classification logic is applied in strict priority order. Higher-priority rules are evaluated first, and the first matching rule determines the final classification:

1

BA1 Stand-alone

Allele frequency > 5% is always classified Benign. BA1 is the only stand-alone criterion in the ACMG framework and cannot be overridden by any other evidence, including ClinVar assertions.

2

Conflicting Evidence

If a variant has pathogenic evidence at moderate strength or above (PVS, PS, or PM criteria triggered) AND strong benign evidence (BS criteria triggered), the variant is classified as VUS regardless of the individual evidence strength. This is a conservative approach that prioritizes clinical safety.

3

ClinVar Override

ClinVar classification is applied only when no conflicting computational evidence exists. Requires minimum review star level (default: 1 star). ClinVar VUS does not override computational classification.

4

Bayesian Point System

Each triggered criterion contributes points based on its evidence strength: Very Strong (+8), Strong (+4), Moderate (+2), Supporting (+1) for pathogenic; Strong (-4), Supporting (-1) for benign. Total points determine classification: >= 10 Pathogenic, 6-9 Likely Pathogenic, 0-5 VUS, -1 to -5 Likely Benign, <= -6 Benign. This system is mathematically equivalent to the original 18 ACMG combining rules while filling gaps for evidence combinations not explicitly covered.

5

Default

Variants that do not meet any of the above criteria are classified as Uncertain Significance (VUS).

Classification Output

Each variant receives one of five standard ACMG classifications, a list of all criteria that were triggered with evidence strength levels (e.g., "PVS1,PM2,PP3_Strong"), a Bayesian point total, and a continuous confidence score derived from the distance between the point total and the nearest classification boundary:

Pathogenic

>= 10 pts

confidence: 0.80-0.99

Likely Pathogenic

6-9 pts

confidence: 0.70-0.90

VUS

0-5 pts

confidence: 0.30-0.60

Likely Benign

-1 to -5 pts

confidence: 0.70-0.90

Benign

<= -6 pts

confidence: 0.80-0.99

Automated Criteria (19 of 28)

These criteria are evaluated automatically for every quality-passing variant. Exact conditions and thresholds are documented below. Each criterion lists the databases it depends on and known limitations.

Pathogenic Evidence

PVS1

Null variant in gene where loss-of-function is a known disease mechanism

Very Strong / Strong

Conditions

  • Impact = HIGH
  • Consequence: frameshift, stop_gained, splice_acceptor, or splice_donor variant
  • Gene constraint: pLI > 0.9 OR LOEUF < 0.35 OR gene is in the curated autosomal recessive LoF gene list (~150 genes with Definitive/Strong evidence for biallelic LoF disease mechanism) OR ClinGen haploinsufficiency_score = 3 OR gene is in the curated autosomal dominant ClinVar LOF gene list (AD genes with ClinVar P/LP LOF variants at 2+ review stars but uninformative gnomAD constraint, v3.13.0)
  • Last-exon NMD downgrade (v3.9.0): Truncating variants in the last exon are downgraded from PVS1 Very Strong (+8 points) to PVS1_Strong (+4 points). Last-exon variants escape nonsense-mediated mRNA decay (NMD) because there is no downstream exon-exon junction. The resulting truncated protein may retain partial function or exert a dominant-negative effect. Detection uses the VEP exon_number column (format X/Y): last exon when X == Y. Single-exon genes (1/1) correctly receive PVS1_Strong. NULL or missing exon_number defaults to Very Strong (conservative). Reference: Abou Tayoun et al. 2018, Figure 1, Node 3.
  • MANE Select positional guard (v3.22.0, HELIX-CR-2026-055): Variants outside the MANE Select/Plus Clinical CDS region do not receive PVS1 or PM4. VEP may annotate variants against non-canonical transcripts that extend beyond the canonical coding sequence, producing false loss-of-function predictions. 19,354 MANE transcripts loaded with CDS coordinates. NULL MANE data = PVS1 applied (conservative fallback). Reference: Abou Tayoun et al. 2018, PMID:30192042.
  • Expression-aware guard (v3.23.0, HELIX-CR-2026-056): Per-exon pext scores from gnomAD v4.1 (GTEx v10, 49 tissues) modulate PVS1. Three tiers based on max_tissue_pext: >= 0.9 = PVS1 Very Strong (unchanged), 0.1-0.9 = PVS1_Strong[pext] (downgraded to Strong), < 0.1 = PVS1 blocked (exon not expressed in any tissue). NULL pext = Very Strong (conservative fallback). PM4 blocked at pext < 0.1. Scientific basis: Cummings 2020 (PMID:32461655) - 22.8% false pLoF filtered, < 4% true pathogenic lost.

Exclusions

  • NMD-escaping transcripts (consequence contains NMD_escaping_variant). Note: NMD_transcript_variant is NOT excluded - it indicates the transcript undergoes NMD, which is evidence FOR loss-of-function (v3.11.3).
  • Non-canonical splice site variants: splice_donor_5th_base_variant (position +5), splice_donor_region_variant (positions +3 to +8), splice_acceptor_5th_base_variant (position -5). ClinGen PVS1 Decision Tree specifies canonical +/-1,2 only. Extended splice variants retain PP3_splice access (v3.11.4).
  • Gain-of-function / dominant-negative / gain-of-expression genes: multi-source guard (v3.18.0, HELIX-REF-003). Primary: gof_genes_unified view (348 pure GoF/DN/GoE monoallelic genes from G2P + GoFCards + manual curation, confidence definitive/strong/moderate). Fallback: GOF_AD_GENES curated list (29 genes including ANKRD26, C1QTNF5, CALM2, MYH3, PRKAG2, SKI, TUBB3 and 22 others with PMID-documented non-LoF mechanism). Dual-mechanism genes (e.g., SCN5A, LMNA, KCNH2, MYH7) excluded from view and retain PVS1 eligibility. ClinGen PVS1 Decision Tree (Abou Tayoun 2018): "Is LoF a known mechanism of disease?" - if NO, PVS1 not applicable.
  • Gain-of-function gene guard history: v3.16.8 (HELIX-CR-2026-022) introduced GOF_AD_GENES (36 curated genes). v3.17.0 (HELIX-CR-2026-023) added G2P as primary source, reduced GOF_AD_GENES to 14 fallback genes, and fixed GNAS dual-mechanism error.
  • Stop-retained and stop-lost variants
  • HLA gene family (HLA-A, HLA-B, HLA-C, HLA-DRA, HLA-DRB1, HLA-DRB5, HLA-DQA1, HLA-DQB1, HLA-DPA1, HLA-DPB1, HLA-E, HLA-F, HLA-G, HLA-DMA, HLA-DMB, HLA-DOA, HLA-DOB)
  • Homozygous reference genotypes (hom_ref) from multi-allelic VCF sites

Databases: VEP (consequence, impact), gnomAD Constraint (pLI, LOEUF), curated AR LoF gene list (ClinGen Gene-Disease Validity), curated AD ClinVar LOF gene list (v3.13.0), Orphanet (gene-disease association), ClinVar (gene-level P/LP check), ClinGen (haploinsufficiency_score), VCEP (gene coverage), gof_genes_unified view (348 GoF/DN/GoE genes from G2P + GoFCards + manual, v3.20.1), GOF_AD_GENES fallback (29 genes, v3.20.1), gene_disease_mechanism table (disease association gate, v3.18.2), MANE Select transcripts v1.4 (positional guard, v3.22.0), gnomAD pext v4.1 GTEx v10 (expression-aware guard, v3.23.0)

Known limitations

  • Does not evaluate reading frame rescue via downstream in-frame reinitiation
  • Tissue-specific expression is assessed via gnomAD pext (proportion expressed across transcripts, GTEx v10) for PVS1 and PM4 modulation (v3.23.0, HELIX-CR-2026-056). Exons with max tissue pext < 0.1 block PVS1; 0.1-0.9 downgrade PVS1 to Strong. Alternative transcript usage addressed via MANE Select positional guard (v3.22.0, HELIX-CR-2026-055): variants outside MANE Select CDS do not receive PVS1 or PM4.
  • VCEP gene-specific PVS1 applicability gate available for ~50-60 genes (e.g., PVS1 disabled for gain-of-function genes like MYOC). Generic GoF/DN/GoE guard via gof_genes_unified (347 genes from G2P + GoFCards + manual curation) + GOF_AD_GENES fallback (28 genes) covers all other known non-LoF mechanism genes (v3.18.0, HELIX-REF-003). Generic thresholds used for all other genes.
  • Autosomal recessive gene bypass uses a curated list of ~150 genes with Definitive/Strong ClinGen validity or equivalent published evidence for biallelic LoF disease mechanism. VCEP AR genes (e.g., CDH23, GJB2, MYO7A, PAH, SLC26A4, USH2A) are handled via the VCEP pvs1_applicable gate and are not duplicated in the curated list. Categories covered: neurodegeneration, metabolic (amino acid, fatty acid oxidation, glycogen storage, lysosomal, peroxisomal), ciliopathies, sensory (hearing, vision), immune/hematologic, cardiac, neuromuscular, connective tissue, kidney, endocrine, liver, and respiratory.
  • Autosomal dominant ClinVar LOF gene bypass (v3.13.0) uses a curated list of AD genes with ClinVar P/LP LOF variants (2+ review stars) but uninformative gnomAD constraint (pLI < 0.9, LOEUF >= 0.35, HI != 3). Targets small genes where gnomAD constraint is statistically underpowered. New genes added only with clinical trigger and verification. Reference: Abou Tayoun 2018 PMID:30192042, HELIX-CR-2026-010.
PS1

Same amino acid change as an established pathogenic variant

Strong

Conditions

  • ClinVar clinical significance: Pathogenic, Pathogenic/Likely_pathogenic, or Likely_pathogenic
  • ClinVar review stars >= 2

Databases: ClinVar (clinical_significance, review_stars)

Known limitations

  • Matches on exact variant position and allele, not amino acid position (PM5 amino acid-level matching is disabled)
  • ClinVar assertions may lag behind current evidence for recently reclassified variants
PM1

Located in a mutational hot spot or well-established functional domain

Moderate

Conditions

  • Consequence: missense_variant (v3.20.1, HELIX-CR-2026-035). PM1 is defined for missense variants in critical functional domains (Richards 2015 Table 3, ClinGen SVI Walker 2023 Section 3.3). Synonymous, frameshift, splice, and other non-missense variants do not receive PM1.
  • Three-tier PM1 evidence model (v3.30.0, HELIX-CR-2026-082):
  • Tier 1 (PRIMARY) - UniProt residue-level evidence: variant overlaps a curated critical residue from UniProt SwissProt human reviewed proteome. 113,126 PM1-eligible features across 14,278 proteins covering DISULFID (disulfide bonds), ACT_SITE (enzyme active sites), BINDING (substrate/cofactor binding), MOD_RES (post-translational modifications), and MOTIF (functional motifs). Match performed via Ensembl protein (ENSP) cross-reference at the exact protein position parsed from hgvs_protein. Evidence eligibility uses UniProt qualifier presence (any /evidence= or /note=); naked features without qualifiers excluded by default (2,067 features, 1.8%).
  • Tier 2 (FALLBACK) - InterPro Pfam critical domains: variant overlaps a critical functional domain from interpro_pfam_domains reference table (~50 domains marked is_critical=true, sourced from InterPro Pfam catalog, v3.15.0). Retained for VCEP-defined regions and domain-level evidence not yet captured at residue resolution.
  • Tier 3 (RESERVED) - VCEP-specific PM1 region overrides: future expansion for gene-specific PM1 region definitions from published ClinGen Variant Curation Expert Panel specifications.
  • PM1 boolean: triggered if Tier 1 OR Tier 2 matches. Single Moderate strength preserved (no tiered Moderate/Supporting split) to maintain compatibility with existing combining rules.

Databases: VEP (consequence, hgvs_protein), UniProt SwissProt human reviewed (uniprot_features, uniprot_ensembl_mapping reference tables, v3.30.0), InterPro Pfam (interpro_pfam_domains reference table)

Known limitations

  • UniProt naked features (no /evidence= or /note= qualifiers, 2,067 features) excluded from PM1 eligibility by default. Config flag pm1_uniprot_include_naked allows clinical override for specific cases where curator inclusion in SwissProt is treated as curation evidence (e.g., INSR Cys234Tyr disulfide bond Cys223-Cys234).
  • UniProt evidence is residue-level; some VCEP gene-specific PM1 regions (multi-residue critical regions defined by Variant Curation Expert Panels) are not yet integrated and rely on the Pfam fallback tier.
  • Tier 2 Pfam fallback retains the curation scope of v3.15.0: ~50 critical domains across 14 categories (kinase catalytic, DNA-binding, ion channel pores, enzyme active sites, and other VCEP-documented domains). Generic structural domains (Caveolin, coiled-coil, DUF) and repetitive regions remain excluded.
  • PM1 evidence source recorded in audit column pm1_evidence_source for every PM1 hit (format: uniprot:DISULFID:223-234:P06213 or pfam:PF00069), enabling per-variant provenance review and frontend evidence display.
PM2

Absent from controls or at extremely low frequency in population databases

Moderate

Conditions

  • gnomAD global allele frequency < 0.0001 (0.01%) OR variant absent from gnomAD (NULL frequency = never observed in ~800K individuals)
  • NULL gnomAD allele frequency correctly treated as "absent from controls" per Richards et al. 2015 Table 3

Databases: gnomAD v4.1 (global_af)

Known limitations

  • Does not apply population-specific frequency adjustments
  • ClinGen SVI PM2_Supporting downgrade implemented for VCEP genes (v3.20.0, HELIX-CR-2026-034). 46 VCEP genes use PM2 at Supporting strength; 11 genes retain Moderate (RASopathy, ENIGMA BRCA1/2, HBOP). Non-VCEP genes use full Moderate strength.
  • When VCEP gene-specific specifications are enabled, PM2 threshold may differ from the generic 0.01% (e.g., 0% for RASopathy genes where any population frequency argues against pathogenicity)
PM3

Detected in trans with a pathogenic variant for recessive disorders

Moderate

Conditions

  • Variant flagged as compound heterozygote candidate (compound_het_candidate = true)
  • ClinVar partner validation guard (v3.25.1, HELIX-CR-2026-058): trans partner must be ClinVar Pathogenic or Likely Pathogenic with review_stars >= 2 and not Conflicting. Aligns with Richards 2015 Table 3: PM3 explicitly requires "detected in trans with a pathogenic variant". ClinVar annotation is external/immutable, avoiding circular dependency where partner classification depends on PM3.

Exclusions

  • Gain-of-function / dominant-negative gene exclusion (v3.20.2, HELIX-CR-2026-036): PM3 not applied in pure GoF/DN genes. PM3 presupposes AR mechanism (Richards 2015 Table 3), biologically irrelevant for dominant disease genes.
  • Dual-mechanism bypass (v3.20.3, HELIX-CR-2026-037): genes with both GoF/DN monoallelic AND biallelic LoF mechanism (gof_genes_exclusive view, 55 dual-mechanism genes excluded from GoF guard) retain PM3 for the AR LoF pathway.

Databases: Pipeline-internal (compound heterozygote detection), ClinVar (partner clinical_significance, review_stars), gof_genes_exclusive view (G2P + GoFCards + manual, 347 GoF/DN/GoE genes minus 55 dual-mechanism), GOF_AD_GENES fallback

Known limitations

  • Compound heterozygote status inferred from genotype data without formal phasing
  • Trio data or long-read phasing would provide definitive trans confirmation
  • ClinVar partner requires 2+ review stars; partners with 1-star P/LP submissions do not satisfy PM3
PM4

Protein length change in a non-repetitive region

Moderate

Conditions

  • Consequence: in-frame insertion or in-frame deletion
  • Located within a Pfam functional domain
  • Not in a repetitive or low-complexity region (domains not containing "tandem", "repeat", "lowcomplexity", or "Seg")

Exclusions

  • HLA gene family (same exclusion list as PVS1)

Databases: VEP (consequence, domains)

Known limitations

  • Repetitive region detection based on VEP domain annotations only
  • Does not evaluate whether the in-frame change disrupts a critical functional residue
PP2

Missense variant in a gene with low rate of benign missense variation

Supporting

Conditions

  • Consequence: missense variant
  • Gene constraint: pLI > 0.5 AND mis_z > 2.0 (missense constraint)
  • Disease association gate (v3.18.5): gene must have established disease association via same 7 sources as PVS1 gate. Prevents PP2 in genes without known Mendelian disease mechanism.
  • MPC regional constraint guard (v3.18.6, HELIX-CR-2026-032): mpc_score >= 1.0 required when MPC data is available. Regions with mpc_score < 1.0 are unconstrained for missense - PP2 blocked. NULL MPC data falls back to gene-level constraint (bypass). Reference: Samocha et al. 2014.

Databases: VEP (consequence), gnomAD Constraint (pLI, mis_z), MPC regional constraint (mpc_score), disease association (7 sources)

Known limitations

  • pLI measures loss-of-function constraint; mis_z measures missense constraint
  • Both metrics required: pLI > 0.5 ensures LoF intolerance, mis_z > 2.0 ensures missense variants are a disease mechanism (Samocha et al. 2014)
  • MPC regional constraint data available for ~18K genes. NULL MPC falls back to gene-level constraint only.
PP3

Computational evidence supports a deleterious effect (two independent paths)

Supporting / Moderate / Strong

Conditions

  • Path A (Missense - BayesDel): Requires consequence = missense_variant (v3.12.1, HELIX-CR-2026-009). BayesDel_noAF is calibrated exclusively for missense variants (Pejaver et al. 2022); non-missense variants (stop_gained, frameshift) receive spurious scores from dbNSFP positional overlap and are excluded. Three strength levels: PP3_Strong (>= 0.518, +4 points), PP3_Moderate (0.290-0.517, +2 points), PP3_Supporting (0.130-0.289, +1 point). Scores below 0.130 are indeterminate. Requires missense relevance guard v2.0 (v3.19.0, HELIX-CR-2026-033): 5 gates determine if missense is a disease mechanism. Gate 1: mis_z > 2.0 (direct constraint). Gate 2: mis_z > 1.5 AND pLI < 0.5 (AR genes). Gate 3: mis_z > 0.5 AND pLI > 0.9 (AD LoF-intolerant). Gate 4a: gene in refdb.clinvar_missense_genes (2,136 genes with ClinVar P/LP missense evidence). Gate 4b: pLI > 0.7 AND mis_z > 0.5 AND gene has disease association (gray-zone genes). NULL fallback: pLI > 0.5 without mis_z data. PP3_Strong additionally requires established disease association (v3.8.1 gate, same 7 sources as PVS1 gate).
  • PM1 + PP3 double-counting guard: When PM1 (functional domain) applies alongside PP3_Strong, PP3 is downgraded to PP3_Moderate. Combined PM1 + PP3 capped at Strong equivalent (4 points) per ClinGen SVI.
  • Path B (Splice evidence - 3 strength levels, v3.15.1): SpliceAI max_score evaluated with evidence strength modulation. PP3_splice_Strong (>= 0.8, +4 points), PP3_splice_Moderate (0.5-0.799, +2 points), PP3_splice (0.2-0.499, +1 point Supporting). PP3_splice_Strong requires disease association gate (same as PP3_Strong missense). All levels excluded when PVS1 applies (ClinGen SVI 2023 double-counting guard).
  • Paths A and B are independent. A variant can trigger both if it has a BayesDel score and a splice prediction.

Exclusions

  • PP3_splice not applied when PVS1 is triggered at any strength level (Very Strong or Strong/last-exon). Prevents double-counting loss-of-function and splice evidence per ClinGen SVI 2023.
  • PP3_splice missense-tolerated guard (v3.25.7, HELIX-CR-2026-065): PP3_splice (all levels) blocked for pure missense variants without VEP splice consequence when BayesDel noAF data is available. ClinGen SVI (Walker 2023 Section 3.2): PP3 is a single criterion - when BayesDel indicates tolerated and SpliceAI predicts cryptic splice, evidence is conflicting, not additive. BayesDel IS NULL bypass preserved.
  • PP3 + PP3_splice mutual exclusion (v3.25.8, HELIX-CR-2026-066): for dual-consequence variants (missense AND splice_region/donor/acceptor), PP3 BayesDel is blocked when PP3_splice applies. SpliceAI is the more specific tool when VEP confirms splice proximity. Together with v3.25.7, exactly one PP3 path per variant - never both.
  • BayesDel indeterminate range (-0.180 to 0.129): no PP3 or BP4 applied

Databases: dbNSFP 4.9c (bayesdel_noaf_score), SpliceAI (max_score)

Known limitations

  • BayesDel_noAF used to avoid circular reasoning with PM2/BA1/BS1 frequency criteria (Pejaver et al. 2022)
  • PM1 + PP3 cap assumes PM1 = Moderate (2 points). Extensible if future VCEP upgrades PM1 strength.
  • BayesDel does not reach PP3_Very_Strong per Pejaver calibration data
PP4

Patient phenotype is highly specific for a disease with a single genetic etiology

Supporting

Conditions

  • Requires patient HPO terms to be provided for the analysis session
  • Trigger condition A: >= 3 patient HPO terms match the gene HPO profile
  • Trigger condition B: >= 2 patient HPO terms match AND gene has <= 5 total HPO associations (highly specific gene-phenotype relationship)

Exclusions

  • Not evaluated when no patient HPO terms are provided

Databases: HPO (gene-phenotype associations)

Known limitations

  • HPO matching is exact term overlap, not semantic similarity (ontology hierarchy not used at this stage)
  • Semantic similarity-based matching is performed by the downstream Phenotype Matching Service
PP5

Reputable source reports variant as pathogenic

Supporting

Conditions

  • ClinVar clinical significance: Pathogenic, Pathogenic/Likely_pathogenic, or Likely_pathogenic
  • ClinVar review stars >= 1 AND < 2 (lower confidence than PS1)

Exclusions

  • Does not apply when PS1 already applies (prevents double-counting ClinVar evidence at different strength levels)

Databases: ClinVar (clinical_significance, review_stars)

Known limitations

  • ClinGen SVI has recommended retiring PP5 as a standalone criterion; retained for maximum sensitivity

Benign Evidence

BA1

Allele frequency is above 5% in population databases

Stand-alone

Conditions

  • gnomAD global allele frequency > 0.05 (5%)

Databases: gnomAD v4.1 (global_af)

Known limitations

  • Uses global allele frequency only; population-specific BA1 thresholds not implemented

BA1 is the only stand-alone ACMG criterion. A variant meeting BA1 is classified as Benign regardless of any other evidence, including ClinVar assertions. When VCEP gene-specific specifications are enabled, the BA1 threshold may be lower than 5% for specific genes (e.g., 0.1% for Cardiomyopathy genes).

BS1

Allele frequency is greater than expected for the disorder

Strong

Conditions

  • Inheritance-aware frequency threshold determined by a 5-level cascade (v3.7.0):
  • Priority 1 - VCEP gene-specific: BS1 threshold from published ClinGen VCEP specification (~100 genes)
  • Priority 2 - ClinGen HI score = 3: allele frequency >= 0.001 (0.1%) (~400 genes)
  • Priority 3 - Orphanet AD inheritance: allele frequency >= 0.001 (0.1%) for genes with autosomal dominant disease association (~1,694 genes)
  • Priority 3b - Orphanet XLD inheritance: allele frequency >= 0.001 (0.1%) for genes with X-linked dominant disease association (~90 genes). XLR genes intentionally excluded (fall to AR default).
  • Priority 4 - HPO AD fallback: allele frequency >= 0.001 (0.1%) for genes annotated with HP:0000006 (autosomal dominant inheritance) (~200-400 additional genes)
  • Default - AR threshold: allele frequency >= 0.05 (5%) for all remaining genes
  • Upper bound: allele frequency <= BA1 threshold (BA1 takes precedence)

Exclusions

  • Does not apply when BA1 applies (BA1 takes precedence)
  • XLR (X-linked recessive) genes do not receive the AD threshold - carrier females are unaffected and higher carrier frequencies are expected

Databases: gnomAD v4.1 (global_af), ClinGen (haploinsufficiency_score), Orphanet/Orphadata (orphanet_gene_inheritance: has_ad, has_ar, has_xld), HPO (hpo_ids for HP:0000006), gnomAD Constraint (pLI, oe_lof_upper for constraint-implied AD fallback), AR_LOF_GENES curated list

Known limitations

  • Orphanet covers rare diseases only; some common AD conditions may not be catalogued (covered by HPO AD fallback or AR default)
  • Dual AD+AR inheritance genes (1,027 genes, e.g. KCNJ11) receive the AD threshold (0.1%) as conservative default - AR pathogenic variants are typically well below 0.1% AF
  • Orphanet data updated twice per year (June + December releases). Coverage expands with each release.
  • When VCEP gene-specific specifications are enabled, VCEP thresholds take highest priority and override all other cascade levels
  • Criteria string includes inheritance source annotation: BS1[ClinGen-HI], BS1[Orphanet-AD], BS1[Orphanet-XLD], BS1[HPO-AD]

Data source: Orphadata (INSERM, France). License: CC-BY-4.0. Attribution: Orphadata: Free access data from Orphanet. INSERM 1999. Available on https://www.orphadata.com.

BS2

Observed in a healthy adult individual for a fully penetrant early-onset disorder

Strong

Conditions

  • Inheritance-aware threshold (v3.26.8, HELIX-CR-2026-074): AR-only genes (Orphanet has_ar=TRUE, has_ad=FALSE OR ClinGen HI=30) with documented early-onset-only diseases use threshold of 10 homozygotes. Generic and AD genes use threshold of 15 homozygotes.
  • Onset guard: AR threshold activated only when gene exists in refdb.orphanet_disease_onset (no data falls back to generic 15) AND no associated disease has Adult/Elderly/All ages onset. Aligns with Richards 2015 Table 3: BS2 "with full penetrance expected at an early age".
  • AR path criteria string marker: BS2[AR] when AR threshold path was used.

Exclusions

  • AD BS2 from gnomAD homozygote count not implemented (Harrison 2019 PMID:31159682: gnomAD is general population, not healthy controls; APC VCEP Curia 2023 PMID:37805481 excludes gnomAD for AD BS2).
  • X-linked hemizygous BS2 deferred (ac_hemi column not in current annotation pipeline).

Databases: gnomAD v4.1 (global_hom), Orphanet (orphanet_gene_inheritance, orphanet_disease_onset), ClinGen (haploinsufficiency_score for HI=30 AR proxy)

Known limitations

  • AR threshold of 10 calibrated for early-onset full-penetrance AR diseases. Genes with adult-onset AR diseases (e.g., Wilson disease ATP7B) fall back to generic 15 threshold via onset guard.
  • Compound heterozygote-only AR diseases require homozygote count interpretation in carrier-frequency context; not currently modelled separately.
BP1

Missense variant in a gene for which primarily truncating variants are known to cause disease

Supporting

Conditions

  • Consequence: missense variant
  • Impact: MODERATE
  • Gene constraint: pLI < 0.1 (gene is tolerant to loss-of-function)
  • pLI value must be present (non-NULL)
  • Missense constraint guard (v3.6.8): mis_z must be < 2.0 or absent. Genes with high mis_z have missense as primary disease mechanism.
  • ClinVar pathogenic missense guard (v3.8.2): BP1 does not apply when ClinVar has a session-local Pathogenic/Likely Pathogenic missense variant in the same gene. If ClinVar confirms pathogenic missense in the patient VCF, the premise of BP1 is falsified.
  • Reference-based ClinVar missense guard (v3.20.9, HELIX-CR-2026-044): BP1 additionally blocked when gene is in refdb.clinvar_missense_genes (>= 2 ClinVar P/LP missense at 2+ stars). Reference-level check covers genes with established missense disease mechanism not present in the current patient VCF (extends session-local guard).
  • GoF / DN gene exclusion (v3.20.6, HELIX-CR-2026-039v2): BP1 not applied for missense variants in gain-of-function or dominant-negative genes. BP1 presupposes LoF mechanism (Richards 2015 Table 3); GoF/DN genes have non-LoF truncating mechanism, so the BP1 premise is inapplicable. Same guard sources as PVS1 GoF guard: refdb.gof_genes_exclusive + GOF_AD_GENES fallback.

Exclusions

  • Genes with mis_z >= 2.0 (missense is a known disease mechanism)
  • Genes with session-local ClinVar Pathogenic/Likely Pathogenic missense variants
  • Genes with reference-level ClinVar missense evidence (refdb.clinvar_missense_genes, 2,136 genes)
  • GoF/DN genes (refdb.gof_genes_exclusive view + GOF_AD_GENES fallback)
  • Curated AR LoF genes (~150 genes with biallelic LoF disease mechanism)

Databases: VEP (consequence, impact), gnomAD Constraint (pLI, mis_z), ClinVar (clinical_significance, consequence), refdb.clinvar_missense_genes (reference-level missense mechanism), refdb.gof_genes_exclusive (G2P + GoFCards + manual), GOF_AD_GENES, AR_LOF_GENES

Known limitations

  • Low pLI used as proxy for "primarily truncating variants cause disease"
  • mis_z measures heterozygous missense constraint. AR enzyme genes (e.g. MCCC2) may have low mis_z despite pathogenic homozygous missense - the ClinVar missense guards (v3.8.2 session-local + v3.20.9 reference-based) address this gap.
  • GoF/DN gene coverage limited to G2P 2026-02-28 release + GoFCards Release 1.0 + curated GOF_AD_GENES. Genes with novel non-LoF mechanism not yet curated may receive incorrect BP1.
BP2

Observed in trans with a pathogenic variant for a fully penetrant dominant disorder

Supporting

Conditions

  • Compound heterozygote candidate (compound_het_candidate = true)
  • ClinGen haploinsufficiency score = 30 (dosage sensitivity unlikely)

Databases: Pipeline-internal (compound heterozygote detection), ClinGen (haploinsufficiency_score)

Known limitations

  • Trans observation inferred without formal phasing
  • Haploinsufficiency score of 30 is a specific ClinGen code for "dosage sensitivity unlikely"
BP3

In-frame insertion or deletion in a repetitive region without a known function

Supporting

Conditions

  • Consequence: in-frame insertion or in-frame deletion
  • Located in a repetitive/low-complexity region (domains contain "tandem", "repeat", "lowcomplexity", or "Seg"), OR not in any Pfam domain, OR no domain annotation available

Databases: VEP (consequence, domains)

Known limitations

  • Complementary to PM4 (PM4 requires Pfam domain; BP3 requires absence of critical domain)
BP4

Computational evidence suggests no impact on gene or gene product

Supporting / Moderate

Conditions

  • Path A (Missense - BayesDel): Requires consequence = missense_variant (v3.12.1, HELIX-CR-2026-009). BayesDel_noAF score evaluated with ClinGen SVI calibrated thresholds (Pejaver et al. 2022). Two strength levels: BP4_Moderate (<= -0.361, -2 points), BP4_Supporting (-0.360 to -0.181, -1 point). SpliceAI max_score must be < 0.1 or absent.
  • Path B (Non-canonical splice - BP4_splice): v3.25.2 (HELIX-CR-2026-060). Supporting benign for non-canonical splice region variants (splice_donor_region, splice_donor_5th_base, splice_acceptor_5th_base, splice_region, splice_polypyrimidine_tract) when SpliceAI max_score <= 0.1. Excludes synonymous (BP7 handles) and missense (BP4 BayesDel handles). PVS1 guard: not applied when PVS1 condition is satisfied (no double-counting splice/LoF evidence). Threshold consistent with BP7 (SpliceAI <= 0.1). References: Jaganathan 2019 PMID:30661751, Walker 2023 PMID:37352859, de la Hoya 2022 PMID:35202600.
  • Criteria string marker: BP4 (BayesDel path) and BP4_splice (non-canonical splice path) reported separately.

Exclusions

  • BayesDel indeterminate range (-0.180 to 0.129): no BP4 missense path applied
  • BP4_splice not applied when PVS1 is triggered (canonical splice +/-1,2 with constraint pass)

Databases: dbNSFP 4.9c (bayesdel_noaf_score), SpliceAI (max_score), VEP (consequence)

Known limitations

  • BayesDel_noAF does not reach BP4_Strong per Pejaver calibration data
  • SpliceAI guard prevents BP4 missense path for variants with any predicted splice impact
  • BP4_splice covers only non-canonical splice region consequences. Deep intronic variants without VEP splice tags require RNA validation per ClinGen SVI (Walker 2023).
BP6

Reputable source reports variant as benign

Supporting

Conditions

  • ClinVar clinical significance: Benign, Benign/Likely_benign, or Likely_benign
  • ClinVar review stars >= 1

Databases: ClinVar (clinical_significance, review_stars)

Known limitations

  • ClinGen SVI has recommended retiring BP6 as a standalone criterion; retained for maximum sensitivity
BP7

Synonymous variant with no predicted impact on splicing

Supporting

Conditions

  • Consequence: synonymous variant
  • Not in a splice region (consequence does not contain "splice_region")
  • SpliceAI max_score <= 0.1 or absent

Databases: VEP (consequence), SpliceAI (max_score)

Known limitations

  • Conservation filter intentionally omitted per Walker et al. 2023 Table S13 recommendation ("no improvement in negative predictive value" with conservation filter)

Aligned with ClinGen SVI 2023 (Walker et al.) Figure 4 decision tree for synonymous variant classification.

Computational Predictors (PP3 / BP4)

PP3 and BP4 use BayesDel_noAF as the primary classification tool with ClinGen SVI calibrated thresholds (Pejaver et al. 2022). BayesDel is a Bayesian framework that integrates deleteriousness scores from multiple underlying predictors into a single calibrated score. The noAF variant (without allele frequency) is used to avoid circular reasoning with PM2, BA1, and BS1 frequency criteria.

BayesDel_noAF ScoreEvidence StrengthACMG CodeBayesian Points
>= 0.518Strong pathogenicPP3_Strong+4
0.290 - 0.517Moderate pathogenicPP3_Moderate+2
0.130 - 0.289Supporting pathogenicPP3_Supporting+1
-0.180 to 0.129IndeterminateNone0
-0.360 to -0.181Supporting benignBP4_Supporting-1
<= -0.361Moderate benignBP4_Moderate-2

PM1 + PP3 Double-Counting Guard

When PM1 (functional domain, Moderate = 2 points) applies alongside PP3_Strong (4 points), the combined evidence would exceed the ClinGen SVI recommended cap of Strong equivalent (4 points). In this case, PP3_Strong is downgraded to PP3_Moderate (2 points), yielding a combined 2 + 2 = 4 points. PP3_Moderate and PP3_Supporting are not affected by this cap.

Why BayesDel

The ClinGen SVI Working Group calibrated four computational tools (BayesDel, MutPred2, REVEL, VEST4) and demonstrated that a single calibrated tool with evidence strength modulation provides more accurate classification than a fixed-threshold multi-predictor consensus. BayesDel_noAF was selected because it explicitly excludes allele frequency from its model (avoiding circular reasoning with PM2/BA1/BS1), is precomputed in dbNSFP 4.9c, and reaches both PP3_Strong and BP4_Moderate in the Pejaver calibration - providing the widest evidence strength range among available tools.

Display Predictors (Not Used in Classification)

The following predictors are available alongside every variant for the reviewing geneticist to inspect, but they are not used in the PP3/BP4 classification logic. They were used in the weighted consensus approach prior to v3.4 and are retained for clinical reference:

SIFT
AlphaMissense
MetaSVM
DANN
PhyloP
GERP

SpliceAI Integration

Splice impact predictions are integrated following ClinGen Sequence Variant Interpretation (SVI) Working Group recommendations (Walker et al., 2023).

SpliceAI predicts the impact of each variant on mRNA splicing through four delta scores: acceptor gain (DS_AG), acceptor loss (DS_AL), donor gain (DS_DG), and donor loss (DS_DL). The maximum of these four scores is used for classification thresholds.

Scores are sourced from Ensembl precomputed MANE transcript predictions, not computed at runtime. This ensures reproducibility and avoids runtime dependencies on external services.

Thresholds

  • SpliceAI >= 0.8 → PP3_splice_Strong: Strong evidence for spliceogenicity (+4 points). Requires disease association gate (v3.15.1).
  • SpliceAI 0.5 - 0.799 → PP3_splice_Moderate: Moderate evidence for spliceogenicity (+2 points).
  • SpliceAI 0.2 - 0.499 → PP3_splice_Supporting: Supporting evidence for spliceogenicity (+1 point).
  • SpliceAI < 0.1BP4 guard: Required for BP4 to apply. Prevents benign classification when splice impact is predicted.
  • SpliceAI <= 0.1 → BP7: Required for synonymous variant BP7 classification. Confirms no splice impact.

PVS1 Double-Counting Guard

When PVS1 (loss-of-function) is triggered for a variant, PP3_splice is not applied. This prevents double-counting the same biological mechanism (splice disruption leading to loss of function) as both PVS1 and PP3 evidence, per ClinGen SVI recommendation.

Reference: Walker et al., Am J Hum Genet. 2023;110(7):1046-1067. PMID: 37352859

Classification Logic

Classification uses the Bayesian point-based framework (Tavtigian et al. 2018, 2020) which is mathematically equivalent to the original 18 ACMG combining rules while providing proper classifications for evidence combinations not explicitly covered by the 2015 guidelines.

Bayesian Point System (Primary)

Each evidence criterion contributes points based on its strength level. The total determines classification:

Pathogenic Evidence Points

  • Very Strong (PVS)+8
  • Strong (PS)+4
  • Moderate (PM)+2
  • Supporting (PP)+1

Benign Evidence Points

  • Stand-alone (BA1)Override to Benign
  • Strong (BS)-4
  • Supporting (BP)-1

Classification Thresholds

Pathogenic

>= 10 pts

Likely Path.

6 - 9 pts

VUS

0 - 5 pts

Likely Benign

-1 - -5 pts

Benign

<= -6 pts

High-Confidence Conflict Safety Check

When pathogenic evidence at Strong or Very Strong level conflicts with Strong benign evidence (BS), the variant is flagged for manual review regardless of the point total. This prevents automated resolution of genuinely conflicting high-quality evidence.

ACMG 2015 Combining Rules (Reference)

The original 18 ACMG 2015 combining rules are a special case of the Bayesian point system - every rule produces the same classification under both approaches. They are retained here as a reference.

Pathogenic (8 rules)

  • P1: 1 Very Strong (PVS) + >= 1 Strong (PS)
  • P2: 1 Very Strong (PVS) + >= 2 Moderate (PM)
  • P3: 1 Very Strong (PVS) + 1 Moderate (PM) + 1 Supporting (PP)
  • P4: 1 Very Strong (PVS) + >= 2 Supporting (PP)
  • P5: >= 2 Strong (PS)
  • P6: 1 Strong (PS) + >= 3 Moderate (PM)
  • P7: 1 Strong (PS) + 2 Moderate (PM) + >= 2 Supporting (PP)
  • P8: 1 Strong (PS) + >= 4 Moderate (PM)

Likely Pathogenic (7 rules)

  • LP1: 1 Very Strong (PVS) + 1 Moderate (PM) - gated by ar_biallelic_het_lof_guard (v3.25.5): blocked for het LoF in AR-only LoF genes without compound het partner (carrier state)
  • LP1b: 1 Very Strong (PVS) or PVS1_Strong + >= 1 Supporting (PP) - ClinGen SVI PM2 downgrade accommodation (v3.20.0); same het LoF guard as LP1
  • LP2: 1 Strong (PS) + 1-2 Moderate (PM) - gated by computational-only LP guard (v3.25.6 CR-064 / v3.25.9 CR-067): blocked when sole Strong is PP3_Strong or PP3_splice_Strong with PM2 only and no PP4/PP5/PS1/PM3/PM4. v3.27.2 CR-081: bypass for PP3_splice_Strong in AD/XL LoF-intolerant genes with disease association and splice proximity.
  • LP3: 1 Strong (PS) + >= 2 Supporting (PP) - same het LoF guard as LP1
  • LP4: >= 3 Moderate (PM) - requires gene_has_disease_association (v3.16.7 CR-021); blocked by ar_biallelic_missense_guard (v3.25.3 CR-061), computational_only_lp_guard (v3.25.4 CR-062), ar_biallelic_het_lof_guard (v3.25.5 CR-063)
  • LP5: 2 Moderate (PM) + >= 2 Supporting (PP) - same disease association gate and three guards as LP4
  • LP6: 1 Moderate (PM) + >= 4 Supporting (PP) - same disease association gate and three guards as LP4

Benign (2 rules)

  • B1: 1 Stand-alone (BA1) - frequency > 5%
  • B2: >= 2 actual Strong benign (BS1, BS2) - v3.26.6 (BB-007): explicit Strong-only count. BP4_Moderate (Moderate benign per Pejaver 2022) is included in bs_count for LB1 participation but excluded from B2. Tavtigian 2020 (PMID:32720330): BS1(-4) + BP4_Moderate(-2) = -6 pts = LB, not B (requires <= -7 pts).

Likely Benign (2 rules)

  • LB1: 1 Strong benign (BS or BP4_Moderate mapped to bs_count) + 1 Supporting benign (BP)
  • LB1b: 1 actual Strong benign (BS1 or BS2) + BP4_Moderate - v3.26.6 (BB-007) explicit fallthrough: Tavtigian BS1(-4) + BP4_Moderate(-2) = -6 pts = LB. Prevents BS1 + BP4_Moderate (without other BP) from incorrectly falling to VUS.
  • LB2: >= 2 Supporting benign (BP)

Conflicting Evidence Handling

The Bayesian point system naturally handles most conflicting evidence through point summation. For example, PM2 (+2) and BS1 (-4) yield a net of -2 = Likely Benign. Under the previous v3.3 system, this combination would have been flagged as conflicting and defaulted to VUS. The point-based approach is more nuanced and clinically appropriate. However, when pathogenic evidence at Strong or Very Strong level directly conflicts with Strong benign evidence, the variant is flagged for manual review regardless of point total (see High-Confidence Conflict Safety Check above). BA1 remains a stand-alone override handled at a higher priority level.

VCEP Gene-Specific Specifications

ClinGen Variant Curation Expert Panels (VCEPs) adapt generic ACMG/AMP 2015 criteria to specific genes or diseases. Helena implements approved VCEP specifications as an optional overlay on top of the standard classification.

How It Works

The standard classification pipeline runs first, producing a generic ACMG classification for all variants. For variants in genes with available VCEP specifications, gene-specific thresholds are applied via a lightweight overlay that modifies frequency cutoffs (BA1, BS1, PM2) and criterion applicability (PVS1) based on the published VCEP specification. The Bayesian point total is then recalculated with the modified criteria.

What VCEPs Modify

  • BA1: Gene-specific allele frequency threshold for stand-alone benign (e.g., 0.1% for Cardiomyopathy genes instead of generic 5%)
  • BS1: Gene-specific elevated frequency threshold, already mode-of-inheritance-aware
  • PM2: Gene-specific absent-in-controls threshold (e.g., 0% for RASopathy genes)
  • PVS1: Gene-specific applicability gate. Set to FALSE for gain-of-function genes (e.g., MYOC in glaucoma)

Toggle Behavior

VCEP overlay is enabled by default and can be disabled per case in case settings. When enabled, variants in VCEP-covered genes display an audit trail marker (e.g., "[VCEP:Hearing Loss v1.0]") in the criteria string.

Coverage

Approved VCEP specifications are sourced from the ClinGen Criteria Specification Registry (CSpec). Approximately 50-60 genes have published specifications, including panels for Hearing Loss, Cardiomyopathy (MYH7, MYBPC3), RASopathy (PTPN11, BRAF, SOS1), PTEN, CDH1, TP53, PAH, and BRCA1/BRCA2 (ENIGMA). For all other genes, generic ACMG 2015 thresholds are used.

ClinVar Override Logic

ClinVar clinical significance assertions are used as classification evidence, but only under specific conditions that prevent overriding computational evidence when conflicts exist.

When ClinVar override IS applied

ClinVar has a Pathogenic, Likely Pathogenic, Benign, or Likely Benign assertion with at least 1 review star, AND no conflicting computational evidence exists (no BA1, no conflicting pathogenic+benign criteria at moderate+ strength). For P/LP assertions with only 1 review star (single submitter), the gene must also have an established disease association via at least one of five sources: ClinVar P/LP (gene-level, requires 2+ review stars to prevent circular self-validation), Orphanet, ClinGen haploinsufficiency score, curated AR LoF gene list, or VCEP coverage (v3.10.0 disease association gate, v3.10.1 CTE fix). P/LP assertions with 2+ review stars bypass this gate. Benign/Likely Benign assertions are not gated.

When ClinVar override is NOT applied

BA1 applies (frequency > 5% always overrides ClinVar). OR conflicting evidence exists (pathogenic + benign criteria both triggered). OR ClinVar asserts VUS (VUS does not override computational classification). OR ClinVar review stars are below the minimum threshold. OR ClinVar P/LP with 1 review star in a gene without established disease association (v3.10.0 gate, v3.10.1 circular reference fix - the gene-level ClinVar disease check requires 2+ stars to prevent a 1-star submission from self-validating its own gate).

Default minimum review stars for override: 1 (configurable per deployment)

When ClinVar classification is used, the criteria string includes "ClinVar" as the first element (e.g., "ClinVar,PM2,PP3") to make the evidence source explicit. When a ClinVar P/LP override is blocked by the disease association gate (v3.10.0), the criteria string shows "ClinVar_gated" as the first element to indicate that the ClinVar assertion was present but not applied. ClinVar review star level is available alongside every variant for the reviewing geneticist to assess assertion quality.

Manual Review Criteria (9 of 28)

These criteria require information that cannot be determined from a single-sample VCF file - family segregation data, functional study results, confirmed de novo status, or case-level clinical context. They must be evaluated by the reviewing geneticist.

PS2

De novo variant (confirmed paternity and maternity)

Requires trio sequencing data and confirmed parental relationships. Cannot be determined from single-sample VCF analysis.

PS3

Well-established in vitro or in vivo functional studies show a deleterious effect

Requires curation of published functional assay data. Automated literature extraction of functional evidence is not yet implemented.

PS4

Prevalence of the variant in affected individuals is significantly increased compared with controls

Requires case-control study data or odds ratios not available in standard annotation databases.

PM5

Novel missense change at an amino acid residue where a different pathogenic missense has been observed

Disabled (pending implementation)

Currently disabled. Requires normalized HGVSp matching against ClinVar at the amino acid position level. Will be enabled when ClinVar preprocessing provides standardized protein-level coordinates.

PM6

Assumed de novo without confirmation of paternity and maternity

Requires family structure information not available in single-sample analysis.

PP1

Cosegregation with disease in multiple affected family members

Requires multi-generational pedigree data and segregation analysis.

BS3

Well-established in vitro or in vivo functional studies show no deleterious effect

Requires curation of published functional assay data (benign counterpart of PS3).

BS4

Lack of segregation in affected members of a family

Requires family segregation data not available in single-sample analysis.

BP5

Variant found in a case with an alternate molecular basis for disease

Requires clinical case-level information about alternative diagnoses.

Quality Filtering

Three configurable quality presets control the stringency of variant filtering. Quality filtering occurs before annotation and classification.

PresetQuality (QUAL)Depth (DP)Genotype Quality (GQ)Recommended Use
Strict>= 30>= 20>= 30High-confidence clinical reporting
Balanced>= 20>= 15>= 20Standard clinical analysis (default)
Permissive>= 10>= 10>= 10Maximum sensitivity / research

ClinVar Rescue Mechanism

Variants with documented clinical significance in ClinVar (Pathogenic or Likely Pathogenic) that fail quality thresholds are not discarded. They are flagged as rescued variants and proceed through the classification pipeline. This prevents clinically significant findings from being silently excluded due to sequencing quality in low-coverage regions - a deliberate design decision for clinical safety.

Limitations and Disclaimers

  • Helena is a clinical decision support tool, not a diagnostic device. All classifications require review and confirmation by a qualified clinical geneticist.
  • 9 of 28 ACMG criteria require information not available from single-sample VCF analysis (segregation, functional studies, de novo confirmation). These criteria must be evaluated manually by the reviewing geneticist.
  • SpliceAI predictions are computational. RNA splicing studies remain the gold standard for confirming splice-altering effects.
  • Population frequency data from gnomAD may underrepresent certain ethnic groups and geographic populations. Allele frequency thresholds should be interpreted in the context of the patient's ancestry.
  • ClinVar assertions vary in quality and currency. Review star levels are displayed alongside all ClinVar-derived evidence to enable informed interpretation.
  • Structural variants (SVs), copy number variants (CNVs), and repeat expansions are not currently classified by this pipeline.
  • Mitochondrial variants are processed through the same pipeline using nuclear ACMG rules as an approximation. Dedicated mitochondrial classification guidelines (e.g., MitoMap criteria) are not yet implemented.
  • VCEP gene-specific specifications are implemented as a threshold overlay for BA1, BS1, PM2, and PVS1 applicability. The overlay does not implement VCEP-specific functional assay interpretation (PS3/BS3) or gene-specific segregation logic (PP1/BS4), which require manual curation. Coverage is limited to approximately 50-60 genes with published ClinGen CSpec specifications; all other genes use generic ACMG 2015 thresholds.
  • PM5 (novel missense at known pathogenic amino acid position) is currently disabled pending standardized protein-level coordinate matching in the ClinVar preprocessing pipeline.
  • Compound heterozygote detection is inferred from genotype data without long-read phasing or trio analysis. Formal phasing should be performed for clinical confirmation.
  • The curated autosomal recessive LoF gene list (~150 genes) for PVS1 bypass covers genes with Definitive/Strong evidence for biallelic LoF disease mechanism. AR genes outside this list that lack constraint (low pLI, high LOEUF) will not trigger PVS1 unless they have VCEP gene-specific specifications.
  • Homozygous reference genotypes (hom_ref) from multi-allelic VCF sites are excluded from classification. These represent alleles not present in the patient.
  • Subtractive sophistication guards (v3.25.3 ar_biallelic_missense_guard, v3.25.4 computational_only_lp_guard, v3.25.5 ar_biallelic_het_lof_guard, v3.25.6 LP2 PP2 bypass fix, v3.25.9 LP2 PP3_splice_Strong extension, v3.26.6 B2 Strong-only count) deliberately downgrade Likely Pathogenic to VUS or Benign to Likely Benign when the evidence profile is computational-only or biologically inconsistent with the gene mechanism. These guards reduce false positives but may also downgrade variants that subsequent functional or segregation evidence would re-elevate. Reviewing geneticists should treat downgraded variants as candidates for additional evidence gathering rather than terminal classifications.
  • gnomAD LoF tolerance signal (v3.24.0, HELIX-CR-2026-057): for HIGH impact variants outside MANE Select CDS in genes with documented homozygous or hemizygous LoF carriers in healthy gnomAD individuals, a BP_regional Supporting benign criterion is applied (v3.25.0, CR-057 Phase 3). AD genes are excluded - het LoF tolerance in AD is a different clinical argument. Compound het candidates are excluded - they may contribute to biallelic LoF.
  • ClinGen negative evidence guard (v3.26.9, HELIX-CR-2026-078): genes with ClinGen Gene-Disease Validity classifications limited to Limited / Disputed / Refuted (no Definitive / Strong / Moderate) are no longer accepted as having disease association via Orphanet entry alone. This blocks ClinVar 1-star override and LP rules in genes with negative ClinGen evaluation, preventing single-submitter ClinVar promotion in genes that ClinGen has examined and found insufficient evidence.
  • Results should always be interpreted in the context of the patient's clinical presentation, family history, and other available clinical information.

Version History

Every methodology change is versioned and documented. The version number corresponds to the classification engine version in production.

Show older versions (72 more)
v3.30.0April 2026Current
  • PM1: UniProt residue-level evidence integration (HELIX-CR-2026-082). Three-tier PM1 evidence model replaces domain-level Pfam-only logic. Tier 1 (primary): UniProt SwissProt human reviewed proteome, 113,126 PM1-eligible features across 14,278 proteins (DISULFID, ACT_SITE, BINDING, MOD_RES, MOTIF). Tier 2 (fallback): InterPro Pfam critical domains (~50 domains, retained for VCEP-defined regions). Tier 3 (reserved): VCEP-specific PM1 region overrides (future). PM1 boolean unchanged (single Moderate strength), preserving existing combining rules.
  • PM1 evidence density: 2,262x increase over v3.28 (50 hardcoded Pfam domains -> 113,126 expert-curated UniProt residues). Architecture follows Helena Pattern A (Parquet as source of truth, DuckDB as disposable cache). hgvs_protein parsing handles NULL, empty, multi-transcript pipe-delimited, version-suffix-stripped ENSP, and missense-only consequence guard.
  • Naked feature handling (Strategy D): UniProt features lacking /evidence= and /note= qualifiers (2,067 features, 1.8% of total) excluded from PM1 eligibility by default. Config flag pm1_uniprot_include_naked allows clinical override for specific cases (e.g., INSR Cys234Tyr disulfide bond Cys223-Cys234, naked DISULFID). Default conservative; opt-in for full UniProt coverage when curator inclusion in SwissProt is treated as curation evidence.
  • Audit trail: pm1_evidence_source column populated for every PM1 hit (uniprot:DISULFID:223-234:P06213 or pfam:PF00069). Frontend variant detail panel displays human-readable evidence source with UniProt entry link. Replaces generic "PM1: critical functional domain" annotation.
  • InterPro Pfam loader Pattern B remediation: load_interpro_pfam.py migrated from direct DROP TABLE + INSERT to Parquet-first architecture, matching all other reference datasets. CRITICAL_DOMAINS dict retained as manual override layer.
  • Reference: Richards 2015 PMID:25741868 (PM1 definition), Walker 2023 PMID:37352859 (ClinGen SVI Section 3.3), UniProt Consortium 2023 PMID:36408920.
v3.28.0April 2026
  • Processing notes refactor (HELIX-CR-HEL-VA-2026-001): sentinel-guarded deduplication of classifier-generated processing notes. All 14 ACMG note categories prefixed with [ACMG] sentinel. Symmetric refactor in artifact detection stage with [ARTIFACT] sentinel. Eliminates stale note preservation across reclassification runs (Finding-1: Case 13 IFT140 missing Cat 4 bypass note) and N-fold note duplication on repeated classifier runs (Finding-2: AIRE sessions 11-18x duplication). Zero classification logic changes.
  • Note structure invariant: [ACMG] and [ARTIFACT] note text must not contain "; " internally. MNV notes structurally safe (split/rejoin lossless for internal STRING_AGG).
  • Scope: acmg_classifier.py (18 hunks), artifact_detection_stage.py (3 hunks), rerun_classifier_pipeline.py (1 hunk).
v3.27.2April 2026
  • LP2 splice bypass for AD/XL LoF genes (HELIX-CR-2026-081). LP2 computational-only guard (CR-042/064/067) extended with bypass for PP3_splice_Strong (SpliceAI >= 0.8) in autosomal dominant or X-linked dominant LoF-intolerant genes with established disease association. SpliceAI >= 0.8 in a LoF gene is qualitatively different from BayesDel missense PP3_Strong + PM2: high-confidence splice disruption prediction in an established LoF-mechanism gene approaches a functional LoF proxy.
  • Bypass conditions (six mandatory): has_pp3_splice_strong AND NOT has_pp3_strong AND ad_lof_constraint_satisfied AND gene_has_disease_association AND splice_proximity_satisfied AND NOT outside_mane_region. MANE guard prevents bypass from reactivating LP through PP3_splice_Strong backdoor when PVS1 is blocked for positional reasons.
  • ad_lof_constraint_satisfied excludes AR-only paths (carrier state, not haploinsufficiency), GoF/DN genes via gof_genes_exclusive + GOF_AD_GENES (analogous to PVS1 CR-022 v3.16.8), and dual-mechanism biallelic LoF genes that retain biallelic-only mechanism (CR-079 compatible: IFT140 retains AD path).
  • splice_proximity_satisfied excludes pure missense (CR-066 handles dual-consequence), pure synonymous without splice VEP tag, UTR variants, and deep intronic without splice VEP (requires RNA per Walker 2023).
  • Additive change: VUS to LP. Cannot create new B/LB or remove existing P/LP. Clinical trigger: HG2024_252 TCOF1 c.1279-6C>G VUS vs CG Pathogenic.
  • Reference: Jaganathan 2019 PMID:30661751, Walker 2023 PMID:37352859, Abou Tayoun 2018 PMID:30192042, Richards 2015 PMID:25741868.
v3.27.1April 2026
  • Processing notes CASE priority fix (HELIX-CR-2026-080). Single first-match-wins CASE structure replaced with independent CASE WHEN concatenation. Variants matching multiple conditions now receive all applicable notes separated by semicolons.
  • Cat 2 pext guard: Added NOT outside_mane_region guard to prevent conflicting notes when variant is both outside MANE and has pext data.
  • Outer CASE guard: Checks if at least one condition matches before entering concatenation. ELSE preserves original processing_notes unchanged.
  • Zero classification impact: no changes to acmg_class, acmg_criteria, or confidence_score. Additive notes only. Trigger: BBS9 p.Glu501Ter (Case 3, HG2022_042) - pext downgrade note was masking ar_biallelic_het_lof_guard note.
v3.27.0April 2026
  • Dual-mechanism AD/AR LoF guard bypass (HELIX-CR-2026-079). ar_biallelic_het_lof_guard (CR-063) extended with bypass for genes with documented monoallelic LoF mechanism. Guard was designed for AR-only LoF genes (AMACR, CFTR, GBA1) where het LoF is carrier state. Dual-mechanism genes (biallelic LoF + monoallelic LoF) have an AD pathway where het LoF is pathogenic.
  • Two bypass paths: gene_disease_mechanism monoallelic LoF (definitive/strong/moderate) OR CLINVAR_LOF_AD_GENES Python constant (IFT140, GCM2). ClinGen AD Definitive/Strong path removed in v2 review: ClinGen AD moi does not distinguish LoF from GoF/DN.
  • Additive change: VUS to LP for het LoF in dual-mechanism AD/AR LoF genes. Cannot create new B/LB. Cannot remove existing P/LP. Affected gene: IFT140 (3rd ADPKD gene per Senum 2022 PMID:34890546).
  • Clinical trigger: IFT140 c.640del p.Val214CysfsTer55 (session 6692b88f). Phenotype: HP:0000107 (Renal cyst). Precedent: CR-037 (gof_genes_exclusive, v3.20.3) - analogous dual-mechanism bypass for PVS1/PM3 GoF guard.
v3.26.9April 2026
  • ClinGen negative evidence guard (HELIX-CR-2026-078). gene_has_disease_association Orphanet branch guarded by clingen_negative_genes CTE. When ClinGen has evaluated a gene AND all evaluations are non-actionable (Limited/Disputed/Refuted, no Definitive/Strong/Moderate), Orphanet entry alone is insufficient for gene_has_disease_association.
  • clingen_negative_genes CTE: SELECT DISTINCT gene_symbol from ClinGen GDV where classification IN (Limited/Disputed/Refuted) AND gene NOT IN disease_associated_genes_clingen. ~480 genes. Materialized by DuckDB.
  • Affects ClinVar 1-star override gate, LP4/LP5/LP6 disease gate, PVS1 disease gate, PP3_Strong disease gate, de novo projection P6/P7/P8 and LP4/5/6.
  • Subtractive change: LP -> VUS for 1-star ClinVar override in genes with ClinGen negative-only evaluation. Cannot create new P/LP. 205 affected genes (Orphanet + ClinGen negative-only out of 480 total).
  • Clinical trigger: TNC p.Arg108His LP [ClinVar,BP4] (PD2025_120). ClinGen Limited (Hearing Loss VCEP). BayesDel = -0.322 (benign). pLI = 0.0.
  • Reference: Strande 2017 PMID:28552198, Richards 2015 PMID:25741868.
v3.26.8April 2026
  • BS2 inheritance-aware AR threshold (HELIX-CR-2026-074). AR-only genes (Orphanet has_ar=TRUE, has_ad=FALSE, or HI=30) with early-onset-only diseases use bs2_ar_homozygote_threshold (10) instead of generic (15).
  • Onset guard from refdb.orphanet_disease_onset (HELIX-REF-005). EXISTS guard: gene must be in orphanet_disease_onset (no data = generic threshold). NOT EXISTS guard: gene must not have Adult/Elderly/All ages onset. Aligns with Richards 2015 Table 3: BS2 "with full penetrance expected at an early age".
  • AD BS2 from gnomAD not implemented. Harrison 2019 PMID:31159682: gnomAD is general population, not healthy controls. APC VCEP (Curia 2023 PMID:37805481) excludes gnomAD for AD BS2. XL BS2 deferred (ac_hemi column not in annotation pipeline).
  • Criteria string: BS2[AR-hom:N] for AR path, BS2 for generic path. Subtractive change: BS2 adds benign evidence only. Cannot create new P/LP. 2 P -> VUS via conflicting evidence (TGM5 p.Gly113Cys 14 hom, DUOX2 p.Phe966SerfsTer29 14 hom).
  • Clinical trigger: HG2022_073 CBS p.Ala114Val P with 6 hom in gnomAD (CBS excluded by 6 < 10 threshold and Adult onset blocking onset guard).
v3.26.7April 2026
  • PS1 minimum review stars correction (HELIX-CR-2026-073). ps1_min_stars: 1 -> 2 in processing.yaml. YAML override from commit 39f97c7e (2025-12-16) allowed PS1 (Strong) at ClinVar 1-star, bypassing v3.10.0 disease association gate. Code default was already 2.
  • PP5 restoration: review_stars = 1 ClinVar P/LP now correctly receives PP5 (Supporting) instead of PS1 (Strong). PP5 range (>= 1 AND < 2) no longer empty set.
  • PM3 partner validation: partner.review_stars >= ps1_min_stars (shared config, identical semantics per Richards 2015 Table 3: both PS1 and PM3 require established pathogenic variant). 0 PM3 variants in production have partner with review_stars = 1 (verified 151 sessions).
  • Subtractive change: PS1 removed from 1-star variants. Cannot create new P/LP. 3 LP -> VUS across 151 sessions (P4HA1, HLA-DRB1, NDUFAF7). All 3 genes without disease association in Orphanet/ClinGen/gene_disease_mechanism.
  • Clinical trigger: HG2022_016 P4HA1 chr10:73043951 LP [PS1,PM2] - deep intronic in gene without disease association.
  • Reference: Richards 2015 PMID:25741868 (Table 3, PS1/PM3 definitions), Walker 2023 PMID:37352859 (ClinGen SVI PS1 2-star minimum).
v3.26.6April 2026
  • B2 combining rule BP4_Moderate exclusion (Bug Bounty BB-007). B2 (>= 2 Strong benign) replaced with explicit (has_bs1 + has_bs2) >= 2. BP4_Moderate (Moderate benign, -2 pts Tavtigian) was incorrectly counted in bs_count for B2 firing, causing B2 (>= 2 Strong) to fire with 1 Strong (BS1) + 1 Moderate (BP4_Moderate). Richards 2015 Table 5 rule (ii): B2 requires >= 2 Strong benign. Tavtigian 2020 PMID:32720330: BS1(-4) + BP4_Moderate(-2) = -6 pts = LB, not B (requires <= -7).
  • bs_count unchanged: BP4_Moderate remains in bs_count for LB1 participation (bs_count >= 1 AND bp_count >= 1). Only B2 check uses explicit Strong-only count.
  • LB1b fallthrough added: 1 actual Strong benign (BS1/BS2) + BP4_Moderate = LB. Prevents BS1 + BP4_Moderate (without other BP) from falling to VUS.
  • Subtractive change for B: B -> LB when BS1 + BP4_Moderate was the sole B2 trigger. Cannot create new P/LP. Cannot remove existing LB.
  • Reference: Richards 2015 PMID:25741868 (Table 5 rule ii), Tavtigian 2020 PMID:32720330, Pejaver 2022 PMID:36413997.
v3.26.5April 2026
  • PM2 af_grpmax = 0.0 edge case fix (Bug Bounty BB-006). PM2 af_grpmax path: af_grpmax = 0.0 now treated as absent from controls, consistent with global_af path (v3.21.2, CR-050). gnomAD records AC=0 variants with af_grpmax=0.0 (NOT NULL). At pm2_threshold=0.0 (RASopathy "absent from controls"), 0.0 < 0.0 is false, blocking PM2.
  • Affects only VCEP genes with pm2_frequency_field=af_grpmax AND pm2_threshold=0.0 (currently RASopathy genes). Neutral/additive change: restores PM2 for variants incorrectly blocked. Cannot remove existing P/LP.
  • Reference: HELIX-CR-2026-050 (v3.21.2 global_af fix), HELIX-CR-2026-047 (v3.21.0 af_grpmax path), Richards 2015 PMID:25741868 (PM2 absent from controls).
v3.26.4April 2026
  • De novo projection compound het + hemizygous (Bug Bounty BB-003). compound_het_candidate guard removed from de_novo_projection outer CASE. De novo and compound het are compatible: allele 1 de novo + allele 2 inherited from carrier parent = legitimate compound het. PS2 (origin) and PM3 (biallelic configuration) are orthogonal evidence types.
  • Hemizygous genotype support: rv.genotype IN (het, hom_alt) with chromosome guard (X/chrX/Y/chrY). Hemizygous males on X/Y correctly eligible for de novo projection. Autosomal hom_alt excluded.
  • Dead LP rules documentation: LP1/LP1b/LP4/LP5/LP6 annotated as unreachable in de novo context (ps_count+1 >= 1 tautological, P-rules pre-empt). Retained for ACMG completeness.
  • Additive change: new de_novo_candidate=true for compound het VUS and hemizygous X/Y VUS. No change to acmg_class (de novo projection is prospective only).
  • Reference: Richards 2015 PMID:25741868 (PS2, PM3 definitions), Abou Tayoun 2018 PMID:30192042.
v3.26.3April 2026
  • P8 combining rule fix (Bug Bounty BB-001). P8 combining rule: ps_count >= 1 AND pm_count >= 4 replaced with ps_count >= 1 AND pm_count >= 1 AND pp_count >= 4. Richards 2015 Table 5 rule (viii): >=1 Strong AND >=1 Moderate AND >=4 Supporting.
  • Previous implementation was dead code (subsumed by P6: ps>=1 AND pm>=3) and encoded a non-existent rule. De novo projection P8: same fix applied to de_novo_projection CTE. Docstring P8 description corrected.
  • Subtractive/neutral change: dead code replaced with correct rule. Profile 1S+1M+4PP rare in current 19-criteria implementation (PS1 triggers ClinVar override before ACMG scoring). No classification changes expected in simulation.
  • Reference: Richards 2015 PMID:25741868 (Table 5 rule viii), Tavtigian 2018 PMID:29300386 (Bayesian validation: 4+2+4=10 pts = P).
v3.26.0April 2026
  • De novo projection AF ceiling + post-projection conflict guard (HELIX-CR-2026-071). De novo AF ceiling: de_novo_projection CTE filters by AF. global_af > 0.001 OR af_grpmax > 0.001 blocks de novo projection. PS2 (de novo) presupposes variant not inherited; AF > 0.001 implies ~150+ carriers in gnomAD v4.1 (152K genomes). Config parameter de_novo_af_ceiling (default 0.001).
  • Post-projection conflict guard: bs_count > 0 now blocks de novo projection unconditionally. Previous guard checked pre-projection conflict (pvs/ps/pm > 0 AND bs > 0), missing cases where PS2 projection itself creates the conflict.
  • Subtractive change: removes false de novo candidates. Cannot create new P/LP or new de novo candidates.
  • Clinical trigger: PEX2 intron_variant HG2022_079 (AF=0.0039, false de_novo_candidate=true).
v3.25.9April 2026
  • LP2 computational-only guard PP3_splice_Strong extension (HELIX-CR-2026-067). LP2 computational-only guard (CR-042/064) extended: has_pp3_strong replaced with (has_pp3_strong OR has_pp3_splice_strong). PP3_splice_Strong + PM2 without observed evidence now blocked, same as PP3_Strong + PM2.
  • Subtractive change: LP -> VUS. Cannot create new P/LP. De novo projection unchanged (PS2 is observed, LP2 legitimate).
  • Clinical trigger: IRF1 p.Arg123Gly LP [PP3_splice_Strong,PM2] (CR-065/066 simulation, computational-only profile).
v3.25.8April 2026
  • PP3 + PP3_splice mutual exclusion (HELIX-CR-2026-066). PP3 (BayesDel, all levels: Strong, Moderate, Supporting) blocked for dual-consequence variants where VEP consequence includes splice_region, splice_donor, or splice_acceptor in addition to missense. ClinGen SVI (Walker 2023 Section 3.2): PP3 is single criterion. SpliceAI (PP3_splice) is the more specific tool when VEP confirms splice proximity.
  • Symmetric with CR-065: CR-065 blocks PP3_splice for pure missense (no splice VEP). CR-066 blocks PP3 for missense WITH splice VEP. Together: exactly one PP3 path per variant, never both.
  • Subtractive change: PP3 removed from dual-consequence variants where PP3_splice also applies. Cannot create new P/LP. Clinical trigger: CR-065 simulation IRF1 p.Arg123Gly VUS to LP (PP3_Supporting + PP3_splice_Strong + PM2, double-counting).
v3.25.7April 2026
  • PP3_splice missense-tolerated guard (HELIX-CR-2026-065). PP3_splice (all levels: Strong, Moderate, Supporting) blocked for pure missense variants (no VEP splice_region/splice_donor/splice_acceptor consequence) when BayesDel noAF data is available. ClinGen SVI (Walker 2023 Section 3.2): when BayesDel indicates tolerated/indeterminate and SpliceAI predicts cryptic splice, evidence is conflicting, not additive.
  • Subsumes CR-043 (v3.20.8) double-counting guard. New guard covers both BayesDel >= 0.130 (CR-043 case) and BayesDel < 0.130 (CR-065 case) in single condition. BayesDel IS NULL bypass preserved (SpliceAI as sole in silico evidence). Missense + splice VEP consequence bypass preserved (VEP confirms splice proximity).
  • Subtractive change: PP3_splice removed from pure missense with BayesDel. Cannot create new P/LP.
  • Clinical trigger: PD2025_101 DICER1 c.5524A>G p.Ile1842Val LP [PP3_splice_Strong,PM2,PP2] vs ClinVar Expert Panel VUS (3 stars).
v3.25.6April 2026
  • LP2 computational-only guard PP2 bypass fix (HELIX-CR-2026-064). LP2 combining rule guard (CR-042) extended: pp_count = 0 condition replaced with (pp_count = 0 OR (pp_count > 0 AND NOT has_pp4 AND NOT has_pp5)). PP2 (gene-level constraint) no longer bypasses the PP3_Strong+PM2 computational-only LP guard. PP4 (HPO match, observed) and PP5 (ClinVar 1-star, observed) still override the guard legitimately.
  • Subtractive change: LP -> VUS. Cannot create new P/LP. De novo projection unchanged: PS2 + PP3_Strong = ps_count >= 2 -> P5.
  • Clinical trigger: PD2025_002 MEN1 p.Arg115His LP [PP3_Strong,PM2,PP2] vs ClinVar VUS (2 stars, 70 P/LP missense in gene).
v3.25.5April 2026
  • Het LoF LP guard for AR/biallelic-only LoF genes (HELIX-CR-2026-063). All LP combining rules (LP1-LP6) blocked for het LoF (HIGH impact) variants in AR/biallelic-only LoF genes without compound het partner. Het LoF in AR-only gene = carrier state, not pathogenic event. Richards 2015 Table 3: PVS1 presupposes pathogenic context.
  • Guard logic: ar_biallelic_het_lof_guard boolean. True when genotype=het, impact=HIGH, compound_het_candidate=false/NULL, AND gene in gene_disease_mechanism (LoF, biallelic, definitive/strong/moderate) OR (Orphanet AR-only AND gene_disease_mechanism LoF). P rules NOT blocked: P1 with ClinVar PS1 is legitimate override.
  • De novo projection: blocked by guard. pLI~0 + LOEUF>1 = het LoF tolerated in gnomAD. De novo het LoF not informative for AD mechanism.
  • Interaction: orthogonal to CR-061 (missense), CR-062 (computational). Subtractive change: LP -> VUS. Cannot create new P/LP.
  • Clinical trigger: PD2025_084 AMACR c.857del LP [PVS1_Strong,PM2] vs ClinVar VUS (2 stars). AMACR: pLI=0.000001, LOEUF=1.108, AR-only Orphanet, biallelic LoF definitive G2P.
v3.25.4April 2026
  • Computational-only LP guard for LP4/LP5/LP6 (HELIX-CR-2026-062). LP4/LP5/LP6 combining rules blocked when ALL pathogenic criteria are computational/annotation-based: PM1 (domain), PM2 (frequency), PP2 (gene constraint), PP3 (BayesDel, any level), PP3_splice (SpliceAI, any level). At least one observed/clinical criterion required: PS1 (ClinVar observed), PM3 (compound het), PM4 (inframe indel), PP4 (phenotype match), or PP5 (ClinVar low confidence).
  • De novo projection: LP4/LP5/LP6 NOT gated by this guard. PS2 (de novo) is observed evidence - if confirmed, LP is legitimate even with computational moderates.
  • Subtractive change: cannot create new P/LP. LP -> VUS when entire evidence profile is computational. 0 impact on PVS1-based LP, PP3_splice LP, ClinVar override LP, compound het LP, or PM4 LP.
  • Clinical trigger: PD2025_106 JAG1 p.Gly309Arg LP [PM1,PM2,PP2,PP3_Supporting] vs ClinVar VUS (2 stars).
v3.25.3April 2026
  • Monoallelic missense LP guard for AR/biallelic LoF genes (HELIX-CR-2026-061). PP2 blocked for het missense in AR/biallelic LoF genes without documented missense disease mechanism. Richards 2015 Table 3: PP2 requires "missense variants are a common mechanism of disease". AR LoF genes without ClinVar P/LP missense (clinvar_missense_genes) do not satisfy this premise.
  • LP4/LP5/LP6 combining rules blocked (ar_biallelic_missense_guard) for het missense variants in AR/biallelic LoF genes without compound het partner and without ClinVar P/LP missense evidence.
  • De novo projection: LP4/LP5/LP6 blocked by same guard. LP2 (PS2 + PM) intentionally NOT blocked - de novo is direct phenotypic observation, not computational prediction.
  • Subtractive change: cannot create new P/LP. ~44 LP -> VUS (het missense in AR/biallelic LoF genes). 0 impact on PVS1-based LP, PP3_splice LP, ClinVar override LP, or compound het candidates.
  • Clinical trigger: PD2025_127 ABCA2 p.Arg1064Cys LP [PM1,PM2,PP3_Moderate,PP2] vs ClinVar VUS (2 stars, multiple submitters, no conflicts).
v3.25.2April 2026
  • BP4_splice_benign for non-canonical splice (HELIX-CR-2026-060). Supporting benign for non-canonical splice region variants with SpliceAI <= 0.1. Covers splice_donor_region, splice_donor_5th_base, splice_acceptor_5th_base, splice_region, and splice_polypyrimidine_tract consequences. Excludes synonymous (BP7 handles) and missense (BP4 BayesDel handles).
  • SpliceAI threshold <= 0.1: consistent with BP7 synonymous guard. Jaganathan 2019 PMID:30661751, de la Hoya 2022 PMID:35202600.
  • PVS1 guard: NOT pvs1_cond (pre-split, covers both full and last-exon). Same guard as PP3_splice paths. No double-counting splice/LoF evidence. Criteria string: BP4_splice (Supporting benign).
  • Subtractive change: adds benign evidence only. Cannot create new P/LP. VUS with BS1/BS2 + BP4_splice = LB (LB1 rule). VUS without Strong benign remain VUS (1 BP insufficient for LB2).
  • Clinical trigger: PD2025_087 HNF4A c.426+4C>T (splice_donor_region, SpliceAI=0.0, VUS without criteria). Geneticist flag by D. Maneva.
v3.25.1April 2026
  • PM3 ClinVar partner validation guard (HELIX-CR-2026-058). compound_het_candidate = true alone no longer satisfies PM3. Partner must have ClinVar P/LP (clinical_significance LIKE Pathogenic% OR Likely_pathogenic%, NOT LIKE %Conflicting%) with review_stars >= ps1_min_stars (currently 2). Richards 2015 Table 3: detected in trans with a pathogenic variant - partner pathogenicity is explicit.
  • ClinVar-only guard (no acmg_class dependency): avoids circular reference where partner classification depends on PM3 which depends on partner classification. ClinVar annotation is external/immutable.
  • Subtractive change: PM3 can only be removed, not added. 241 LP -> VUS expected (PM3 was decisive). 5 LP unchanged (validated partner). De novo projection inherits PM3 change via pm3_cond.
  • Clinical trigger: HG2022_037 F5 p.Met1874Thr LP [PM2,PM3,PP3_Moderate,BP2] vs ClinVar VUS (2 stars). Partner: Factor V Leiden (p.Arg534Gln, AF=2.14%, ClinVar drug_response 3 stars). PM3 incorrectly fired without partner pathogenicity validation.
  • Impact: 188 sessions, 52,929 PM3 variants, 241 LP -> VUS, 0 new P/LP.
v3.25.0April 2026
  • BP_regional Phase 3 - gnomAD LoF population tolerance as Supporting benign (HELIX-CR-2026-057 Phase 3). Supporting benign (bp_count) for HIGH impact variants outside MANE CDS where gnomAD demonstrates regional LoF tolerance in healthy individuals. Builds on v3.24.0 informational signal.
  • AD genes excluded - het LoF tolerance in AD is a different clinical argument. Compound het candidates excluded - they may contribute to biallelic LoF.
  • Conditions: gls.gene_symbol IS NOT NULL, position outside MANE CDS, impact=HIGH, NOT compound_het_candidate, inheritance_mode IS NULL OR IN (AR, AD/AR, XL). Autosomal: max_nhomalt >= 2. X-linked: max_ac_hemi >= 2.
  • Criteria string marker: BP_regional[gnomAD_LoF].
v3.24.0April 2026
  • gnomAD LoF population tolerance signal (HELIX-CR-2026-057 Phase 1). Boolean gnomad_lof_tolerant: gene has HC LoF variants with hom/hemi > 0 outside MANE CDS in gnomAD genomes. Phase 1 informational only, no ACMG criteria change.
  • Reference data: gnomad_lof_tolerant_gene_summary table loaded. Aggregates max_nhomalt, max_ac_hemi, lof_variant_count per gene with example_variant audit trail.
  • Activated by mt.gene_symbol IS NOT NULL (positional context mandatory) and position outside MANE CDS region.
  • Phase 2 (CR-057 Phase 3, v3.25.0): gnomad_lof_tolerant signal is converted into BP_regional Supporting benign criterion.
v3.23.0April 2026
  • PVS1: Expression-aware guard using gnomAD v4.1 pext data (GTEx v10, GRCh38, 49 tissues). Per-exon max_tissue_pext modulates PVS1 application (HELIX-CR-2026-056, Cummings et al. 2020 PMID:32461655). Three tiers: max_tissue_pext >= 0.9 = PVS1 Very Strong (unchanged), 0.1-0.9 = PVS1_Strong[pext] (downgraded from Very Strong to Strong), < 0.1 = PVS1 blocked (exon not expressed). NULL pext = Very Strong (conservative fallback). PM4 receives binary pext guard (blocked at < 0.1).
  • PVS1: 189,856 exon records for 18,923 MANE Select genes loaded from gnomAD v4.1 base-level pext TSV. Distribution: 56.3% full (>= 0.9), 42.5% downgrade (0.1-0.9), 1.2% block (< 0.1). Uses max_tissue_pext (maximum across 49 GTEx v10 tissues) to avoid false negatives for tissue-specific genes.
  • PVS1: Human-readable annotations in acmg_criteria ("[PVS1 blocked: low exon expression (pext X.XX)]", "[PVS1 downgraded: partial exon expression (pext X.XX)]") and processing_notes (full sentence with gene name, threshold, source).
  • Reference databases: gnomAD pext (v4.1, GTEx v10) and MANE Select transcripts (v1.4) added as reference data sources.
  • Scientific basis: Cummings 2020 demonstrated that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants while removing less than 4% of true pathogenic variants.
v3.22.0April 2026
  • PVS1: MANE Select positional guard (HELIX-CR-2026-055, Abou Tayoun 2018 PMID:30192042). Variants outside the MANE Select/Plus Clinical CDS region do not receive PVS1 or PM4. Addresses false PVS1 application on variants annotated by VEP against non-canonical transcripts that extend beyond the canonical coding sequence.
  • PVS1: 19,354 MANE transcripts (19,288 Select + 66 Plus Clinical) with CDS coordinates loaded from MANE GFF3 v1.4. LEFT JOIN with GROUP BY prevents row duplication for 66 genes with both Select and Plus Clinical transcripts.
  • PVS1: Inline NULL-safe guards on 6 sites (pvs_count, ps_count/pvs1_last_exon, pm_count/pm4, has_pvs1_full, has_pvs1_last_exon, has_pm4). NULL MANE data = PVS1 applied (conservative fallback for genes not yet in MANE).
  • Criteria string: [PVS1_BLOCKED:outside_MANE_CDS] marker appended for HIGH impact variants outside MANE CDS. Processing notes: PVS1_BLOCKED:outside_MANE_CDS(gene,mane_gene) audit trail.
  • Clinical trigger: HG2022_057 FANCB chrX:14797196 p.Gln864Ter classified LP (PVS1+PM2) against non-canonical transcript ENST00000696353. Position is 46 kb upstream of MANE Select CDS (14,843,567-14,865,510). Reclassified to VUS (PM2 only).
v3.21.4March 2026
  • CLCN1 removed from AR_LOF_GENES (HELIX-CR-2026-053). Myotonia congenita has both AD (Thomsen) and AR (Becker) forms. Het LoF can cause AD Thomsen disease. CLCN1 receives AD/AR via dual HPO check.
v3.21.3March 2026
  • Annotation Phase 7.5: dual HPO inheritance detection (HELIX-CR-2026-051). Genes with both HP:0000006 (AD) and HP:0000007 (AR), confirmed by gene_disease_mechanism biallelic, now receive AD/AR instead of AD. 312 genes annotation improved.
  • CYP11A1 added to AR_LOF_GENES. Congenital adrenal insufficiency, AR. OMIM #613743.
v3.21.0March 2026
  • PM2 frequency field alignment for VCEP FAF thresholds (HELIX-CR-2026-047). VCEP genes with pm2_frequency_field=af_grpmax now compare PM2 against af_grpmax instead of global_af. K1 bottlenecked NULL guard: af_grpmax NULL + global_af NOT NULL = PM2 not satisfied.
  • PM2 criteria string audit trail: PM2[af_grpmax] and PM2_Supporting[af_grpmax] markers.
v3.20.9March 2026
  • BP1: reference-based missense mechanism guard (HELIX-CR-2026-044). BP1 blocked when gene is in refdb.clinvar_missense_genes (>= 2 ClinVar P/LP missense at 2+ stars). Extends session-local ClinVar guard with reference-level check.
v3.20.8March 2026
  • PP3_splice missense double-counting guard (HELIX-CR-2026-043). PP3_splice blocked when variant already receives PP3 from BayesDel missense path. ClinGen SVI: PP3 is a single criterion.
v3.20.6March 2026
  • BP1: GoF/DN gene guard (HELIX-CR-2026-039v2). BP1 blocked in GoF/DN genes where truncating variants do not cause disease via LoF. PLIN1 added to GOF_AD_GENES (DN mechanism, FPLD4).
v3.20.3March 2026
  • PVS1/PM3: Dual-mechanism gene guard (HELIX-CR-2026-037). gof_genes_exclusive view excludes 55 dual-mechanism genes from GoF guard, restoring PVS1/PM3 for AR LoF pathway.
v3.20.2March 2026
  • PM3: GoF/DN gene exclusion (HELIX-CR-2026-036). Compound het in pure GoF/DN genes no longer satisfies PM3. PM3 presupposes AR mechanism.
v3.18.2March 2026
  • Disease association gate: gene_disease_mechanism table added as 7th source in pvs1_disease_gate and gene_has_disease_association (HELIX-CR-2026-026). 256 genes in gene_disease_mechanism (G2P 225, GoFCards 35, manual 2) were not covered by existing 6 disease gate sources. Confidence filter: definitive/strong/moderate only (predicted excluded per CTO directive). Clinical trigger: SNRPN c.-391+1_-391+2insGA false VUS after v3.18.1 HI fix.
v3.18.1March 2026
  • Disease association gate: haploinsufficiency_score IS NOT NULL replaced with IN (2, 3, 30) in pvs1_disease_gate and gene_has_disease_association (HELIX-CR-2026-025). HI=0 (no evidence), HI=1 (little evidence), HI=40 (dosage sensitivity unlikely) no longer satisfy disease gate. 475 genes lost false disease association, 117 with no other disease source. Clinical trigger: SLC22A18 c.-133+2T>C false LP in gene without Mendelian disease.
v3.18.0March 2026
  • PVS1 GoF/DN guard: gof_genes_g2p view (198 genes) replaced by gof_genes_unified view (347 genes) as primary PVS1 GoF exclusion source (HELIX-REF-003 Phase 2A). gof_genes_unified draws from 3 sources: G2P (199 genes), GoFCards (179 genes), manual curation (27 genes). Dual-mechanism genes correctly excluded. GOF_AD_GENES Python constant retained as fallback.
  • Reference database: gene_disease_mechanism table created in reference_db (2,617 records). Unified schema for molecular mechanism data from G2P, GoFCards, and manual curation. Columns: gene_symbol, disease_id, disease_name, mechanism (LoF/GoF/DN/GoE), confidence, source, allelic_requirement, pmid, curation_date, curator, last_reviewed, notes.
  • Reference database: GoFCards Release 1.0 integrated (579 genes, 3,161 curated GoF variants). Gene-level aggregation threshold: 3+ variants, Pscore >= 2.0, at least 1 variant with animal or cell model evidence = confidence "strong". 217 genes qualified.
  • Reference database: ClinGen Dosage Sensitivity full TSV integrated (1,626 genes, 23 columns). Replaces legacy 3-column clingen table. Backward-compatible clingen table recreated from full data.
  • Reference: GoFCards PMID:39578693, ClinGen clinicalgenome.org, HELIX-REF-003.
v3.17.2March 2026
  • PVS1: 12 GoF/DN/GoE AD genes added to GOF_AD_GENES (HELIX-REF-003 Phase 2B). Systemic gap analysis: 28 AD genes depend ONLY on ClinGen AD Definitive path for PVS1, without mechanism check. 13 identified as HIGH risk (established non-LoF mechanism). TRAF7 deferred (insufficient germline functional evidence).
  • Genes added: CALM2 (DN calmodulinopathy), CALM3 (DN calmodulinopathy), DNAJB6 (DN myofibrillar myopathy), GJB6 (DN deafness, ClinGen AR=Refuted), GREM1 (regulatory GoE), HSPB8 (toxic GoF CMT2L), INF2 (DN formin FSGS5), NEFL (DN neurofilament CMT), PRKAG2 (GoF AMPK activation), SKI (DN Shprintzen-Goldberg), TNNC1 (GoF Ca2+ sensitivity HCM), TUBB3 (DN microtubule CFEOM3A).
  • Cross-referenced with LoGoFunc (variant-level GoF/LoF predictions), GoFCards (curated GoF variants), ClinGen Dosage (HI=0 signal), and G2P mechanism data.
v3.17.1March 2026
  • PVS1: ANKRD26 and C1QTNF5 added to GOF_AD_GENES (HELIX-CR-2026-024). ANKRD26: Thrombocytopenia 2, gain-of-expression mechanism - 5UTR mutations disrupt RUNX1/FLI1 silencing. ClinGen Dosage explicitly states "gain-of-function rather than haploinsufficiency". C1QTNF5: Late-onset retinal degeneration, dominant-negative mechanism - mutant destabilizes wildtype in gC1q domain co-expression.
  • Clinical trigger: PD2025_095 ANKRD26 chr10:27044191 classified P [PVS1,PM2,PM3], should be VUS. C1QTNF5 chr11:119339479 received PVS1_Strong for frameshift despite all known pathogenic being missense.
v3.17.0March 2026
  • PVS1: G2P molecular mechanism integration (HELIX-CR-2026-023). DECIPHER Gene2Phenotype (G2P) database now provides primary GoF/DN guard for PVS1. gof_genes_g2p view contains 198 pure GoF/DN monoallelic genes where PVS1 is blocked. G2P is queried at classification time via reference_db ATTACH.
  • PVS1: GOF_AD_GENES reduced from 36 to 14 fallback genes (genes not in G2P: 6 neurodegeneration/toxic aggregation, 3 somatic/neomorphic, 5 absent from G2P 2026-02-28). PRSS1 added to fallback (GoF trypsinogen autoactivation, clinical trigger PD2025_090).
  • PVS1: GNAS removed from GOF_AD_GENES (dual-mechanism: McCune-Albright = GoF, pseudohypoparathyroidism 1A = LoF). G2P view correctly excludes GNAS. GNAS LoF variants now receive PVS1.
  • PVS1: 49 dual-mechanism genes (SCN5A, LMNA, KCNH2, KCNQ1, FGFR1, etc.) excluded from GoF view, preserving PVS1 for legitimate LoF phenotypes.
  • PVS1: G2P confidence filter: only definitive, strong, moderate included. Limited confidence excluded to prevent blocking PVS1 on weak GoF evidence.
  • PVS1: Guard order: G2P view (primary) -> GOF_AD_GENES (fallback). Identical pattern to refdb.ar_lof_genes_clingen and refdb.disease_associated_genes_clingen.
  • DECIPHER G2P: Molecular mechanism data (g2p_gene_mechanism table, 2,372 records) added to reference_db alongside existing HPO enrichment data (v3.12.0).
  • Loader script: scripts/load_g2p_mechanism.py with dry-run, force flags, and 9-point validation (total records, mechanism distribution, view count, dual-mechanism exclusion, spot checks).
  • Subtractive change: can only remove PVS1 from GoF/DN genes. Cannot create false P/LP.
  • Clinical trigger: PD2025_090 PRSS1 p.Gly177Ter (stop_gained). GoF gene, PVS1 scientifically incorrect. BA1 protected in this case (AF 41%), but rare PRSS1 truncating variants without BA1 could reach LP[PVS1,PM2] incorrectly.
  • Reference: Abou Tayoun 2018 PMID:30192042, Josephs 2023 PMID:37884986, HELIX-CR-2026-023.
v3.16.8March 2026
  • PVS1: GoF AD gene exclusion (HELIX-CR-2026-022). GOF_AD_GENES curated list (36 genes) blocks PVS1 for autosomal dominant genes where the disease mechanism is gain-of-function, dominant-negative, or toxic aggregation. ClinGen PVS1 Decision Tree Node 1: "Is LoF a known mechanism of disease?" - if NO, PVS1 not applicable.
  • PVS1: GOF_AD_GENES categories: RAS/MAPK pathway (12 genes), receptor tyrosine kinases (10), PI3K/AKT/mTOR (2), GTPases (3), metabolic enzymes (2), kinases (2), neurodegeneration (6). Each gene documented with PMID for GoF mechanism.
  • PVS1: ClinGen AD Definitive/Strong genes retain PVS1 access even if in GOF_AD_GENES - VCEP and ClinGen disease validity take precedence.
  • Clinical trigger: PD2025_082 RAC1 p.Val14SerfsTer4 (frameshift in GoF gene). SOD1 discordance in same session.
  • Reference: Abou Tayoun 2018 PMID:30192042, HELIX-CR-2026-022.
v3.16.7March 2026
  • LP classification: Disease association gate for LP rules LP4, LP5, LP6 (HELIX-CR-2026-021). LP rules that rely solely on moderate and supporting criteria now require established disease association via at least one of five sources (same as PVS1 and PP3_Strong gates).
  • LP4 (>=3 PM), LP5 (2 PM + >=2 PP), LP6 (1 PM + >=4 PP): these rules can be satisfied by PM1+PM2+PM4 or similar combinations without any gene-disease evidence. Gate prevents LP classification in genes without known Mendelian disease mechanism.
  • LP1, LP2, LP3 not gated: they require PVS or PS criteria, which already have their own disease association gates.
  • Subtractive change: can only downgrade LP to VUS, cannot upgrade.
v3.16.6March 2026
  • PVS1: ClinGen AD Definitive/Strong genes bypass pLI/LOEUF constraint gate. Genes with ClinGen Gene-Disease Validity rating of Definitive or Strong for autosomal dominant inheritance now qualify for PVS1 regardless of population constraint metrics.
  • PVS1: This addresses genes like TUBB1 (pLI near 0, LOEUF=1.586) with ClinGen AD Definitive that were excluded from PVS1 by the population constraint gate.
v3.16.0March 2026
  • ClinGen Gene-Disease Validity integration: ar_lof_genes_clingen and disease_associated_genes_clingen reference tables added to reference_db. Replaces in-memory Python constants for AR LoF gene list and disease association checks.
  • PVS1 AR gene bypass now queries refdb.ar_lof_genes_clingen at classification time instead of Python AR_LOF_GENES constant.
  • Disease association check now queries refdb.disease_associated_genes_clingen for PVS1, PP3_Strong, LP disease gates, and ClinVar override gate.
  • Reference database count: 14 (previously 13).
v3.15.0March 2026
  • PM1: Critical functional domains migrated from Python constant (CRITICAL_PFAM_DOMAINS) to reference database table (refdb.interpro_pfam_domains). HELIX-CR-2026-012.
  • PM1: Full InterPro Pfam catalog (27,481 entries) loaded into reference_db. 49 domains marked as critical (is_critical=true) across 14 categories.
  • PM1: Pre-build approach - critical accessions loaded from refdb at classification time, OR LIKE clauses generated in Python. Zero performance regression vs v3.14.0.
  • PM1: Adding a new critical domain is now a database UPDATE, not a code change + rebuild + deploy.
  • InterPro / Pfam added as 13th reference database.
  • Loader script: scripts/load_interpro_pfam.py with InterPro API fetch, JSON cache for offline fallback, and hard-fail validation.
  • Reference: HELIX-CR-2026-012.
v3.14.0March 2026
  • PM1: CRITICAL_PFAM_DOMAINS converted from Pfam short names to InterPro-verified Pfam accession numbers (HELIX-CR-2026-011). VEP annotation records domain data as Pfam:PFxxxxx accession numbers, but PM1 SQL searched for short names (RyR, Pkinase, Ion_trans). Result: 0 legitimate PM1 applications since curated list introduction. Fix: all 49 domains now use verified accession numbers.
  • PM1: Removed Rho (covered by PF00071 Ras superfamily, VEP confirms RAC1 annotated with PF00071) and Peptidase_S1 (covered by PF00089 Trypsin, VEP confirms F10 annotated with PF00089).
  • PM1: Corrected accessions - Dynamin_N: PF01031 (Dynamin_M) corrected to PF00350 (Dynamin_N). PH: PF00168 (C2 domain) corrected to PF00169 (PH domain). FBN_EGF: mapped to PF00008 (EGF-like domain).
  • PM1: All 49 accessions verified against InterPro REST API (ebi.ac.uk/interpro) on 2026-03-16. Domain coverage validated against patient VCF data (RYR1 PF01365, KRAS PF00071, KCNH2 PF00520, CDH23 PF00028, MYO7A PF00063, COL1A1 PF01391, CFTR PF00005, EGFR PF07714).
  • PM1: Post-fix validation on HG2023_206: PM1 count increased from 2 (false positive) to 1,139 (legitimate). RYR1 c.6635T>C correctly receives PM1. CRYGD false positives eliminated. Zero B/LB classification changes.
  • Clinical trigger: HG2023_206 RYR1 c.6635T>C (p.Val2212Ala) in RyR domain (Pfam:PF01365) classified VUS [PM2,PP3_Supporting] without PM1. CG classified LP. PM1 now applied: VUS [PM1,PM2,PP3_Supporting].
  • Reference: HELIX-CR-2026-011, Richards et al. 2015 PMID:25741868, ClinGen SVI PM1 guidance.
v3.13.0March 2026
  • PVS1: ClinVar LOF evidence added as fifth constraint gate path (HELIX-CR-2026-010). Small autosomal dominant genes with ClinVar P/LP LOF variants (2+ review stars) but uninformative gnomAD constraint (pLI < 0.9, LOEUF >= 0.35, HI != 3) now qualify for PVS1 via CLINVAR_LOF_AD_GENES curated list.
  • gene_has_disease_association: CLINVAR_LOF_AD_GENES added as sixth source in the disease association check.
  • Clinical trigger: GCM2 p.Arg131Ter (stop_gained) in PD2025_098. GCM2 is a 5-exon AD gene for Familial hypoparathyroidism with ClinVar Pathogenic p.Tyr136Ter (2 stars) but pLI=0.0, LOEUF=1.205, HI=NULL. Now LP [PVS1,PM2].
  • Reference: Abou Tayoun 2018 PMID:30192042, HELIX-CR-2026-010.
v3.12.1March 2026
  • PP3/BP4: Missense consequence guard (HELIX-CR-2026-009). PP3 BayesDel path (Strong/Moderate/Supporting) and BP4 BayesDel path (Moderate/Supporting) now require consequence = missense_variant. BayesDel_noAF is calibrated exclusively for missense variants (Pejaver et al. 2022).
  • Clinical trigger: GCM2 p.Arg131Ter (stop_gained) in PD2025_098 received PP3_Strong (BayesDel=0.66) + PM2 = LP. Correct: PM2 only = VUS. 2 non-missense variants with false PP3_Strong eliminated.
  • Subtractive change: can only downgrade (LP to VUS), cannot upgrade. PP3_splice (SpliceAI) unaffected.
  • Reference: Pejaver 2022 PMID:36413997, Walker 2023 PMID:37352859, HELIX-CR-2026-009.
v3.12.0March 2026
  • HELIX-REF-001: Multi-Source HPO Enrichment Pipeline. HPO annotation expanded from single-source (HPO Consortium, 5,173 genes, 320K records) to multi-source enriched view (6 sources, 5,688 genes, 927K records, +10% gene coverage).
  • HPO sources: HPO Consortium (primary), Orphanet disease-to-HPO via Orphadata XML (3,176 genes with frequency data), DECIPHER G2P from 7 clinical panels (2,125 genes), Monarch Initiative (4,791 genes), ClinVar-MedGen chain mapping from P/LP variants (5,258 genes), and manual clinical curation for common-disease genes.
  • Enrichment depth: 3,292 existing genes received additional HPO terms. 515 net new genes added. 1,505 genes covered by all 5 automated sources.
  • Source priority ordering: Orphanet > G2P > HPO Consortium > Monarch > ClinVar-MedGen. Low-confidence entries filtered from clinical view.
  • PP4 criterion evaluates against enriched HPO set. No ACMG criteria logic changes.
  • Data integrity validated: 0 NULLs, 0 invalid HPO IDs, 0 duplicate PKs across 926,930 records.
v3.11.4March 2026
  • PVS1: Non-canonical splice site variants excluded from PVS1 application. VEP consequence terms splice_donor_5th_base_variant (position +5), splice_donor_region_variant (positions +3 to +8), and splice_acceptor_5th_base_variant (acceptor position -5) now block PVS1. ClinGen PVS1 Decision Tree (Abou Tayoun et al. 2018) specifies canonical +/-1,2 splice sites only. Extended splice region variants retain access to PP3_splice (SpliceAI >= 0.2) at Supporting evidence level.
  • Clinical trigger: PDE6B c.1401+4_1401+48del in PD2025_061 - VUS vs ClinVar Benign (2 stars). VEP annotates as splice_donor_5th_base_variant. PVS1 created false conflicting evidence (PVS1 vs BS1+BS2+BP6 = VUS).
  • Impact: 2 of 18 splice_donor PVS1 variants affected in PD2025_061. 0 P/LP variants affected. Subtractive change (cannot create false P/LP).
  • Log strings now use CLASSIFIER_VERSION constant instead of hardcoded version.
  • Reference: HELIX-CR-2026-008, Abou Tayoun 2018 PMID:30192042.
v3.11.3March 2026
  • PVS1: NMD_transcript_variant no longer blocks PVS1. VEP NMD_transcript_variant annotation means the transcript UNDERGOES nonsense-mediated decay, which is additional evidence FOR loss-of-function, not against it. The exclusion now targets NMD_escaping_variant (transcripts that escape NMD - truncated protein expressed, may retain partial function).
  • Previous behavior: frameshift_variant,NMD_transcript_variant -> PVS1 blocked. New behavior: frameshift_variant,NMD_transcript_variant -> PVS1 applies. frameshift_variant,NMD_escaping_variant -> PVS1 blocked (correct).
  • Clinical trigger: HG2022_057 review (KCNN2 frameshift).
  • Reference: VEP documentation (Ensembl), Abou Tayoun et al. 2018 PMID:30192042.
v3.11.0March 2026
  • De novo projection (PS2): Prospective computation of whether adding PS2 (confirmed de novo, Strong) would upgrade a VUS to LP or P. Identifies variants where trio confirmation has the highest clinical impact.
  • Excludes: non-VUS, non-heterozygous, BA1 active, compound het candidates (biologically exclusive with de novo), and conflicting evidence variants.
  • Minimum evidence guard (v3.11.2): PM2 alone is insufficient for de novo candidacy - requires at least one additional pathogenic criterion beyond PM2.
  • De novo projection is informational only - it does not change the current ACMG classification. It flags variants where parental testing would resolve VUS status.
v3.10.1March 2026
  • disease_genes_clinvar CTE: Added review_stars >= 2 filter. The shared CTE that provides gene-level ClinVar disease association evidence (Source 1 of 5 in the disease association check) previously included all ClinVar P/LP genes without a star minimum. This created a circular reference: a 1-star ClinVar P/LP submission simultaneously triggered the ClinVar override AND provided the disease association evidence that allowed the v3.10.0 gate to pass. SHROOM2 (1-star Likely Pathogenic, zero disease association from Orphanet/ClinGen/AR_LOF_GENES/VCEP) self-validated its own gate.
  • The review_stars >= 2 threshold is consistent with PS1 (ps1_min_stars = 2): a single unreviewed ClinVar submission does not constitute gene-level disease association evidence. Two or more independent submitters with concordant interpretation (2+ stars) provide meaningful gene-disease validation.
  • Impact on PVS1 and PP3_Strong disease gates: minimal. Both gates reference the same CTE, but genes where PVS1 or PP3_Strong activate have redundant disease evidence from Orphanet, ClinGen haploinsufficiency, curated AR LoF gene list (~150 genes), or VCEP coverage. A gene relying solely on 1-star ClinVar for disease evidence is exactly the type of gene that should not receive PVS1 (Very Strong) or PP3_Strong.
  • Clinical trigger: SHROOM2 p.Gly211Ser remained LP after v3.10.0 deployment. Now correctly classified as VUS (ClinVar_gated).
v3.10.0March 2026
  • ClinVar override: P/LP override with review_stars = 1 (single submitter, no peer review) now requires the gene to have an established disease association via at least one of five sources: ClinVar P/LP (gene-level), Orphanet entry, ClinGen haploinsufficiency score, curated AR LoF gene list (~150 genes), or VCEP coverage. Without this gate, a single ClinVar submission in a gene without any known Mendelian disease mechanism could promote a variant to Likely Pathogenic.
  • ClinVar override: P/LP assertions with 2+ review stars (multiple concordant submitters) bypass the disease association gate. Benign and Likely Benign ClinVar overrides are not affected by this gate.
  • ClinVar override: Gated variants (1-star P/LP in genes without disease association) fall through to standard ACMG scoring, which typically produces VUS when no ACMG criteria are activated.
  • gene_has_disease_association: New boolean column in the criteria evaluation stage. Reuses the same 5-source disease check as the PVS1 gate (v3.8.0) and PP3_Strong gate (v3.8.1). Zero new data sources, zero performance overhead.
  • clinvar_disease_gated: New audit trail boolean. Criteria string shows "ClinVar_gated" prefix when a ClinVar P/LP override was blocked by the disease association gate, enabling geneticists to identify gated variants.
  • Clinical trigger: SHROOM2 p.Gly211Ser in PD2025_060 - classified as LP from a single ClinVar submission (1 star) in a gene with zero disease association across all five reference sources (Orphanet, ClinGen, HPO, VCEP, AR_LOF_GENES), zero constraint data (pLI/LOEUF/mis_z all NULL), and population frequency of 0.23%.
  • Reference: HELIX-CR-2026-007, Ghosh et al. 2018 PMID:30311383 (limitations of single-submitter ClinVar entries).
v3.9.0March 2026
  • PVS1: Last-exon truncating variants downgraded from Very Strong (+8 points) to Strong (+4 points). ClinGen PVS1 Decision Tree (Abou Tayoun et al. 2018) Figure 1, Node 3: truncating variants in the last exon escape nonsense-mediated mRNA decay (NMD) because there is no downstream exon-exon junction complex.
  • PVS1: is_last_exon detection uses VEP exon_number column (format X/Y, last exon when X == Y). NULL or missing format defaults to Very Strong (conservative fallback). Single-exon genes (1/1) correctly receive PVS1_Strong.
  • PVS1: Criteria string shows PVS1 for Very Strong (non-last-exon) and PVS1_Strong for last-exon downgraded variants. Consistent with PP3_Strong, BP4_Moderate naming convention.
  • PVS1: PP3_splice double-counting guard unchanged - uses original PVS1 condition covering both Very Strong and Strong paths.
  • PVS1: AR_LOF_GENES last-exon downgrade applies regardless of inheritance mode. NMD escape is a biophysical mechanism independent of AD/AR inheritance.
  • PVS1: Classification impact - PVS1_Strong + PM2 = 6 points = Likely Pathogenic (most common minimal combination unaffected). PVS1_Strong + PS1 drops from Pathogenic to Likely Pathogenic.
  • Clinical trigger: PD2025_058 review (MYO6 p.Lys972GlufsTer5, exon 27/32 - NOT last exon, unaffected by change).
  • Reference: Abou Tayoun et al. 2018 PMID:30192042.
v3.8.2March 2026
  • BP1: ClinVar pathogenic missense guard - BP1 no longer applies when ClinVar has a Pathogenic missense variant in the same gene. If ClinVar confirms pathogenic missense, the BP1 premise is falsified. Uses session-local ClinVar data.
  • BP1: Clinical trigger - MCCC2 p.Val339Met in PD2025_099 received BP1 despite being ClinVar Pathogenic missense. AR enzyme genes have low mis_z but pathogenic homozygous missense.
  • PP3_Supporting: Criteria string label changed from PP3 to PP3_Supporting for consistency with PP3_Strong and PP3_Moderate. ClinGen SVI (Walker et al. 2023) recommends explicit evidence strength labeling.
v3.8.1March 2026
  • PP3_Strong: Disease association gate - PP3_Strong (BayesDel >= 0.518, Strong evidence level) now requires the gene to have an established disease association via at least one of five sources: ClinVar P/LP (gene-level), Orphanet gene-disease entry, ClinGen haploinsufficiency score, curated AR LoF gene list, or VCEP coverage.
  • PP3_Strong: Without this gate, PP3_Strong + PM2 could reach Likely Pathogenic in genes without any known Mendelian disease mechanism. Clinical trigger: PD2025_056 identified POU2F2 p.Val324Met and RYR3 p.Arg285Gln as false LP classifications.
  • PP3_Strong: Scientific basis - Pejaver et al. 2022 calibrated BayesDel on ClinVar variants in disease genes; Walker et al. 2023 (ClinGen SVI) Section 3.2 restricts PP3/BP4 calibrated thresholds to genes with established disease mechanism.
  • PP3_Moderate and PP3_Supporting are NOT gated by disease association (cannot reach LP alone).
  • PP3_splice path unchanged (splice prediction independent of missense disease mechanism).
v3.8.0March 2026
  • PVS1: Disease association gate - PVS1 now requires the gene to have an established disease association via at least one of five sources: ClinVar P/LP (gene-level), Orphanet gene-disease entry, ClinGen haploinsufficiency score, curated AR LoF gene list (~150 genes), or VCEP coverage. Population constraint (pLI/LOEUF) alone is no longer sufficient.
  • PVS1: Genes with high pLI but no known Mendelian disease mechanism (e.g., TTC28, ABL1, HTT) are correctly excluded from PVS1. Gene audit across 5 validation cases: 250 genes blocked, 92 genes pass, ACMG SF v3.2 safety check passed (0 of 81 genes affected).
  • PVS1: Reference: Abou Tayoun et al. 2018 (ClinGen PVS1 Decision Tree) - "PVS1 is applicable when loss of function is a known mechanism of disease." HELIX-CR-2026-003.
  • BS1: Constraint-implied AD fallback - when no inheritance mode data exists from any source (Orphanet, ClinGen, HPO) but the gene is LoF-constrained (pLI > 0.9 or LOEUF < 0.35), BS1 uses the AD threshold (0.1%) instead of the AR default (5%). Resolves the architectural inconsistency where PVS1 treated a gene as AD-haploinsufficient but BS1 treated it as AR.
  • BS1: Explicit Orphanet AR and curated AR_LOF_GENES added as cascade priorities 5-6, ensuring explicit AR evidence overrides constraint-implied AD inference.
  • BS1: Cascade expanded from 5 to 8 levels: VCEP > ClinGen HI > Orphanet AD > Orphanet XLD > HPO AD > Orphanet AR > AR_LOF_GENES > Constraint-implied AD > Default AR.
  • BS1: Criteria string now includes BS1[Orphanet-AR] and BS1[constraint-AD] source annotations.
  • Reference databases: Orphanet/Orphadata and VCEP gene-specific specifications now documented as separate reference databases (9 total, previously 7).
  • Clinical validation: TTC28 p.Arg21Ter (PD2025_100) changes from LP to VUS. Discordance analysis: frequency_conflict CRITICAL findings reduced from 1 to 0.
v3.7.0March 2026
  • BS1: Orphanet inheritance cascade for inheritance-aware frequency thresholds. 5-level priority: VCEP gene-specific -> ClinGen HI score=3 -> Orphanet AD -> Orphanet XLD -> HPO AD fallback -> AR default.
  • BS1: Orphanet AD coverage: 1,694 autosomal dominant genes (4x improvement over ~400 from ClinGen HI alone).
  • BS1: XLD/XLR distinction: X-linked dominant genes (90) receive AD threshold (0.1%); X-linked recessive genes (663) receive AR default (0.05%). XL-unspecified mapped to XLR as conservative default.
  • BS1: HPO AD fallback (HP:0000006) as Priority 4 catches AD genes not in Orphanet (e.g. KCNQ1, KCNH2 - also covered by ClinGen HI).
  • BS1: Criteria string now includes inheritance source annotation: BS1[ClinGen-HI], BS1[Orphanet-AD], BS1[Orphanet-XLD], BS1[HPO-AD].
  • Reference DB: orphanet_gene_inheritance table added (3,197 genes, LEFT JOIN at classification time, <2ms overhead).
  • Data source: Orphadata CC-BY-4.0 (INSERM, France). Covers ~6,100 rare diseases with gene-disease-inheritance annotations.
  • Validation: 9/10 known AD genes confirmed (KCNJ11, GCK, HNF1A, FGFR3, SCN1A, etc.), 4/4 AR-only genes unchanged, 5/5 dual-inheritance genes correct.
v3.6.8March 2026
  • BP1: Added mis_z < 2.0 guard to prevent BP1 in missense-constrained genes.
  • BP1: Genes with pLI < 0.1 AND mis_z >= 2.0 (e.g. KCNJ11, ABCC8, KCNQ1, SCN1A) have missense as primary disease mechanism - BP1 ("primarily truncating variants cause disease") is factually incorrect for these genes.
  • Clinical validation: KCNJ11 p.Leu270Val correctly loses BP1 after this change.
v3.6.7March 2026
  • PP3: Missense relevance guard added. PP3 (BayesDel) now requires evidence that missense variants cause disease in the gene before applying computational prediction.
  • PP3 guard has four gates: (1) mis_z > 2.0 for direct missense constraint, (2) mis_z > 1.5 AND pLI < 0.5 for AR genes like BRCA2/MLH1/ATM, (3) mis_z > 0.5 AND pLI > 0.9 for strong AD LoF-intolerant genes, (4) NULL fallback for genes without mis_z but with pLI > 0.5.
  • PP3_splice path unchanged (splice prediction independent of missense mechanism).
  • BP4 unchanged (benign prediction consistent with low missense constraint).
  • Scientific basis: Pejaver et al. 2022 calibrated BayesDel on missense-mechanism genes; applying to LoF-mechanism genes is methodologically incorrect.
v3.6.6March 2026
  • PM1: Curated CRITICAL_PFAM_DOMAINS list (~60 domains) replaces generic Pfam presence check.
  • PM1: Only well-characterized functional domains with documented pathogenic variant clustering trigger PM1.
  • PM1: Excluded domains: generic structural (Caveolin, coiled-coil), DUF, transmembrane helices, repetitive regions.
  • PM1: Included domains: kinase catalytic (PF00069, PF07714), DNA-binding (PF00870, PF00505, PF00096, PF00533), ion channel pores (PF00520, PF00029, PF01365), GTPase (PF00071), enzyme active sites (PF00089), and ~40 additional VCEP-documented domains. Domain matching uses Pfam accession numbers since v3.14.0.
  • Scientific basis: ACMG 2015 requires PM1 for "critical and well-established functional domain without benign variation" - not any Pfam annotation.
v3.6.5March 2026
  • PP2: Now requires mis_z > 2.0 in addition to pLI > 0.5.
  • PP2: ACMG 2015 requires "missense variants are a common mechanism of disease" - pLI alone measures LoF constraint, not missense constraint.
  • PP2: mis_z > 2.0 ensures gene is missense-constrained (Samocha et al. 2014, Karczewski et al. 2020).
  • Genes like CAV1 (pLI=0.846, mis_z=0.86) with LoF disease mechanism no longer incorrectly trigger PP2.
v3.6.4March 2026
  • PVS1: Comprehensive AR LoF gene list expansion from ~35 to ~150 genes
  • AR LoF gene list covers all major AR disease categories: neurodegeneration, metabolic (amino acid, fatty acid oxidation, glycogen storage, lysosomal, peroxisomal), ciliopathies, sensory (hearing, vision), immune/hematologic, cardiac, neuromuscular, connective tissue, kidney, endocrine, liver, and respiratory
  • Every gene in the curated list has ClinGen Definitive/Strong validity or equivalent published evidence for biallelic LoF disease mechanism
  • VCEP AR genes (CDH23, GJB2, MYO7A, PAH, SLC26A4, USH2A) handled via VCEP pvs1_applicable gate, not duplicated in curated list
v3.6.2March 2026
  • Homozygous reference genotype guard: variants with genotype hom_ref (0/0) excluded from ACMG classification
  • hom_ref variants arise from multi-allelic VCF sites where the sample has 0 ALT reads for a specific allele but was included due to another ALT allele at the same position
  • Without this guard, hom_ref frameshift/stop variants in LoF-intolerant genes received PVS1+PM2 = Likely Pathogenic despite not being present in the patient
  • 20 false Pathogenic/Likely Pathogenic classifications eliminated in validation (Case 19)
v3.6.1March 2026
  • PM2: NULL gnomAD allele frequency now correctly satisfies PM2 (absent from controls = never observed in ~800K individuals)
  • PM2: Previous behavior required frequency data to be present (non-NULL), incorrectly excluding truly absent variants
  • PVS1: Curated AR LoF gene list (~35 genes) replaces broad HPO-based bypass (v3.6 used HP:0000007 with 3,127 AR genes - too broad, caused 43 false Likely Pathogenic)
  • PVS1: Curated list includes only genes with Definitive/Strong evidence for biallelic LoF disease mechanism (ClinGen Gene-Disease Validity, literature)
v3.6March 2026
  • PVS1: Autosomal recessive genes bypass pLI/LOEUF constraint gate using HPO inheritance annotation (HP:0000007)
  • PVS1: Aligned with ClinGen PVS1 Decision Tree (Abou Tayoun et al. 2018) which does not require population constraint for genes with established LoF disease mechanism in recessive inheritance
  • HPO database now used for PVS1 AR gene identification in addition to PP4 phenotype matching
  • Note: HPO-based bypass replaced by curated gene list in v3.6.1 due to excessive breadth
v3.5February 2026
  • ClinGen VCEP gene-specific specification overlay (optional, enabled by default)
  • BA1, BS1, PM2: gene-specific frequency thresholds from published VCEP specifications (~50-60 genes)
  • PVS1: gene-specific applicability gate (disabled for gain-of-function genes)
  • VCEP audit trail: criteria string includes [VCEP:Panel vX.Y] marker when gene-specific thresholds applied
  • VCEP toggle: can be enabled/disabled per case in case settings
  • Source: ClinGen Criteria Specification Registry (CSpec)
v3.4February 2026
  • Bayesian point-based classification framework (Tavtigian et al. 2018, 2020) replaces sequential 18-rule evaluation
  • All 18 original ACMG combining rules produce identical results under the point system (backward compatible)
  • Point system fills gaps: evidence combinations not covered by original 18 rules now classified properly
  • PP3/BP4: BayesDel_noAF single-tool replaces weighted 6-predictor consensus (ClinGen SVI calibration, Pejaver et al. 2022)
  • PP3 evidence strength modulation: Strong (>= 0.518), Moderate (0.290-0.517), Supporting (0.130-0.289)
  • BP4 evidence strength modulation: Moderate (<= -0.361), Supporting (-0.360 to -0.181)
  • PM1 + PP3 double-counting guard: combined evidence capped at Strong equivalent (4 points)
  • Continuous confidence scores derived from point distance to classification threshold boundary
  • Conflicting evidence handled by point summation (more nuanced than binary VUS default)
  • High-confidence conflict safety check: Strong/Very Strong pathogenic vs. Strong benign flagged for manual review
  • Legacy predictors (SIFT, AlphaMissense, MetaSVM, DANN, PhyloP, GERP) retained as display data only
v3.3February 2026
  • SpliceAI PP3 threshold aligned to ClinGen SVI 2023 recommendation (lowered from 0.5 to 0.2)
  • PP3_splice excluded when PVS1 applies (ClinGen SVI double-counting guard)
  • BP7 upgraded: synonymous + not splice_region + SpliceAI <= 0.1 (Walker et al. Figure 4)
  • BP7 conservation filter intentionally omitted per Walker et al. Table S13
  • Evidence strength modulation (PP3_Moderate for high SpliceAI) deferred pending VCEP specifications
v3.2February 2026
  • SpliceAI integration: PP3_splice path for splice evidence
  • BP4 SpliceAI guard: requires max_score < 0.1 for benign consensus
  • Criteria string distinguishes PP3 (missense) from PP3_splice (splice)
v3.1January 2026
  • Maximum sensitivity approach: removed all frequency and impact pre-filtering
  • All quality-passing variants proceed through classification
  • Clinicians decide clinical relevance using classification + annotations
v3.0January 2026
  • Conflicting evidence priority: pathogenic + benign evidence produces VUS
  • BA1 stand-alone override: allele frequency > 5% always classified Benign
  • ClinVar override restricted to non-conflicting evidence only
  • SQL-based classification engine (100x performance improvement)
  • PM2: required non-NULL frequency data (corrected in v3.6.1)

References

  1. Tavtigian SV, Greenblatt MS, Harrison SM, et al. Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification framework. Human Mutation. 2018;39(11):1485-1492. PMID: 30311386
  2. Tavtigian SV, Harrison SM, Boucher KM, Biesecker LG. Fitting a naturally scaled point system to the ACMG/AMP variant classification guidelines. Human Genetics. 2020;139(8):1057-1067. PMID: 32666219
  3. Pejaver V, Byrne AB, Feng BJ, et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. American Journal of Human Genetics. 2022;109(12):2163-2177. PMID: 36413997
  4. Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine. 2015;17(5):405-424. PMID: 25741868
  5. Walker LC, Hoya M, Wiggins GAR, et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup. American Journal of Human Genetics. 2023;110(7):1046-1067. PMID: 37352859
  6. Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019;176(3):535-548.e24. PMID: 30661751
  7. Karczewski KJ, Francioli LC, Tiao G, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581(7809):434-443. PMID: 32461654
  8. Cheng J, Novati G, Pan J, et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science. 2023;381(6664):eadg7492. PMID: 37733863
  9. McLaren W, Gil L, Hunt SE, et al. The Ensembl Variant Effect Predictor. Genome Biology. 2016;17(1):122. PMID: 27268795
  10. Landrum MJ, Lee JM, Benson M, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Research. 2018;46(D1):D1062-D1067. PMID: 29165669
  11. Abou Tayoun AN, Pesaran T, DiStefano MT, et al. Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criterion. Human Mutation. 2018;39(11):1517-1524. PMID: 30192042
  12. Josephs KS, Roberts AM, Grozeva D, et al. Beyond gene-disease validity: capturing structured data on inheritance, allelic requirement, disease-relevant variant classes, and mechanism for inherited cardiac conditions. Genome Medicine. 2023;15:86. PMID: 37884986
  13. Cummings BB, Karczewski KJ, Kosmicki JA, et al. Transcript expression-aware annotation improves rare variant interpretation. Nature. 2020;581(7809):452-458. PMID: 32461655

Questions About Our Methodology?

We welcome technical questions from clinical geneticists and laboratory directors. Transparency is foundational to clinical trust.

Contact Us