Our Method: Cross-Species Conserved Variants as mRNA Target Signals

The question that drives our target identification work is straightforward: across the many proteins involved in cellular aging, which ones, if expressed at higher levels, are least likely to cause harm and most likely to extend healthspan? This question is hard to answer experimentally at scale, but it has a computational approximation that we think is underexploited: cross-species conservation analysis in the context of documented longevity differences.

The hypothesis is this — if a gene variant is conserved across multiple species that show unusually long lifespans relative to their body mass, and if that conservation cannot be explained by structural necessity alone, then the variant is a candidate longevity-associated feature. When we find variants in the coding or regulatory regions of genes that appear repeatedly across long-lived species but not in closely related shorter-lived species, those variants are signals worth investigating.

This post describes the computational pipeline we have built to operationalize that hypothesis. It is not a brief overview — we want to show the actual logic at each step, including where the method is strong, where it has real limitations, and what assumptions we are making that could be wrong.

Step 1: Multi-Species Genome Alignment Across 40+ Species With Documented Longevity Data

The foundation of the pipeline is a multi-species genome alignment. We currently work with a curated set of 42 vertebrate species for which we have (a) a well-assembled reference genome, (b) published maximum lifespan data from the AnAge database, and (c) at least moderate-quality transcriptomic data from one or more tissue types. The species span a longevity range from roughly 3 years (small shrews) to over 200 years (Greenland shark, bowhead whale), with a concentration in the 10-80 year range that includes most mammalian species.

For each protein-coding gene in the human reference, we identify orthologous sequences in the 42 species using a combination of BLAST-based reciprocal best-hit identification and synteny-based ortholog validation. Synteny validation is important because sequence similarity alone can misassign paralogs as orthologs in rapidly evolving gene families. For genes with extensive paralogy — PI3K family members, FOXO transcription factors, sirtuins — we apply stricter synteny criteria and validate orthologs against published phylogenetic trees where available.

The alignment step uses MAFFT for protein sequence alignment and MUSCLE for nucleotide-level codon alignment where needed. Poorly aligned regions (typically due to insertions in individual species, low assembly quality, or rapidly evolving segments) are masked using TrimAl before downstream analysis. The output for each gene is a multiple sequence alignment of 42 sequences with confidence-weighted alignment positions.

We are not claiming 42 species is the right number or that these are the optimal species. Our selection was constrained by genome availability and longevity data quality. There are known biases: long-lived marine mammals are somewhat overrepresented because their unusual longevity generated significant research interest, and small short-lived mammals are slightly underrepresented because their genomes are less frequently assembled at high quality. We have a species selection correction factor in the pipeline to partially account for this, but it is imperfect.

Step 2: Conservation Scoring With Phylogenetic Distance Weighting

Raw conservation — counting how many species share a given amino acid at a given position — is a poor measure of biologically meaningful conservation because it does not account for phylogenetic relationships. Two closely related species that share a variant tell you almost nothing independently; the same variant appearing in a bat, a naked mole rat, and a bowhead whale — three phylogenetically distant lineages — is substantially more informative.

We apply a phylogenetic distance weighting to conservation scores using a modified version of the approach used in ConSurf and related tools. The basic idea is that conservation across phylogenetically distant lineages gets higher weight than conservation within a clade. We use a mammalian time-calibrated phylogeny to assign weights, with branch lengths calibrated to divergence time estimates from TimeTree.

The output is a per-position conservation score that reflects not just whether a position is conserved, but the phylogenetic spread of that conservation. Positions that are invariant across all 42 species get high conservation scores, but that alone does not distinguish functionally important conserved positions from structurally necessary ones (e.g., the catalytic triad of an enzyme). We address this in the positive selection analysis step.

We also compute a longevity-stratified conservation score: we split the 42 species into a "long-lived" group (top quartile by body-mass-corrected lifespan) and a "short-lived" group (bottom quartile), and compute conservation separately in each group. Positions where the long-lived group shows higher conservation than the short-lived group — that is, positions where the variant is specifically maintained in long-lived species — receive an additional weighting factor. This is the central signal we are trying to isolate.

Step 3: Positive Selection Analysis

Conservation tells you that something has been maintained. Positive selection analysis tells you that something has been actively favored. Positions showing signatures of positive selection in long-lived lineages are candidates where the amino acid change was not just tolerated but may have conferred a fitness advantage — potentially including longevity-related fitness.

We use a codon substitution model framework (based on the PAML/HyPhy methodology) to identify codons with elevated dN/dS ratios in long-lived lineages relative to the background rate across the tree. A dN/dS ratio greater than 1.0 at a specific branch indicates positive selection at that position in that lineage. For our purposes, we are interested in cases where the elevated dN/dS is concentrated in multiple long-lived lineages rather than being a lineage-specific event in a single species.

Positive selection analysis is computationally expensive and statistically demanding. We apply it after the conservation filter — only to positions that show longevity-stratified conservation signals — rather than across all coding positions in all genes. Even with this filtering, false positives are a significant concern: certain types of codon composition bias and sequencing artifacts can produce spurious positive selection signals. We apply a series of quality filters including minimum sequence quality thresholds, alignment confidence cutoffs, and permutation-based significance thresholds.

We are not claiming that every signal we identify as positively selected in long-lived species is a validated longevity variant. Positive selection analysis at the scale we are running it will include false positives, and the prior probability that any given variant is mechanistically relevant to longevity is low. The analysis is a filter to enrich the candidate set for biological investigation, not a definitive identification of longevity genes.

Step 4: Pathway Enrichment and Druggability Filtering

After conservation scoring and positive selection analysis, we have a list of candidate positions across thousands of genes. Not all of these are relevant to our program. Pathway enrichment analysis identifies which biological pathways are over-represented among the top-scoring candidates, which helps prioritize research directions and validates that the pipeline is recovering known longevity-associated biology rather than random noise.

In our current implementation, the top-scoring candidates from the conservation and positive selection filters are consistently enriched in pathways including insulin/IGF-1 signaling, FOXO transcription factor activity, TOR/mTOR regulation, NAD+ metabolism and sirtuin biology, DNA repair and stress response, and mitochondrial biogenesis. This enrichment pattern is reassuring — these are the pathways that appear most consistently in the longevity biology literature across model organisms. It does not validate our method independently, but it confirms that the pipeline is not generating random candidates.

From the pathway-enriched candidate set, we apply a druggability filter with two components: target class assessment (is this a protein category that is accessible to mRNA-mediated upregulation?) and safety assessment (is there literature evidence that overexpression of this protein causes harm in any organism?).

The mRNA druggability assessment filters for proteins that are secreted, membrane-associated, or nuclear with known transcriptional effects — categories where protein-level changes produce measurable downstream effects on cell biology. Intracellular enzymes without established dose-response relationships are lower priority. Proteins with known dominant-negative isoform effects at high expression are flagged for special attention.

The safety filter scans the literature and cross-references the OMIM and ClinVar databases for gain-of-function variants in the candidate genes. A candidate gene where gain-of-function variants are associated with disease in humans is deprioritized, regardless of how strong the cross-species conservation signal is. This is a conservative filter and will remove some candidates that might be safe within a narrow expression window, but for a preclinical company without validated safety data, the conservative choice is appropriate.

Step 5: mRNA Designability Assessment

Not all proteins that pass the preceding filters are equally tractable as mRNA therapeutics. The final step in our pipeline assesses mRNA designability — a composite score that reflects how likely a given protein is to produce safe, consistent expression from an mRNA construct.

The designability score incorporates:

Coding sequence complexity: Genes with highly repetitive sequences, very high GC content, or extensive secondary structure in the mRNA coding region are harder to synthesize reliably and may show variable translation efficiency across cells. These are not disqualifying but affect design confidence.
Isoform diversity: Genes with many functionally distinct isoforms require careful selection of which isoform to encode, and mRNA delivery will not recapitulate the endogenous isoform mix. Genes where isoform diversity reflects tissue-specific biology (different isoforms in neurons vs. peripheral tissue) require isoform selection logic that adds complexity.
Protein size: Very large proteins (above approximately 1000 amino acids) produce very long mRNA transcripts that present synthesis, stability, and LNP encapsulation challenges. They are not impossible, but they require more development work than a 300-500 amino acid target.
Cell-type expression pattern: For our CNS-focused program, we score the target's baseline expression level in neurons and glial cells. A target that is not expressed in the target cell type at baseline may not have the cellular machinery required for its function even if the protein is successfully delivered.

The designability assessment does not determine whether a target makes it into the pipeline — it informs the order in which targets are pursued and the expected development complexity. A high-scoring target on conservation and positive selection with low designability complexity goes to the top of the priority list. A high-scoring conservation target with high designability complexity is deprioritized to a second tier, worth developing but not worth front-loading limited resources.

Why Conservation-Based Targets Are a Safety Argument, Not Just an Efficacy Argument

The most common question we get about this approach is whether conserved variants are actually more likely to be safe when upregulated in humans. The argument we make is probabilistic rather than definitive.

If a gene variant has been maintained across multiple long-lived species over tens of millions of years of evolution, and if that maintenance reflects genuine selection (rather than neutral drift), then there is at least indirect evidence that the variant is compatible with healthy cellular function across diverse organismal contexts. The variant has been "tested" by evolution in multiple independent experiments and has not been eliminated. That does not mean it is safe to upregulate in humans — evolution optimizes for reproductive fitness, not longevity per se, and the effects at supraphysiological expression levels are unknown. But it provides a different prior probability than a novel synthetic variant with no evolutionary track record.

We are not saying that evolutionary conservation guarantees safety. We are saying it provides a lower bound on plausibility. A variant that exists in the same form in Greenland sharks, naked mole rats, and humans is less likely to be acutely toxic when expressed in human cells than a variant with no conservation signal, because it has been expressed in vertebrate cellular contexts for very long periods without apparent elimination. This is a weak argument by itself, but combined with the safety filter and the gain-of-function literature screen, it contributes to a risk prioritization framework that we think is more principled than alternatives.

Current Output and Where We Are Taking It

Our current pipeline has processed approximately 18,000 protein-coding human genes through steps 1-4. After conservation scoring, positive selection filtering, pathway enrichment analysis, and druggability filtering, we have a prioritized candidate list of roughly 200-300 genes. After applying the designability assessment, the top-priority tier for mRNA development contains approximately 20-30 targets.

Of these, we have advanced a small number into in vitro mRNA construct design and preliminary cell-based testing. FOXO3 and Klotho are the targets we have described most publicly because they have the strongest independent experimental validation in the longevity literature, which provides external corroboration of our computational prioritization. Identifying known longevity-associated genes as top scorers in an independent computational pipeline is encouraging, but we are cautious about inferring too much from it — a pipeline that recovers known biology is consistent with being well-calibrated, but it does not prove that the pipeline will successfully identify novel targets that validate in subsequent experimental work.

That experimental validation is the work we are now doing. The computational pipeline is a target identification method, not a drug discovery method. The targets it produces are hypotheses, and like all hypotheses, they need to be tested. What we have built is a principled, mechanistically motivated way to generate a smaller and more likely-valid set of hypotheses than random screening would produce. Whether those hypotheses hold up is the question that the next phases of our work will answer.