Pangolin Genetics: What the Genome Reveals

Published: June 18, 2025

In 2020, a research consortium published a chromosome-level reference genome for the Sunda pangolin (Manis javanica), providing the first detailed molecular map of a species that had become the world's most trafficked wild mammal. The genome brought pangolin biology into the era of comparative genomics, allowing scientists to address questions that field observation alone could never answer: why pangolins grow scales no other mammal produces, how their immune systems function, how far they diverged from other mammals, and what their genetic diversity tells us about extinction risk. The findings have reshaped thinking across conservation genetics, evolutionary biology, and wildlife forensics.

The 2020 Manis javanica Genome

The reference genome for Manis javanica assembled approximately 2.46 gigabases across 23 chromosomes with high contiguity, placing it among the more complete mammalian genome assemblies available at the time of publication. The assembly was produced using a combination of long-read sequencing technologies that resolve repetitive regions which short-read methods had previously fragmented or collapsed.

Annotation of the genome identified approximately 19,700 protein-coding genes, broadly comparable to other mammalian genomes but with notable expansions and contractions in specific gene families. Researchers also sequenced transcriptomes from multiple tissue types including scale-forming skin, liver, lung, and peripheral blood, enabling identification of genes active specifically during scale development.

Subsequent partial genome assemblies and resequencing efforts have targeted the other seven pangolin species, including the Temminck's ground pangolin (Smutsia temminckii) and the white-bellied tree pangolin (Phataginus tricuspis), providing a multi-species comparative framework that has accelerated understanding of trait evolution across the pangolin clade.

Gene Families for Keratin and Scale Development

Pangolin scales are composed primarily of alpha-keratin and beta-keratin proteins, with the beta-keratin component providing the rigidity that distinguishes scales from mammalian hair and nails. The Manis javanica genome contains an expanded repertoire of keratin-associated protein genes (KAPs) and a distinctive cluster of cornification-related genes that are either absent or minimally represented in other mammalian genomes.

Skin transcriptome data from scale-bearing regions showed high expression of specific KAP genes during active scale growth, including several paralogs (gene duplicates) that appear to have arisen through tandem gene duplication events within the pangolin lineage. The cornified envelope proteins encoded by these genes form the structural scaffold of the scale, with sulphur-rich proteins cross-linking the matrix to produce mechanical hardness.

The evolutionary origins of pangolin scales have been debated for over a century. The genome data supports the view that pangolin scales are not homologous to reptile scales or any other amniote scale system, but represent an independent evolutionary solution to body armour derived from modified hair follicle programmes. The genetic architecture underlying scale formation shares regulatory elements with mammalian hair development but has been substantially elaborated and redirected through positive selection.

Immune System Genes and the ACE2 Receptor

The pangolin immune gene repertoire attracted intense scrutiny following the emergence of SARS-CoV-2 in late 2019. Genomic analyses published in 2020 and 2021 examined the structure of the pangolin angiotensin-converting enzyme 2 (ACE2) receptor, the cell-surface protein that coronaviruses use to enter host cells.

The key finding was that the pangolin ACE2 receptor contains the same five critical amino acid residues at the receptor-binding domain interface as human ACE2, a configuration that differs from the ACE2 of most other mammals studied. This structural similarity means that pangolin ACE2 can bind spike proteins from SARS-like coronaviruses with relatively high affinity, consistent with pangolins serving as hosts for coronaviruses that are closely related to the SARS-CoV-2 lineage.

Coronaviruses isolated from Malayan pangolins seized in anti-smuggling operations in southern China between 2017 and 2019 showed receptor-binding domains with high amino acid similarity to SARS-CoV-2. The prevailing scientific view is that these findings identify pangolins as hosts of related viruses but do not establish pangolins as the direct proximate source of SARS-CoV-2; the question of the immediate precursor and transmission pathway remains unresolved as of the mid-2020s. What the genomic data does establish clearly is that pangolin-associated coronaviruses have been subject to selection pressures that favour binding to ACE2 receptors compatible with human cells, a finding with direct implications for pandemic preparedness research.

Beyond ACE2, the pangolin immune gene repertoire shows contraction in several innate immune gene families relative to other mammals, including reductions in the STING pathway genes that mediate interferon responses to cytosolic DNA. Whether this represents adaptation to chronic viral tolerance or vulnerability to novel pathogens is an active area of investigation.

Evolutionary Divergence from Other Mammals

Molecular clock analyses using the Manis javanica genome and multiple calibration points from the mammalian fossil record place the divergence of Pholidota (pangolins) from Carnivora (cats, dogs, bears, and relatives) at approximately 79 to 87 million years ago, in the Late Cretaceous. This places the pangolin-carnivore split as contemporaneous with or slightly predating the mass extinction event at the Cretaceous-Paleogene boundary 66 million years ago.

Within Pholidota, the split between African pangolin genera (Smutsia and Phataginus) and Asian genera (Manis) is estimated at 40 to 50 million years ago. This timing aligns with major geological and climatic reorganisation events associated with the Eocene epoch, when continental configurations and vegetation zones were substantially different from today. The deep divergence between African and Asian clades has important implications for conservation genetics, as the two groups cannot be treated as interchangeable in managed breeding programmes.

Synteny analysis (comparison of gene order across chromosomes) shows that pangolin chromosomes retain significant blocks of ancestral mammalian chromosome organisation, with less chromosomal rearrangement than many other mammalian orders. This genomic conservatism is somewhat unexpected given the highly derived morphological characteristics of pangolins and suggests that large-scale chromosomal restructuring was not a primary driver of pangolin diversification.

Conservation Genetics: Inbreeding and Genetic Rescue

Population-level genomic studies using reduced-representation sequencing (RADseq) and whole-genome resequencing have characterised genetic diversity in several pangolin species. The results are concerning for long-term viability. Ground pangolin populations in fragmented South African habitat show elevated runs of homozygosity (ROH), a genomic signature of recent inbreeding, in populations where fewer than 50 to 80 individuals occupy isolated reserves or farm clusters.

Inbreeding coefficients estimated from ROH data in some isolated South African ground pangolin groups exceed values associated with inbreeding depression in other mammalian species, including reduced litter viability and increased susceptibility to pathogen challenge. These findings have informed management recommendations for genetic rescue, the deliberate translocation of individuals from genetically distinct populations to restore diversity in inbred groups.

Genetic rescue has been successfully applied to other mammalian species, including Florida panthers and Isle Royale wolves. For pangolins, implementation faces practical barriers: low population densities, the difficulty of capturing and safely translocating animals, and uncertainty about social compatibility. The genomic data now available for southern African ground pangolin populations provides the population structure maps needed to identify appropriate source populations for future rescue translocations.

DNA Barcoding for Wildlife Law Enforcement

The eight pangolin species carry different regulatory protections under CITES appendices, making accurate species identification essential for prosecution of trafficking offences. Visual identification of scales from different species is unreliable, particularly for processed and mixed shipments. Molecular identification using DNA barcoding resolves this problem.

Standard barcoding protocols use the mitochondrial cytochrome c oxidase subunit I (COI) gene or cytochrome b gene to generate a short sequence that can be matched against reference databases. For pangolins, all eight species produce distinct barcode sequences, and the method successfully identifies species from scale fragments, blood traces, dried meat, and museum specimens. Reference libraries compiled from verified voucher specimens from known geographic origins now cover all eight species.

South African forensic wildlife laboratories and TRAFFIC's seizure analysis programme have applied barcoding to specimens from numerous enforcement actions. The technique has confirmed the presence of species beyond the geographic range implied by trafficking routes, identifying Sunda pangolin scales in consignments described as originating from African sources, evidence of deliberate mislabelling to confuse enforcement responses. Expanded reference databases that include geographic haplotype variation within species are enabling increasingly refined provenance analysis, pointing toward specific source populations or even geographic subregions within a species' range.

Frequently Asked Questions

When was the pangolin genome first sequenced?

A high-quality reference genome for the Sunda pangolin (Manis javanica) was published in 2020. This genome assembly provided the first comprehensive view of pangolin gene content and enabled comparative analyses with other mammalian genomes, revealing the molecular basis of several pangolin-specific traits including scale composition and immune gene repertoire.

Are pangolins genetically related to cats or dogs?

Pangolins belong to the order Pholidota, which is the sister group to Carnivora (the order containing cats, dogs, bears, and related families). They diverged from the carnivore lineage approximately 79 to 87 million years ago. Despite some superficial resemblances to armadillos, pangolins are not related to them: armadillos are xenarthrans, a deeply divergent mammalian lineage.

What is DNA barcoding and how does it help identify trafficked pangolins?

DNA barcoding uses short, standardised gene regions (typically mitochondrial COI or cytochrome b) to identify a species from a tissue sample. For law enforcement, this means that a scale fragment, blood spot, or small piece of dried meat found in a seizure can be matched against a reference database to determine which of the eight pangolin species it came from. This is critical because all eight species have different CITES listing statuses and the visual identification of processed scales is unreliable.

What is inbreeding depression and why does it threaten pangolin populations?

Inbreeding depression occurs when small, isolated populations accumulate harmful recessive mutations that become expressed when closely related individuals mate. For pangolins, habitat fragmentation has divided formerly continuous populations into small remnant groups. Genetic studies have documented elevated inbreeding coefficients in isolated ground pangolin populations in South Africa, which correlates with reduced reproductive success and immune function impairment.

What did the pangolin genome reveal about keratin genes?

Analysis of the Manis javanica genome identified a substantial expansion of the keratin-associated protein (KAP) gene family and specific beta-keratin gene clusters not found in other mammalian orders. These expanded gene families underpin the production of the hard, overlapping scales unique to pangolins. The gene duplications appear to have occurred over a relatively short evolutionary window, suggesting rapid adaptive evolution of the scale-producing system.