SEQUENCES OF BALAMUTHIA IN THE DNA DATABASES

(UPDATED November 2023)
NUCLEAR 18S rRNA GENE
As of November 2023, sequences for the nuclear 18S rRNA gene from 17 isolates of B. mandrillaris have been deposited into the DNA databases (or are available from information in publications or from the authors of reports) .  One isolate CDC:V039 has been sequenced independently three times, leading to 19 sequences from known isolates in the database. Eleven sequence span almost the complete length of the gene (>1969 bases), representing nine of the isolates. 
Most of the various isolates of B. mandrillaris are very similar to one another in sequence for the 18S rRNA gene, having much less genetic diversity that that seen in Acanthamoeba.   As mentioned on the previous page, the lack of variation for the 18S rRNA gene prompted the examination of additional genes to obtain clues to the phylogenetic relationship between isolates. 
In addition to the material from B. mandrillaris, two isolates from a second species, B. spinosa sp. nov., are also available, including one full length sequence. As mentioned on the cover page for Balamuthia, the sequences from B. spinosa sp. nov. differ from those of B. mandrillaris by an average of ~7% sequence divergence.
18S rRNA  SEQUENCES FROM ENVIRONMENTAL SAMPLES SUGGEST TAXONOMIC DIVERSITY WITHIN BALAMUTHIA
Also among the sequences available in the database are 28 sequences obtained as unidentified uncultured eukaryotes in environmental surveys.  Two of these survey samples contain the complete 18S rRNA gene, and an additional 10 of  sequences from environmental samples exceed 900 basepairs in length.  Nine of the environmental survey sequences appear to represent isolates of B. mandrillaris. In contrast, the remaining 18 environmental survey sequences are divergent from clinical samples, showing between ~93% to ~96% sequence identity with clinical isolates of B. mandrillaris.  Further, these 18 sequences are not homogeneous, but rather potentially appear to represent several different taxonomic groups. Whether this indicates the existence of separate taxa closely related to B. mandrillaris is unclear, but they are much closer to B. mandrillaris than they are to either Acanthamoeba, Protacanthameoba or any other known taxa.  Care must be taken because the measured divergence could be due to sequencing errors which sometimes occur in a fraction of environmental uncultured material. Nevertheless, the results appear to suggest that the genus Balamuthia may contain substantial variation.
CHARACTERISTICS OF BALAMUTHIA 18S rRNA SEQUENCES IN THE DATABASES
The initial 18S rRNA sequence from Balamuthia mandrillaris was reported in 1998.  As detailed, 49 Balamuthia or Balamuthia-like sequences have been reported, including the sequence from D. spinosa sp. nov. and the environmental samples.  The temporal pattern of reports is shown in the figure below.
The spikes in 2014 and 2016 represent reports from environmental surveys.  Other reports, including clinical samples, are reported in a steady random low level pattern.
The size of sequences for the 18S rRNA gene that have been deposited from 103 bp to 2005 bp (for the sequence from B. spinoa nov. sp.). The distribution of the 49 sequences that have been included in the database is show below.
The largest group of sequences are small segments less than 300 basepairs in length, many from the environmental studies that made use of conserved universal PCR primers for the 18S rRNA gene.  The last class (between 1900 and 2005 bp) represents sequences that  attempted to analyze either almost-complete or complete sequences of the gene.  Complete sequences were obtained analyses from the WGS genome projects of three isolates: CDC:V039 [ATCC 50209], B. mandrillaris 2046 and B. mandrillaris Itson-01.   
 
MITOCHONDRIAL 16S-LIKE rRNA GENE SEQUENCES
The sequence of the mitochondrial 16S-like rRNA from B. mandrillaris has proven to be more variable than its nuclear homologue, although levels of variation are still limited. As a consequence, the mt-16S-like rRNA gene has come to be the target of a greater number of studies within Balamuthia.  Currently there are 68 sequences representing 56 isolates that have been deposited in the DNA databases for the mitochondrial 16S-like rRNA gene, plus information for 4 undeposited sequences. The pattern of deposition of sequences of the mt 16S-like rRNA gene sequence over the past 20+ years is shown in the following figure. The spikes in deposited sequences for 2014 and 2016 are simply the result of an uncoordinated set of depositions, and unlike the case for the 18S rRNA sequences, were not the result of extensive microbiome sampling.
Twelve of the sequences of the mt 16S-like rRNA gene come from whole mitochondrial genome sequences, nine of which were released in August 2015.   These 12 sequences represent the entire length of the 16S-like rRNA gene sequence and are 1470 and 1487 bp in length (differences represent disagreement of the exact start and end of the rRNA transcript).  Other incomplete sequences in the DNA databases range from 1109 bases in length down to 84 nucleotides.  In total, 52 of the sequences exceed 800 nucleotides in length.  Only three of the sequences come from uncultured samples.  No Balamuthia-like 16S-like rRNA sequences have been reported from purely environmental surveys of taxonomic genomic diversity. The distribution of sequence lengths for the gene in the DNA databases is shown below.
 
MULTIPLE GENE STUDIES
Seven isolates of B. mandrillaris have been studied for both of the rRNA genes.  These isolates are CDC:V039, CDC:V188, CDC:V433, CDC:V451, CDC:V630, Itson01 and B. mandrillaris strain 2046. All seven of these isolates have also had their complete mt-genome determined. 
ENVIRONMENTAL ISOLATES
As mentioned, there are a series of sequences that cluster with the Balamuthia 18S rRNA sequence that have been deposited in the DNA databases from uncultured eukaryotes observed during environmental microbiome studies.  However, the number of such environmental much less than the proportion seen for analyses of sequences from Vermamoeba, Naegleria, Acanthamoeba or even Protacanthamoeba.  To us, this suggests strongly that Balamuthia is truly less frequent in the environment than those other groups of free-living amoebae. 
 
GENOME INFORMATION ABOUT THE BALAMUTHIA MANDRILLARIS NUCLEAR GENOME:
At the end of 2015, a genome sequence was reported in the databases for the original B. mandrillaris type strain, CDC:V039 [ATCC 50209] (Detering et al. 2015. Genome Announcements 3: e01013-15), deposited as WGS project LFUI.   The size of the nuclear genome of B. mandrillaris CDC:V039  was estimated to be 44.27 Mb.  The most closely related genome (excluding redundant repeated sequences) was, unsurprisingly, Acanthamoeba sp. Neff.  Initially there was only limited detailed analysis other than comparisons of the repeated 18S and 28S rRNS gene segments of the nuclear genome sequence had been reported. Soon after, a second genome sequence was shared, obtained from material isolated from a PAM survivor (B. mandrillaris 2046) (Greninger et al. 2015. Genome Medicine 7: 113). The genomic material was deposited as WGS project LEOU. Comparisons between the two isolates began to be conducted. Recently (2023), the genome sequences for a third isolate, an environmental isolate, ITSON-01 was reported (Otero-Ruiz, et al. 2023), including an expanded comparison among the three genomes. The WGS project has been deposited as WGS JAVKOQ
 
THE BALAMUTHIA MITOCHONDRIAL GENOME:
In addition to information concerning the nuclear genome,  Greninger et al. (2015) reported the complete sequence of the mitochondrial DNA genome from B. mandrillaris 2046 (axenic culture). Additionally, they reported the complete mt-genome sequences of four additional PAM isolates of B. mandrillaris: CDC:V039, CDC:V188 [BEI NR-46452] (in both a standard and an axenic form), CDC:V451 [ATCC PRA-291], and B. mandrillaris SAM, as well as the complete mt-genome sequence from the frozen non-axenic strain of B. mandrillaris 2046. Greninger et al. (2015) also reported the complete mt-genome sequence from two environmental isolates: B. mandrillaris RP-5 and B. mandrillaris OK1. The mt-genomes ranged in size from 39,996 bp (B. mandrillaris strain V039) to 42,823 bp (B. mandrillaris strain OK1). Subsequently, in 2022 two additional complete mt-genomes were deposited. The first represented an isolate from a clinical PAM patient in Hong Kong (B. mandrillaris strain KM-20), while the second represented the environmental isolate ITSON01 from Mexico.
 
A PERSONAL NOTE ABOUT RESEARCH ON BALAMUTHIA:
Examination and comparison of the WGS genome information from three different isolate of B. mandrillaris is obviously an important focus of research. On a personal note, I think it remarkable that with data from only about 100-150 isolates of B. mandrillaris, we already have three complete WGS project to study.  In contrast, with more than 6000 isolates, Acanthamoeba had only ~35 WGS sequences to compare.  Whole genome sequencing, and the bioinformatics approaches that it requires, will bring us tremendous advances in the future. I look forward to these advances and express a profound frustration that my career mostly spanned a time when these tools were not available. I envy the young researchers who are examining the organisms of which I write. Good luck and good research.

Leave a Reply

Your email address will not be published. Required fields are marked *