THE ACANTHAMOEBA GENOME SEQUENCES IN THE DNA DATABASES: WE NEED TO APPLY CAUTION IN THEIR USE
(updated January 2020)
As the cost of DNA sequencing has declined, efforts have been made to determine the complete genome sequence of a number of isolates belonging to the genus Acanthamoeba. However, some issues have occurred that suggest that great caution must be taken in interpreting the genome information from different isolates within the genus (see below). At the beginning of 2017, nuclear genome sequences (most in the form of unlinked CONTIG sequences) were accessible for 15 isolates of Acanthamoeba. Complete mitochondrial DNA sequences were available from 16 isolates. It is notable that the sequences for neither genome are available for the type isolate of the genus as described by Castelani in a series of papers in 1930 (isolate A. castellanii AC30; ATCC 30011 or CCAP 1501/10, and subcultured and deposited as ATCC 30234 and ATCC 50374).
THE GENOME SEQUENCE FOR A. sp. Neff
The first genome released for use by the community was that of the Neff strain of Acanthamoeba (ATCC 30010), released in 2013 as NCBI Reference Sequence: NZ_AHJI00000000.1. The genome sequence was obtained as a whole genome shotgun sequencing project. This genome sequence has been well annotated, and will serve as the template to which other genome sequences can be compared. However, the phylogenetic position of the Neff strain and the modest frequency of Neff-like natural isolates within the universe of Acanthamoeba isolate sequences in the databases suggests that it may not be the best standard to represent the genus, or even sequence type T4.
REFERENCE: Clarke, M. et al. 2013. Genome of Acanthamoeba castellanii highlights extensive lateral gene transfer and early evolution of tyrosine kinase signaling. Genome Biol. 14 (2), R11.
[Note on the species classification of the Neff strain: The Neff strain has been classified traditionally as belonging the species A. castellanii. However, this is based on morphology, not gene phylogeny. When the sequence of several genes (discussed elsewhere on this site) are compared, it appears clear that this species designation is not appropriate. Although the Neff strain of Acanthamoeba is a member of the T4 sequence type, within T4 it is not closely related to the type strain of A. castellanii (the original type strain described by Castellani  is available as ATCC 30011, or sub-cultured as ATCC 30234 and ATCC 50374). The Neff strain represents a fairly small group of well differentiated isolates (sequence subtype T4-neff), as indicated on the page detailing the phylogenetic relationships among T4 isolates. Our recommendation is that the isolate be labeled as A. sp. Neff to indicate that it does not represent an isolate of A. castellanii.]
THE GENOME SEQUENCES OF A. polyphaga Linc Ap-1
During 2016, investigators at Kingston University deposited sequences from the whole genome shotgun sequencing project of Acanthamoeba polyphaga strain Linc Ap-1 (accession # LQHA01000000). This isolate is typed as a member of sequence type T4, subgroup T4A. The complete mitochondrial DNA genome was also been deposited (accession # KP054475). The species classification of this isolate is also somewhat in doubt. Page (1967) described A. polyphaga, based on eight isolates. At least two of these isolates (ATCC 30871; CCAP 1501/3a and ATCC 30872; CCAP 1501/3b) exist in the culture centers. The 18S rRNA gene sequence of ATCC 30871 has been determined and is clearly distinct from that of Linc Ap-1 (ATCC 30871 belongs to sequence type T4 subtype T4E, while Linc Ap-1 belongs to T4 subtype T4A), thus raising questions about the use of “polyphaga” to describe the isolate. The classification of ATCC 30872 is more problematic. The only 18S rRNA sequence directly reported to be from ATCC 30872 (acc # AY026244) places this isolate as a sequence type T2/6, subtype A. In the discussion below of other genome sequences, one of the genome projects purported to use ATCC 30872. The 18S rRNA sequence from this genome project does not agree at all with sequence AY026244, nor does it match any other 18S rRNA sequence in the database. It does place the isolate within sequence type T4 subtype T4B. If the genome project does in fact truly represent ATCC 30872, this again raises objections to the use of “polyphaga” for Linc Ap-1 (as well as the general utility of “polyphaga” to be a correct taxonomic name for any Acanthamoeba isolate.
THE GENOME SEQUENCES OF OTHER MEMBERS OF ACANTHAMOEBA (UNIVERSITY OF LIVERPOOL; JANUARY 2015)
Recently, the genome sequences of 14 isolates of Acanthamoeba were released to the international DNA databases, the result of Whole Genome Sequences obtained by nextgen sequencing procedures. Originally these sequences were released under only species names, with no identification of the source isolates. Unfortunately, a series of mislabelings appear to have occurred at some point (currently unknown), which rendered the species and isolate designations of a number of the genome sequences problematic . We have been cooperating with one of the PIs of this project (Andrew Jackson of the University of Liverpool) to clarify the isolate designations that should be applied to these sequences.
Together with Dr. Jackson, we have analyzed the sequences for a set of genes (nuclear 18S rRNA, mitochondrial 16S-like rRNA, mitochondrial cytochrome oxidase subunit 1, and a set of partial sequences from 5 nuclear genes: beta-tubulin, elongation factor-1, glyceraldehyde-3-dehydrogenase, glycogen phosphorylase 1, and RasC. The set of sequences has allowed us to identify with extremely high or moderately high probability the identity of 13 of the fourteen isolates from which genome sequences were obtained.
The following list provides information concerning the original species designations of the isolates and the putative isolate identification from which the DNA for these genome project was obtained. It then provides the results of our analysis and the best estimate of the correct identification of the source of project material.
We are collaborating with Dr. Jackson to provide a description of the patterns of genome differentiation as seen in light of the available WGS project genome sequences
We are also attempting to retrieve the sequence of the whole mitochondrial genomes from these genome samples.
[Please note that we are not currently making judgements concerning whether species attributions originally associated with any of the standard ATCC strains are appropriate. Given analysis based either on genome sequences or on sequences of genes such as the 18S rRNA gene, it is likely that some species names will be viewed as synonymous with alternative appropriate designations. Future postings on this site will deal with the question of “species” within Acanthamoeba as revealed by molecular phylogenetics.]
CORRECT ATTRIBUTION OF GENOME SEQUENCES
Six of the Liverpool genome sequences are correctly attributed to isolates. These are:
A. astronyxis (WGS Project: CDFH01): genome source: ATCC 30137; (sequence type T7)
A. culbertsoni (WGS Project: CDFF01): genome source: ATCC 30171 (strain A1); (sequence type T10)
A. lenticulata (WGS Project: CDFG01): genome source: ATCC 30841 (isolate PD2S); (sequence type T5)
A. lugdenensis (WGS Project: CDFB01): genome source: ATCC 50240 (isolate L3a); (sequence type T4, subtype T4A)
A. quina (WGS Project: CDFN01): genome source: ATCC 50241 (isolate Vil3); sequence type T4, subtype T4A)
A. rhysodes (WGS Project: CDFC01): genome source: ATCC 30973 (isolate Singh); sequence type T4, subtype T4D)
A seventh genome sequence appears to be constituted primarily of sequences representing the correct source, but analysis of mitochondrial DNA sequence reads suggest that it also appears to contain a minority of short sequence reads that may be contaminants from another source (probably A. culbertsoni, above).
A. mauritaniensis (WGS Project: CDFE01): genome source: ATCC 50253 (isolate 1652); (sequence type T4, subtype T4D)
PROBLEMATIC ATTRIBUTION OF GENOME SEQUENCES TO ISOLATES
Of the remaining seven Liverpool genome sequences, problems exist in identifying the correct attribution for source. In terms of the nature of uncertainty, we will list each isolate from most certain to least certain.
- A. castellanii (WGS Project: CDFL01): putative genome source: ATCC 50370 A. castellani (isolate Ma); (sequence type T4, subtype T4B)
Probably attributed correctly. Evidence from the 18S rRNA gene suggests the possibility of minor contamination, or multiple allelism of the 18S rRNA sequence.
- “A. healyi” (WGS Project: CDFA01): putative genome source: ATCC 30866 A. healyi (isolate OC-3A)
ERRONEOUS IDENTIFICATION: The WGS sequences do not match ATCC 30866 A. healyi.
CORRECT IDENTIFICATION: The sequences are a match to ATCC 30870, A. palestinensis Reich (sequence type T2).
- “A. palestinensis” (WGS Project: CDFD01): putative genome source: ATCC 30870 A. palestinensis Reich
ERRONEOUS IDENTIFICATION: The WGS sequences do not match ATCC 30870 A. palestinensis Reich.
CORRECT IDENTIFICATION: The sequences are a match to ATCC 50254, A. triangularis (isolate SH621); (sequence type T4, subtype T4F).
- “A. pearcei ” (WGS Project: CDFJ01): putative genome source: ATCC 50435, A. pearcei .
ERRONEOUS IDENTIFICATION: The WGS sequences do not match ATCC 50435, A. pearcei.
CORRECT IDENTIFICATION: problematic. Comparisons are not absolutely conclusive as to the identity of the source strain. The most likely source is Acanthamoeba sp. ATCC 50496 (strain Galka), but other similar standard strains are possible, though less likely. The WGS sequences for the 18S rRNA show a close but not exact match to previous sequences from ATCC 50496 A. sp. Galka (BCM:1282:324). However, the previous sequence from the 16S-like rRNA from A. sp. Galka ATCC 50496 does not match the WGS results as closely as do sequences from some other isolates. (Sequence type T4, subtype T4A)
- “A. polyphaga” (WGS Project: CDFK01): putative genome source: ATCC 30872 A. polyphaga (CCAP 1501/3b).
ERRONEOUS IDENTIFICATION: As mentioned above, the WGS sequences of the 18S rRNA do not match the sequence deposited in the DNA databases (AY026244) to represent ATCC 30872 A. polyphaga (CCAP 1501/3b). This single sequence is the only prior comparative sequence information reported from ATCC 30872 A. polyphaga (CCAP 1501/3b).
CORRECT IDENTIFICATION for WGS Project: CDFK01: problematic. The WGS sequences for several genes show close matches to previous sequences from several ATCC isolates within the sequence subtype T4B. Analysis of 18S rRNA, 16S-like rRNA and Cox-I all suggest three possible isolates sources, equally likely. If the sequence (AY026244) is correct, then a best guess for the source of the DNA for the genome material is that the source for this WGS is ATCC 50372, A. polyphaga JAC/S2 (given sequence similarity to previous sequences and some overlap of ATCC number). If AY026244 was inappropriately attributed to ATCC 50372, then WGS project CDFK01 could be the first data for this ATCC isolate. (Sequence type T4, subtype T4B)
- “A. divionensis” (WGS Project: CDFI01): putative genome source: ATCC 50238 A. divionensis
ERRONEOUS IDENTIFICATION: The WGS sequences do not match ATCC 50238 A. divionensis.
CORRECT IDENTIFICATION: The WGS project sequences are a match to ATCC 30137 A. astronyxis. Project sequences appear to be from a duplicate sample of A. astronyxis .
- “A. royreba” (WGS Project: CDEZ01) : putative genome source: ATCC 30884 A. royreba (Oak Ridge).
ERRONEOUS IDENTIFICATION: The WGS sequences do not match ATCC 30884 A. royreba Oak Ridge.
CORRECT IDENTIFICATION: very problematic. The sequence reads from the 18S rRNS gene of the WGS project DO NOT MATCH the sequence of any previously described Acanthamoeba isolate (neither from an known and described isolate from a culture center nor from an isolate reported from nature). The isolate ATCC 30884 A. royreba (Oak Ridge) has previously been identified through multiple sequences as a member of Acanthamoeba sequence type T4, subtype T4-D. These do not match the information from the genome project. The sequence from the 18S rRNA gene differ from all previously described Acanthamoeba 18S rRNA sequences by more than 10%. Nevertheless, the WGS sequence has the expansion segments within the 18S rRNA gene sequence characteristic of Acanthamoeba. WGS sequences from other genes show a similar large divergence from the genes of known Acanthamoeba isolates. This unknown isolate may represent a sample from one of the ATCC standard isolates for which no previous sequence information has been obtained. Or it may represent some isolate of unknown origin. Whatever its ultimate identification, it appears to represent a new sequence type that is quite distinct from all previously described forms within Acanthamoeba. Identification of its source should be a high priority. (The WGS sequences would thus represent a new sequence type, designated T21).
THE GENOME SEQUENCES OF OTHER MEMBERS OF ACANTHAMOEBA (AUSTRIAN INSTITUTE OF TECHNOLOGY; JANUARY 2017)
Two genome sequences were deposited in the DNA databases as Sequence read archives (SRA).
Acanthamoeba comandoni Strain Pb30/40 (ATCC Pra 287) : A group I Acanthamoeba. (originally designated A. astronyxis Pb30/40). Sequences from the nuclear 18S rRNA gene and the mitochondrial 16S-like rRNA gene (pb30-40 18s rRNA sequence) suggest that this isolate is not closely related to the type isolate for A. comandoni (ATCC 30135) which has been designated as a sequence type T9 for the nuclear 18S rRNA gene sequence. Comparison of sequences from the SRA indicate that Strain Pb30/40 is roughly equidistant from sequences designated as T17 and T18 (and less than 5% divergent from the sequences of either group). Sequences exist in SRA SRX2460089 and can be accessed through SRA experimental run SRR5141519 .
Acanthamoeba lenticulata strain 72/2 (ATCC 50704): A member of sequence type T5, this strain represents a different sub-type of A. lenticulata, compared to A. lenticulata strain PD2S (ATCC 30841), whose genome sequence information was deposited the University of Liverpool group (above). Strain PD2S is characterized by the presence of an intron in the 18S rRNA gene sequence (Schroeder-Diedrich, Fuerst, and Byers, 1998). The gene in strain 72/2 lacks the intron. Genome sequences for A. lenticulata strain 72/2 exist in SRA SRX2469245 and can be accessed through SRA experimental run SRR5151161 .
THE GENOME SEQUENCE OF ACANTHAMOEBA PYRIFORMIS (nov. sp.)
Late in December 2016, a paper appeared that reinterpreted the extent of the Acanthamoebidae (Tice, et al. – Biology Direct [2016 Dec 28] 11(1):69.). This paper included information on a new form of Acanthamoeba. They reported that by sequence analysis the sporocarpic amoebae “Protostelium” pyriformis is clearly a close relative to the members of the genus Acanthamoeba. This would make the form “Acanthamoeba” pyriformis the first reported member of the genus which individually forms a walled, dormant propagule elevated by a non-cellular stalk.
The paper has been followed by the deposition of transcriptome nextgen sequences in a sequence read archive (SRA) file. The sequence of the 18S rRNA gene has been deposited (accession # KX840327). An equivalent sequence retrieved from the transcriptome sequence read archive was 2220 nucleotides in length and contained regions of the gene equivalent to the expanded hypervariable regions that characterize Acanthamoeba 18S rRNA genes. Initial comparisons of this sequence with the almost complete 18S rRNA genes of other Acanthamoeba taxa found none of the other taxa within Acanthamoeba showing sequence similarity to Acanthamoeba pyriformis sp. nov. greater than ~86%. The sequence appears to be more divergent from the Group I acanthamoebae (A. astronyxis, etc.) than from other taxa, suggesting it may have diverged from within Acanthamoeba Groups II or III. No other partial or almost complete sequences have been reported that show close correspondence with this type sequence from A. pyriformis sp. nov.. More extensive comparisons will be forthcoming. (The SRA transcriptome sequences together with the 18S rRNA gene sequence indicate that this taxa would represent a new sequence type, designated T22).