THE GENOME SEQUENCES OF ORIENTIA ISOLATES
In general, the genomes of isolates of O. tsutsugamushi are larger than are the genomes of members of the genus Rickettsia. When complete genomes have been determined, they are in the range of 2 Mb in size. The difference between a genome in Rickettsia and one of Orientia is a combination of some increase in gene number in Orientia combined with a significant contribution by repeated sequences (some coding, some non-protein coding) that appear in large numbers in the genomes of Orientia compared to Rickettsia. The appearance of these repeated sequence greatly complicates the work required to complete a closed circular genome sequence. As a consequence, many of the genome sequences of Orientia that appear in the DNA databases may be considered partial sequences, although they may contain all of the unique coding portions of the genome of an isolate.
As of July, 2017, the genome sequences of eleven members of the genus Orientia had been deposited in the DNA databases (ten isolates of O. tsutsugamushi and one isolate of O. chuto). In late 2017, information concerning 27 additional isolates has been released, initially as sequence read archives (SRA) files in GenBank.
The Boryong Isolate
The first genome sequence of any isolate of Orientia to be deposited in the DNA databases was that of the Boryong isolate [NCBI Reference Sequence: NC_009488, deposited in 2007; reference: Cho,N.H., et al., The Orientia tsutsugamushi genome reveals massive proliferation of conjugative type IV secretion system and host-cell interaction genes. Proc. Natl. Acad. Sci. U.S.A. 104 (19), 7981-7986 (2007)]. The genome sequence was found to encompass 2,127,051 bp, and contain a large number of repeated sequences.
Cho,N.H., Kim,H.R., Lee,J.H., Kim,S.Y., Kim,J., Cha,S., Kim,S.Y., Darby,A.C., Fuxelius,H.H., Yin,J., Kim,J.H., Kim,J., Lee,S.J., Koh,Y.S., Jang,W.J., Park,K.H., Andersson,S.G., Choi,M.S. and Kim,I.S. 2006. The Orientia tsutsugamushi genome reveals massive proliferation of conjugative type IV secretion system and host-cell interaction genes. Proc. Natl. Acad. Sci. U.S.A. 104 (19), 7981-7986.
Orientia tsutsugamushi Boryong (NC_009488) – 2,127,051 bp
The large number of repeated sequences is significant, because the presence of these sequences inhibits the bioinformatics processes that could quickly establish gene order in subsequent genome assemblies. It is thus difficult to conclude that gene order is basically similar in different isolates.
The Ikeda strain
Following the determination of the sequence of the Boryong isolate, the genome sequence of the related Ikeda strain was determined in 2008.
Nakayama,K., Yamashita,A., Kurokawa,K., Morimoto,T., Ogawa,M., Fukuhara,M., Urakami,H., Ohnishi,M., Uchiyama,I., Ogura,Y., Ooka,T., Oshima,K., Tamura,A., Hattori,M. and Hayashi,T. 2008. The Whole-genome sequencing of the obligate intracellular bacterium Orientia tsutsugamushi revealed massive gene amplification during reductive genome evolution. DNA Res. 15 (4), 185-199
Orientia tsutsugamushi str. Ikeda (NC_010793) – 2,008,987 bp
More recently, genome sequences of a number of other isolates have begun to be deposited into the DNA databases. These include the following “standard” strains:
The Karp strain
Two groups have independently sequenced the genome of the Karp strain. These sequences are:
- Liao,H.M., Chao,C.C., Lei,H., Li,B., Tsai,S., Hung,G.C., Ching,W.M. and Lo,S.C. 2016. Genomic Sequencing of Orientia tsutsugamushi Strain Karp, an Assembly Comparable to the Genome Size of the Strain Ikeda. Genome Announc 4 (4), e00702-16 (2016)
Orientia tsutsugamushi str. Karp (LYMA02000000) – 2,026,724 bp
The second genome analysis of Karp reported the sequence after removing portions of the sequence involved with highly repeated sequences. The sequence appeared as:
- 2. Daugherty,S.C., Su,Q., Abolude,K., Beier-Sexton,M., Carlyon,J.A., Carter,R., Day,N.P., Dumler,S.J., Dyachenko,V., Godinez,A., Kurtti,T.J., Lichay,M., Mullins,K.E., Ott,S., Pappas-Brown,V., Paris,D.H., Patel,P., Richards,A.L., Sadzewicz,L., Sears,K., Seidman,D., Sengamalay,N., Stenos,J., Tallon,L.J., Vincent,G., Fraser,C.M., Munderloh,U. and Dunning-Hotopp, J.C. 2015. Genome Sequencing of Rickettsiales. Unpublished.
Orientia tsutsugamushi str. Karp (LANM01000000) – 1,454,354 bp
Other standard strains.
In addition to Karp, the Daugherty, et al. group also reported the sequences of two other “standard” strains, Kato and Gilliam. The data is given in:
Orientia tsutsugamushi str. Kato PP (LANN00000000) – 1,478,442 bp
Orientia tsutsugamushi str. Gilliam (LANO00000000) – 1,997,698 bp
Further genome sequences for isolates of Orientia tsutsugamushi have been reported by the two latter groups. These include the following sequences:
Liao,H.M., et al. . 2017. Genomics Data 12: 84–88.
Orientia tsutsugamushi strain AFSC7 (LYMB00000000) – 1,437,566 bp
Orientia tsutsugamushi strain AFSC4 (LYMT00000000) – 1,295,323 bp
Daugherty,S.C., et al. 2015. Unpublished.
Orientia tsutsugamushi str. TA716 (LAOA01) – 2,221,260 bp
Orientia tsutsugamushi str. UT76 (LANZ01) – 3,033,399 bp
Orientia tsutsugamushi str. UT144 (LAOR01) – 1,689,193 bp
Orientia tsutsugamushi str. TA763 (LANY01) – 2,460,104 bp
Orientia tsutsugamushi str. Sido (LAOM01) – 712,858 bp
Finally, the Daugherty group has also determined at least a partial sequence of the closely related species Candidatus O. chuto:
Orientia chuto str. Fuller (LANP01) – 1,092,196 bp
During 2017, a set of genome sequences from 32 isolates (including 5 isolates for which previous sequences were available) have been added to databases. Currently these sequences are accessible as individual Sequence Read Archives (SRA). These sequences include the following:
Isolates for which previous sequences were reported:
SRR3503732 ORTS0002 Karp replicate2
SRR3503829 ORTS0069 Gilliam
SRR3503839 ORTS0070 TA716
SRR3503840 ORTS0071 TA763
SRR3503893 Orts0093 Kato
SRR3503897 Ot0001 Karp replicate1
Newly sequenced isolates:
SRR3503734 ORTS0005 TM 2259 (Laos: Vientiane Prefecture)
SRR3503738 ORTS0007 TM 2325 (Laos: Vientiane Prefecture)
SRR3503739 ORTS00020 TM 2978 (Laos: Vientiane Prefecture)
SRR3503740 ORTS0049 isolate 772 (Laos: Salavan Prefecture)
SRR3503824 ORTS0055 isolate 1768 (Laos: Luang Nam Tha Prefecture)
SRR3503847 ORTS0072 Domrow
SRR3503849 ORTS0073 AFC-27
SRR3503851 ORTS0074 AFC-30
SRR3503852 ORTS0075 Garton
SRR3503853 ORTS0076 TH-1811
SRR3503856 Orts0077 TH-1812
SRR3503857 Orts0078 TH- 1814
SRR3503859 Orts0079 TH-1817
SRR3503882 Orts0080 TH-1826
SRR3503883 Orts0081 isolate 18-032113
SRR3503884 Orts0082 isolate 18-032460
SRR3503885 Orts0083 isolate 18-032604
SRR3503886 Orts0084 isolate 18-030643
SRR3503887 Orts0086 afc3
SRR3503888 Orts0087 afpl-12
SRR3503889 Orts0088 afsc-7
SRR3503890 Orts0089 brown
SRR3503891 Orts0090 bse125
SRR3503892 Orts0092 citrano
SRR3503894 Orts0094 kostival
SRR3503895 Orts0095 mak119
SRR3503896 Orts0096 mak243
In March 2017, the sequencing center for the Wellcome Centre for Human Genetics, Oxford, began releasing genome sequences for eight isolates. Their sequences were based on long read technology, which was hoped to mitigate the problem of genome assembly caused by the large proportion of the Orientia genome that is made up of repetitive sequences. The sequences to be released include five isolates whose genome sequences were previously determined. Those sequences are:
Gilliam – BioSample: SAMEA104570318; SRA: ERS2181602
UT76 – BioSample: SAMEA104570325; SRA: ERS2181609
Karp – BioSample: SAMEA104570320; SRA: ERS2181604
Kato – BioSample: SAMEA104570321; SRA: ERS2181605
TA763 – BioSample: SAMEA104570323; SRA: ERS2181607
New isolates included:
FPW1038 – BioSample: SAMEA104570319; SRA: ERS2181603
TA686 – BioSample: SAMEA104570322; SRA: ERS2181606
UT176 – BioSample: SAMEA104570324; SRA: ERS2181608
Further information on these sequences will be added when full sequences or Contig sequences are released.