Home

While working on my PhD in Dr. Juan Alfonzo’s lab, I started two side projects that involved the computer-based analysis of two different types a data generated from the same basic building block of life: RNA. As a part of my main, wet lab-based, project and without going into the details of why, I had the goal of identifying a group of hypermodified ribonucleosides, wybutosine and related derivatives, on tRNA within Trypanosome cellular compartments. We searched these tRNAs using two techniques: RNA sequencing via mass spectrometry and nucleoside identification and quantification using high performance liquid chromatography (HPLC). These endeavors lead to the development of two data analysis computer programs: RoboOligo and the General HPLC-DAD Reader, respectively.

RoboOligo

With the help of the Limbach lab, we sequenced Trypanosome tRNAPhe  using of tandem mass spectrometry (tRNAPhe is the only tRNA that has been shown to contain wybutosine derivatives). This served as my introduction to mass spectrometry and the sheer amount of data that it produces. As I learned how to interpret the data, I began to play around with an idea on how to automate the data analysis process. This led to the development of RoboOligo, an interactive program for the manual and automated de novo sequencing of short, modified base-containing RNA oligomers.
Read more about RoboOligo!

The General HPLC-DAD Reader

As a simple way to determine if the wybutosine-derivatives we identified were being down regulated by RNA interference (RNAi) of suspected Trypanosome wybutosine-synthesizing proteins, we digested Tryp. tRNA to nucleosides and used liquid chromatography – mass spectrometry (LC-MS) to determine whether or not the quantity of the derivatives decreased after induction of RNAi. We used the mass spectrometry data to determine the retention times of the wybutosine-derivatives and then the HPLC UV trace (@254nm) as a measure of relative quantity. The HPLC systems that we used were equipped with a diode array detector (DAD) which allows for absorbance reads across a user-defined spectrum — for nucleosides,  the range between 210 – 300nm is ideal. The composition and structural differences between the nucleosides give them different light absorbance properties, making it possible to differentiate some from the others (I say ‘some’ because many of the different nucleosides absorb light too similarly to distinguish one from the other).   So, the DAD allowed us to get the ‘spectrum profile’ — or, how the nucleosides absorb light from 210 – 300nm — of nucleosides which we used as a secondary evidence to support our nucleoside identifications.

The only issue we had was that our lab didn’t have an HPLC or the very well protected proprietary software for analyzing the UV-Vis data generated by a run. So, while other labs allowed us to use their HPLC, I had to come up with a way to view and handle the results (I didn’t want to sit in front of another lab’s computer for days and days!), which led me to develop a simple python program that takes a file with raw  data (UV-Vis spectrum reads over time) and appropriately plots them. It generates a ‘main figure’, that is simply a plot of a user-defined wavelength over time. From this, the user can click on the front and back bases of a peak and the program will process all of the spectrum reads that lie within the two selected points to generate a spectrum profile with background subtracted, averaged and normalized data points, and curve smoothing using the Savitsky-Golay algorithm.
Read more about the General HPLC-DAD Reader and download the open source code!

Leave a Reply

Your email address will not be published. Required fields are marked *