Retroviruses integrate a DNA copy of the viral genome into the host genome. After integration the viral cDNA is called the provirus. The provirus is transcribed by the host machinery to generate progeny viruses.
Following reverse transcription the viral nucleic acid is a linear double stranded cDNA. The viral enzyme integrase first performs “terminal cleavage” removing two bases from the 3′ ends leaving recessed 3′ hydroxyls. The 3′ ends of the viral cDNA are covalently joined to the host DNA during “strand transfer”. Integrase requires no ATP to complete the single step transesterification reaction.
Every retrovirus appears to have a distinct preference for genomic elements, such as transcription units, promoters, and CpG islands, as well as a subtle DNA sequence preference at the integration site. In some cases the preference for genomic elements is determined by a host co-factor that interacts with integrase. For example, PSIP1/LEDGF/p75 directs most HIV-1 integration events to transcription units in euchromatin. However, LEDGF does not have an effect on the sequence preference of HIV-1 integrase.
Integration is performed by a complex of integrase bound to the ends of the viral DNA. Some retroviral integrases, including HIV-1 and Prototype foamy virus (PFV) integrases, form a tetramer with the viral DNA. Other retroviruses have an octamer of integrase, such as mouse mammary tumor virus (MMTV). We are interested in how these integrase complexes search the target chromatin for an integration site.