Cis regulatory motifs and antisense transcriptional control in the apicomplexan Theileria parva

Background Theileria parva is an intracellular parasite that causes a lymphoproliferative disease in cattle. It does so by inducing cancer-like phenotypes in the host cells it infects, although the molecular and regulatory mechanisms involved remain poorly understood. RNAseq data, and the resulting updated genome annotation now available for this parasite, offer an unprecedented opportunity to characterize the genomic features associated with gene regulation in this species. Our previous analyses revealed a T. parva genome even more gene-dense than previously thought, with many adjacent loci overlapping each other, not only at the level of untranslated sequences (UTRs) but even in coding sequences. Results Despite this compactness, Theileria intergenic regions show a pattern of size distribution indicative of monocistronic gene transcription. Three previously described motifs are conserved among Theileria species and highly prevalent in promoter regions near or at the transcription start sites. We found novel motifs at many transcription termination sites, as well as upstream of parasite genes thought to be critical for host transformation. Adjacent genes that could be regulated by antisense transcription from an overlapping transcriptional unit are syntenic between T. parva and P. falciparum at a frequency higher than expected by chance, suggesting the presence of common, and evolutionary old, regulatory mechanisms in the phylum Apicomplexa. Conclusions We propose a model of transcription with conserved sense and antisense transcription from a few taxonomically ubiquitous and several species-specific promoter motifs. Interestingly, the gene networks regulated by conserved promoters are themselves, in most cases, not conserved between species or genera.