Splicing is a complex mechanism that generates mature mRNA from pre-mRNA and which is essential for a correct gene expression. It involves many factors – nearly 400, if we consider alternative splicing as well – and it is tightly controlled. A new paper published in Molecular Cell by Juan Valcárcel, head of the Regulation of Alternative pre-mRNA Splicing group at the CRG, and colleagues has now identified an heterogeneous nuclear ribonucleoprotein (hnRNP) that helps in the recognition of the 3′ splice site.
The essential pre-mRNA splicing factor U2AF guides the early stages of splice site choice by recognizing a polypyrimidine (Py) tract consensus sequence near the 3′ splice site. The U2AF 65 KDa subunit binds to the Py tract while its 35 KDa subunit binds the invariant AG dinucleotide at the intron 3′ end.
Since Py tracts are relatively poorly conserved in higher eukaryotes, how does U2AF find the right ones?
By using in vitro and in vivo depletion, as well as reconstitution assays using purified components, the authors have identified hnRNP A1 as an RNA binding protein that allows U2AF to discriminate between pyrimidine-rich RNA sequences followed or not by a 3′ splice site AG.
Valcárcel and colleagues have demonstrated using biochemical assays and NMR that hnRNP A1 forms a ternary complex with the U2AF heterodimer on AG-containing/uridine-rich RNAs, while it displaces U2AF from non-AG-containing/uridine-rich RNAs.
Tavanez JP, Madl T, Kooshapur H, Sattler M, Valcárcel J
hnRNP A1 Proofreads 3′ Splice Site Recognition by U2AF.
Mol Cell. 2012 Feb 10;45(3):314-29
Despite all having the same DNA content, each cell is different. The phenotypic differences observed between cells depend on the differences in the RNA transcript content of the cell. And this variability of transcript abundance is the result of gene expression variability, which has been studied for many years and is usually measured using DNA arrays, but also of alternative splicing variability. Indeed, changes in splicing ratios, even without changes in overall gene expression, can have important phenotypic effects. However, little is known about the variability of alternative splicing amongst individuals and populations.
Taking advantage of the popular use of RNA-seq (or “Whole Transcriptome Shotgun Sequencing”), a technique that sequences cDNA in order to get information about a sample’s RNA content, a team of researchers at the CRG have recently published in Genome Research a statistical methodology to measure variability in splicing ratios between different conditions. They have applied this methodology to estimates of transcript abundances obtained from RNA-seq experiments in lymphoblastoid cells from Caucasian and Yoruban (Nigerian) individuals.
Their results show that protein coding genes exhibit low splicing variability within populations, with many genes exhibiting constant ratios across individuals. Genes involved in the regulation of splicing showed lower expression variability than the average, while transcripts with RNA binding functions, such as long non coding RNAs, showed higher expression variability. The authors also found that up to 10% of the studied protein coding genes exhibit population-specific splicing ratios and that variability in splicing is uncommon without variability in transcription.
Even as they accept the limitations of their work (e.g. RNA-seq is still very new and not completely understood, and the data in which they base their analysis belongs to the first and only human RNA-seq studies published so far), the authors conclude that “given the low variability in the expression of protein coding genes, phenotypic differences between individuals in human populations are unlikely to be due to the turning on and off of entire sets of genes, not to dramatic changes in their expression levels, but rather to modulated changes in transcript abundances”.
The researchers, led by Roderic Guigó, present in the same paper a new methodology to find out the relative contribution of gene expression and splicing variability to the overall transcript variability. They estimated that about 60% of the total variability observed in the abundance of transcript isoforms can be explained by variability in transcription, and that a large fraction of the remaining variability can likely result from variability in splicing.
Guigó, last author of this paper, has recently received an ERC Advanced Grant, the most prestigious given to scientific projects in Europe, in the category of Physical Sciences and Engineering. The 2 M € awarded over five years will allow his team to carry out the study of RNA using massively parallel sequencing techniques.
Gonzalez-Porta M, Calvo M, Sammeth M, Guigo R. Estimation of alternative splicing variability in human populations. Genome Res. 2011 Nov 23; [PDF]