Understanding regulatory networks in the era of massively parallel sequencing: did we lose our genetic switches and feedback loops?
The turn from the 20th to the 21st Century was marked by a drastic change in the scale at which biologists study regulatory networks. In the 1990, a PhD student could spend years analysing the regulation of one particular gene by one or a few transcription factors. Microarray technologies enabled monitoring the expression of all the genes of an organism in a single experiment (transcriptome arrays), and to lead genome-wide location analysis to report supposedly exhaustive lists of transcription factor binding sites. Next Generation Sequencing amplified the movement, and many labs are now combining ChIP-seq and RNA-seq experiments to get a wide view on transcription factor binding locations, histone modifications, and transcriptional responses to a multitude of conditions, cell types, developmental stages, etc. In the first part of the talk, I will present some of the bioinformatics approaches and tools that we developed to analyse regulatory motifs from various types of high-throughput data (e.g. co-expression clusters, ChIP-seq peaks, replication origins).
At the light of the evolution of the domain, I would also like to address a more general question about the insights gained from high-throughput approaches on fundamental mechanisms of regulation. Indeed, it implicitly became standard to consider that a typical high-throughput experiments should return thousands of significant features (differentially expressed genes, TF binding sites, active enhancers). This however does not fit with our classical models, were transcription factors would turn on or off specific sets of target genes (“regulatory switches”), thereby forming regulatory networks whose behaviour was understandably determined by feedback loops. How can we conciliate the undeniable robustness of regulatory networks with the apparent noisiness of binding and transcription profiles?