Approximately 98 percent of the human genome is made up of noncoding DNA, including enhancers, promoters, and other elements that regulate gene activity. The methods for studying these regions tend to be expensive, labor-intensive, and largely low-throughput.
To really understand the functional geography of the noncoding genome, however, researchers need a way to isolate and characterize thousands to millions of regulatory DNA elements within it simultaneously, rapidly, and at high resolution. The need is great, as more than 90 percent of variants identified in genome-wide association studies of traits and disease are located in noncoding DNA.
By merging two powerful sequencing-based assays with a machine learning-based tool, a research team led by Xinchen Wang and associate members Melina Claussnitzer and Manolis Kellis in the Broad's Metabolism and Epigenomics programs, respectively, have engineered a powerful new approach for measuring individual noncoding DNA segments' ability to control gene expression, and doing so at both massive scale and high resolution called HiDRA (for High-resolution Dissection of Regulatory Activity), the approach brings together a number of widely used tools:
1.ATAC-seq (Assay for Transposase-Accessible Chromatin with high-throughput sequencing), a technique developed by Broad associate member Jason Buenrostro that looks across the entire genome for unwound, transcribable regions of DNA
2.STARR-seq (Self-Transcribing Active Regulatory Region Sequencing), an assay developed by Kellis lab alum Alexander Stark for measuring noncoding DNA segments' expression-promoting activity
3.SHARPR-RE, a machine learning algorithm based on the SHARPR tool Kellis's lab developed to analyze data from massively-parallel reporter assays.
By building on these approaches, HiDRA lets researchers create massive libraries of regulatory DNA and study their influence over gene expression at nucleotide-level resolution.
As they reported in Nature Communications, the team applied HiDRA to a blood cell line to test about seven million noncoding DNA fragments for ones that regulate gene expression, ultimately identifying 65,000. These included segments clearly marked as enhancers and promoters as well as segments lacking such marks, suggesting that the genome may harbor additional kinds of expression-controlling elements that we have yet to discover.
In addition, the team used HiDRA to examine how disease risk DNA variants in regulatory elements affect gene expression compared to variants that do not raise risk—a boon for researchers seeking to study how minute sequence variations in promoters and enhancers can impact human traits and disease states.
The team's findings suggest that HiDRA is a generalizable method for dissecting the nuances of gene regulation and the roles various functional elements play in human disease.