Scientists have developed a novel technology that allows them to read and interpret the human genome, a breakthrough that may pave the way for new drug targets to treat many genetic diseases.
The computational method, called TargetFinder, can predict where non-coding DNA - the DNA that does not code for proteins - interacts with genes.
This technology helps researchers connect mutations in the so-called genomic "dark matter" with the genes they affect, potentially showing new therapeutic targets for genetic disorders.
Also Read
Researchers at the Gladstone Institutes in US looked at fragments of non-coding DNA called enhancers. Enhancers act like an instruction manual for a gene, dictating when and where a gene is turned on.
Genes can be separated from their enhancers by long stretches of DNA that contain many other genes.
"Most genetic mutations that are associated with disease occur in enhancers, making them an incredibly important area of study," said Katherine Pollard, a senior investigator at the Gladstone Institutes.
"Before now, we struggled to understand how enhancers find the distant genes they act upon," said Pollard.
Scientists originally believed that enhancers mostly affect the gene nearest to them. However, the new study showed that, on a strand of DNA, enhancers can be millions of letters away from the gene they influence, skipping over the genes in between.
When an enhancer is far away from the gene it affects, the two connect by forming a three-dimensional loop, like a bow on the genome.
Using machine learning technology, the researchers analysed hundreds of existing datasets from six different cell types to look for patterns in the genome that identify where a gene and enhancer interact.
They discovered several patterns that exist on the loops that connect enhancers to genes. This pattern accurately predicted whether a gene-enhancer interaction occurred 85 per cent of the time.
"It's remarkable that we can predict complex three-dimensional interactions from relatively simple data," said Sean Whalen, a biostatistician at Gladstone.
Performing experiments in the lab to identify all of these gene-enhancer interactions can take millions of dollars and years of research.
The new computational approach is a much cheaper and less time-consuming way to identify gene-enhancer connections in the genome.
The technology also provides insight into how DNA loops form and how they might break in disease.
The study was published in the journal Nature Genetics.