New technology reveals DNA secrets behind disease and evolution

19.01.2025/14/30 XNUMX:XNUMX    309


An international team of researchers has made significant progress in understanding how gene expression is regulated in the human genome. In a recent study, they conducted a comprehensive analysis of cis-regulatory elements (CREs)—DNA sequences that control gene transcription. The study provides valuable information about how CREs drive the expression of cell-specific genes and how mutations in these regions can affect health and contribute to disease.

CREs, such as enhancers and promoters, play a crucial role in determining when and where genes are activated or silenced. Although their importance is well known, analyzing their activity on a large scale has been a long-standing challenge.

"The human genome contains a multitude of CREs, and mutations in these regions are thought to play important roles in human disease and evolution," explained Dr. Fumitaka Inoue, one of the study's first authors. "However, it has been very difficult to comprehensively quantify their activity across the genome."

Latest news:  Here's what would happen if the 500-meter asteroid Bennu crashed into Earth: everyone would be affected (photo)

Innovative technology enables large-scale CRE analysis

To address this problem, the team used a cutting-edge technology called lentivirus-based massively parallel reporter assay (lentiMPRA), which the authors previously developed. This approach allows thousands of CREs to be analyzed simultaneously, tagging them with unique DNA barcodes that track their activity.

Using lentiMPRA, the researchers screened as many as 680 CRE candidates in three widely used cell types: hepatocytes (liver cells), lymphocytes (a type of white blood cell), and induced pluripotent stem cells (a type of artificial stem cell derived from a normal cell in the body).

The study revealed several key insights. Across the three cell types, approximately 41,7% of the CREs analyzed were active. Promoters, which initiate gene transcription, showed a dependence on sequence orientation but were less specific for cell types. Enhancers, which increase gene transcription, were active regardless of their orientation and exhibited cell type specificity. These findings highlight fundamental differences in how these two types of CREs function.

Latest news:  NASA's largest spacecraft on its way to Jupiter took its first picture of stars

Machine learning improves predictive gene regulation

The study developed several machine learning models to predict CRE regulatory activity based on large-scale experimental data. MPRALegNet, a model trained on the huge lentiMPRA dataset, was found to be the most accurate and efficient at predicting regulatory activity for any DNA sequence. Its predictions closely match experimental results, in some cases performing as well as experimental replicates.




The model also demonstrated its ability to identify important transcription factor binding motifs, i.e. short DNA sequences that determine CRE activity, thus providing insight into how specific factors drive cell-type-specific gene expression. For example, the study identified HNF4 and GATA motifs as crucial for activity in hepatocytes and lymphocytes, respectively.

By providing precise identification and quantification of enhancer activity, the study opens avenues for studying the molecular mechanisms of human disease. Future research will focus on applying this approach to studying genetic polymorphisms, variations in DNA sequence that contribute to individual differences and susceptibility to disease.

Latest news:  Ocean physics on Enceladus threatens detection of aliens on Saturn's moon

"The nearly complete human genome has recently been sequenced, but most of its functional regions remain unknown. Our findings link DNA sequence information to its functional roles. We hope that these results will contribute to a deeper understanding of biological phenomena, including human disease and evolution," said Dr. Inoue.

This study also creates a publicly available database of CRE activity on the ENCODE portal, providing a valuable resource for researchers worldwide. By integrating large-scale experimental data with machine learning, this work lays the foundation for future discoveries in genomics and personalized medicine. In addition, the use of tools such as lentiMPRA and MPRALegNet will help better equip researchers to unravel the complexities of gene regulation and explore vast uncharted territories of the human genome.


portaltele.com.ua