Last week, humanity got to learn 8 per cent more about homo sapiens. Scientists completed sequencing the human genome, the goal of a project that began in October 1990. (A genome is the complete set of genetic instructions encoded in an organism. The word is a combination of gene and chromosome.) This opens the door to new research.
What is a nucleotide?
If you’ve seen the movie Gattaca, you know the molecular units of DNA and RNA are made up of “ACGT” nucleotides. These combine in various sequences to make up the chromosomes (long DNA molecules) of any species. As you may know, DNA is packed in a double-helix — two intertwined spirals. Each “rung” of the DNA ladder consists of two base pairs of nucleotides. There are over three billion such base pairs in the human genome.
How many chromosomes are there in human DNA?
Different species have different sets of chromosomes. Chromosomes come in paired sets of 23 (46 in total) for sapiens. One of each pair is donated by each parent. So a child inherits two copies of every gene — one from each parent. Our closest relatives — other apes like gorillas, chimpanzees and orangutans — have 24 pairs. Even so we share over 98 per cent of DNA with the other apes.
What is the human genome project (HPG)?
The HGP launched in 1990, with the intention of decoding the human genome. Over 2,800 scientists worked together to arrive at a collaborative road-map for the HGP with set goals for each five-year period.
The HGP researchers deciphered the HG in three ways: They found the order, or “sequence,” of DNA; they made location maps showing where genes were placed in chromosomes; and they made “linkage maps” to track inherited traits (colour of hair, eyes, propensity to disease).
When did the HGP first produce significant results?
By 2001, the HGP had deciphered enough to publish a “draft genome”. This was nowhere near complete but the sequence of perhaps 90 per cent of those three billion base pairs were known. (Ten per cent of three billion is still a rather larger number — 300 million). Much of this was dismissed as “junk” DNA. Although the project was deemed completed, we really wanted to know what that 10 per cent or so was. By 2003, a more complete genome was published.
What is “junk DNA”?
Most DNA gives coded instructions to create proteins, which do various things. DNA that doesn’t give coding instructions is generally called junk DNA. Most scientists consider it less important, but there’s a mystery.
Why would our genomes contain non-functional junk sequences? As science advances, we’re figuring out junk DNA is often not junk — it seems to do several useful things, which we don’t yet fully understand. But in 2003, when the first draft genome was released, it wasn’t considered important and we simply didn’t have the tools to decipher it. So there were gaps in the draft genome — by early 2022, only around 92 per cent of the HG was done.
When was the HG completely sequenced and by whom?
Last week, six different papers were published, completing the sequence of the HG. There are now no gaps left — that last 8 per cent has been filled in. The sequencing was done by another collaborative effort called the Telomere to Telomere (T2T) consortium. This included researchers at the US National Human Genome Research Institute (NHGRI), the US National Institutes of Health; University of California, Santa Cruz; and University of Washington, Seattle. The entire genome sequence has been released in Science journal (www.science.org/doi/10.1126/science.abl3533).
The Telomere-to-Telomere CHM13 (T2T-CHM13) genome (that’s the scientific name for the benchmark completed genome) closes the gaps. It adds nearly 200 million base pairs of sequences, it corrects thousands of errors, and unlocks the most complex regions of the human genome for future scientific inquiry.
How does this advance science?
The reference CHM13 offers a benchmark to compare individuals, or entire populations. Say, you know somebody (or many somebodies) has a genetic propensity to diabetes, or resistance to HIV, or Covid-19. You can compare the specific genomes to the reference CHM13 and study the differences. This leads to a better understanding of which genes are responsible for what. It may make it possible to tailor medicines or vaccines to suit specific individuals, and it may lead eventually to being able to genetically engineer vulnerability to some diseases out of the species.