A group of scientists at Cold Spring Harbor Laboratory (SCHL) have created an algorithm that can detect mutations in genes associated with conditions such as autism or obsessive-compulsive disorder. The algorithm, named Scalpel, works by grouping together all of the sequences from a given genomic region and then creating a new sequence alignment for that area.
There are over three billion letters in the human genome, which come together to create instructions for proteins via three-letter sequences. These sequences form sentences, which spell out instructions to the molecular structures of the body. When one sequence is absent, or if an extra sequence is present, it affects the entire sentence, changing the instructions, and potentially causing changes which can lead to disorders such as autism or obsessive-compulsive disorder.
With over three billion letters in the human genome, there are multiple possibilities for variations present on any given strand of DNA. DNA insertions and deletions, called indels, can vary from one letter to thousands, making it challenging to pinpoint any particular change.
Mike Schatz, assistant professor and quantitative biologist says
“These indels are like very fine cuts to the genome – places where DNA is inserted or deleted – and Scalpel provides us with a computational lens to zoom in and see precisely where the cuts occur.”
Scalpel works by grouping together all of the sequences of a given genomic region. It then creates a new sequence alignment for that area, like putting together the pieces of a puzzle.
The team, including Assistant Professors Mike Schatz, Gholson Lyon, Ivan Iossifov, and Professor Michael Wigler, published their findings in the journal Nature Methods. They compared Scalpel to two similar tools, HaplotypeCaller and SOAPIndel. The three systems were used to test DNA from a donor with severe Tourette Syndrome and obsessive-compulsive disorder. One thousand indels from the donor’s exome were selected for focused resequencing. Scalpel revealed a 77% true positive resequencing rate, outperforming the other tools. It also performed exceptionally well with indels longer than five base pairs, which is often a weak point for other programs.
Scalpel was also used to test a large set of genetic data from 593 families who donated samples to the Simmons Simplex Collection, a project of the Simmons Foundation Autism Research Initiative. The families included in the sample each had at least one child diagnosed with autism and one unaffected sibling. The researchers discovered a total of 3.3 million indels across the 593 families, most of which appear to be relatively harmless. A few dozen of these mutations appeared to be specifically associated with autism.
“All this adds to our body of knowledge about the spontaneous mutations that cause autism, ” says Schatz. “We are collaborating with plant scientists, cancer biologists, and others, looking for indels. This is a powerful tool, and we are looking forward to revealing new pieces of the genome that make a difference, throughout the tree of life.”