Thomas Evans and Laurence Ettwiller of New England Biolabs don’t hesitate to answer that question in their recent paper — it’s right in the title: “DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification”. Indeed, the word “variant” is used 88 times in the paper, including hard-to-decipher phrases like “Variants originating from real in-vivo variants”, because it’s used to mean at least three different things:

  1. a germline variant is a position in an individual’s germline DNA that is different from the reference genome sequence
  2. a somatic variant is a position in a somatic cell’s DNA that is different from that individual’s germline sequence
  3. a sequence read variant is a position in a specific DNA sequencing read that is different from the reference genome sequence, which I’ll call an observed non-reference allele

Much of the interest in this paper I’ve seen on twitter (and my own spit-take when I saw it as a preprint just after submitting a low-coverage sequencing paper) arises from the potential effects on those first two definitions, which are the foundation of human genetics and cancer genetics, respectively.
More »

14. August 2015 · Comments Off on Two new reviews · Categories: Papers

Summertime was review writing time in the group, with two new papers published recently.

The first was, Strategies for fine-mapping complex traits. One of the activities that has kept us busy lately has been trying to narrow down GWAS hits to causal variants and genes. Our biggest applied effort in fine-mapping has been, unsurprisingly, in inflammatory bowel disease (IBD), where our longstanding collaboration with the International IBD Genetics Consortium, access to big sample sets, and a generally tractable genetic architecture have made it a fruitful exercise. This review was largely motivated by the idea that our experience might be useful to others working on other diseases.

The second was, Understanding inflammatory bowel disease via immunogenetics. Sticking with the IBD theme, this is the latest in a number of IBD genetics reviews (things change quickly in this business!), this time as part of a series in the Journal of Autoimmunity, aimed at putting the immunogenetics of many different disorders in context in a single issue.

Congrats to Katie and Sarah on these two papers!

20. March 2015 · Comments Off on Genetic study sheds new light on TB pathogenesis · Categories: Papers, Science · Tags: , , ,

One of the world’s most ancient diseases

Tuberculosis, also known as consumption, was first recorded in Greek literature around 460 BCE. Hippocrates identified it as the most widespread and fatal disease of his time. Tuberculosis (TB) is caused by a pathogen called Mycobacterium tuberculosis (M.tb). In Greek myco refers to a mushroom-like shape, vividly describing these fungal looking bacterium that float into the human system through the airways.

TB accounted for approximately 25% of total deaths in Europe from the 17th to 19th centuries. Many of the writers and artists of the Victorian era suffered and died from the disease and painted it with a pathological – yet somehow romantic – extreme: febrile, unrelenting and breathless.

Experiment eleven

It was not until 1943 when a young Ph.D. student called Albert Schatz, from Professor Selman Waksman’s lab at Rutger’s University in the US, discovered the first effective cure for treating TB. On Schatz’s eleventh experiment on a common bacterium found in farmyard soil, the first antibiotic agent for treating TB, streptomycin, was discovered. The battle for the ownership of streptomycin became a famous scientific scandal [Experiment Eleven], when Waksman took credit and the Nobel prize for the discovery, downplaying Schatz’s contributions. Thanks to a sustained effort from the government and society, including better nutrition, housing, improved sewage systems and ventilation, the number of TB cases was reduced significantly by the 1980s. The efforts to seek cures for TB have not only brought TB mortality down, but also helped to shape modern medicine and our understanding towards infectious illness.
More »

Out this week in Nature is the first big paper from the inflammatory bowel disease Immunochip project. The international project collected data from over 75 thousand individuals, and brought the total number of known IBD loci to a record-breaking 163. You can read more about the paper on the Sanger Institute website.

One interesting thing about the paper was how difficult it was to visualize the results. With one exception there were no single image that naturally fell out of any of the analyses, and we had to put quite a bit of work into displaying the messages of the paper in the figures. You can judge for yourself how much success we had, but I can say that up until the last few days before submission we still had images that everyone hated but couldn’t think what to replace them with. The last one to be replaced was the evocatively named “Smear-o-venn”, that we were all relieved to see the back of.

More »

17. February 2012 · Comments Off on Are loss-of-function variants relevant to complex disease? · Categories: Papers · Tags: , , , , , , , ,

Three members of the group (James, Luke & Jeff) were involved in a substantial undertaking (led by our colleague Daniel MacArthur) to study loss-of-function (LoF) variants in otherwise “normal” human genomes. These are mutations which are predicted to obliterate the function of a gene: things like gained stop codons, coding frameshift insertions and the like. The paper has just come out in Science, and Daniel has a great write-up over at Genomes Unzipped. We made two contributions to the project (reflecting our principal interests):

  1. Because real LoF variants are (generally) selected against, but sequencing errors which look like LoF aren’t, this class of variation is hugely enriched in all sorts of sequencing and annotation errors. We therefore spent a lot of effort in hand validating LoF calls (using Evoker) to try to separate the wheat from the chaff.
  2. One might expect LoF variants to be more likely than an average SNP to affect disease risk, but it didn’t really seem to be the case. Only one very well known LoF variant (the Crohn’s disease NOD2 frameshift) from the paper showed any appreciable association to disease. Perhaps these variants are just too strongly selected against to rise to frequencies visible to GWAS?