LETTER RESEARCH 8. Hussain, I. et al. Identification of a novel aspartic protease (Asp 2) as b-secretase. 24. Citron, M. et al. Mutation of the b-amyloid precursor protein in familial Alzheimer’s Mol. Cell. Neurosci. 14, 419–427 (1999). disease increases b-protein production. Nature 360, 672–674 (1992). 9. Sinha, S. et al. Purification and cloning of amyloid precursor protein b-secretase 25. Gruninger-Leitch, F., Schlatter, D., Kung, E., Nelbock, P. & Dobeli, H. Substrate and from human brain. Nature 402, 537–540 (1999). inhibitor profile of BACE (b-secretase) and comparison with other mammalian 10. Vassar, R. et al. b-Secretase cleavage of Alzheimer’s amyloid precursor protein by aspartic proteases. J. Biol. Chem. 277, 4687–4693 (2002). the transmembrane aspartic protease BACE. Science 286, 735–741 (1999). 26. Tomasselli, A. G. et al. Employing a superior BACE1 cleavage sequence to probe 11. Yan, R. et al. Membrane-anchored aspartyl protease with Alzheimer’s disease cellular APP processing. J. Neurochem. 84, 1006–1017 (2003). b-secretase activity. Nature 402, 533–537 (1999). 27. Morris, J. N. et al. MDS Cognitive Performance Scale. J. Gerontol. 49, M174–M182 12. Hardy, J. C. & Crook, R. APP mutations table. http://www.alzforum.org/res/com/ (1994). mut/app/table1.asp (2010). 28. Holm, H. et al. A rare variant in MYH6 is associated with high risk of sick sinus 13. Cruts, M. Alzheimer’s disease and frontotemporal dementia mutation database. syndrome. Nature Genet. 43, 316–320 (2011). http://www.molgen.ua.ac.be/ADMutations (2012). 14. St George-Hyslop, P. H. Molecular genetics of Alzheimer’s disease. Biol. Psychiatry Supplementary Information is linked to the online version of the paper at 47, 183–199 (2000). www.nature.com/nature. 15. Cruchaga, C. et al. Rare variants in APP, PSEN1 and PSEN2 increase risk for AD in late-onset Alzheimer’s disease families. PLoS ONE 7, e31039 (2012). Acknowledgements We would like to thank the NHLBI GO Exome Sequencing Project 16. Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype and its ongoing studies, which produced and provided exome variant calls for imputation. Nature Genet. 40, 1068–1075 (2008). comparison: the Lung GO Sequencing Project (HL-102923), the WHI Sequencing 17. Kong, A. et al. Fine-scale recombination rate differences between sexes, Project (HL-102924), the Broad GO Sequencing Project (HL-102925), the Seattle GO populations and individuals. Nature 467, 1099–1103 (2010). Sequencing Project (HL-102926) and the Heart GO Sequencing Project (HL-103010). 18. Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint Author Contributions The study was designed and results were interpreted by T.J., method for genome-wide association studies by imputation of genotypes. Nature J.K.A., H.S., R.J.W. and K.S. Sequence data analysis was carried out by T.J., S.S., P.S., A.K., Genet. 39, 906–913 (2007). T.B., R.R.G., T.W.B. and D.G. Subject recruitment, phenotype analysis and biological 19. Sulem, P. et al. Identification of low-frequency variants associated with gout and material collection was organized and carried out by J.S., P.V.J., S.B., G.B., O.A.A., E.G.J. serumuric acidlevels. Nature Genet. 43, 1127–1130 (2011). and A.P. Sequencing and genotyping was supervised by J.H., O.T.M. and U.T. Cell line 20. Peacock, M. L., Warren, J. T. Jr, Roses, A. D. & Fink, J. K. Novel polymorphism in the experiments and BACE1 cleavage assays were carried out and analysed by J.K.A., J.M., A4 region of the amyloid precursor protein gene in a patient without Alzheimer’s K.H., Y. Lu, Y. Liu, A.G. and R.J.W. The paper was drafted by T.J., J.K.A., R.J.W. and K.S. All disease. Neurology 43, 1254–1256 (1993). authors contributed to the final version of the paper. 21. Exome Variant Server. NHLBI Exome Sequencing Project (ESP). http:// evs.gs.washington.edu/EVS/ (2012). Author Information Reprints and permissions information is available at 22. Di Fede, G. et al. A recessive mutation in the APP gene with dominant-negative www.nature.com/reprints. The authors declare competing financial interests: details effect on amyloidogenesis. Science 323, 1473–1477 (2009). accompany the full-text HTML version of the paper at www.nature.com/nature. 23. Giaccone, G. et al. Neuropathology of the recessive A673V APP mutation: Readers are welcome to comment on the online version of this article at Alzheimer disease with distinctive features. Acta Neuropathol. 120, 803–812 www.nature.com/nature. Correspondence and requests for materials should be (2010). addressed to K.S. ([email protected]). 2A U G U S T 2 0 1 2|V O L4 8 8 |N A T U R E | 9 9 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER doi:10.1038/nature11284 Dissecting the genomic complexity underlying medulloblastoma A list of authors and their affiliations appears at the end of the paper Medulloblastoma is an aggressively growing tumour, arising in Some cases probably went through even higher polyploidy states the cerebellum or medulla/brain stem. It is the most common before reaching an approximately 4n baseline (for example malignant brain tumour in children, and shows tremendous bio- ICGC_MB45, displaying 4n chromosomes with 4:0 or 3:1 allele ratios; 1 logical and clinical heterogeneity . Despite recent treatment Supplementary Fig. 2). Across the discovery set, tetraploidy was most advances, approximately 40% of children experience tumour commonly observed in Group 3 (7 out of 13, 54%) and Group 4 recurrence, and 30% will die from their disease. Those who survive tumours (8 out of 20, 40%), followed by SHH (4 out of 14, 29%) and often have a significantly reduced quality of life. Four tumour WNT tumours (1 out of 7, 14%). Interestingly, the four tetraploid SHH subgroups with distinct clinical, biological and genetic profiles tumours all harboured TP53 mutations and also displayed chromo- 6 2,3 are currently identified . WNT tumours, showing activated thripsis . Tetraploid Group 3 and 4 tumours showed significantly wingless pathway signalling, carry a favourable prognosis under more large-scale copy number alterations compared with diploid cases 4 current treatment regimens . SHH tumours show hedgehog (median 10 changes per tumour in tetraploid versus 4 per tumour in 2 pathway activation, and have an intermediate prognosis . Group 3 diploid cases, P 5 0.008, two-tailed Mann–Whitney U-test; Supplemen- and 4 tumours are molecularly less well characterized, and also tary Fig. 3). Thus, tetraploidy followed by genomic instability may be present the greatest clinical challenges 2,3,5 . The full repertoire of an early driving event in a large proportion of Group 3 and 4 medullo- genetic events driving this distinction, however, remains unclear. blastomas, which pose a significant clinical challenge due to their Here we describe an integrative deep-sequencing analysis of 125 dismal prognosis and lack of targeted treatment options. Novel classes tumour–normal pairs, conducted as part of the International of drugs such as mitotic checkpoint kinase or kinesin inhibitors, which Cancer Genome Consortium (ICGC) PedBrain Tumor Project. targetthemaintenanceof tetraploidythroughsuccessivecelldivisions, Tetraploidy was identified as a frequent early event in Group 3 may therefore represent a rational therapeutic strategy in these 7,8 and 4 tumours, and a positive correlation between patient age cases . The value of tetraploidy as a prognostic marker also requires and mutation rate was observed. Several recurrent mutations further investigation. were identified, both in known medulloblastoma-related genes The average somatic mutation rate in the WGS cohort was 0.52 per (CTNNB1, PTCH1, MLL2, SMARCA4) and in genes not previously megabase (Mb), with an average of 10.3 non-synonymous coding linked to this tumour (DDX3X, CTDNEP1, KDM6A, TBR1), single-nucleotide variants (SNVs) in the discovery cohort (Supplemen- often in subgroup-specific patterns. RNA sequencing confirmed tary Table 2). This is slightly higher than previously reported for 9 these alterations, and revealed the expression of what are, to our medulloblastoma , possibly due to improved coverage and technical knowledge, the first medulloblastoma fusion genes identified. sensitivity, but considerably lower than in deep-sequenced adult Chromatin modifiers were frequently altered across all subgroups. tumours, for example 10,11 . There were significantly fewer transitions These findings enhance our understanding of the genomic in the somatic alterations compared with germline variation 27 complexity and heterogeneity underlying medulloblastoma, and (P 5 4.6 3 10 , Wilcoxon rank-sum test; Supplementary Fig. 4). All provide several potential targets for new therapeutics, especially coding somatic SNVs identified in the combined cohort are listed in for Group 3 and 4 patients. Supplementary Table 3. As a first phase of the International Cancer Genome Consortium We identified a positive correlation between genome-widemutation (ICGC) PedBrain Tumor Project (http://www.pedbraintumor.org), rate and patient age, as previously reported for coding mutations 9 2 we have collected matched tumour and germline samples from 125 (r 5 0.35, P 5 7.83 10 25 Pearson’s product–moment correlation; medulloblastoma patients aged from 0 to 17 years (Supplementary Fig. 1c). Intriguingly, this association was more pronounced in diploid 25 2 Table 1). Whole-genome sequencing (WGS, n 5 39) and whole-exome tumours (r 5 0.52, P 5 33 10 ), and virtually absent in tetraploid 2 sequencing (WES, n 5 21) were applied to a ‘discovery’ set, with a cases (r 5 0.04, P 5 0.5) (Supplementary Fig. 5a, b). A similar trend custom-capture approach used to sequence 2,734 genes in an additional was observed for non-synonymous mutations across the discovery ‘replication’ set (n 5 65). All tumour samples were obtained at primary cohort (Supplementary Fig. 5c). Coverage level did not correlate with diagnosis, before adjuvant therapy, and the distribution of molecular mutation rate (Supplementary Fig. 5d). One explanation may be that subgroups was similar across cohorts (Supplementary Fig. 1). all medulloblastomas originate during embryogenesis, with some Investigation of genome-wide somatic mutation allele frequencies tumours needing to accumulate more genetic ‘hits’ before becoming identified several cases with a clear peak at approximately 25%, rather symptomatic. Alternatively, tumours arising in older patients may than the expected approximately 50% allele frequency for early, derive from more differentiated cells that require a greater number heterozygous events (Fig. 1a). Analysis of coverage depth and allele of alterations to undergo malignant transformation. Investigation of frequencies in regions of copy-number change ruled out stromal con- additional tumours from older patients may help to clarify this. tamination, but rather indicated a tetraploid baseline in the tumour Five SHH tumours harbouring TP53 mutations, including three genome (Fig. 1b). Predicted ploidy status was confirmed by fluor- previously described Li–Fraumeni syndrome (LFS)-associated tumours 6 escence in situ hybridization (FISH) using multiple centromeric with germline mutations , one newly identified LFScase (ICGC_MB23), probes in 17 out of 18 cases analysed (Fig. 1a). The extremely low and one somatically mutated tumour (ICGC_MB34), had significantly fraction of mutations at approximately 50% allele frequency indicates more mutations than the remaining cases, both genome wide (mean 1.1 26 that genome duplication occurred very early during tumorigenesis. per Mb versus 0.43 per Mb, P5 4.53 10 ; two-tailed t-test) and for 10 0 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2 012 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH 01234 ICGC_MB31 3,500 SHH-p53 a ICGC_MB1 Diploid ICGC_MB37 (diploid) 25% Tetraploid ICGC_MB15 (tetraploid) c 4,500 WNT 50% ICGC_MB16 ICGC_MB18 SHH ICGC_MB35 Grp3 ICGC_MB37 Grp4 ICGC_MB49 Density Density 012345 ICGC_MB5 Somatic SNVs genome-wide 2,500 1,500 R = 0.35 ICGC_MB6 ICGC_MB7 ICGC_MB15 ICGC_MB17 2 ICGC_MB26 ICGC_MB34 500 123456789 10 11 12 13 14 15 16 17 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Age at diagnosis (years) Mutant allele frequency Mutant allele frequency b ICGC_MB15 P = 0.001 2.0 0.5 d 4,000 P = 0.017 P = 0.004 Coverage ratio tumour versus control 1.0 0.4 2:2 3:2 4:3 3,000 1.5 0.5 0.0 Tumour BAF 0.3 2:1 Somatic SNVs genomewide 0.5 0.2 2,000 0.4 Tumour BAF 0.3 0.1 A XY 3:0 1,000 0.2 0.1 0.0 P = 0.017 P = 0.008 P = 0.013 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 WNT SHH SHH-p53 Grp3 Grp4 0.0 1234567891011121416182022XY Coverage ratio tumour versus control Subgroup Chromosome Figure 1 | Tetraploidy is a frequent early event in medulloblastoma which are heterozygous in the germ line. Right, genome alteration print (GAP) tumorigenesis, and mutation rates vary with age and subgroup. of segmented copy number and allele frequency profiles. Chromosomes with a, Distributions of genome-wide somatic mutation allele frequencies (the predicted 3:0/2:1/3:2 allele ratios show a BAF of approximately 0/0.33/0.4 and proportion of sequence reads supporting a mutation) for diploid tumours (with coverage ratios of approximately 0.75/0.75/1.25. Owing to random sampling, a peak at ,50% for heterozygous events, n 5 7) and tetraploid cases (with a the 2:2 allele ratio is slightly below 0.5. c, Genome-wide somatic mutation rates peak at ,25%, n 5 7). Insets show centromeric FISH for chromosomes 1 (red) are positively correlated with patient age (n 5 39). Grp, Group. d, Distribution and 11 (green), confirming the predicted ploidy status. b, Top left, rescaled of somatic mutation rates by tumour subgroup (n 5 39). P values are according tumour:germline coverage ratio, indicating copy-number gains (red) or losses to a Wilcoxon rank-sum test with Bonferroni correction. SHH-p53, SHH- (green). Bottom left, B-allele frequency (BAF) in the tumour at SNP positions subgroup tumours harbouring a somatic or germline TP53 mutation. 26 non-synonymous changes (mean 23 versus 8.8, P 52.6310 ). mutated in 5 cases, 4%), KDM6A (5 cases, 4%) and CTDNEP1 (4 cases, Interestingly, the WNT subgroup, which typically shows a good pro- 3%) (Fig. 2). These were also the only genes found to be significantly gnosis and few copy-number changes, had the next highest mutation altered upon analysis of the combined cohort with MutSig, an algo- rate (Fig. 1d). rithm testing whether the observed mutations in a gene are not simply Forty-one somatic, coding, small insertions/deletions (Indels) were a consequence of random background mutation processes. It takes identified across the cohort, with an average of 0.4 coding Indels per into account gene length and composition, silent to non-silent muta- case in the discovery set (range 0–2; Supplementary Table 4). Some tion ratios, and other factors (see https://confluence.broadinstitute. genes, however, were more commonly affected by Indels than SNVs. org/display/CGATools/MutSig; Supplementary Table 6). Large-scale For example, frameshift Indels in PTCH1 were detected in 6 out of 125 copy-number changes known to be associated with medulloblastoma, cases, whereas only 2 SNVs were observed. Recurrent Indels were also such as formation of an isodicentric 17q and losses of 10q/9q/X 13–15 , seen in the chromatin modifiers MLL2, KDM6A (3 cases each) and were more frequently recurrent than SNVs (Supplementary Fig. 6a–e). BCOR (2 cases). Many alterations were enriched in specific medulloblastoma sub- In contrast to another paediatric brain tumour, glioblastoma, in groups. For example, all of the WNT tumours (15 out of 15) harboured which we recently identified frequently recurrent hotspot mutations , a mutation in CTNNB1, and 13 out of 15 displayed loss of one copy of 12 the majority of mutated genes in this study were unique to a single case chromosome 6 (or acquired uniparental disomy in one case), altera- (587 out of 760 non-synonymous SNVs in the 125 cases, 77%), tions which have previously been associated with this subgroup 4,13,15 . demonstrating the pronounced genetic heterogeneity of medulloblas- Mutations in DDX3X were also clearly enriched in WNT tumours 26 toma. Twenty-five of these singleton mutations, and 53 SNVs in total, (adjusted P 5 7.06 3 10 , two-tailed Fisher’s exact test with a were at positions listed in the COSMIC database of somatic alterations Bonferroni correction), and these mutations were clustered within in tumours (available at http://www.sanger.ac.uk/genetics/CGP/ the helicase domain (Supplementary Fig. 7a). Three were localized at cosmic/), suggesting a rare but important contribution of many known the RNA-binding surface of the protein and three were predicted to cancer genes in medulloblastoma (Supplementary Table 5). Only 8 disrupt the closed (RNA-binding) conformation (Supplementary genes were somatically altered in more than 3% of the whole series: Fig. 7b). The remainder were predicted to disrupt indirectly either CTNNB1(15cases, 12%); DDX3X (10cases,8%); PTCH1 (8 cases, 6%), the positive charge on the RNA-binding surface (n 5 2) or the folding SMARCA4 (6 cases, 5%), MLL2 (6 cases, 5%), TP53 (somatically of the closed form (n 5 2). No truncating mutations were found, 2 A UGUS T 2 012 | V OL 488 | N A T U R E | 1 0 1 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER WNT SHH Group 3 Group 4 ND MBRep_T28 ICGC_MB31 MB_Exm599 MBRep_T35 ICGC_MB518 MBRep_T56 MBRep_T63 MBRep_T68 ICGC_MB20 ICGC_MB14 MB_Exm987 ICGC_MB46 MBRep_T64 MBRep_T92 MBRep_T37 ICGC_MB1 MBRep_T25 ICGC_MB28 MB_Exm250 MBRep_T59 ICGC_MB35 ICGC_MB21 MBRep_T91 ICGC_MB12 MBRep_T30 MB_Exm528 MBRep_T10 ICGC_MB37 MBRep_T1 MBRep_T89 MBRep_T94 MBRep_T33 MBRep_T6 MBRep_T32 MBRep_T83 MBRep_T86 MBRep_T78 MBRep_T53 ICGC_MB34 MBRep_T29 LFS_MB4 LFS_MB1 LFS_MB3 LFS_MB2 ICGC_MB23 ICGC_MB800 ICGC_MB9 ICGC_MB39 ICGC_MB36 MBRep_T48 MBRep_T5 MBRep_T79 ICGC_MB50 ICGC_MB5 MBRep_T40 MBRep_T54 MBRep_T69 MBRep_T73 MBRep_T27 ICGC_MB18 MB_Exm557 MBRep_T49 ICGC_MB16 MBRep_T23 MBRep_T43 MBRep_T67 ICGC_MB17 ICGC_MB32 ICGC_MB45 MB_Exm10 MBRep_T47 ICGC_MB7 ICGC_MB49 MBRep_T72 MBRep_T41 MBRep_T24 ICGC_MB19 MBRep_T2 ICGC_MB6 ICGC_MB38 MBRep_T46 MBRep_T87 ICGC_MB15 MB_Exm1017 MBRep_T38 MBRep_T71 ICGC_MB2 ICGC_MB24 ICGC_MB51 ICGC_MB612 MB_Exm879 MBRep_T26 MBRep_T36 MBRep_T39 MBRep_T51 MBRep_T61 MBRep_T90 MBRep_T88 MBRep_T93 MB_Exm999 MBRep_T70 MB_Exm23 Group Gender Histology Ploidy –6 111 0 11111 U 1111 0 000000000000000000000000000000 00000000000000000000000000 0000000000000000000000000000000000000000 00 1 00000000000 –9q 000000000000000 11111 1 1 1 00 11 111 00000000 1 0 1 00 1 0 000000 1 0000 1 00000000000000 000000 1 000000000000000000000000000000000 00 11 0000000000 –10q 000000000000000 1 0000 1 1 00 1 00 11 0000000 1 00 111 1 1 0000 11 0 1 000 111 00 1111 00000 000000 11111 0000000000000000000 1111 000000 1 0000 1 00 1 0 1 0 1 0 –17p 0000 1 00000 1 0 1 00 0 1 000000000000000000000 1 0 11111 1 00 1111 0 1111 0 1 000000000000 111 0 11111111111111111111111 000 1 000000000 111 0 11111 0 1 000 +17q 0000000000 1 0000 000000000000000 0000000 1 00000 000 111 00 11111 0000000000000 11111111111111111111111111111 1 0000000000 111 0 111111 0000 –X 00000000000000 1 0000000000000 1 0000000000000000 1 0000000 1 0000 1111 000000000 000 1 00 11 000 1111 000000000000 1 000 1 00 1 00000 1 0000000000000 MYC Amp 000000000000000 000000000000000000000000000000 1 0000 111 000000000000000000 0000000000000000000000000000000000000000 00000000000000 MYCN Amp 000000000000000 000000 1 000000000000000 1 00 111 00 000 1 0000000000000000000000 000000 11 00000000000000000000000000000000 000 1 0000000000 CTNNB1 111111111111111 000000000000000000000000000000 00000000000000000000000000 0000000000000000000000000000000000000000 00000000000000 DDX3X 11111111 0000000 0 1 0000000 1 00000000000000000000 00000000000000000000000000 0000000000000000000000000000000000000000 00000000000000 SMARCA4 1 0000000 11 0000 000000000000000000000000000000 111 00000000000000000000000 0000000000000000000000000000000000000000 00000000000000 MLL2 0 111 000000 1 0000 00000000 1 000000000000000000000 00000000000000000000000000 00000 1 0000000000000000000000000000000000 00000000000000 TP53 0 1 00 1 0000000000 0000000000000000000000 111 11111 00000000000000000000000000 0000000000000000000000000000000000000000 00000000000000 PTCH1 000000000000000 D 1111111 0000000000000000000000 00000000000000000000000000 0000000000000000000000000000000000000000 00 1 00000000000 KDM6A 000000000000000 000000000000000000000000000000 0000 1 000000000000000000000 1111 000000000000000000000000000000000000 00000000000000 CTDNEP1 000000000000000 000000000000000000000000000000 000 1 0000000000000000000000 0000 1 00000000000000000000000000000000000 11 000000000000 Histology: Classic Large cell/anaplastic Ploidy: Diploid Alterations: Alteration present Germline mutation U Acquired UPD Desmoplastic Tetraploid Partial loss or monosomy D Homozygous deletion Figure 2 | Subgroup specificity of common genetic alterations. Summary of UPD, uniparental disomy; ND, no material available for conclusive molecular clinical data and recurrent alterations in the combined cohort (n 5 125). Genes subgroup assignment. which were found to be significantly mutated by MutSig analysis were included. indicating an alteration rather than simply a loss of function. DDX3X a ICGC_MB34 chromosome 7 has recently been proposed to have an oncogenic role 10,11 , although its exact function in tumorigenesis remains to be determined. Log 2 ratio – 024 showed loss of the whole of chromosome arm 9q, as well as alterations 13,16 As anticipated from previous studies , SHH tumours frequently 2 in key hedgehog-pathway signalling molecules (for example, PTCH1, altered in 8 cases; MYCN, amplified in 5 cases; and SMO, mutated in ICGC_MB12). 0 50 100 150 Chromosomal coordinates (megabases) The most frequently mutated gene in Group 3 tumours was 155.07 INSIG1 clustered in the helicase domain (Supplementary Fig. 7a). As noted b SMARCA4 (3 out of 26 cases). As with DDX3X, these mutations were 156.66 above, tetraploidy was also a common event in this subgroup and in Group 4 tumours. Recurrent truncating mutations in KDM6A (on NOM1 chromosome X, which frequently shows copy-number loss in female MNX1 EN2 Group 3 and 4 medulloblastoma patients; also known as UTX), encod- ing a histone 3 lysine 27 (H3K27) demethylase, were also seen in Group 4 (4 out of 40, 10%), indicating a tumour-suppressive role in CNPY1 17 this subgroup, as previously described for other cancers . CTDNEP1 (a homologue of the Xenopus gene dullard), was also affected by trun- chr7 (2) chr7 (1) cating alterations in four tumours. In three of these cases, the mutation was accompanied by loss of the wild-type allele through isodicentric 17q formation. This gene, encoding a nuclear envelope phosphatase, UBE3C was shown in Xenopus to have roles in BMP signalling and neural development . In mammalian cells it is involved in the lipin activation 18 pathway, regulating nuclear membrane biogenesis and production of diacylglycerol 19,20 . Given the high frequency of isodicentric 17q in medulloblastoma, genetic targets on this chromosome have long been RBM33 DNAJB6 157.18 155.64 SHH sought after. CTDNEP1 may be a good candidate for one of the medul- loblastoma tumour suppressors on 17p. Aside from these subgroup-enriched events, a commonly recurring c MB34, RNA-seq, reads per million chr7(+) 8 theme across all medulloblastomas is alterations in genes involved in chromatin modification. Some point mutations and DNA copy num- 0 ber alterations in this pathway have previously been implicated in DNAJB6 medulloblastoma 9,21 . Overall, 45 out of 125 cases (36%) harboured a mutation in a gene categorized under the Gene Ontology term 20 MB34, RNA-seq, reads per million chr7(–) ‘Chromatin Modification’ (GO:0015168, Supplementary Fig. 6f, g). We recently described an enrichment of catastrophic DNA rearran- 0 gements (‘chromothripsis’) in TP53-mutated SHH medulloblasto- SHH 6 mas . Three new TP53-mutant SHH tumours were identified in this DNAJB6 (exon 1) SHH (exons 2–3) Figure 3 | Identification of novel fusion genes in medulloblastoma. a,Read- [ 72 reads on fusion depth plot with log 2 tumour:germline coverage ratio showing alterations on 103 spanning mate pairs chromosome 7inICGC_MB34.Lines indicateconnected segments. b,Schematic Truncated SHH: of the rearrangement. c, Details of the SHH fusion gene structure and support for (aa 114–462) M N Q W P G V K L R V ... its expression, derived from RNA sequencing data. aa, amino acids. 10 2 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2012 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH study: ICGC_MB23 (germline mutation), MBRep_T29 and involving well-established medulloblastoma oncogenes may have a MBRep_T53 (somatic mutations). Two of these, ICGC_MB23 and more important role in medulloblastoma than previously recognized, MBRep_T53, showed complex genomic rearrangements indicative and warrant further investigation. 22 of the chromothripsis model (Supplementary Fig. 8) . High-coverage, strand-specific RNA sequencing of 28 cases allowed Deep sequencing also allowed fine mapping of two amplicons on us to determine the proportion of DNA SNVs that were observable in chromosome 7 in ICGC_MB34 (a SHH tumour with a somatic TP53 the transcriptome (Supplementary Tables 3 and 4). Overall, 129 out of mutation, relating to MB2034 in ref. 6). One amplicon included the 268 (48%) non-synonymous mutations in the DNA were also detect- entire SHH gene, whereas the second disrupted DNAJB6, such that its able at the RNA level. A further 38% (101 out of 268) resided in genes first exon was juxtaposed to SHH (Fig. 3a, b). RNA sequencing further expressed at extremely low abundance (reads per kilobase of exon revealed a novel fusion transcript, not expected from the DNA data, model per million mapped reads (RPKM) , 1). Thus, the fraction of containing the first exon of DNAJB6 and exons 2 and 3 of SHH. The expressed mutations is even smaller than the already low number of firstexonof SHHwas skipped, resultingin a predicted amino-terminally DNA alterations, supporting the hypothesis that very few driving hits truncated SHH protein (Fig. 3c). Expression of SHH was extremely high are needed to generate this paediatric tumour. It may also be the case in this case, although virtually absent in 301 other medulloblastomas that some mutations required for tumour initiation are not essential (Supplementary Fig. 9a). Predicted DNA and RNA junctions were for later tumour cell maintenance. validated by PCR (Supplementary Fig. 9b). RNA sequencing further revealed monoallelic expression of a Several additional in-frame gene fusions were identified by large heterozygous mutation in TBR1, producing a p.G275C change, which 9 insert mate-pair sequencing, which gives better resolution for struc- was also seen in a previous study (Supplementary Fig. 11a). TBR1 24 tural variant detection. ICGC_MB18, for example, carried an intra- encodes a T-box transcription factor involved in brain development . chromosomal translocation resulting in a fusion between LCLAT1 and This gene, and a second family member, EOMES (or TBR2), clearly ERBB4, the latter of which has previously been associated with medul- showed subgroup-specific differential expression (Fig. 4a). Sequencing loblastoma oncogenesis (Supplementary Fig. 9c–f). In ICGC_MB6, a of TBR1 exon 2 in a further 85 medulloblastomas revealed one addi- 23 complex rearrangement of fragments from chromosomes 1 and 17 tional case with an identical mutation. All three mutated tumours were produced a fusion between MLLT6 and MRPL45, a mitochondrial in Group 4. Gene expression was also strongly correlated with DNA ribosomal protein, resulting in strong overexpression of the latter methylation for both TBR1 and EOMES (Fig. 4b, c and Supplementary (Supplementary Fig. 10a–c). These findings indicate that gene fusions Fig. 11b, c), and expression of TBR1 and EOMES is inversely correlated Chromosome 2 a TBR1 EOMES b 4,000 12,000 10,000 WNT 3,000 8,000 Gene expression 2,000 6,000 SHH 1,000 4,000 Group 3 2,000 0 0 Group 4 WNT SHH Grp3 Grp4 WNT SHH Grp3 Grp4 TBR1 12,000 500 cd 4,000 400 TBR1 gene expression 300 200 EOMES gene expression 9,000 6,000 3,000 2,000 TBR1 gene expression 100 r = –0.563 3,000 1,000 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 Samples Methylation (mean beta−value) Figure 4 | Integration of mutation, expression and methylation data shows correlated with gene methylation (n 5 54; Pearson’s correlation values, r). SHH differential regulation of TBR1 and EOMES in medulloblastoma. tumours show high methylation and virtually no expression, whereas WNT, a, Microarray data showing clear differences in TBR1 and EOMES expression Group 3 and Group 4 tumours display a more varied pattern. d, Expression between medulloblastoma subgroups (n 5 301). b, DNA methylation of TBR1 levels of TBR1 (diamonds) and EOMES (circles) are inversely related in Group (n 5 54), ranging from low (blue) to high (red).Horizontalred barindicates the 4 tumours (n 5 104). region used for correlation analysis in c. c, Expression of TBR1 is tightly 2 A UGUS T 2 012 | V OL 488 | N A T U R E | 103 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER in Group 4 tumours (Fig. 4d), giving subsets that are either TBR1- 19. Han, S. et al. Nuclear envelope phosphatase 1-regulatory subunit 1 (formerly methylated and EOMES hi or EOMES-methylated and TBR1 hi TMEM188) is the metazoan Spo7p ortholog and functions in the lipin activation pathway. J. Biol. Chem. 287, 3123–3137 (2012). (Supplementary Fig. 11d, e). These two genes are markers for different 20. Kim, Y. et al. A conserved phosphatase cascade that regulates nuclear membrane stages of neuronal lineage commitment, suggesting possible differences biogenesis. Proc. Natl Acad. Sci. USA 104, 6596–6601 (2007). 25 in cell-of-origin or differentiation within Group 4 subpopulations . 21. Northcott,P.A.etal.Multiplerecurrentgeneticeventsconvergeoncontrolofhistone lysine methylation in medulloblastoma. Nature Genet. 41, 465–472 (2009). This large, integrative genomics study has provided a detailed insight 22. Stephens, P. J. et al. Massive genomic rearrangement acquired in a single into new mechanisms contributing to medulloblastoma tumorigenesis catastrophic event during cancer development. Cell 144, 27–40 (2011). and disclose novel targets for therapeutic approaches, especially for 23. Gilbertson, R. J., Perry, R. H., Kelly, P. J., Pearson, A. D. J. & Lunec, J. Prognostic significance of HER2 and HER4 coexpression in childhood medulloblastoma. Group 3 and 4 patients. The molecular subgroup-related enrichment Cancer Res. 57, 3272–3280 (1997). of many alterations highlights the importance of considering this dis- 24. Hevner, R. F. et al. Tbr1 regulates differentiation of the preplate and layer 6. Neuron tinguishing factor in research, trial design and clinical practice. 29, 353–366 (2001). 25. Englund, C. et al. Pax6, Tbr2, and Tbr1 are expressed sequentially by radial glia, intermediate progenitor cells, and postmitotic neurons in developing neocortex. METHODS SUMMARY J. Neurosci. 25, 247–251 (2005). All patient material was collected after receiving informed consent according to Supplementary Information is linked to the online version of the paper at ICGC guidelines and as approved by the institutional review board of contributing www.nature.com/nature. centres. Tumour subgrouping was based on gene expression profiling or immuno- Acknowledgements We thank GATC Biotech AG for sequencing services. For technical histochemical analysis as described in ref. 5. support and expertise we thank: B. Haase, D. Pavlinic, B. Baying, M. Wahlers, R. Lu ¨ck, Next generation sequencing was performed using Illumina technologies. Mean I. Kutschera, K. Schlangen, M. Metsger, K. Schulz, A. Nu ¨rnberger, A. Kovacsovics, DNA sequence coverage was 35-fold for whole-genome cases (range 26–563), M. Linser, J. C. Lindsey, S. Bailey, D. M. Pearson, the EMBL Genomics Core Facility, the whereas mean on-target coverage in the whole-exome and replication cohorts was EMBL High-performance Computing Core Facility and the DKFZ Genomics and 68-fold (74% of targets above 203 for whole exome, 66% for the replication Proteomics Core Facility. This work was principally supported by the PedBrain Tumor cohort). Exome capture was carried out with Agilent SureSelect (Human All Project contributing to the International Cancer Genome Consortium (ICGC PedBrain Tumor Project, http://www.pedbraintumor.org/), funded by German Cancer Aid Exon 50 Mb and XT Custom Library) in-solution reagents. Sequence data were (109252) and the German Federal Ministry of Education and Research (BMBF, aligned to the hg19 human reference genome assembly; duplicate and non- NGFN plus #01GS0883). Additional support came from the German Cancer Research uniquely mapping reads were excluded. Tumour ploidy was predicted from Center–Heidelberg Center for Personalized Oncology (DKFZ-HIPO), the Max Planck Society, the Pediatric Brain Tumor Foundation, the Italian Neuroblastoma Foundation sequencing data by a novel approach integrating copy number aberrations with and the Samantha Dickson Brain Tumour Trust. This study included samples provided allele frequencies. A subset of sequence variants were validated using PCR and by the UK Children’s Cancer and Leukaemia Group (CCLG) as part of CCLG-approved Sanger sequencing. Verification rates were 95% (128 out of 135) for SNVs and biological study BS-2007-04. 100% (14 out of 14) for Indels (Supplementary Tables 3 and 4). A complete Author Contributions D.T.W.J., M.Su., A.M.S., H.-J.W., S.B., S.P., H.C., E.P., L.S., A.W., S.H., description of the materials and methods is provided in the Supplementary T.T., B.R., C.C.B., M.Sch., C.v.K., V.B., R.V., S.Wo., S.Wi. and J.F. performed and/or Information. coordinatedexperimentalwork.N.Ja ¨ger,D.T.W.J.,M.K.,T.Z.,B.H.,M.Su,T.J.P.,V.Ho.,T.R., H.-J.W., J.W., M.A., V.Am, M.Z., Q.W., B.L., V.Ast, C.L., J.E., R.K., P.v.S., J.K., D.Sh., M.J.B., Received 3 February; accepted 6 June 2012. R.B.R. and P.A.N. performed data analysis. Y.-J.C., M.Ry., M.Re., S.C., G.P.T., U.S., V.Ha., Published online 25 July 2012. N.G.,Y.-J.K.,C.M.,W.R.,A.U., C.H.-M.,T.M.,A.E.K., A.v.D.,O.W., E.M., J.R., M.E., M.U.S., M.C.F., M.H., N. Jabado, S.R., A.O.v.B., D.W., S.C.C., M.G.M., V.P.C., W.S., G.R., M.D.T. and A.K. collected data and provided patient materials. D.T.W.J., N. Ja ¨ger, D.St., M.K., V.Ho., H.W., 1. Louis, D. N. et al. The 2007 WHO classification of tumours of the central nervous system. Acta Neuropathol. 114, 97–109 (2007). R.E., S.M.P. and P.L. prepared the initial manuscript and figures. U.D.W., H.L., B.B., G.R., 2. Kool, M. et al. Molecular subgroups of medulloblastoma: an international meta- M.M., S.L.P., M.-L.Y., J.O.K., R.E., A.K., S.M.P. and P.L. provided project leadership. All authors contributed to the final manuscript. analysis of transcriptome, genetic aberrations, and clinical data of WNT, SHH, Group 3, and Group 4 medulloblastomas. Acta Neuropathol. 123, 473–484 Author Information Short-read sequencing data have been deposited at the European (2012). Genome-phenome Archive (EGA, http://www.ebi.ac.uk/ega/) hosted by the EBI, under 3. Taylor, M. D. et al. Molecular subgroups of medulloblastoma: the current accession number EGAS00001000215. Reprints and permissions information is consensus. Acta Neuropathol. 123, 465–472 (2012). available at www.nature.com/reprints. This paper is distributed under the terms of the 4. Clifford, S. C. et al. Wnt/Wingless pathway activation and chromosome 6 loss Creative Commons Attribution-Non-Commercial-Share Alike licence, and is freely characterise a distinct molecular sub-group of medulloblastomas associated with available to all readers at www.nature.com/nature. The authors declare no competing a favourable prognosis. Cell Cycle 5, 2666–2670 (2006). financialinterests.Readersarewelcometocommentontheonlineversionofthisarticle 5. Northcott, P. A. et al. Medulloblastoma comprises four distinct molecular variants. at www.nature.com/nature. Correspondence and requests for materials should be J. Clin. Oncol. 29, 1408–1414 (2011). addressed to R.E. ([email protected]), S.M.P ([email protected]) or 6. Rausch, T. et al. Genome sequencing of pediatric medulloblastoma links P.L. ([email protected]). catastrophic DNA rearrangements with TP53 mutations. Cell 148, 59–71 (2012). 7. Rello-Varona, S. et al. Preferential killing of tetraploid tumor cells by targeting the mitotic kinesin Eg5. Cell Cycle 8, 1030–1035 (2009). 8. Vitale, I. et al. Inhibition of Chk1 kills tetraploid tumor cells through a p53- David T. W. Jones *, Natalie Ja ¨ger *, Marcel Kool , Thomas Zichner , Barbara Hutter , 1 2 2 1 3 dependent pathway. PLoS ONE 2, e1337 (2007). Marc Sultan , Yoon-Jae Cho , Trevor J. Pugh , Volker Hovestadt , Adrian M. Stu ¨tz , 6 4 7 3 5 9. Parsons, D. W. et al. The genetic landscape of the childhood cancer Tobias Rausch , Hans-Jo ¨rg Warnatz , Marina Ryzhova , Sebastian Bender , Dominik 4 1 3 8 medulloblastoma. Science 331, 435–439 (2011). Sturm , Sabrina Pleier , Huriye Cin , Elke Pfaff , Laura Sieber , Andrea Wittmann , 1 1 1 1 1 1 1 1 1 10. Stransky, N. et al. The mutational landscape of head and neck squamous cell Marc Remke , Hendrik Witt 1,9 , Sonja Hutter , Theophilos Tzaridis , Joachim 3 4 carcinoma. Science 333, 1157–1160 (2011). Weischenfeldt , Benjamin Raeder , Meryem Avci , Vyacheslav Amstislavskiy , Marc 3 4 11. Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic Zapatka , Ursula D. Weber , Qi Wang ,Ba ¨rbel Lasitschka , Cynthia C. Bartholomae , 7 11 7 2 10 12 12 11 12 11 leukemia. N. Engl. J. Med. 365, 2497–2506 (2011). ManfredSchmidt , Christofvon Kalle , VolkerAst , Chris Lawerenz ,Ju ¨rgenEils , 2 13 14 14 14 12. Schwartzentruber, J. et al. Driver mutations in histone H3.3 and chromatin Rolf Kabbe , Vladimir Benes , Peter van Sluis , Jan Koster , Richard Volckmann , 16 15 17 16 remodelling genes in paediatric glioblastoma. Nature 482, 226–231 (2012). David Shih , Matthew J. Betts , Robert B. Russell , Simona Coco , Gian Paolo 18 20 21 19 17 13. Kool, M. et al. Integrated genomics identifies five medulloblastoma subtypes with Tonini , Ulrich Schu ¨ller , Volkmar Hans , Norbert Graf , Yoo-Jin Kim , Camelia 22 23 22 distinct genetic profiles, pathway signatures and clinicopathological features. Monoranu , Wolfgang Roggendorf , Andreas Unterberg , Christel 9 23 PLoS ONE 3, e3088 (2008). Herold-Mende , Till Milde 9,24 , Andreas E. Kulozik , Andreas von Deimling 25,26 , Olaf 27 29 28 14. Pfister, S. et al. Outcome prediction in pediatric medulloblastoma based on DNA Witt 9,24 , Eberhard Maass , Jochen Ro ¨ssler , Martin Ebinger , Martin U. 32 33 30 31 copy-number aberrations of chromosomes 6q and 17q and the MYC and MYCN Schuhmann , Michael C. Fru ¨hwald , Martin Hasselblatt , Nada Jabado , Stefan 35 34 34 35 loci. J. Clin. Oncol. 27, 1627–1636 (2009). Rutkowski , Andre ´ O. von Bueren , Dan Williamson , Steven C. Clifford , Martin G. 4 10 37 15. Thompson, M. C. et al. Genomics identifies medulloblastoma subgroups that are McCabe 36,37 , V. Peter Collins , Stephan Wolf , Stefan Wiemann 10,38 , Hans Lehrach , 2 40 40 39 enriched for specific genetic alterations. J. Clin. Oncol. 24, 1924–1931 (2006). Benedikt Brors , Wolfram Scheurlen ,Jo ¨rg Felsberg , Guido Reifenberger , Paul A. 41 15 16. Pietsch, T. et al. Medulloblastomas of the desmoplastic variant carry mutations of Northcott , Michael D. Taylor , Matthew Meyerson 6,42 , Scott L. Pomeroy 6,43 , 25,26 4 3 2,44,45 the humanhomologue of Drosophila patched. Cancer Res. 57, 2085–2088 (1997). Marie-Laure Yaspo , Jan O. Korbel , Andrey Korshunov , Roland Eils *, Stefan 7 17. van Haaften, G. et al. Somatic mutations of the histone H3K27 demethylase gene M. Pfister 1,9 * & Peter Lichter * UTX in human cancer. Nature Genet. 41, 521–523 (2009). 18. Satow,R., Kurisaki, A.,Chan, T.C., Hamazaki,T.S. &Asashima,M.Dullardpromotes 1 Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), Im 2 degradation and dephosphorylation of BMP receptors and is required for neural Neuenheimer Feld 280, Heidelberg 69120, Germany. Division of Theoretical induction. Dev. Cell 11, 763–774 (2006). Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 10 4 | NA TU RE | V O L 488 | 2 A U GU ST 20 12 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH 3 Heidelberg 69120, Germany. European Molecular Biology Laboratory (EMBL), Germany. 25 Department of Neuropathology, University of Heidelberg, Im Neuenheimer 4 Meyerhofstrasse 1, Heidelberg 69117, Germany. Max Planck Institute for Molecular Feld 220, Heidelberg 69120, Germany. 26 Clinical Cooperation Unit Neuropathology, 5 Genetics, Ihnestrasse 63-73, Berlin 14195, Germany. Division of Child Neurology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 220-221, Heidelberg 6 Stanford University,750 Welch Road, Palo Alto, California 94304, USA. Broad Institute of 69120, Germany. 27 Department of Pediatric Oncology, Hematology & Immunology, 7 MIT and Harvard, Cambridge, Massachusetts 02142, USA. Division of Molecular Klinikum Stuttgart Olgahospital, Bismarckstrasse 8, Stuttgart 70176, Germany. Genetics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 28 Department of Paediatric Haematology and Oncology, University Hospital Freiburg, 8 Heidelberg 69120, Germany. Department of Neuropathology, NN Burdenko Mathildenstrasse 1, Freiburg 79106, Germany. 29 Department of Hematology and Neurosurgical Institute, 4th Tverskaya-Yamskaya 16, Moscow 125047, Russia. Oncology, Children’s University Hospital, Hoppe-Seyler Strasse 1, Tu ¨bingen 72076, 9 30 Department of Pediatric Oncology, Hematology & Immunology, Heidelberg University Germany. Department of Neurosurgery, University Hospital, Hoppe-Seyler Strasse 3, Hospital, Im Neuenheimer Feld 430, Heidelberg 69120, Germany. 10 Genomics and Tu ¨bingen 72076, Germany. 31 Children’s Hospital Augsburg, Stenglinstrasse 2, Augsburg Proteomics Core Facility, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 86156, Germany. 32 Institute of Neuropathology, University Hospital Mu ¨nster, 280, Heidelberg 69120, Germany. 11 Division of Translational Oncology, German Cancer Albert-Schweitzer-Campus 1, Mu ¨nster 48149, Germany. 33 Departments of Pediatrics Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer and Human Genetics, McGill University and the McGill University Health Center Research Feld 460, Heidelberg 69120, Germany. 12 Data Management Facility, German Cancer Institute, Montreal, Quebec H3Z 2Z3, Canada. 34 Department of Paediatric Haematology Research Center (DKFZ), Im Neuenheimer Feld 280, Heidelberg 69120, Germany. and Oncology, University Medical Center Hamburg-Eppendorf, Martinistrasse 52, 13 35 Genomics Core Facility, European Molecular Biology Laboratory (EMBL), Hamburg 20246, Germany. Northern Institute for Cancer Research, Newcastle Meyerhofstrasse 1, Heidelberg 69117, Germany. 14 Department of Oncogenomics, AMC, University, Royal Victoria Infirmary, Newcastle-upon-Tyne, NE1 4LP, UK. 36 School of University of Amsterdam, Meibergdreef 9, Amsterdam 1105 AZ, Netherlands. 15 The Cancer and Enabling Sciences, University of Manchester, Manchester Academic Health Arthur and Sonia Labbatt Brain Tumor Research Centre, Hospital for Sick Children, 555 Science Centre, Manchester, M13 9PL, UK. 37 Division of Molecular Histopathology, University Avenue, Toronto, Ontario M5G 1X8, Canada. 16 Cell Networks Cluster of Department of Pathology, University of Cambridge, Cambridge CB2 0QQ, UK. 38 Division Excellence, University of Heidelberg, Heidelberg 69120, Germany. 17 Department of of Molecular Genome Analysis, German Cancer Research Center (DKFZ), Im Advanced Diagnostic Technologies, IRCCS Azienda Ospedaliera Universitaria San Neuenheimer Feld 280, Heidelberg 69120, Germany. 39 Cnopf’sche Kinderklinik, Martino - IST Istituto Nazionale per la Ricerca sul Cancro, L.go R. Benzi,10, Genoa 16132, Nu ¨rnberg Children’s Hospital, St-Johannis-Mu ¨hlgasse 19, Nu ¨rnberg 90419, Germany. Italy. 18 Center for Neuropathology and Prion Research, University of Munich, 40 Department of Neuropathology, Heinrich-Heine-University Du ¨sseldorf, Moorenstrasse Feodor-Lynen-Strasse 23, Munich 81377, Germany. 19 Institute for Neuropathology, 5, Du ¨sseldorf 40225, Germany. 41 Division of Neurosurgery and The Arthur and Sonia Evangelisches Krankenhaus, Remterweg 2, Bielefeld 33617, Germany. 20 Department of Labatt Brain Tumour Research Centre, Hospital for Sick Children, 555 University Avenue, Paediatric Oncology and Haematology, Saarland University Hospital, Homburg 66421, Toronto, Ontario M5G 1X8, Canada. 42 Dana Farber Cancer Institute, 450 Brookline Germany. 21 Institute for Pathology, Saarland University Hospital, Kirrberger Strasse, Avenue, Boston, Massachusetts 02215, USA. 43 Children’s Hospital Boston, 300 Homburg 66424, Germany. 22 Department of Neuropathology, Institute of Pathology, Longwood Avenue, Boston, Massachusetts 02115, USA. 44 Institute of Pharmacy and Wu ¨rzburg University Josef-Schneider Strasse 2, Wu ¨rzburg 97080, Germany. Molecular Biotechnology, University of Heidelberg, Heidelberg 69120, Germany. 23 45 Department of Neurosurgery, Heidelberg University Hospital, Im Neuenheimer Feld Bioquant Center, University of Heidelberg, Im Neuenheimer Feld 267, Heidelberg 400, Heidelberg 69120, Germany. 24 Clinical Cooperation Unit Pediatric Oncology, 69120, Germany. German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, Heidelberg 69120, *These authors contributed equally to this work. 2 A UGUS T 2 012 | V OL 488 | N A T U R E | 105 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER doi:10.1038/nature11329 Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations 1 5 Trevor J. Pugh 1,2,3 , Shyamal Dilhan Weeraratne 3,4 , Tenley C. Archer 3,4 , Daniel A. Pomeranz Krummel , Daniel Auclair , 1 1 1 1 1 James Bochicchio , Mauricio O. Carneiro , Scott L. Carter , Kristian Cibulskis , Rachel L. Erlich , Heidi Greulich 1,2,3 , 1 1 1 1 1 1 Michael S. Lawrence , Niall J. Lennon , Aaron McKenna , James Meldrim , Alex H. Ramos 1,2,3 , Michael G. Ross , Carsten Russ , 1 1 1 1 1 1 Erica Shefler , Andrey Sivachenko , Brian Sogoloff , Petar Stojanov , Pablo Tamayo , Jill P. Mesirov , Vladimir Amani 3,4 , 6 7 6 Natalia Teider 3,4 , Soma Sengupta 3,4 , Jessica Pierre Francois 3,4 , Paul A. Northcott , Michael D. Taylor , Furong Yu , 7 1 9 1 9 9 Gerald R. Crabtree 7,8 , Amanda G. Kautzman , Stacey B. Gabriel , Gad Getz , Natalie Ja ¨ger , David T. W. Jones , Peter Lichter , 9 2,3 Stefan M. Pfister , Thomas M. Roberts , Matthew Meyerson 1,2,3,10 , Scott L. Pomeroy 1,3,4 & Yoon-Jae Cho 1,3,4,7 Medulloblastomas are the most common malignant brain tumours in present at low allelic fraction. Here we survey coding somatic muta- 1 children . Identifying and understanding the genetic events that drive tions at deeper coverage in a larger cohort of 92 medulloblastoma/ these tumours is critical for the development of more effective normal pairs and assess these mutations in the context of specific diagnostic, prognostic and therapeutic strategies. Recently, our group molecular subtypes (Supplementary Table 1). and others described distinct molecular subtypes of medulloblastoma Intotal,1,908mutationsweredetectedwithin1,671outof18,863genes 2–5 on the basis of transcriptional and copy number profiles . Here sequenced to a median of 1063 coverage (Supplementary Table 2). we use whole-exome hybrid capture and deep sequencing to Confirmation of 20 candidate mutations in selected genes (CTNNB1, identify somatic mutations across the coding regions of 92 primary DDX3X, SMARCA4, TP53 and CTDNEP1)was performedbyamplifica- medulloblastoma/normal pairs. Overall, medulloblastomas have tion of 48 exons using a microfluidic PCR device (Fluidigm) followed by low mutation rates consistent with other paediatric tumours, with single-molecule real-time sequencing (SMRT, Pacific Biosciences) a median of 0.35 non-silent mutations per megabase. We identified (Supplementary Information). Sequence data was unavailable for one twelve genes mutated at statistically significant frequencies, includ- DDX3X mutation because of poor PCR amplification from the sample. ing previously known mutated genes in medulloblastoma such as All remaining 19 mutations were confirmed by this orthogonal method CTNNB1, PTCH1,MLL2, SMARCA4 andTP53. Recurrentsomatic (median 73 redundant sub-reads, range 3–287, Supplementary Fig. 1). mutations were newly identified in an RNA helicase gene, DDX3X, A median of 16 somatic mutations (12 non-silent, 4 silent) per often concurrent with CTNNB1 mutations, and in the nuclear tumour was identified, corresponding to a mutation rate of 0.35 non- co-repressor (N-CoR) complex genes GPS2, BCOR and LDB1.We silentmutations per megabase of callable sequence, less than most adult show that mutant DDX3X potentiates transactivation of a TCF solid tumours and consistent with results from ref. 7. Six of the twelve promoter and enhances cell viability in combination with mutant, most frequently mutated tumours were from the oldest patients but not wild-type, b-catenin. Together, our study reveals the altera- (16–31 years at diagnosis), consistent with increased mutation fre- 25 tion of WNT, hedgehog, histone methyltransferase and now N-CoR quency in adult versus childhood medulloblastomas (P 5 7.73 10 , pathways across medulloblastomas and within specific subtypes of Wilcoxon rank-sum test, Supplementary Fig. 2). this disease, and nominates the RNA helicase DDX3X as a compon- To identify genes mutated at statistically significant frequencies 8 ent of pathogenic b-catenin signalling in medulloblastoma. across our cohort, we used the MutSig algorithm which takes into Medulloblastomas are aggressive tumours of primitive neuroecto- account gene size, sample-specific mutation rate, non-silent to silent dermal origin. More than one third of patients diagnosed with medul- mutation ratios, clustering within genes, and base conservation across 6 loblastoma succumb to their disease within 5 years and surviving species. In our cohort of 92 samples, we identified 12 significantly patients often have significant long-term adverse effects from current mutated genes (q , 0.1, Table 1 and Supplementary Table 3). therapies. Identifying the underlying genetic events responsible for Strikingly, these genes were not mutated in c5 (Group 3) and c4 medulloblastomas can help guide the development of more effective (Group 4) tumours with extensive somatic copy number alteration therapies and refine the selection of currently available chemotherapy (Fig. 1), indicating that these subtypes are driven primarily by struc- and radiotherapy. Recent efforts profiling transcriptional and DNA tural variation, rather than base mutation. Not unexpectedly, CTNNB1 copy number changes in medulloblastoma have provided insights into (b-catenin) and PTCH1 were the two most significantly mutated genes the biological processes involved in these tumours and have under- (see Table 1 and Fig. 1). Point mutations of CTNNB1 in combination 2–4 scored the molecular heterogeneity of this disease . Based on these with loss of chromosome 6 were found in all WNT subgroup tumours data,fourbroadsubgroups havebeenestablished,knownaccordingtoa and were concurrent with several other recurrently mutated genes, consensus nomenclature as SHH, WNT, Group 3 and Group 4 (ref. 5). namely CSNK2B, DDX3X, TP53 and SMARCA4. Mutations involving The first genome-scale sequencing of protein coding regions in PTCH1 occurred exclusively in SHH subgroup tumours and muta- medulloblastoma was reported recently . Altered genes encoding for tions of genes associated with the hedgehog pathway were also 7 histone modification proteins were identified in 20% of cases, most restricted to this subgroup (P , 0.0001, Fisher’s exact test). All but notably MLL2 and MLL3 (ref. 7). This initial survey was limited by a one of the tumours with PTCH1 mutations had somatic loss of 9q, small discovery sample size (22 patients), lack of subtype-specific ana- resulting in hemizygosity for the mutant allele. The remaining tumour lysis, and use of Sanger sequencing technology insensitive to variants had apparent copy neutral loss-of-heterozygosity of 9q22. Other 1 Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA. Center for Cancer Genome Discovery, Departments of Medical Oncology and of Biological Chemistry and Molecular 2 3 4 Pharmacology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA. Harvard Medical School, Boston, Massachusetts 02115, USA. Department of Neurology, Children’s Hospital Boston, 7 6 5 Boston, Massachusetts 02115, USA. Brandeis University, Waltham, Massachusetts 02453, USA. The Hospital for Sick Children, Toronto, Ontario M5G 1X8, Canada. Departments of Neurology and 9 8 Neurosurgery, Stanford University School of Medicine, Stanford, California 94305, USA. Howard Hughes Medical Institute at Stanford University, Stanford, California 94305, USA. German Cancer 10 Research Center (DKFZ), 69120 Heidelberg, Germany. Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts 02115, USA. 10 6 | NA TU RE | V O L 488 | 2 A U GU ST 20 12 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH Table 1 | Genes mutated at a statistically significant frequency in 92 medulloblastomas. Gene Description Mutations Patients Unique sites Silent Missense Indel or null Double null q CTNNB1 b-catenin 664060 0,1.8 3 10 211 PTCH1 Patched homologue 1 (Drosophila) 777007 04.03 10 29 MLL2 Myeloid/lymphoid or mixed-lineage leukaemia 2 10 8 10 0 2 4 4 4.0 3 10 29 DDX3X DEAD box polypeptide 3, X-linked 777070 02.33 10 28 GPS2 G protein pathway suppressor 2 333012 01.23 10 24 TP53 Tumour protein p53 333030 0 0.039 KDM6A UTX, lysine (K)-specific demethylase 6A 333021 0 0.042 BCOR BCL6 co-repressor 333003 0 0.046 SMARCA4 ATP-dependent helicase 443040 0 0.046 LDB1 LIM domain binding 1 222011 0 0.047 CTDNEP1 CTD nuclear envelope phosphatase 1 222002 0 0.047 CSNK2B Casein kinase 2, b polypeptide 222020 0 0.071 Null, nonsense, frameshift or splice-site mutations; double null, null mutations co-occurring in a single tumour; q, q-value, false discovery rate (Benjamini–Hochberg procedure). See Supplementary Table 3 for further statistics and subtype analysis. somatic mutations of hedgehog pathway members include a splice site (MD-097 and MD-335) had loss-of-function variants in SUFU (1 fra- mutation in SUFU, an in-frame deletion in WNT6, and missense meshift deletion and 1 nonsense) that began as heterozygotes in the mutations in GLI2, SMO, PRKACA, WNT2 and WNT2B. germline and became hemizygous in the tumour, due to somatic loss of Two patients with SHH subgroup tumours had germline variants in chromosome 10 in one case and copy neutral loss-of-heterozygosity in PTCH1, one with somatic loss of 9q resulting in hemizygosity for a loss- the other. of-functiongermlineallele(MD-085,c.3030delC,p.Asn1011Thrfs*38), MLL2 was also subject to recurrent inactivating mutations, consist- and the other with a substitution previously reported in patients with ent with findings from ref. 7 and providing further evidence for holoprosencephaly (MD-286, p.T1052M, ref. 9). Two additional cases dysregulated histone modification in medulloblastoma. Indeed, six Identifer (MD-) 290 278 144 121 279 292 294 286 085 317 095 325 326 332 241 050 335 097 320 319 330 237 306 051 042 057 133 282 329 274 143 141 277 341 284 208 333 035 276 026 054 283 287 293 220 328 296 023 339 334 285 340 288 034 028 047 088 289 343 146 212 324 336 322 093 074 076 053 022 318 206 045 142 338 337 275 207 281 280 323 305 331 090 342 321 021 243 253 295 127 204 304 Clinical Sex variables Age - - Histology - -- - Subtypes Ref. 5 WNT SHH Group 3 Group 4 Ref. 2 c6 c3 c1 c5 c4 c2 Subtype-specifc 6– copy number GLI2amp 9q– alterations 3q+ (ref. 2) 20p– 2+ cMYCamp 1q+ 8+ 14+ 10– 16– 11– 13– 17+ 8– 12q+ MYCNamp i17q 4+ Somatic mutations DDX3X CTNNB1 Wnt pathway CSNK2B PTCH1 SUFU Hh pathway SMO GLI2 MLL2 Histone methyl- MLL3 transferases KDM6A BCOR Chromatin GPS2 remodeling LDB1 complexes NCOR2 SMARCA4 CTDNEP1 Other TP53 Sex Male Age 0–5 years Histology Classic Copy number Gains Somatic mutations Silent Female 6–15 years Large cell/anaplastic Losses Missense 16+ years Nodular/desmoplastic Nonsense/splice site/indel Unknown - Unknown - Germline Figure 1 | Demographic characteristics, molecular subtypes and selected microarray data. Consensus subtypes from refs 2 and 5 as published. Copy copy number alterations and somatic mutations across 92 medulloblastoma number alterations, selected copy number alterations used to assign tumours to cases. Data tracks describing 92 medulloblastoma cases. Identifier, unique subtypes. Blue, losses; red, gains. Somatic mutations, gene names (HUGO name used to denote each case. Identifiers also link samples to those analysed in symbols) grouped by functional category. MutSig gene names are in bold. ref. 2. Sex,malesinblue, females in pink.Age, yearsofageat diagnosis binnedas Black, missense mutations; orange, nonsense/splice site/indel mutations; infants, children or adults. Histology, pathology review of primary tissue purple, silent mutations; green, germline variants. specimen. Subtypes, based on copy number profiles derived from sequence or 2A U G U S T2 0 1 2|V O L4 8 8 |N A T U R E | 1 0 7 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER of the twelve most significantly mutated genes are involved in histone a Histone methyltransferases modification and/or related chromatin remodelling complexes Frameshift Missense (MLL2, GPS2, KDM6A, BCOR, SMARCA4 and LDB1; see Table 1). MLL2 As a gene set, histone methyltransferases (HMTs) were enriched for (5,537 residues) somatic mutation with 21 tumours having apparent, predominantly Missense Nonsense 29 loss-of-function, HMT mutations (q 5 5.83 10 ; Fig. 2 and MLL3 (4,911 residues) Supplementary Table 4). Splice Frameshift Subtype-specific MutSig analysis identified additional significant MLL4 mutations of histone-modifying genes, MLL3 and HDAC2, in Group (2,715 residues) SET 4 tumours along with KDM6A mutations (q 5 0.039 and 0.066, see Frameshift Missense Coiled coil Supplementary Table 3). Mutations in KDM6A, interestingly, KDM6A Zinc fnger occurred exclusively in tumours with an i17q as the sole autosomal (1,401 residues) Other alteration (P 5 0.0023, Fisher’s exact test) with the one female case with KDM6A mutation also having loss of a chromosome X. b Notably, the two ‘i17q only’ tumours without KDM6A mutations N-CoR complex Splice had other histone-modifying enzymes mutated, namely THUMPD3, NCOR2 ZMYM3 and MLL3, perhaps suggesting a distinct biology for tumours (2,525 residues) SANT CoRNR with this karyotype. Splice Missense Mutations in several genes encoding components of the nuclear GPS2 co-repressor (N-CoR) complex were observed at a statistically signifi- (327 residues) Nonsense Frameshift Nonsense cant frequency: BCOR in 3 tumours, GPS2 in 3 tumours, and LDB1 in 2 tumours. BCOR mutations have recently been reported at high fre- BCOR (1,755 residues) quency in retinoblastoma and in ‘copy-neutral’ acute myelogenous Missense Nonsense ANK 10 11 leukaemia . BCOR is located on the X-chromosome and two GPS2–NCOR2 interaction LDB1 hemizygous frameshift mutants were found in tumours from males (375 residues) Other domains (labeled) (allele fractions 0.90 and 0.92). A third nonsense mutation was also found in a male but at low allelic fraction (0.12), indicating a subclonal c event. Two out of three BCOR mutations occurred in SHH subgroup RNA helicases Missense tumours. LDB1 missense and nonsense mutations were found in two additional SHH tumours, both appearing hemizygous due to loss of DDX3X RNA binding (661 residues) 10q and complete chromosome 10 loss, respectively (allele fractions DEAD/DExH Missense 0.81 and 0.78). Both BCOR and LDB1 promote assembly of the ATP DHX9 Helicase 12 repressive N-CoR complex and harbour apparent loss of function (1,270 residues) mutations. GPS2, which encodes a critical subunit of the N-CoR com- Frameshift plex, a repressor of JNK/MAPK signalling through partnership with DHX32 12 histone deacetylases , was mutated in two Group 3 tumours. The (743 residues) GPS2 mutations cluster within amino acids 53–90, the domain critical Missense for heterodimerization with NCOR2 (also known as SMRT) and DHX57 (1,386 residues) interacting with a TBL1 amino-terminal domain tetramer to assemble Nonsense 12 the N-CoRrepression complex . Finally,anadditional nonsensemuta- FANCM tioninNCOR2was identified ina singleSHH subgrouptumour,under- (2,048 residues) scoring the central role of N-CoR dysregulation in medulloblastoma Missense development and particularly within the SHH subgroup. SKIV2L Several genes encoding subunits of the SWI/SNF-like chromatin- (1,246 residues) remodelling complex were also mutated in our cohort, including sig- Frameshift nificant recurrent mutations of SMARCA4(Brg/BAF190), which SETX 13 encodes a DNA helicase with ATPase activity and has been reported (2,677 residues) to be mutated in lung, ovarian, and pancreatic cancers as well as Figure 2 | Location of mutations in histone methyltransferases, RNA 14 15 medulloblastoma . In our cohort, SMARCA4(Brg/BAF190) mutations helicases and N-CoR complex-associated genes. Location of somatic clustered in helicase domains and occurred in three Group 3 tumours mutations on linear protein domain models of genes from sets frequently (significant within the c1 subtype, q 5 0.019), and one WNT tumour. mutated in medulloblastoma. All domain annotations are from UniProt and In addition, mutations were found in the alternative ATPase subunit InterPro annotations. Diagrams were constructed using Domain Graph (DOG) , version 2.0. a, Histone methyltransferase domains: red, SET; green, 28 SMARCD2(Brm) (missense at a highly conserved residue) and two coiled-coil; blue, zinc-finger; cyan, other. b, N-CoRcomplex-associated domains: other members of the SWI/SNF complex, ARID1B(BAF250b) (2 base purple, anti-parallel coiled-coil domains required for GPS2–NCOR2 (SMRT) pairs (bp) frameshift deletion) and SMARCC2(BAF170) (splice site). interaction ;yellow,otherinteractiondomainsaslabelled(SANTdomainsbinds 12 These were all apparent loss-of-function mutations and occurred in DNA,CoRNRdomainsbindsnuclearreceptors,ANKrepeatsmediateadiversity SHH tumours. Thus, it seems that disruption of this complex is fre- of protein–protein interactions, and LIM-binding domains bind a common quent across medulloblastomas. protein structural motif). c, RNA helicase domains: cyan, helicase and helicase- New and hemizygous mutations were found in CTDNEP1 (previ- associated (InterPro); red, RNA-binding and RNA polymerase sigma factor ously known as DULLARD), a phosphatase with roles in Xenopus (InterPro); blue, ATP binding site; green, DEAD or DExH box motif. See Supplementary Table 1 for UniProt protein model identifiers. 16 neural development through regulation of BMP receptors , and as a direct regulator of LIPIN, an integral component of the mTOR com- 17 plex . CTDNEP1 mutations were found in two Group 3 tumours Mutations in DDX3X, an ATP-dependent RNA helicase with func- 18 (significant within the subtype, q 5 0.0087), a 2-bp frameshift deletion tions in transcription, splicing, RNA transport and translation , were and a substitution disruptive of a splice site. Both tumours have i17q found in seven tumours, including half of the WNT pathway tumours chromosomes, resulting in loss of the wild-type allele at 17p13. (P 5 0.005, Fisher’s exact test) and several SHH subgroup tumours. 10 8 | NA TU RE | V O L 488 | 2 A U GU ST 20 12 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH DDX3X mutations have recently been reported at low frequency in five (TOPflash) and if DDX3X/b-catenin co-expression had a measureable other tumour types (Catalogue of Somatic Mutations in Cancer, effect on cell viability/proliferation. In combination with wild-type 15 COSMIC ) but the significance of these mutations for DDX3X func- b-catenin, neither wild-type nor mutant DDX3X alone significantly tion remains unclear. To understand the consequence of observed transactivated the TOPflash reporter. However, in combination with point mutations on the physical structure of DDX3X, we mapped mutant b-catenin (S33Y substitution), the majority of DDX3X point the mutations onto the previously reported crystal structure of mutants in our cohort potentiated reporter activity (P , 0.05, Fig. 3b). 19 DDX3X and its orthologue DDX4 (also known as VASA, ref. 20) This potentiation was alsoapparent incell viability assaysin both HeLa (Fig. 3a; Supplementary Fig. 3 and Supplementary Table 5). The (data not shown) and D425 medulloblastoma cell lines (P , 0.05, mutations seem to cluster in two structural domains, a helicase Fig. 3c). ATP-binding domain (residues 211–403) and a helicase carboxy- Given the apparent importance of DDX3X mutations in medullo- terminal domain (residues 414–575). The location of these mutations blastoma, we searched the genes listed in the RNA Helicase Database indicates that they may alter DDX3X–RNA interaction (Fig. 3a and (http://www.rnahelicase.org/) for low frequency mutations in medul- Supplementary Table 5). loblastoma. We found five tumours with mutations within RNA heli- As half of the b-catenin mutated tumours contained concurrent case or RNA binding domains of DHX9, DHX32, DHX57, FANCM DDX3X mutations, we investigated whether DDX3X could enhance and SKIV2L (Fig. 2 and Supplementary Table 6). The missense muta- the of ability b-catenin to transactivate a TCF4-luciferase reporter tions were at conserved residues and predicted to be deleterious by the software packages SIFT, AlignGVGD and PolyPhen2. In addition, a frameshift insertion in SETX occurs upstream of, and probably dis- rupts, its RNA helicase domain. Overall, 15% of medulloblastomas a seem to have some disruption of RNA helicase activity. RNA 166–405 406-582 In summary, we report a next-generation sequencing analysis of RNA medulloblastoma, the most common malignant brain tumour in Mg-ATP analogue children. Our results reveal mutations in several known pathways such as histone methylation (MLL2 and others), sonic hedgehog (PTCH1, SUFU and others) and Wnt (CTNNB1 and others), and also previously ATP binding DEAD Box unrecorded mutations in genes including DDX3X, BCOR, LDB1 and Helicase Helicase Gly/Ser-rich GPS2. Our preliminary functional studies implicate DDX3X as a Q motif b candidate component of pathogenic WNT/b-catenin signalling. In a broader sense, DDX3X mutations have recently been reported in 1,200 22 21 1,000 chronic lymphocytic leukaemia and head and neck cancers , both Luciferase/Renilla relative light units 600 Studies investigating whether mutant DDX3X functions together with of which have subsets of tumours with dysregulated WNT signalling. 800 b-catenin in these contexts should provide additional insights into this 400 200 multifaceted molecule and open potential avenues for novel therapies. Finally, the delineation of nuclear receptor co-repressor complex 0 c 1.6 molecules as altered in medulloblastoma provides new insight into 1.4 the pathogenesis of this deadly childhood disease. Relative proliferation 1.0 METHODS SUMMARY 1.2 Informed consent was provided by families of medulloblastoma patients treated at 0.8 Children’s Hospital Boston, The Hospital for Sick Children, Toronto, Canada, and 0.6 institutions contributing to the Children’s Oncology Group/Cooperative Human 0.4 Tissue Network, under approval and oversight by their respective Internal Review 0.2 Boards. All tumours were obtained at the initial surgical resection and recurrent DDX3X Empty WT R276K D354H R376C D506Y R528H R534H P568L WT β-catenin + + + + + + + + + tumours were excluded from our analysis. Haematoxylin- and eosin-stained slides S33Y β-catenin + + + + + + + + + of tumour samples were reviewed by a pathologist to confirm the diagnosis of Figure 3 | Functional consequence of DDX3X point mutations. a, Three- medulloblastoma, determine histological subtype when possible, and assess dimensional model of the tworecA-like domains ofhuman DDX3X in complex tumour purity. DNA was isolated from tumour specimens and matched 2 with single-stranded RNA and a Mg-ATP analogue. Displayed are the residues peripheral blood as previously described . Exome sequencing of DNA from 92 mutated in the amino-terminal recA-like domain (R276K, D354H, R376C) and tumour/normal pairs was performed using in-solution hybrid-capture of 193,094 C-terminal recA-like domain (D506Y, R528H, R534H, P568L). Colouring: exons from 18,863 micro RNA (miRNA)- and protein-coding genes, followed by light blue, DDX3X residues 166–405; dark blue, DDX3X residues 406–582; sequencing of 76 bp paired-end reads using Illumina sequencing-by-synthesis 23 cyan, single-stranded RNA; magenta and green, Mg-ATP analogue. Molecular technology . Reads were aligned to human genome build GRCh37 24 using a 25 graphics images were produced using the University of San Francisco Chimera Burrows–Wheeler aligner (BWA) .The ,33-megabase target region was 29 package (http://www.cgl.ucsf.edu/chimera). b, Mutant DDX3X potentiates sequenced to 1063 mean coverage in each sample (range 73–234). Gene mutantb-catenintransactivationofTOPflashpromoter. Representedisrelative expression data and copy number profiles (derived from SNP microarrays or luciferaseactivityin293Tcellsco-transfectedwithTOPflashreporter,FOPflash sequence data) were used to assign each tumour to a subgroup using published 2 control, and either wild-type or mutant DDX3X in combination with wild-type criteria . Our cohort consisted of 6 WNT (c6), 23 SHH (c3), 33 Group 3 (12 c1, 21 or mutant b-catenin. One-dimensional model of DDX3X displayed above bar c5), and 30 Group 4 (12 c2, 18 c4) tumours (see Supplementary Table 1 for case graphstoillustratesthepositionofthemutations.WT,wild type.c, Cellviability annotations). Mutations were detected using muTect, annotated using 26 assaysofmedulloblastomaD425cellsstably transducedwitheitherwild-typeor Oncotator , and manually reviewed using the Integrated Genomics Viewer mutant DDX3X lentivirus in combination with either wild-type or mutant (IGV) . For validation, PCR on Access Array microfluidic chips (Fluidigm) was 27 b-catenin lentivirus. For b and c, error bars depict the standard deviation of the followed by single-molecule real-time sequencing (Pacific Biosciences) as per mean from five replicate experiments performed for each condition. Student’s manufacturer’s instructions. Sub-reads were extracted and assigned to samples t-tests were performed to evaluate significance of differences in TOPflash using manufacturer’s and custom software, and aligned to the hg19 (GRCh37) 25 intensity or cell proliferation value distributions as follows: increases with build of the human reference genome sequence using BWA-SW . Candidate DDX3X alone versus empty vector, increases with wild-type b-catenin versus mutations were confirmed by manual review using IGV (Supplementary Fig. 1). 27 DDX3X alone, increases with mutant b-catenin versus DDX3X alone, and See Supplementary Information and http://www.broadinstitute.org/cancer/cga/ for increases with mutant b-catenin versus wild-type b-catenin. complete descriptions of materials and methods. 2 A UG US T 2 01 2 | V O L 4 8 8 | N A T UR E | 1 0 9 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER Received 3 February; accepted 15 June 2012. 22. Stransky, N. et al. The mutational landscape of head and neck squamous cell carcinoma. Science 333, 1157–1160 (2011). Published online 22 July 2012. 23. Bentley, D. R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008). 1. Central Brain Tumor Registry of the United States. Statistical Report: Primary Brain 24. Genome Reference Consortium. Human Genome Overview http:// and Central Nervous System Tumors Diagnosed in the United States in 2004–2007 www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/ (2012). http://www.cbtrus.org/2011-NPCR-SEER/WEB-0407-Report-3-3-2011.pdf 25. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler (CBTRUS, 2011). transform. Bioinformatics 25, 1754–1760 (2009). 2. Cho, Y.-J. et al. Integrative genomic analysis of medulloblastoma identifies a 26. Ramos, A. et al. Oncotator http://www.broadinstitute.org/oncotator/ (2012). molecular subgroup that drives poor clinical outcome. J. Clin. Oncol. 29, 27. Robinson, J. T. et al. Integrative genomics viewer. Nature Biotechnol. 29, 24–26 1424–1430 (2011). (2011). 3. Kool, M. et al. Integrated genomics identifies five medulloblastoma subtypes with 28. Ren, J. et al. DOG 1.0: illustrator of protein domain structures. Cell Res. 19, distinct genetic profiles, pathway signatures and clinicopathological features. 271–273 (2009). PLoS ONE 3, e3088 (2008). 29. Pettersen, E.F.etal.UCSF Chimera–a visualizationsystemforexploratoryresearch 4. Remke, M.etal.Adultmedulloblastomacomprisesthreemajormolecularvariants. and analysis. J. Comput. Chem. 25, 1605–1612 (2004). J. Clin. Oncol. 29, 2717–2723 (2011). 5. Taylor, M. D. e. t. a. l. Molecular subgroups of medulloblastoma: the current Supplementary Information is linked to the online version of the paper at consensus. Acta Neuropathol. 123, 465–472 (2012). www.nature.com/nature. 6. Smoll, N. R. Relative survival of childhood and adult medulloblastomas and primitive neuroectodermal tumors (PNETs). Cancer 118, 1313–1322 (2012). Acknowledgements This work wassupported by NIH grants NHGRI U54HG003067 to 7. Parsons, D. W. et al. The genetic landscape of the childhood cancer E.S.Lander(E.S.,D.A.,S.B.G.,G.G.,M.M.); R01CA109467(S.L.P.,J.P.M.);R01CA105607 medulloblastoma. Science 331, 435–439 (2011). (H.G., T.M.R., M.M., S.L.P.); P30 HD18655 (S.L.P.); R01 CA030002 and CA050661 8. Getz, G. et al. Comment on ‘‘The consensus coding sequences of human breast (T.M.R.); R01 NS046789 (G.R.C.); R01 CA154480 (P.T.); R25NS070682 (S.S.) and and colorectal cancers.’’. Science 317, 1500 (2007). R01CA148699 (M.D.T.); St. Baldrick’s Foundation Scholar Award and the Beirne 9. Ming, J. E. et al. Mutations in PATCHED-1, the receptor for SONIC HEDGEHOG, are Faculty Scholar endowment and Center for Children’s Brain Tumors at Stanford associated with holoprosencephaly. Hum. Genet. 110, 297–301 (2002). University (Y.-J.C.); German Cancer Aid (109252) and the BMBF ICGC-PedBrain 10. Zhang, J. et al. A novel retinoblastoma therapy from genomic and epigenetic project(N.J.,D.T.W.J.,P.L., S.M.P.);HHMI(G.R.C.);thePediatricBrainTumorFoundation analyses. Nature 481, 329–334 (2012). (M.D.T.); Canadian Institutes of Health Research Fellowship (T.J.P.); Restracomp 11. Grossmann, V. et al. Whole-exome sequencing identifies somatic mutations of funding from the Hospital for Sick Children (P.A.N.); and the Mullarkey Research Fund BCOR in acute myeloid leukemia with normal karyotype. Blood 118, 6153–6163 (S.L.P.). We thank Children’s Oncology Group and the Cooperative Human Tissue (2011). Network for providing tumour samples, the staff of the Broad Institute Biological 12. Oberoi, J. et al. Structural basis for the assembly of the SMRT/NCoR core Samples, Genome Sequencing and Genetic Analysis Platforms for their assistance in transcriptionalrepression machinery. Nature Struct. Mol.Biol. 18, 177–184 (2011). genomic processing of samples and generating the sequencing data used in this 13. Baek, S. H. et al. Regulated subset of G1 growth-control genes in response to analysis,K. KehoandM.BrownatPacific Biosciencesfor technicalsupportwith sample derepression by the Wnt pathway. Proc. Natl Acad. Sci. USA 100, 3245–3250 barcoding methods, and L. Gaffney of Broad Institute Communications for assistance (2003). with figure layout and design. 14. Wilson, B. G. & Roberts, C. W. M. SWI/SNF nucleosome remodellers and cancer. Author Contributions Y.-J.C., M.M. and S.L.P. conceived the project. Y.-J.C., T.J.P., M.M. Nature Rev. Cancer 11, 481–492 (2011). and S.L.P. wrote the manuscript with input from co-authors. S.D.W., T.C.A., J.P.F., S.S., 15. Futreal,P.A. etal.A census ofhumancancergenes. NatureRev.Cancer4, 177–183 N.T., Y.-J.C., A.G.K. and F.Y. performed functional characterization studies. D.A.P.K. (2004). generated in silico structural modelling of DDX3X mutations. T.J.P. conducted the 16. Satow,R., Kurisaki, A., Chan,T.-c., Hamazaki,T.S. &Asashima, M.Dullardpromotes bioinformatic analysis, supported by S.L.C., P.S., K.C., M.S.L., A.M., A.H.R., A.S., H.G., P.T., degradation and dephosphorylation of BMP receptors and is required for neural J.P.M., N.J. and D.T.W.J.; D.A., E.S., S.B.G., and G.G. facilitated transfer, sequencing and induction. Dev. Cell 11, 763–774 (2006). analysis of samples. P.A.N. and M.D.T. provided tissues for analysis. Y.-J.C., J.P.F. and 17. Peterson, T. R. et al. mTOR complex 1 regulates lipin 1 localization to control the V.A. processed tumour and blood samples for study. G.R.C. generated reagents used in SREBP pathway. Cell 146, 408–420 (2011). functional characterization studies. P.L., S.M.P. and T.M.R. assisted with interpretation 18. Garbelli, A., Beermann, S., Di Cicco, G., Dietrich, U. & Maga, G. A motif unique to the of results. J.B., M.O.C., R.L.E., N.J.L., J.M., M.G.R., C.R. and B.S. performed microfluidic humandead-boxproteinDDX3isimportantfornucleicacidbinding,ATPhydrolysis, PCR and single-molecule real-time sequencing for validation analysis. RNA/DNA unwinding and HIV-1 replication. PLoS ONE 6, e19810 (2011). 19. Ho ¨gbom, M. et al. Crystal structure of conserved domains 1 and 2 of the human Author Information Sequence data used for this analysis are available in dbGaP under DEAD-box helicase DDX3X in complex with the mononucleotide AMP. J. Mol. Biol. accession phs000504.v1.p1. Reprints and permissions information is available at 372, 150–159 (2007). www.nature.com/reprints. The authors declare competing financial interests: details 20. Sengoku, T., Nureki, O., Nakamura, A., Kobayashi, S. & Yokoyama, S. Structural accompanythefull-textHTMLversionofthepaperatwww.nature.com/nature.Readers basis for RNA unwinding by the DEAD-box protein Drosophila Vasa. Cell 125, are welcome to comment on the online version of this article at www.nature.com/ 287–300 (2006). nature. Correspondence and requests for materials should be addressed to Y.-J.C. 21. Wang, L. et al. SF3B1 and other novel cancer genes in chronic lymphocytic ([email protected]), M.M. ([email protected]) or leukemia. N. Engl. J. Med. 365, 2497–2506 (2011). S.L.P. ([email protected]). 1 1 0| N A T U R E |V O L 4 8 8|2 A U G U S T 2 0 1 2 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER doi:10.1038/nature11362 Targeting nuclear RNA for in vivo correction of myotonic dystrophy 1,2 3 1,2 4 4 3 Thurman M. Wheeler , Andrew J. Leger , Sanjay K. Pandey , A. Robert MacLeod , Masayuki Nakamori ,SengH.Cheng , 3 4 Bruce M. Wentworth , C. Frank Bennett & Charles A. Thornton 1,2 18 Antisense oligonucleotides (ASOs) hold promise for gene-specific with oligomers . As an alternative, RNase H-active ASOs could pro- knockdown in diseases that involve RNA or protein gain-of- duce widespread correction, provided that uptake of circulating ASOs function effects. In the hereditary degenerative disease myotonic was sufficient to induce target cleavage. dystrophy type 1 (DM1), transcripts from the mutant allele contain We identified ASOs showing a strong knockdown of hACTA1 in 4,5 an expanded CUG repeat 1–3 and are retained in the nucleus . The tissue culture, good tolerability when systemically administered in 6 mutant RNA exerts a toxic gain-of-function effect , making it wild-type mice, and activity against hACTA1-CUG exp transcripts an appropriate target for therapeutic ASOs. However, despite in vivo when electroporated into muscle (Supplementary Figs 1–3). improvements in ASO chemistry and design, systemic use of The ASOs had 29-O-methoxyethyl (MOE) modifications at both ASOs is limited because uptake in many tissues, including skeletal ends to maximize biostability, and a central gap of 10 unmodified and cardiac muscle, is not sufficient to silence target messenger nucleotides to support RNase H activity (MOE gapmers; Supplemen- 7,8 RNAs . Here we show that nuclear-retained transcripts contain- tary Table 1). We tested three of the ASOs in HSA LR transgenic mice by ing expanded CUG (CUG exp ) repeats are unusually sensitive to subcutaneous injection of 25 mg kg 21 twice weekly (Fig. 1a). After 4 antisense silencing. In a transgenic mouse model of DM1, systemic weeks of administration (8 injections), ASO 445236 reduced the level administration of ASOs caused a rapid knockdown of CUG exp of CUG exp RNA in hindlimb muscles by more than 80% (Fig. 1b). RNA in skeletal muscle, correcting the physiological, histopatho- Another ASO targeting the 39 UTR, downstream of the repeat tract, logic and transcriptomic features of the disease. The effect was also showed strong CUG exp reduction, whereas an ASO targeting the sustained for up to 1 year after treatment was discontinued. 59 UTR, or three oligonucleotides against other targets, had no effect Systemically administered ASOs were also effective for muscle (Fig. 1b, c). knockdown of Malat1, a long non-coding RNA (lncRNA) that is RNase H cleavage of mRNA is usually followed by rapid decay of 9 retained in the nucleus . These results provide a general strategy to cleavage fragments. However, stable cleavage fragments are observed 20 19 correct RNA gain-of-function effects and to modulate the expres- occasionally , and the CUG exp tract forms extensive hairpins and 21 sion of expanded repeats, lncRNAs and other transcripts with pro- ribonucleoprotein complexes that could inhibit exonuclease activity. longed nuclear residence. The failure of antisense targeting in the 59 UTR also raised the Antisense silencing by the RNase H-dependent mechanism entails a possibility that cleavage downstream of the repeat tract was required three-step process of oligonucleotide hybridization to its cognate for efficient silencing. We therefore tested an additional ASO, 190401, RNA, cleavage of the target by RNase H1 and exonuclease degradation targeting the hACTA1 coding region, and found that it also was highly of the cleavage fragments. The relative efficiency of this mechanism in effective (Fig. 1d). Furthermore, northern blot analysis using a CAG- the nucleus and cytoplasm is uncertain. Observations that ASOs repeat probe showed no evidence for a stable CUG exp cleavage frag- shuttle from cytoplasm to nucleus 10 and that targeting intronic ment (Fig. 1e), and in situ hybridization showed reduction of nuclear 12 11 sequences or nuclear RNA can silence gene expression indicate CUG exp foci (Supplementary Fig. 4). These results indicate that that antisense is active in the nucleus. However, activity in the expanded CUG repeats are degraded after a cleavage event 59 or 39 cytoplasm is less clear. Although RNase H1 is not restricted to the of the repeat tract. 13 nucleus , recent studies indicate that the non-nuclear fraction is con- Reduction of CUG exp RNA would be expected to release sequestered 14 fined to mitochondria . This suggests that ASONRNase H cleavage is MBNL1 protein and improve its splicing regulatory activity. mainly a nuclear process, whose potency could be maximized by Consistent with this prediction, alternative splicing of four MBNL1- targeting transcripts with long nuclear residence. dependent exons, Serca1 (also known as Atp2a1) exon 22, titin (Ttn) To test this idea, we used a transgenic mouse model of DM1. HSA LR exon 362, Zasp (also known as Ldb3) exon 11, and Clcn1 chloride ion transgenic mice express CUG exp RNA at high levels in skeletal muscle. channel exon 7a, was normalized (Fig. 1f, g and Supplementary Figs 5 15 Human DM1 is caused by an expanded CTG repeat in the 39 untrans- and 6) .The Clcn1 splicing defect causes loss of channel function, 22 lated region (UTR) of dystrophia myotonica-protein kinase (DMPK) , repetitiveactionpotentialsand delayed musclerelaxation (myotonia) , 3 whereas in HSA LR mice the expanded repeat is in the 39 UTR of a a cardinal feature of the disease. Blind analysis showed that myotonic human skeletal actin (hACTA1) transgene . In both conditions the discharges in hindlimb muscles were eliminated by the active ASOs 6 CUG exp transcripts are retained in nuclear foci, along with splicing (Fig. 1h), indicating rescue of Clcn1 function. factors in the muscleblind-like (MBNL) protein family. Muscleblind In addition to splicing defects, expression of CUG exp RNA or ablation sequestration leads to misregulated alternative splicing and other ofMbnl1causesextensiveremodellingofthemuscletranscriptome 16,17,23 . changes of the muscle transcriptome 15–17 . The RNA toxicity was We used microarrays to examine transcriptomic effects of ASOs. mitigated in mice by CAG-repeat morpholino oligomers that compete Principle component analysis showed that gene expression in ASO- with MBNL proteins for CUG exp binding, without activating RNase H. treated HSA LR mice was shifted towards wild-type mice, indicating However, this approach required direct injection into a single muscle, an overall trend for transcriptome normalization (Fig. 2a). Among followed by in vivo electroporation, a method to load muscle fibres transcripts that were up- or downregulated in HSA LR muscle, more 1 2 Department of Neurology, University of Rochester, 601 Elmwood Avenue, Rochester, New York, 14642, USA. Center for Neural Development and Disease, University of Rochester, 601 Elmwood Avenue, 3 4 Rochester, New York, 14642, USA. Genzyme Corporation, 49 New York Avenue, Framingham, Massachusetts 01701, USA. Isis Pharmaceuticals, 2855 Gazelle Court, Carlsbad, California 92010, USA. 2 A UG US T 2 012 | V O L 4 8 8 | N A T U R E | 1 1 1 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER a hACTA1-CUG exp mRNA 5′ Coding sequence (CUG) 220 3′ 445238 190403 190401 445236 bcd 2.0 6 ** 1.5 *** *** ** Saline 1.5 Saline hACTA1 mRNA level (normalized to 18S RNA) 4 2 *** *** ** ** 445238 hACTA1 mRNA level (normalized to 18S RNA) 1.0 hACTA1 mRNA level (normalized to 18S RNA) 1.0 *** 190401 190403 445236 0.5 0.5 0 0.0 0.0 Saline 141923 116847 399462 Quad Gastroc TA Quad Gastroc TA ef Clcn1 WT Neg Pos Saline 190403 445238 445236 WT Saline 190403 445238 445236 + Int 6 CUG RNA + Ex 8a + Ex 7a – Ex 7a TA Mouse + Int 6 + Ex 7a actin + Ex 7a + 8a + Ex 7a – Ex 7a g Serca1 Quad + Int 6 + Ex 7a WT Neg Pos Saline 190403 445238 445236 + Ex 7a + 8a + Ex 7a – Ex 7a + Ex 22 Gastroc – Ex 22 TA h *** *** *** 3 L Quad R Quad + Ex 22 L Gastroc Myotonia grade 1 R TA – Ex 22 2 R Gastroc Quad L TA Lumbar + Ex 22 0 paraspinals – Ex 22 Saline 190403 445238 445236 190401 Gastroc Treatment Figure 1 | Systemic administration of 29-O-(2-methoxyethyl) ASOs in the 190401 (n 5 4 per group; same dose as in b). Error bars are mean 6 s.d. HSA LR transgenic mouse model of DM1. a, Location of ASO-targeting ***P # 0.0005 (t-test). e, Northern analysis of RNA from quadriceps muscle. sequences relative to the hACTA1 coding region and the expanded CUG repeat CUG exp RNA was detected using a (CAG) 9 oligonucleotide probe. Mouse actin in the 39 UTR. b, Quantitative real-time RT–PCR of hACTA1-CUG exp mRNA serves as loading control. f, g, RT–PCR analysis of alternative splicing of Clcn1 in quadriceps (Quad), gastrocnemius (Gastroc) and tibialis anterior (TA) (f) and Serca1 (g) transcripts. For Clcn1, only the –ex7a isoform encodes a muscle in HSA LR mice treated with the indicated ASOs by subcutaneous functional ion channel. –ex7a, exon 7a exclusion; 1ex7a, exon 7a inclusion; – injection of 25 mg kg 21 twice weekly for 4 weeks. Muscle tissue was obtained ex22, exon 22 exclusion; 1ex22, exon 22 inclusion; neg, negative control mice 1 week after the final dose (n 5 4 per group). The mean levels of transgene injected with GAC25 morpholino; pos, positive control mice injected with mRNA 6 s.d. are shown. **P , 0.001, ***P , 0.0001 (one-way analysis of CAG25 morpholino; WT, FVB/N wild-type mice. h, Blind analysis of myotonia variance (ANOVA)). c,hACTA1-CUG exp transcript levels in quadriceps are using EMG, 1 week after the final dose (n 5 4 mice per group). Error bars are not affected by ASOs targeting unrelated transcripts (141923, randomer; mean 6 s.d. ***P , 0.0001 for ASO-treated versus saline-treated muscles 116847, Pten; 399462, Malat1; n 5 4 per group; same dose as in b). Error bars (two-way ANOVA). are mean 6 s.d. d, Knockdown of hACTA1-CUG exp mRNA in muscle by ASO than 85% were normalized or partially corrected by ASOs, without (Supplementary Fig. 9a–c), the latter muscle having higher basal levels 18 evidence of off-target effects (Fig. 2b, and Supplementary Fig. 7 and of CUG exp expression . Serum chemistries showed no evidence for Supplementary Table 2). These results confirm that ASOs caused an renal or liver toxicity (Supplementary Fig. 10). overall improvement of the muscle transcriptome. A uniform finding in previous studies of MOE gapmer ASOs was ASO effects were evident within 2 weeks (Supplementary Fig. 8) and that systemic administration failed to cause significant target reduc- were dose-dependent. A threefold dose reduction (8.5 mg kg 21 tion in muscle, despite efficient knockdown in liver (n 5 12 different biweekly for 4 weeks) caused partial myotonia and splicing correction, mRNA targets; Supplementary Table 3), raising the possibility that 21 whereas a tenfold dose reduction (2.5 mg kg ) caused partial muscle tissue in our model is unusually susceptible to antisense myotonia correction in tibialis anterior but not in quadriceps silencing. We examined the functional integrity of the muscle 1 1 2|N A T U R E | V O L4 8 8 |2A U G U S T 2 0 1 2 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH 0 10,000 20,000 20,000 –10,000 a a –20,00020,000 –10,000 0 10,000 a 30 Saline 0.8 Saline – Srb1 mRNA level (normalized to 18S RNA) Pten mRNA level (normalized to 18S RNA) 0.4 116847 353382 Liver 20 0.6 Saline * 20,00020,000 10 0.2 * 4 445236 0 *** PC1 HSA LR 0.0 PC1 0 0 LR FVB/N WT 190401 HSA –20,000000 2 Saline 3 Saline –20, W 353382 116847 WT PC3C3 Quad Srb1 mRNA level (normalized to 18S RNA) 1 Pten mRNA level (normalized to 18S RNA) 2 P –30,000 –30,000 1 –20,000 –20,000 –10, –10,000000 0 0 0 0 10,000 PC22 10,000 HSA LR PC 20,00020,000 HSA LR FVB/N WT b b b Fold increase for HSA LR versus WT 55 1 0 Liver ** Quad Gastroc TA *** Liver ** Quad Gastroc TA ** T 10 10 Malat1 mRNA level (normalized to 18S RNA) 3 2 Saline 399462 * ** * ** 33 22 c 11 120 HSA LR FVB/N WT Saline 445236 Saline 190401 100 ** *** *** Liver Quad Figure 2 | Effects of ASOs on the transcriptome in quadriceps muscle. Malat1 mRNA level (% saline) 80 Gastroc TA a, Principle component analysis of microarray data shows segregation of 60 * Diaphragm HSA LR (saline) away from wild-type mice in widely separated clusters. ASOs 40 * Heart caused HSA LR transgenic mice to cluster nearer to wild-type mice (25 mg kg 21 biweekly for 4 weeks; n 5 4 mice per group). b, Of the transcripts 20 *** upregulated in HSA LR versus wild-type mice (saline), .85% showed 0 Saline 12.5 25 50 complete or partial return to normal expression after treatment with ASOs (n 5 4 mice per group). Dose (mg kg –1 twice per week) Figure 3 | Differential sensitivity of transcripts to ASO knockdown in skeletal muscle. a,In HSA LR or FVB/N wild-type mice, ASOs targeting Srb1 24 membrane, a physiological barrier to ASO uptake , and found that (353382) or Pten (116847) were effective for knockdown in liver but not in quadriceps muscle (qRT–PCR, n 5 4 per group). Error bars are mean 6 s.d. muscle penetration of the extracellular dye, Evans Blue, was similar in *P 5 0.02, ***P , 0.0001 (t-test). b, HSA LR and FVB/N wild-type mice were LR HSA and wild-type mice (Supplementary Fig.11a). Direct analysis of treated with ASO 399462 targeting Malat1, a nuclear-retained lncRNA. Levels muscle tissue indicated that ASO accumulation was no greater in of Malat1 transcript in the indicated tissues were determined by qRT–PCR HSA LR mice than in wild-type controls (Supplementary Fig. 11b, c). (n 5 4 ASO, 3 saline). Error bars are mean 6 s.d. *P 5 0.035, **P , 0.007, *** Likewise, the mRNA level for RNase H1 was similar in HSA LR and P 5 0.001 for ASO versus saline (t-test). c, Dose response of Malat1 knockdown wild-type muscle (Supplementary Fig. 12). We tested ASOs targeting in BALB/c wild-type mice. BALB/c wild-type mice were treated with saline or other muscle-expressed transcripts. ASOs for Pten phosphatase or ASO 399462 targeting Malat1 at 12.5, 25 and 50 mg kg 21 twice per week for Srb1 (also known as Scarb1) scavenger receptor showed efficient target 3.5weeks (7 doses in total; n 5 4 per group). Tissues were collected for RNA knockdown in liver, but no appreciable knockdown in HSA LR or wild- isolation 2 days after the final dose. Malat1 transcript levels were determined by qRT–PCR. Error bars are mean 6 s.e.m. *P , 0.01, **P , 0.001, type muscle (Fig. 3a). Taken together with previous studies, our results ***P , 0.0001 (two-way ANOVA). indicate specific sensitivity of hACTA1-CUG exp transcriptsrather than a general enhancement of ASO activity in HSA LR muscle. A notable metabolic feature of hACTA1-CUG exp and human subcutaneous administration of ASOs for 4 weeks caused a greater DMPK-CUG exp mRNA is that processing and polyadenylation are than 80% reduction in Malat1 in muscle (Fig. 3b, c), supporting the 5,6 normal but the transcripts are retained in the nucleus . Recent studies idea that nuclear-retained transcripts have enhanced sensitivity. have shown that RNase H1, the enzyme responsible for antisense To determine the duration of ASO action in muscle, we examined 14 knockdown, is localized to the nucleus and mitochondria , suggesting mice at 15 and 31 weeks after ASO was discontinued, and found that that antisense cleavage of nuclear-encoded RNA occurs before nuclear hACTA1-CUG exp knockdown and splicing correction remained export, and raising the possibility that nuclear-retained transcripts strong (not shown). One year after ASO injection was discontinued, may exhibit enhanced sensitivity. To determine whether other target reduction by ASO 190401 had waned, but remained approxi- nuclear-retained transcripts show a similar effect we examined mately 50% or more for ASO 445236 (Fig. 4a). Even at this late time metastasis associated lung adenocarcinoma transcript 1 (Malat1),an point the appropriate cleavage products were detected by amplifica- endogenous nuclear lncRNA . We identified MOE gapmer ASOs that tion of complementary DNA 59 ends (59 RACE), indicating persistent 9 producedstrongMalat1knockdownincells,inanRNaseH1-dependent ASO-RNase H1 activity (Fig. 4b). Consistent with the extent of target manner (Supplementary Fig. 13). In wild-type and HSA LR mice, reduction, there was partial return of myotonia and splicing defects for 2 A UG US T 2 012 | V O L 4 8 8 | N A T U R E | 1 1 3 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER ab (a greater than 50% reduction at 31 weeks after ASO discontinuation) *** Saline Saline ASO No RT Saline ASO No RT and more persistent than in liver (Supplementary Fig. 16). 1 year 1 week * 2.0 Therapeutic application of this strategy to human DM1 will require hACTA1 mRNA level (normalized to Gtf2b mRNA) 1.5 * 190401 5′ RACE 190401 transfer of the targeting sequence to hDMPK. We developed MOE 445236 gapmer ASOs that were active against hDMPK in cells. We examined fragment 1.0 in vivo activity after 4 weeks of twice weekly subcutaneous injection in transgenic mice that express hDMPK with 800 CUG repeats. The ASO 0.5 5′ RACE 0.0 hindlimb muscle (Fig. 4f and Supplementary Figs 17 and 18), support- Quad TA fragment 445236 produced significant knockdown of hDMPK-CUG exp transcripts in ing the feasibility of silencing the pathogenic DMPK allele. c Serca1 splicing d Despite physiological barriers to tissue uptake, our results indicate ** *** that systemic targeting of CUG exp RNA is feasible because small 60 *** *** *** *** Saline amounts of ASOs that enter muscle fibres can hybridize their target Exon 22 exclusion (%) 40 190401 Myotonia grade 3 2 1 L Quad and productively engage RNase H1. Although the mechanisms for R Quad exp enhanced sensitivity of CUG RNA and Malat1 are not fully defined, 445236 L Gastroc R Gastroc WT our data suggest that residence time in the nucleus is an important L TA 20 R TA determinant of transcript sensitivity. Features of the nuclear environ- paraspinals 0 0 Lumbar ment that may enhance antisense activity include nuclear localization of RNase H1 (ref. 14) and auxiliary proteins that promote oligonucleo- Quad TA Saline 190401 445236 25 exp tide hybridization , and—in the case of CUG transcripts—spatial 4 concentration of targets in a small volume . A similar approach may be e effective for other genetic disorders that have nuclear accumulation of f 80 *** *** *** repeat expansion RNA 26,27 . Previous studies have used CAG-repeat *** Saline 1.0 *** *** *** ASOs that bind CUG exp RNA without activating RNase H, in an effort Frequency (%) 40 *** 445236 hDMPK mRNA level (normalized to Gtf2b mRNA) Saline RNA 18,28 . Although this approach was effective with local delivery, 60 to block the protein interactions or modify the metabolism of the toxic 50 mg kg –1 WT 75 mg kg –1 initial attempts at systemic delivery were less successful (T.M.W. WT 20 *** 0.5 and C.A.T., unpublished observations), which fits with the expectation 0 that higher tissue concentrations of ASO are required to occupy exp ≥ 1 ≥ 2 ≥ 3 0.0 CUG binding sites than to induce RNase H cleavage. Quad Gastroc TA Internal nuclei per fbre Furthermore, the RNase H mechanism is attractive because it exploits the nuclear retention phenomenon to gain a therapeutic advantage, Figure 4 | Duration of ASO activity and in vivo targeting of human DMPK. while posing less risk of off-target effects by avoiding a repetitive a–e, Two-month-old HSA LR mice received saline or ASO by subcutaneous injection of 25 mg kg 21 twice weekly for 4 weeks (n 5 5 for each ASO, n 5 6 for sequence. Recently, local delivery of RNase H-active CAG-repeat exp saline), with tissues isolated 1 year after the final dose. qRT–PCR analysis of ASOs induced partial CUG knockdown, but was accompanied by 29 HSA LR transgene mRNA (mean 6 s.d.) was normalized to the housekeeping muscle damage , again suggesting that direct targeting of the repeat gene Gtf2b mRNA (a). Results were similar when normalized to total RNA tract may have pitfalls. Our results also suggest that ASOs are useful for input. 59 RACE was carried out on muscle RNA obtained 1 week or 1 year after in vivo functional characterization and therapeutic modulation of discontinuation of ASO 190401 or 445236 treatment (b). PCR products lncRNAs, a large and recently recognized class of regulatory RNAs . 30 (59 RACE fragment) migrated at the expected position for ASONRNase H cleavage products, and were confirmed by DNA sequencing. Quantification of METHODS SUMMARY Serca1 splicing (c) and myotonia (d) showed a partial return of splicing defects Experimental mice. All animal experiments were approved by the Institutional and myotonia for ASO 190401 but not ASO 445236. Myotonia was graded Animal Care and Use Committees at the University of Rochester, Genzyme blind by the examiner (d). After prolonged knockdown of toxic RNA, the Corporation and Isis Pharmaceuticals. number of internal nuclei per muscle fibre was determined by histologic Subcutaneous injection of ASOs. MOE gapmer ASOs were dissolved in saline analysis when mice were aged 14 months (e)(n 5 4 for ASO 445236; n 5 3 for saline; WT, untreated 3-month-old FVB/N wild-type control mice). and administered by subcutaneous injection in the interscapular region twice per week at the indicated doses. f, DM328XL mice received subcutaneous injections of saline or ASO 445569 targeting the 39 UTR of hDMPK. The ASO dose was 50 or 75 mg kg 21 twice Quantitative real-time RT–PCR (polymerase chain reaction with reverse tran- weekly for 4 weeks (n 5 5, low dose; n 5 4, high dose; n 5 2, saline). Tissues scription) assay. Total RNA was purified from muscle using RNeasy Lipid Tissue were isolated 2 days after the final dose. Dose-dependent reduction of hDMPK, Mini Kits (Qiagen). mRNA levels for ACTA1, Srb1, Pten, Malat1 and RNase H1 normalized to housekeeping gene Gtf2b mRNA. Note that hDMPK mRNA was were determined on the Applied Biosystems 7500 System using 18S rRNA as undetectable in wild-type mice. No RT, no reverse transcriptase. WT, untreated normalization control. General transcription factor 2b (Gtf2b) and total RNA wild-type littermates of DM328XL transgenic mice (n 5 2). Error bars are (Ribogreen assay) served as normalization controls for human DMPK and mouse mean 6 s.d. *P , 0.05, **P , 0.001, ***P , 0.0001 for ASO-treated versus Dmpk. exp 32 saline-treated muscle (two-way ANOVA). Northern analysis. CUG sequences were detected using a P end-labelled (CAG) 9 DNA oligonucleotide probe. Electromyography. Electromyography (EMG) was carried out blind under gen- 18 ASO 190401, whereas correction by ASO 445236 remained strong eral anaesthesia, as described previously . (Fig. 4c, d and Supplementary Fig. 14a–e). Furthermore, the persistent RT–PCR analysis of alternative splicing. RT–PCR was carried out using the knockdown of CUG exp RNA largely prevented the age-dependent SuperScript III One-Step RT–PCR System with Platinum Taq DNA Polymerase myopathic changes in HSA LR muscle, as evidenced by reduced fre- (Invitrogen) and the same gene-specific primers for cDNA synthesis and PCR amplification. PCR products were separated on agarose gels, stained with quency of central nuclei (Fig. 4e) and improved muscle-fibre diameter SybrGreen I Nucleic Acid Gel Stain (Invitrogen) and scanned with a fluorimager. (mainly a prevention of fibre atrophy) (Supplementary Fig. 15). These Transcriptome analysis. Quadriceps-muscle RNA from wild-type or HSA LR findings indicate that ASO activity against hACTA1-CUG exp in muscle transgenic mice treated with vehicle (saline), ASO 445236 or ASO 190401 was is remarkably durable and that long-term reduction of the toxic RNA processed to cRNA and hybridized on microbeads using MouseRef-8 v2.0 can protect against structural changes in muscle fibres. Notably, Expression BeadChip Kits (Illumina). Image data were quantified using the duration of Malat1 knockdown in muscle was also prolonged BeadStudio software (Illumina). 11 4 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2012 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH Full Methods and any associated references are available in the online version of action of phosphorothioate-modified antisense oligodeoxynucleotides. Nucleic the paper at www.nature.com/nature. Acids Res. 33, 114–125 (2005). 20. Napierala, M. & Krzyzosiak, W. J. CUG repeats present in myotonin kinase Received 7 September 2011; accepted 29 June 2012. RNA form metastable ‘‘slippery’’ hairpins. J. Biol. Chem. 272, 31079–31085 (1997). 21. Miller, J. W. et al. Recruitment of human muscleblind proteins to (CUG)(n) 1. Harley,H.G.et al.Expansion ofanunstableDNAregionandphenotypicvariation in expansions associated with myotonic dystrophy. EMBO J. 19, 4439–4448 (2000). myotonic dystrophy. Nature 355, 545–546 (1992). 2. Buxton, J. et al. Detection of an unstable fragment of DNA specific to individuals 22. Mankodi,A.etal.ExpandedCUG repeatstrigger aberrantsplicingofClC-1 chloride channel pre-mRNA and hyperexcitability of skeletal muscle in myotonic with myotonic dystrophy. Nature 355, 547–548 (1992). dystrophy. Mol. Cell 10, 35–44 (2002). 3. Brook, J. D. et al. Molecular basis of myotonic dystrophy: expansion of a 23. Du, H. et al. Aberrant alternative splicing and extracellular matrix gene expression trinucleotide (CTG) repeat at the 39 end of a transcript encoding a protein kinase in mouse models of myotonic dystrophy. Nature Struct. Mol. Biol. 17, 187–193 family member. Cell 68, 799–808 (1992). (2010). 4. Taneja, K. L., McCurrach, M., Schalling, M., Housman, D. & Singer, R. H. Foci of 24. Alter, J. et al. Systemic delivery of morpholino oligonucleotide restores dystrophin trinucleotide repeat transcripts in nuclei of myotonic dystrophy cells and tissues. expression bodywide and improves dystrophic pathology. Nature Med. 12, J. Cell Biol. 128, 995–1002 (1995). 175–177 (2006). 5. Davis, B. M., McCurrach, M. E., Taneja, K. L., Singer, R. H. & Housman, D. E. 25. Pontius, B. W. & Berg,P. Rapid assembly and disassembly of complementary DNA Expansion of a CUG trinucleotide repeat in the 39 untranslated region of myotonic strands through an equilibrium intermediate state mediated by A1 hnRNP dystrophy protein kinase transcripts results in nuclear retention of transcripts. protein. J. Biol. Chem. 267, 13815–13818 (1992). Proc. Natl Acad. Sci. USA 94, 7388–7393 (1997). 6. Mankodi, A. et al. Myotonic dystrophy in transgenic mice expressing an expanded 26. Li, L. B. & Bonini, N. M. Roles of trinucleotide-repeat RNA in neurological disease and degeneration. Trends Neurosci. 33, 292–298 (2010). CUG repeat. Science 289, 1769–1773 (2000). 7. Bennett, C. F. & Swayze, E. E. RNA targeting therapeutics: molecular mechanisms 27. DeJesus-Hernandez, M. et al. Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. of antisense oligonucleotides as a therapeutic platform. Annu. Rev. Pharmacol. Neuron 72, 245–256 (2011). Toxicol. 50, 259–293 (2010). 8. Geary, R. S. et al. Pharmacokinetics of a tumor necrosis factor-alpha 28. Mulders, S. A. et al. Triplet-repeat oligonucleotide-mediated reversal of RNA toxicity in myotonic dystrophy. Proc. Natl Acad. Sci. USA 106, 13915–13920 phosphorothioate 29-O-(2-methoxyethyl) modified antisense oligonucleotide: (2009). comparison across species. Drug Metab. Dispos. 31, 1419–1428 (2003). 9. Wilusz, J. E., Freier, S. M. & Spector, D. L. 39 end processing of a long nuclear- 29. Lee, J. E.,Bennett, C. F. & Cooper, T. A. RNase H-mediated degradation of toxic RNA in myotonic dystrophy type 1. Proc. Natl Acad. Sci. USA 109, 4221–4226 (2012). retained noncoding RNA yields a tRNA-like cytoplasmic RNA. Cell 135, 919–932 30. Wapinski, O. & Chang, H. Y. Long noncoding RNAs and human disease. Trends Cell (2008). 10. Lorenz, P., Misteli, T., Baker, B. F., Bennett, C. F. & Spector, D. L. Nucleocytoplasmic Biol. 21, 354–361 (2011). shuttling: a novel in vivo property of antisense phosphorothioate Supplementary Information is linked to the online version of the paper at oligodeoxynucleotides. Nucleic Acids Res. 28, 582–592 (2000). www.nature.com/nature. 11. Vickers, T. A. et al. Efficient reduction of target RNAs by small interfering RNA and RNase H-dependent antisense agents. A comparative analysis. J. Biol. Chem. 278, Acknowledgements The work was carried out at the Wellstone Muscular Dystrophy 7108–7118 (2003). CooperativeResearchCenterandCenterforRNABiologyattheUniversityofRochester, 12. Prasanth, K. V. et al. Regulating gene expression through RNA nuclear retention. with support from the US National Institutes of Health (NIH) (grants AR049077, Cell 123, 249–263 (2005). U54NS48843, AR/NS48143, K08NS064293 and U01NS072323), the Saunders 13. Wu, H.et al. Determination ofthe role ofthehumanRNase H1inthe pharmacology Family Neuromuscular Research Fund, Run America and a fellowship (to M.N.) from of DNA-like antisense drugs. J. Biol. Chem. 279, 17181–17189 (2004). the Muscular Dystrophy Association and Uehara Memorial Foundation. The authors 14. Suzuki, Y. et al. An upstream open reading frame and the context of the two AUG thank G. Gourdon for providing DM328XL mice, M. Sabripour for assistance with codons affect the abundance of mitochondrial and nuclear RNase H1. Mol. Cell. principal components analysis and L. Richardson and S. Leistman for technical Biol. 30, 5123–5134 (2010). assistance. 15. Lin, X. et al. Failure of MBNL1-dependent post-natal splicing transitions in Author Contributions T.M.W., A.J.L, S.K.P., A.R.M., M.N., S.H.C., B.M.W., C.F.B. and C.A.T. myotonic dystrophy. Hum. Mol. Genet. 15, 2087–2097 (2006). participated in the planning, design and interpretation of experiments. T.M.W., A.J.L., 16. Osborne, R. J. et al. Transcriptional and post-transcriptional impact of toxic RNA in S.K.P., A.R.M and M.N. carried out experiments. T.M.W. and C.A.T.wrote the manuscript. myotonic dystrophy. Hum. Mol. Genet. 18, 1471–1481 (2009). 17. Kanadia, R. N. et al. A muscleblind knockout model for myotonic dystrophy. Author Information Reprints and permissions information is available at Science 302, 1978–1980 (2003). www.nature.com/reprints. Readers are welcome to comment on the online version of 18. Wheeler, T. M. et al. Reversal of RNA dominance by displacement of protein this article at www.nature.com/nature. The authors declare competing financial sequestered on triplet repeat RNA. Science 325, 336–339 (2009). interests: details accompany the full-text HTML version of the paper at 19. Hasselblatt, P., Hockenjos, B., Thoma, C., Blum, H. E. & Offensperger, W. B. www.nature.com/nature. Correspondence and requests for materials should be Translation of stable hepadnaviral mRNA cleavage fragments induced by the addressed to C.A.T. ([email protected]). 2 A UG US T 2 012 | V O L 4 8 8 | N A T U R E | 1 1 5 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER METHODS catalogue number 4310893-E). Gtf2b, proprietary sequences (Applied Antisense oligonucleotides. ASOs were synthesized at Isis Pharmaceuticals, as Biosystems, catalogue number 4331182) 31 describedpreviously .AllASOswereMOEgapmer20merswithphosphorothioate Northern analysis. Total RNA (6 mg) was separated on agarose gels containing as the intersubunit linkage, 29-O-(2-methoxyethyl) (MOE) modifications of 5 MOPS and formaldehyde, transferred to nylon membranes and hybridized 32 nucleotides at the 59 and 39 end, and a central gap of 10 deoxynucleotides. The with (CAG) 9 or mouse actin P-labelled oligonucleotide probes, as described 18 sequence of each ASO is listed in Supplementary Table 1. CAG25 and GAC25 previously . 18 morpholinos were purchased (Gene Tools). Electromyography. EMG was carried out blind under anaesthesia, as described 18 Identification of active ASOs. The criteria for identifying active hACTA1- previously . Myotonic discharges were graded on a four-point scale: 0, no targeting ASOs were as follows: first, selection of targeting sequences that were myotonia; 1, occasional myotonic discharge in less than 50% of needle insertions; not conserved in mice, to avoid knockdown of endogenous skeletal actin; second, 2, myotonic discharge in greater than 50% of needle insertions; 3, myotonic .50% hACTA1 knockdown when electroporated in HepG2 cells (Supplementary discharge with nearly every insertion. Fig. 1); and third, absence of hepatotoxic or immunostimulatory effects in wild- RT–PCR analysis of alternative splicing. RT–PCR was carried out using the type mice, when 50 mg kg 21 was injected subcutaneously twice weekly for 4 weeks SuperScript III One-Step RT–PCR with Platinum Taq DNA Polymerase (Supplementary Fig. 2a–c). Out of 11 candidate ASOs examined, 5 satisfied these (Invitrogen) using gene-specific primers for cDNA synthesis and PCR amplifica- criteria. For the ASO with the highest activity in HepG2 cells, we also verified tion. The primers for Clcn1, Serca1, Titin and Zasp were described previously 15,36 . activity against hACTA1-CUG exp transcripts in vivo, by direct injection and elec- PCR products were separated on agarose gels, stained with SybrGreen I Nucleic troporation of tibialis anterior muscle in HSA LR mice (Supplementary Fig. 3). Four Acid Gel Stain (Invitrogen) and imaged using a laser scanner (Fujifilm LAS-3000 of the five ASOs were subsequently used for subcutaneous administration in Intelligent Dark Box or GE Healthcare Typhoon 9400). Band intensities were HSA LR mice. ASOs targeting Malat1 were identified by demonstration of .50% quantified using ImageQuant software (GE Healthcare.) target knockdown when electroporated in mouse hepatocellular SV40 large Transcriptome analysis by microarray. RNA was isolated from quadriceps T-antigen carcinoma (MHT) cells, and absence of hepatotoxic or immunostimu- muscle of wild-type mice or HSA LR transgenic mice treated with vehicle (saline), latory effects in wild-type mice (dosing as above). ASO 445236 or ASO 190401 (n 5 4 per group, 25 mg kg 21 ASO twice weekly for Cell transfection and gene analysis. HepG2 cells were electroporated in a 96-well 4 weeks). RNA integrity was verified (RIN values .7.5 on Agilent Bioanalyzer). plate format at 165V with ASOs in complete media containing MEM, non- RNA was processed to cRNA and hybridized on microbeads using MouseRef-8 essential amino acid (NEAA), sodium pyruvate and 10% FBS at room temper- v2.0 Expression BeadChip Kits (Illumina) according to the manufacturer’s ature. Cells were incubated overnight and lysed in RLT buffer (Qiagen). Total recommendations. Image data were quantified using BeadStudio software RNA was prepared using Qiagen RNeasy kit. Quantitative real time RT–PCR (Illumina). Signal intensities were quantile normalized. We used row-specific off- (qRT–PCR) was performed using the Qiagen QuantiTect Probe RT–PCR kit. sets to avoid any values of less than two, before the normalization. Data from all Twenty-microlitre qRT–PCR reactions were run in duplicate and normalized probe sets with six or more nucleotides of CUG, UGC or GCU repeats were against total RNA, calculated using the Ribogreen assay (Invitrogen). suppressed to eliminate the possibility that expanded repeats in the hybridization Experimental mice. Institutional Animal Care and Use Committees at the mixture (CAG repeats in cRNA, originating from CUG exp RNA) could cross- University of Rochester, Genzyme Corporation and Isis Pharmaceuticals hybridize with repeat sequences on probes. To eliminate genes whose expression approved all animal experiments. HSA LR mice in the line 20b were derived and was not readily quantified on the arrays, we suppressed probes that did not show a 6 maintained on the FVB/N background strain . The (CTG) 250 tract in this line is detection probability of P , 0.1 for all samples in the group that showed the higher unstable, and has shortened to (CTG) 220 . DM328XL mice carry a 45-kb human mean expression level. Comparisons between groups were summarized and rank genomic fragmentthat includes the mutant DMPKgene with 800CTG repeats 32,33 . ordered by fold-changes of mean expression level and t-tests. The software The DM328XL mice were hemizygous and display no histologic changes, package R (ref. 38) was used to perform principal components analysis myotonia or splicing defects in skeletal muscle 34,35 . FVB/N, BALB/c, C57Bl/10 (PCA) 39,40 on wild-type, ASO-treated, and saline-treated microarray samples. and Mdx mice were from Jackson Laboratories. The principal components allowed the capture of the majority of the expression Muscle injection of ASOs. The tibialis anterior muscle was injected with 0.2, 0.4 variation in each sample within three dimensions. We plotted the first three or 0.8 nmol ASO in 20 ml saline, and the contralateral tibialis anterior with 20 ml principal components of each sample. Array data have been submitted to the 36 saline alone, they were then electroporated, as described previously . Treatment Gene Expression Omnibus, accession number GSE38962 (http://www.ncbi.nlm. assignments were randomized and injections were carried out blind. nih.gov/geo/query/acc.cgi?acc5GSE38962). Subcutaneous injection of ASOs. All ASOs were dissolved in phosphate buffered Fluorescence in situ hybridization. Localization of CUG exp RNA by fluorescence saline(PBS).Dosesof2.5,8.5,12.5,25or50 mg kg 21 wereinjectedsubcutaneously, insituhybridization(FISH)wascarriedoutusingaCAGrepeatoligoribonucleotide twice per week in the interscapular region for 3.5 to 4 weeks (7 or 8 doses in total). probe labelled with Texas Red at the 59 end, on muscle cryosections from ASO- or 15 Injection volumes ranged from 140 to 200ml. saline-treated mice, as described previously . Images are maximum projections of Real-time PCR Assay. TotalRNAwaspurifiedfromtibialisanterior,gastrocnemius deconvolved Z-plane stacks (9 images, 0.1- or 0.2-mM steps) captured under or quadriceps muscle using the RNeasy Lipid Tissue Mini Kit (Qiagen) according to identical exposure and illumination conditions using a fluorescence microscope the manufacturer’s instructions. qRT–PCR was used to determine mRNA levels for (Carl Zeiss Axioplan 2 or Nikon Eclipse E600), a charge-coupled device (CCD) ACTA1, Srb1, Pten, Malat1 and RNase H1, with 18S rRNA as normalization control, digital camera (Hamamatsu ORCA R2 or Photometrics Cool Snap HQ) and on an Applied Biosystems 7500 Real-Time PCR System. Gtf2b and total RNA Metamorph software (Molecular Devices). The Optigrid structured illumination (Ribogreen assay) served as normalization controls for human DMPK and mouse imaging system (Qioptiq) was also used to capture images of DM328XL muscle. Dmpk. Maximum grey-level intensity wasquantified usingMetamorph. Objectives:3100 Real-time PCR assay primer probe set sequences. ACTA1 primer probe set 1 Plan-APOCHROMAT 1.4 NA oil (Zeiss) or 360 Plan Apo 1.4 NA oil (Nikon). (PPset 1): forward, 59-GTAGCTACCCGCCCAGAAACT-39; reverse, 59-CCA Muscle-fibre morphometry. To outline muscle fibres and label nuclei, 10-mM GGCCGGAGCCATT-39; probe, 59-ACCACCGCCCTCGTGTGCG-39. ACTA1 transverse cryosections of muscles from ASO- or saline-treated mice were fixed PPset 2: forward, 59-GACGAGGCTCAGAGCAAGAGA-39; reverse, 59-TGATG with 4% paraformaldehyde, pH 7.3, washed in PBS and incubated in 20 mgml 21 ATGCCGTGCTCGATA-39; probe, 59-CCTGACCCTGAAGTAC-39. Srb1: FITC-wheat germ agglutinin (WGA; Sigma) and 4,6 diamino-2 phenylindole forward, 59-TGACAACGACACCGTGTCCT-39; reverse, 59-ATGCGACTTGTC dihydrochloride (DAPI; 1:20,000) in PBS for 1 h at room temperature. Sections AGGCTGG-39; probe, 59-CGTGGAGAACCGCAGCCTCCATT-39. Pten: then were washed in PBS, mounted and sealed. Images were captured using an forward, 59-ATGACAATCATGTTGCAGCAATTC-39; reverse: 59-CGATGCA Axioplan 2 fluorescence microscope (Zeiss), an ORCA R2 CCD digital camera ATAAATATGCACAAATCA-39; probe, 59-CTGTAAAGCTGGAAAGGGACG (Hamamatsu Photonics), Metamorph software and a 320 Plan-NEOFLUAR 0.5 GACTGGT-39. Malat1: forward, 59-TGGGTTAGAGAAGGCGTGTACTG-39; NA objective (Zeiss). Using the calipers application in Metamorph, the muscle- reverse, 59-TCAGCGGCAACTGGGAAA-39; probe, 59-CGTTGGCACGACAC fibre diameter, defined as the minimum ‘Feret’s diameter’ (the mimimum distance 41 CTTCAGGGACT-39. RNase H1: forward, 59-ACTCAGGATTTGTGGGCAA of parallel tangents at opposing borders of the muscle fibre ), was determined. TG-39; reverse, 59-CCTCAGACTGCTTCGCTCCTT-39; probe, 59-AGAGGC Haematoxylinandeosin(H&E)-stained images were capturedusinganInfinity2-1 CGACAGACTGGCACGG-39. Human DMPK: forward, 59-AGCCTGAGCC 1.4 megapixel colour CCD digital camera (Lumenera), Infinity Analyze 5.0 soft- GGGAGATG-39; reverse, 59-GCGTAGTTGACTGGCGAAGTT-39; probe, ware (Lumenera) and a 310 Plan-NEOFLUAR 0.3 NA objective (Zeiss). 59-AGGCCATCCGCACGGACAACCX-39. Mouse Dmpk: forward, 59-GACAT 59 rapid amplification of cDNA ends analysis. 59 rapid amplification of ATGCCAAGATTGTGCACTAC-39; reverse: 59-CACGAATGAGGTCCTGAG cDNA ends (RACE) was carried out using the FirstChoice RLM-RACE Kit CTT-39; probe, 59-AACACTTGTCGCTGCCGCTGGCX-39. Ap2M1, sequences (Invitrogen). In brief, 1 mg of total mRNA was ligated with 59 RACE adaptor 37 previously reported . 18S rRNA, proprietary sequences (Applied Biosystems, (59-GCUGAUGGCGAUGAAUGAACACUGCGUUUGCUGGCUUUGAUGA ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH AA-39), then reverse transcribed with a primer specific for the cleavage fragment subjected to 30 Hz shaking in a Qiagen TissueLyser II. Lysed muscle samples were (59-TGAGAAGTCGCGTGCTGGAG-39 for 190401, or 59-TTTTTTTTACGCA heated at 55 uC and centrifuged, and the absorbance of the supernatant was deter- GC-39 for 445236). The synthesized cDNA was treated with RNase H, then amp- mined by spectrophotometric measurement at 636 nm. A standard curve of EBD lified with 59 RACE Outer Primer and 59-TTGCGGTGGACGATGGAAGG-39 in N,N-dimethylformamide enabled the EBD content in individual muscle (for 190401 fragment), or 59-TGTGTAAAACGACGGCCAGTACGCAGCTTA samples to be determined. ACAGAATGAC-39 (for 445236 fragment). The PCR products were analysed on Statistical analysis. Group data are presented as mean 6 s.d., except where agarose gels stained with SYBR Green I (Invitrogen) and scanned with a laser mean 6s.e.m. is indicated. Between-group comparison was carried out using a fluorimager (Typhoon, GE Healthcare). two-tailed Student’s t-test or an analysis of variance (ANOVA), as indicated. RNase H1 short interfering RNA experiments. MHT cells were cultured in A P value of ,0.05 was considered significant. 21 DMEM supplemented with 10% fetal calf serum, streptomycin (0.1 mg ml ), 21 and penicillin (100 U ml ). Short interfering RNA (siRNA) treatments were 31. Cheruvallath,Z.S., Kumar, R.K., Rentel, C., Cole, D.L.& Ravikumar, V.T.Solidphase carried out using Opti-MEM containing 5 mg ml 21 Lipofectamine 2000, as previ- synthesis of phosphorothioate oligonucleotides utilizing diethyldithiocarbonate 37 ously described . In brief, MHTcells were platedwith 7,500 cells perwellandwere disulfide (DDD) as an efficient sulfur transfer reagent. Nucleosides Nucleotides Nucleic Acids 22, 461–468 (2003). incubated for either 24 or 48 h with 75 nM of siRNA targeting RNaseH1 32. Seznec, H. et al. Transgenic mice carrying large human genomic sequences with (59-GCTTGGTGAGACGTGCTTATT-39 and 59-TAAGCACGTCTCACCAA expanded CTG repeat mimic closely the DM CTG repeat intergenerational and 37 GCTG-39)or Ap2M1 (sequences reported previously ) in OPTI-MEM and somatic instability. Hum. Mol. Genet. 9, 1185–1194 (2000). Lipofectamine 2000. Twenty-four hours post transfection, cells were treated with 33. Nakamori, M., Gourdon, G. & Thornton, C. A. Stabilization of expanded increasing doses of the Malat1-targeting ASO 399479 in DMEM–10% FBS. (CTG)*(CAG) repeats by antisense oligonucleotides. Mol. Ther. 19, 2222–2227 (2011). Twenty-four hours after the addition of oligonucleotides, cells were lysed and 34. Seznec, H. et al. Mice transgenic for the human myotonic dystrophy region with RNA was isolated using RNAeasy and qRT–PCR was performed to determine expanded CTG repeats display muscular and brain abnormalities. Hum. Mol. the level of Malat1 mRNA. Genet. 10, 2717–2726 (2001). Tissue drug-level determination. Approximately 30 to100 mg liver and muscle 35. Gomes-Pereira, M. et al. CTG trinucleotide repeat ‘‘big jumps’’: large expansions, 42 tissue were homogenized as described . Capillary gel electrophoresis (CGE) small mice. PLoS Genet. 3, e52 (2007). methods were used to measure unlabelled drug concentrations in mouse tissues. 36. Wheeler, T. M., Lueck, J. D., Swanson, M. S., Dirksen, R. T. & Thornton, C. A. The methods for the hACTA1 ASOs were slight modifications of previously Correction of ClC-1 splicing eliminates chloride channelopathy and myotonia in published methods 42,43 , and consisted of a phenol–chloroform (liquid–liquid) mouse models of myotonic dystrophy. J. Clin. Invest. 117, 3952–3957 (2007). extraction followed by a solid-phase extraction. An internal standard (ASO 37. Koller, E. et al. Mechanisms of single-stranded phosphorothioate modified antisense oligonucleotide accumulation in hepatocytes. Nucleic Acids Res. 39, 355868, a 27mer 29-O-methoxyethyl-modified phosphorothioate oligonucleotide) 4795–4807 (2011). was added before extraction. Tissue sample analyses were conducted using a 38. Ihaka, R. & Gentleman, R. R. A language for data analysis and graphics. J. Comput. Beckman MDQ capillary electrophoresis instrument (Beckman Coulter). Graph. Stat. 5, 299–314 (1996). Tissue-sample concentrations were calculated using calibration curves, with a 39. Raychaudhuri, S., Stuart, J. M. & Altman, R. B. Principal components analysis to 21 lower limit of quantification (LLoQ) of approximately 1.14 mgg . summarize microarray experiments: application to sporulation time series. Pac. Symp. Biocomput. 2000, 455–466 (2000). Biochemical analysis and serum chemistry. Serum separated in serum separator 40. Ringne ´r, M. What is principal component analysis? Nature Biotechnol. 26, tubes(BDcataloguenumber365956)wasusedtodetermineaspartatetransaminase 303–304 (2008). (AST), alanine transaminase (ALT), blood urea nitrogen (BUN) and creatinine 41. Briguet, A., Courdier-Fruh, I., Foster, M., Meier, T. & Magyar, J. P. Histological values using Olympus reagents and an Olympus AU400e analyser (Melville). parameters for the quantitative assessment of muscular dystrophy in the mdx- Evans blue dye uptake assay. Evans blue dye (EBD) was dissolved in PBS at a mouse. Neuromuscul. Disord. 14, 675–682 (2004). LR concentration of 10 mg ml 21 and filter-sterilized. HSA , FVB/N, Mdx or C57Bl/ 42. Leeds, J. M., Graham, M. J., Truong, L. & Cummins, L. L. Quantitation of 10 mice were administered an intraperitoneal injection of 10 ml EBD solution phosphorothioate oligonucleotides in human plasma. Anal. Biochem. 235, 36–43 per gram of bodyweight. After a period of 24 h, muscle tissues were collected (1996). 43. Geary, R. S., Matson, J. & Levin, A. A. A nonradioisotope biomedical assay for intact (quadriceps, gastrocnemius, tibialis anterior, diaphragm and heart). The mass of oligonucleotide and its chain-shortened metabolites used for determination of each muscle was recorded before lysing each sample individually in a microfuge exposure and elimination half-life of antisense drugs in tissue. Anal. Biochem. 274, tube containing N,N-dimethylformamide and a 5-mm steel bead, which was 241–248 (1999). ©2012 Macmillan Publishers Limited. All rights reserved
LETTER doi:10.1038/nature11243 A map of the cis-regulatory sequences in the mouse genome 1 1 1 1 1 1 1 1 Yin Shen *, Feng Yue *, David F. McCleary ,ZhenYe , Lee Edsall , Samantha Kuan , Ulrich Wagner , Jesse Dixon 1,2,3 , Leonard Lee , 4 Victor V. Lobanenkov & Bing Ren 1,5 The laboratory mouse is the most widely used mammalian model genomeisorganizedintodomainsofcoordinatelyregulatedenhancers 9 organism in biomedical research. The 2.6 3 10 bases of the mouse and promoters. Our results provide a resource for the annotation of genome possess a high degree of conservation with the human functional elements in the mammalian genome and for the study of 1 genome , so a thorough annotation of the mouse genome will be mechanisms regulating tissue-specific gene expression. of significant value to understanding the function of the human We identified the genomic localizations of RNA polymerase II genome. So far, most of the functional sequences in the mouse (polII), the insulator-binding protein CCCTC-binding factor (CTCF) genome have yet to be found, and the cis-regulatory sequences in and three chromatin modification marks, histone H3 lysine 4 particular are still poorly annotated. Comparative genomics has trimethylation (H3K4me3), histone H3 lysine 4 monomethylation 2 been a powerful tool for the discovery of these sequences , but on its (H3K4me1) and H3 lysine 27 acetylation (H3K27ac), in 13 adult own it cannot resolve their temporal and spatial functions. tissues, four embryonic tissues and two primary cell lines (Fig. 1a, b) Recently, ChIP-Seq has been developed to identify cis-regulatory by performing chromatin immunoprecipitation followed by high- 6 elements in the genomes of several organisms including humans, throughput sequencing (ChIP-Seq) (Supplementary Tables 1 and 2). 3–5 Drosophila melanogaster and Caenorhabditis elegans . Here we Enrichment of H3K4me3 or polII binding signals is indicative of apply the same experimental approach to a diverse set of 19 tissues an active promoter, whereas the presence of H3K4me1 or H3K27ac and cell types in the mouse to produce a map of nearly 300,000 outside promoter regions can be used as marks for enhancers 7–11 . 12 murine cis-regulatory sequences. The annotated sequences add up CTCF binding is considered a mark for potential insulator elements . to 11% of the mouse genome, and include more than 70% of con- In a subset of tissue and cell types, we also performed ChIP-Seq on the served non-codingsequences.Wedefinetissue-specific enhancers and co-activator protein p300 and used its promoter-distal binding sites to identify potential transcription factors regulating gene expression in train an enhancer prediction tool on the basis of chromatin signa- 13 each tissue or cell type. Finally, we show that much of the mouse tures . We determined the transcriptome in each tissue and cell a Rad23b Klf4 b Enhancers Promoters CTCF-binding sites 10 mESCs Bone marrow H3K4me3 Heart Cerebellum 0.3 Cortex Liver E14.5 brain 5 mESCs E14.5 heart E14.5 limb PolII 0.3 Heart E14.5 liver Liver Heart 3 mESCs Intestine H3K4me1 0.3 Heart Lung Kidney Liver Liver MEF 10 mESCs mESCs H3K27ac 0.2 Heart Olfactory bulb Placenta 5 Liver Spleen Testis mESCs Thymus 0.3 P300 Heart 0 20,000 40,000 60,000 80,000 Liver Number of cis-regulatory elements 5 mESCs c CTCF 0.3 Heart Promoters 18,566/23,523 78.9% Liver Enhancers 598/726 82.3% RNA-Seq 100 5 mESCs CTCF-binding 0 20 40 34,833/36,835 100 94.6% sites Heart 60 80 Percentage of known Liver elements recovered Chr4: 55,339,817 100 kb 55,574,800 Figure 1 | Identification of cis-regulatory elements in the mouse genome. regulatory elements in the 19 tissue and cell types. E14.5, embryonic day 14.5; a, UCSC genome browser views of ChIP-Seq and RNA-Seq data for mESC, MEF, murine embryonic fibroblast. c, Percentages of known cis-regulatory heart and liver (chromosome 4). The values on the y axis for ChIP-Seq data are elements recovered in this study. input normalized intensities. kb, kilobases. b, An overview of the predicted 1 Ludwig Institute for Cancer Research, 9500 Gilman Drive, La Jolla, California 92093-0653, USA. Medical Scientist Training Program, University of California, San Diego School of Medicine, 9500Gilman 2 3 Drive, La Jolla, California 92093-0653, USA. Biomedical Sciences Graduate Program, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093-0653, USA. 4 5 Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, TwinbrookI NIAID Facility, Room1417, 5640 Fishers Lane, Rockville, Maryland 20852, USA. Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, Moores Cancer Center, University of California, San Diego School of Medicine, 9500 Gilman Drive, La Jolla, California 92093-0653, USA. *These authors contributed equally to this work. 11 6 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2012 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH type through RNA-Seq experiments, using a protocol that can detect a Exons Promoters Enhancers b CTCF-binding sites both the abundance and strand of origin of RNA transcripts 14 100 Random sequences 10 Sox2 CTCF (Supplementary Fig. 1). By analysing the genomic occupancy of the 80 * mESCs 0.2 H3K4me3 30 above chromatin marks and transcription factors (Supplementary * 0.5 H3K4me1 3 Methods), we identified 295,676 non-redundant cis-regulatory Percentage of elements conserved 60 * 0.2 34,563,619 Chr3:34,539,106 sequences, including 53,834 putative promoters, 234,764 potential 10 SOX2 enhancers and 111,062 CTCF-binding sites (Fig. 1b). With an estimated 40 30 CTCF span of 1,000 base pairs for each element, the combined length of these hESCs 0.2 H3K4me3 0.5 H3K4me1 20 3 putative cis regulatory sequences is 295.6million base pairs, or 11% of 0.5 0.7 0.9 0.2 182,926,534 Chr3:182,900,544 PhastCon score thresholds the mouse genome. cd To determine the accuracy and completeness of our cis-regulatory sequence mapping, we first compared the identified promoters with 80 13,411 80 14,094 e Un-annotated 29% knownpromoters.Werecovered79%ofRefSeq-annotatedpromoters 15 60 60 (Fig. 1c and Supplementary Fig. 2a) and confirmed an additional 62% Conserved occupancy of human elements in mouse homologous sequences (%) 40 Conserved occupancy of mouse elements in human homologous sequences (%) 40 8,444 RNA-Seq 1% of University of California, Santa Cruz (UCSC)-annotated promoters 20 1,709 4,819 20 2,434 CTCF-binding sites* 2% (13,205 out of 21,433) that are not annotated in RefSeq. As expected, 0 0 Promoters Enhancers 53% 15% annotated promoters not recovered by our study are generally Promoters Enhancers CTCF- Promoters Enhancers CTCF- 1.45 million most conserved expressed in tissues that were not investigated in this work (Sup- binding sites* binding sites* non-coding sequences plementary Table 3). In addition to the annotated promoters, we also identified 13,438 novel promoters. When tested with a luciferase Figure 2 | Evolutionary conservation of the identified cis-regulatory elements. a, Evolutionary conservation of cis-regulatory elements, in reporter, 85% of 65 randomly selected novel promoters showed sig- comparison with exons and random genomic sequences. Asterisk, P , 0.001, nificant promoter activity in at least one orientation (P , 0.01, Fisher’s exact test. b, UCSC genome browser views of chromatin state and Student’s t-test) (Supplementary Fig. 3a, b), supporting their function CTCF-binding sites at Sox2 loci for mESCs and human ESCs (hESCs) on as promoters. Next we compared the predicted enhancers with a list chromosome 3. DNA sequences, chromatin states and CTCF binding are all 16 of 726 experimentally validated enhancers and found that 82% of conserved in this region. c, Number of hESC regulatory elements that are them were correctly identified in this study (Fig. 1c and Supplemen- conserved and predicted as regulatory elements in mESCs. d, Number of mESC tary Fig. 2b). We also randomly selected eight predicted murine regulatory elements that are conserved and predicted as regulatory elements in embryonic fibroblast (MEF) enhancers for validation and found that hESCs. e, Functional annotation of the conserved non-coding sequences based six of them (75%) gave positive results (Supplementary Fig. 4) on the cis-regulatory elements identified in this study. The asterisk in c, d and e indicates CTCF-binding sites that do not overlap with either promoters or (P , 0.01, Student’s t-test), supporting the reliability of our enhancer enhancers. identification method. In addition, we recovered 94.5% of previously reported CTCF-binding sites in mouse embryonic stem cells Comparativegenomic methods have identified a significantnumber 17 (mESCs) (Fig.1c), demonstrating the high sensitivityof ourdetection of mammalian sequences as non-protein coding but undergoing nega- method for CTCF binding. Further, we detected 77,236 novel CTCF- tive selection during evolution, commonly referred to as conserved binding sites, 87.5% of which contained the canonical CTCF motifs non-protein-coding sequences (CNSs). These sequences are suspected (P , 2.2 3 10 216 , binomial distribution). The novel CTCF-binding to have important biological roles, yet their precise function remains to sites tend to be more tissue-specific than the sites identified previously be defined.We compared ourmapof cis-regulatoryelementswitha list 20 (Supplementary Fig. 5). The above evidence indicates that we have of CNSs and found that 70% of them fall into one of the three classes correctly identified most known cis-regulatory sequences and have of predicted cis-elements: 15% as promoters, 53% as enhancers and 2% uncovered many novel ones. as CTCF-binding sequences. Additionally, 1% of the CNSs seem to be Functional elements are often under negative selection during non-coding RNA sequencesas supported by the RNA-Seq data (Fig. 2e evolution, so a high level of sequence conservation is frequently used and Supplementary Fig. 2c). Most CNSs therefore seem to function in as evidence of function. However, there are also reports showing that regulating transcription. transcription factor binding may be rapidly lost or gained during We previously showed that enhancers in the human genome are evolution 18,19 , arguing that the usage of cis-elements may evolve more associated with active chromatin marks in a cell-type-specific manner, quickly. We examined the sequence conservation of different classes of whereas promoter and insulator elements tend to be ubiquitously 10 the cis-regulatory sequences identified in this study, and found that occupied in multiple cell lines . Here we found that the occupancy promoters are characterized by the highest degree of sequence conser- of enhancers byH3K4me1 in the mouse genome is stillthe most tissue- vation (Fig. 2a). In contrast, CTCF-binding sites and enhancers have a specific (Fig. 3a). In contrast, we observed that whereas H3K4me3 much lower but still significant level of sequence conservation. We occupies most RefSeq promoters in multiple tissues, a significant num- next assessed the level of conservation of cis-regulatory element usage ber of promoters, especially the novel promoters discovered in this between the mouse and human genomes in embryonic stem cells study, show tissue-specific occupancies by H3Kme3 or polII (Fig. 3a) (ESCs) 10 (Fig. 2b). More than 70% of homologous promoters are (Supplementary Fig. 3d), with many of them corresponding to alter- associated with H3K4me3 in both species, confirming a high degree natively used promoters (Supplementary Table 4 and Supplementary of conservation in promoter usage (Fig. 2c, d). However, only 25.7% Fig. 6). We also found that most CTCF-binding sites are occupied in and 24.8% of enhancers and CTCF-binding sites, respectively, found multiple tissues (Fig. 3a). The tissue-specific CTCF-binding sites in human ESCs are still associated with H3K4me1 or CTCF binding in showed significant overlap with enhancers (P , 1.8 3 10 2143 , mESCs, despite a high degree of sequence conservation (Fig. 2c). These binomial distribution), whereas the ubiquitous CTCF-binding sites results suggest that the cis-regulatory elements identified in the mouse overlapped significantly with promoters (P , 9.03 10 243 , binomial genome are under different selective pressure during evolution, with distribution) (Supplementary Fig. 5b, c), suggesting that a fraction of promoters being most conserved in both sequence and usage, whereas the CTCF-binding sites may function through promoters and enhancers and CTCF-binding sites are undergoing a considerable enhancers, although the exact role of CTCF at these regions remains degree of evolution. This result agrees well with the recent findings unclear. These results indicate that a large fraction of cis-regulatory of large interspecies differences and divergence of transcriptional elements are active in a tissue-specific manner and are most probably 18 regulation . involved in regulating tissue-specific gene expression. 2 A UGUS T 2 012 | V OL 488 | N A TU RE | 1 1 7 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER a 0.20 Enhancer c Chr19:4,200,000–5,000,000 at 0.11 and 0.08, respectively (Fig. 3b). In addition, 34% and 38% of the Promoter (H3K4me3) enhancer/promoter pairs in the nearest TSS model and the CTCF Promoter (polII) 0.15 CTCF-binding sites block model, respectively, are negatively correlated, indicating poten- Density 0.10 tially incorrect promoter assignment. To improve the linking of enhancers to their targets, a logistic regression classifier was recently 21 0.05 Chr19:4,200,000–5,000,000 introduced and shown to perform better than the nearest TSS model . 0 However, this model is still based on the one-to-one relationship 510 15 between an enhancer and a gene, with a bias towards the nearby genes. Tissue specifcity index It has been reported that a significant fraction of enhancers may not More tissue-specifc Ubiquitous Correlation –0.6 0.6 22 e 20 target the nearest promoters . Therefore, to gain a better understand- b Within EPU 1.5 Random ing of enhancer/promoter organization we assessed the correlation of Nearest TSS 15 Between EPU Random expected CTCF block Density 1.0 Normalized Hi-C interaction frequencies 10 the chromatin state at enhancers and polII occupancy at promoters for each possible pair of elements along a chromosome. We observed 0.5 variable sizes (Fig. 3c). We developed an algorithm to detect these local 0 5 that co-regulated promoters and enhancers tend to form clusters with –0.5 0 0.5 Correlation 0 40 200 400 clusters, defined as enhancer–promoter units (EPUs) (Supplementary Distance between predicted iii between predicted Distanceccncccncncncccnccccncncccncncncncncnnnccnnncn d 10 promoter and enhancer pairs Methods). Performing this analysis genome-wide, we defined 8,792 Kilobases pro pro pro pro pro promroooororo oter and enhancer pairs pro pro pro p p p p pro pro pr pr p pro EPUs that contained at least one promoter and one enhancer Normalized HiC interaction frequencies 0 (Supplementary Table 5), encompassing 1,258 million base pairs, or nearly half of the mouse genome. The median enhancer-to-promoter ratio per EPU was 5.67 (Supplementary Table 6), which is consistent 23 with the idea that multiple enhancers may be used to regulate a gene . We confirmed that previously defined enhancer–promoter pairs are frequently located within the same EPU. For example, out of the 2,605 Chr18:37,000,000 500 kb 38,200,000 putative enhancer–promoter pairs recently defined in the human EPU 1 EPU 2 EPU 3 21 genome , most of their mouse homologues are found within the same H3K27ac 5 ChIP-Seq intensity 5 0 5 0 5 0 H3K4me1 exact test). In addition, each of the four linked enhancer–promoter 216 , Fisher’s EPU (83.8% observed versus 43% expected; P , 2.23 10 0 H3K4me3 24 pairs reported by a recent study was found within the same EPU. PolII Finally, seven locus-control regions for Hbb genes were all identified 5 CTCF 25 0 The discovery of EPUs provides strong evidence that the genome is Pcdh-E gene cluster within the same EPU . partitioned into functional domains in which cis-regulatory elements Pcdh-J Pcdh-D gene cluster gene cluster are coordinately regulated, whereas elements located in different domains are relatively insulated from each other. This organization Figure 3 | Genomic organization of co-regulated promoters and enhancers. is reminiscent of recently identified topological domains, defined by a, Tissue specificity of the usages of promoters (H3K4me3 and polII), 26,27 enhancers (H3K4me1) and CTCF-binding sites. b, Distribution of the chromatin interactions, in the mammalian genome . Indeed, Spearman correlation coefficient of H3K4me1 at enhancers and polII at comparison of the EPUs with the higher order chromatin organization promoters of randompermutation, the nearest TSS model, and theCTCF block shows that physical partitioning of the genome is highly correlated model. c, Enhancers and promoters form co-regulated clusters of different with functional partitioning on the basisof the coordinated activitiesof sizes, as shown by the Spearman correlation coefficient of H3K4me1 at cis-regulatory sequences (Fig. 3d and Supplementary Fig. 7). enhancers and polII at promoters on chromosome 19. d, Hi-C interaction EPUs provide a new approach for associating enhancers with their heatmap showing that the physical partitioning of the genome is highly target genes. Instead of being linked to the nearest genes, an enhancer correlated with the EPUs that encompass Pcdha, Pcdhb and Pcdhc gene clusters could be assigned to one or more promoters within an EPU that show on chromosome 18. Top: normalized Hi-C interaction frequencies in mouse significant correlation. To validate the enhancer–promoter relation- cortex as a two-dimensional heatmap. Bottom: UCSC genome browser views of the same regions, including the identified EPUs and the ChIP-Seq data ship predicted bythis approach (Supplementary Table 7), we examined (H3K27ac, H3K4me1, H3K4me3, polII and CTCF) in cortex. e, The average long-range looping interactions between the enhancers and promoters, normalized Hi-C interaction frequencies for enhancer–promoter pairs within reasoning that true enhancer–promoter target pairs should have higher EPUs, between EPUs, and expected by random chance. interaction frequencies than neighbouring non-target sites. We per- formed chromosome confirmation capture (3C) experiments for five Enhancers are important in regulating tissue-specific expression enhancer–promoter pairs predicted to be linked in the cortex but not in patterns during mammalian development. However, finding target mouse ES cells, and two enhancer–promoter pairs predicted not to be genes for enhancers is not straightforward because they are frequently linked in either tissue or cell type. The five linked pairs showed enrich- distal from the genes they control. Assigning enhancers to the nearest ment of 3C signals, whereas the two non-linked pairs did not, transcription start sites is the most widely used method. A recently indicating that the EPU analysis can accurately reveal a enhancer– published strategy associates enhancers and promoters located within promoter targeting relationship (Supplementary Fig. 8 and Sup- the same domain defined by the CTCF-binding sites, assuming that plementary Table 8). For a systematic evaluation of the enhancer– 10 insulators can block promoter–enhancer interactions . We evaluated promoter pairing relationships as defined by this approach, we these two methods by assessing the Spearman correlation coefficients examined long-range looping interactions in adult mouse cortex (SCCs) between H3K4me1 signals at enhancers and the polII genome-wide by using the Hi-C method . We observed that inter- 28 intensities at target promoters (Supplementary Methods).As a control, actions between predicted enhancer–promoter pairs within the same we observed that the SCCs from the randomly paired enhancers and EPUs occured significantly more frequently than interactions between promoters have a bell-shaped distribution with a median of 0 (Fig. 3b). enhancer–promoter pairs of the same genomic distance but across The distribution of the SCCs from enhancer–promoter pairs identified different EPUs or by random chance (Fig. 3e; P , 2.2 3 10 216 , by the nearest transcription start site (TSS) model and CTCF block Wilcoxon test). These results suggest that EPUs may help in assigning model are only slightly better than the random control, with medians enhancers to their target promoters. 11 8 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2012 ©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH Mammalian development requires a precise temporal gene expres- neuron development, whereas the latter was associated with genes sion program that is tightly controlled by transcription factors and important for adult brain functions, for example the transmission of cis-regulatory elements. The map of cis-regulatory sequences now nerve impulses (Fig. 4b, c and Supplementary Fig. 9). We made provides a chance for us to analyse the potential mechanisms similar observations for stage-specific enhancers in liver and heart involved in temporal regulation of gene expression. First, we identified (Supplementary Figs 9 and 10). enhancers specific to embryonic and adult brain on the basis of We also systematically identified potential transcriptional regulators H3K4me1 intensities (Fig. 4a). We observed that the former class acting on tissue-specific gene expression programs. We first defined 19 was associated with genes expressed in neuron differentiation and groups oftissue-specificenhancersonthebasisofH3K4me1occupancy a b Neuron differentiation e 40 Oct4 E14.5 brain-specifc Axon guidance 10 0 Hnf1 E14.5 brain Adult cortex Neuron development Enrichment 30 20 Cell motion 80 93,006 enhancers c 0 Transmission of nerve impulse Enrichment 100 0 8 6 REST Cell morphogenesis involved in differentiation 60 40 40 20 30 10 20 Enrichment (–log P) Cortex-specifc Synaptic transmission Enrichment 4 2 0 Cortex Heart Kidney Liver Lung MEF mESCs Spleen Testis Ion transport Metal ion transport Cation transport 0 2 4 6 8 10 12 14 Bone marrow Cerebellum E14.5 brain E14.5 heart E14.5 liver E14.5 limb Intestine Olfactory bulb Placenta Thymus Normalized intensity –1 1 Enrichment (–log P) Bone marrow Cerebellum Cortex E14.5 brain E14.5 heart E14.5 liver E14.5 limb Heart Intestine Kidney mESCs Olfactory bulb Placenta Spleen Testis Thymus E14.5 heart Kidney E14.5 liver Intestine Placenta Cerebellum Cortex E14.5 limb E14.5 brain Olfactory bulb Bone marrow Spleen Thymus mESCs d f Lung Liver MEF n = 4,286 Heart Lung Liver MEF Testis Oct1 n = 9,870 Pou6f1 Pou3f3 n = 8,138 Zbtb3 n = 6,102 Pttg1ip n = 3,490 Zfp143 n = 6,476 Zfp691 n = 5,848 Rfx1 n = 2,940 Pdx1 TEAD, C-JUN-CRE n = 5,616 NF-E2 n = 5,976 Cdx2, Hoxc9 n = 8,701 Tcf12 n = 3,968 HNF4a PPARE n = 8,230 Hnf1 n = 8,552 TR3 n = 3,509 Mybl1 n = 4,331 NRF1 n = 5,080 C-myc n = 9,451 Lhx3 Oct2 n = 5,605 Pbx3 n = 6,713 Mef2a, Gata1 n = 12,687 n = 6,205 Oct4, Sox2, Klf4, EKLF n = 22,253 n = 7,004 PU.1, ETS1, n = 7,217 GABPA, ELF1 n = 78,768 RUNX-AML Normalized intensity –2 2 Enrichment (–log P) 0 40 g 1.6 h 0.4 i 0.20 PhastCons score 1.2 PhastCons score 0.2 0 –10 +10 +20 0.15 –20 –10 +10 +20 1.4 0.3 0.1 1.0 0.10 0.8 –20 0.6 0.4 j PhastCons score 0.14 k 0.20 0.12 0.15 Motifs found in Random 8-mer enhancers 0.10 –10 +10 +20 0.10 –20 –10 +10 +20 0.08 –20 Figure 4 | Motif analysis of tissue-specific enhancers. a, Classification of factor recognition motifs in the predicted enhancers. REST, RE1-silencing development stage-specific enhancers based on their chromatin state transcription factor. f, Heatmap showing the clustering of 270 transcription (H3K4me1) between embryonic (embryonic day 14.5; E14.5) and adult brain. factor motifs on the basis of their enrichment in the various groups of b and c, Gene Ontology analysis for the genes associated with embryonic brain- enhancersas identified in e. g, Boxplot showing that the de novo motifs found in specific enhancers and adult cortex-specific enhancers. d, Classification of tissue-specific enhancers are evolutionarily conserved. h–k, Examples of motifs tissue-specific enhancers on the basis of their chromatin state (H3K4me1) that show high sequence conservation: h, REST motif in cortex-specific among different tissue and cell types. The first 19 tissue-specific clusters were enhancers; i, Hnf1 motif in kidney-specific enhancers; j, Oct4 motif in mESC- used for further motif analysis. The last cluster contains enhancers enriched in specific enhancers; k, Atoh1 motif in cerebellum-specific enhancers. multiple tissues with no clear patterns. e, Enrichment of three transcription 2A U G U S T2 0 1 2|V O L4 8 8 |N A T U R E | 1 1 9 ©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER (Fig.4d).GeneOntologytermanalysisconfirmedthattheenhancersin 6. Barski, A. et al. High-resolution profiling of histone methylations in the human each grouparelinkedtogenesspecificallyexpressedinthe correspond- 7. genome. Cell 129, 823–837 (2007). Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers ing tissue or developmental stage (Supplementary Fig. 11). We also and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 observed that the known motifs of transcription factors that have been (2010). 8. Kim, T. H. et al. A high-resolution map of active promoters in the human genome. reported tofunctionincertain tissuesareenrichedinthetissue-specific Nature 436, 876–880 (2005). enhancers from the same tissue (Fig. 4e). To identify new transcription 9. Rada-Iglesias, A.etal.Auniquechromatinsignatureuncovers earlydevelopmental factors involved in each group of tissue-specific enhancers, we per- enhancers in humans. Nature 470, 279–283 (2011). 10. Heintzman, N. D. et al. Histone modifications at human enhancers reflect global formed de novo motif analysis and identified 206 motifs with a very cell-type-specific gene expression. Nature 459, 108–112 (2009). stringent cutoff (P , 10 220 ; Supplementary Tables 9 and 10). We 11. Heintzman, N. D. et al. Distinct and predictive chromatin signatures of found that 91% of them (188 out of 206) showed significant levels of transcriptionalpromoters andenhancersinthehumangenome.NatureGenet.39, evolutionary conservation among the vertebrate species (Fig. 4g, h–k). 311–318 (2007). 12. Kim,T.H.etal.AnalysisofthevertebrateinsulatorproteinCTCF-bindingsitesinthe We annotated the most likely transcription factor for each motif by human genome. Cell 128, 1231–1245 (2007). comparing it with public transcription factor databases and verified 13. Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009). that the matching transcription factor was expressed in the corres- 14. Parkhomchuk, D. et al. Transcriptome analysis by strand-specific sequencing of ponding tissue. A total of 62% of the conserved de novo motifs (117 complementary DNA. Nucleic Acids Res. 37, e123 (2009). out of 188) were associated with a known transcription factor, and 75% 15. Pruitt, K. D., Tatusova, T. & Maglott, D. R. NCBI reference sequences (RefSeq): a of them (88 out of 117) have previously been implicated in the regu- curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61–D65 (2007). lation of gene expression in specific tissues (Supplementary Tables 9 16. Visel,A., Minovitsky,S., Dubchak,I.& Pennacchio, L.A.VISTA EnhancerBrowser—a and 11). We performed a similar motif analysis for promoters, and database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 compared the top motifs enriched in promoter and enhancer (2007). 17. Chen, X. et al. Integration of external signaling pathways with the core sequencesin the sametissue (Supplementary Table 12). Only 11motifs transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008). were shared between the two groups of motifs, whereas 93% of tran- 18. Schmidt, D. et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of scription factor motifs enriched in the tissue-specific enhancer were transcription factor binding. Science 328, 1036–1040 (2010). unique only to enhancers, confirming that enhancers and promoters 19. Birney, E. et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007). contain different regulatory sequences, as we reported previously . 20. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and 10 Here we have described an initial survey and a draft annotation of yeast genomes. Genome Res. 15, 1034–1050 (2005). 21. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human the cis-regulatory sequences in the mouse genome. The wide range of cell types. Nature 473, 43–49 (2011). tissue and cell types examined in this study provides an unprecedented 22. Li, G. et al. Extensive promoter-centered chromatin interactions provide a opportunitytodetecttissue-specificanddevelopment-specificpromoters topological basis for transcription regulation. Cell 148, 84–98 (2012). 23. Ong, C. T. & Corces, V. G. Enhancer function: new insights into the regulation of and enhancers, analyses of which have yielded potential clues to tissue-specific gene expression. Nature Rev. Genet. 12, 283–293 (2011). transcription regulators of tissue-specific gene expression programs. 24. Kagey, M. H. et al. Mediator and cohesin connect gene expression and chromatin We show that nearly half of the mouse genome is organized into EPUs architecture. Nature 467, 430–435 (2010). 25. Splinter, E. et al. CTCF mediates long-range chromatin looping and local histone containing enhancers and promoters with correlated activities. These modification in the b-globin locus. Genes Dev. 20, 2349–2354 (2006). EPUs overlap significantly with recently discovered topological 26. Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the domains, defined by chromatin interactions, thus linking physical X-inactivation centre. Nature 485, 381–385 (2012). partitioning of the genome with transcriptional regulation. Such 27. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012). multigene structures 22,29 probably represent a general feature of 28. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions genome organization in mammals. reveals folding principles of the human genome. Science 326, 289–293 (2009). 29. Chepelev, I., Wei, G., Wangsa, D., Tang, Q. & Zhao, K. Characterization of genome- wide enhancer–promoter interactions reveals co-expression of interacting genes METHODS SUMMARY and modes of higher order chromatin organization. Cell Res. 22, 490–503 (2012). Mouse tissues were harvested from eight-week-old male C57Bl/6 mice (Charles 30. Hawkins, R. D. et al. Distinct epigenomic landscapes of pluripotent and lineage- River). The murine embryonic fibroblasts were isolated from C57Bl/6 embryos at committed human cells. Cell Stem Cell 6, 479–491 (2010). embryonic day 14.5. ChIP-Seq and RNA-Seq experiments were performed as Supplementary Information is linked to the online version of the paper at described 14,30 , with the use of Illumina GAIIx and HiSequation (2000) instruments www.nature.com/nature. (details are provided in Supplementary Information). Hi-C experiments in adult Acknowledgements We thank F. Jin, Y. Luu, S. Klugman, A. Y.-J. Kim, Q.-M. Ngo, cortex were conducted as described . A software pipeline to process ChIP-Seq 28 B. A. Gomez and S. Selvaraj for consultation. The mESC line Bruce4 was a gift from data and predict enhancers is described in Supplementary Methods. Highly UCSD Transgenic Core. Research funding was provided by the National Human correlated biological replicates for ChIP-Seq experiments were pooled for all Genome Research Institute (R01HG003991) and the Ludwig Institute for Cancer subsequent data analyses. An algorithm to define the enhancer–promoter unit Research to B.R. Y.S. is supported by a postdoctoral fellowship from the International is given in Supplementary Methods. Rett Syndrome Foundation. J.D. is supported by a pre-doctoral fellowship from the California Institute for Regenerative Medicine. Received 12 May 2011; accepted 18 May 2012. Author Contributions Y.S., F.Y. and B.R. designed the experiments. Y.S., D.M., Z.Y. and Published online 1 July 2012. L.L. conducted experiments. F.Y. performed computational analysis. U.W. contributed toRNA-Seqdataanalysis.J.D.contributedtoHi-Cdataanalysis.S.K.andL.E.performed DNA sequencing and initial data processing. V.L. provided CTCF monoclonal 1. Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse antibodies. Y.S., F.Y. and B.R. prepared the manuscript. genome. Nature 420, 520–562 (2002). 2. Visel, A., Rubin, E. M. & Pennacchio, L. A. Genomic views of distant-acting Author Information Data sets are available from the ENCODE website (http:// enhancers. Nature 461, 199–205 (2009). genome.ucsc.edu/ENCODE), the supporting website for this paper (http:// 3. The ENCODE Project Consortium. A user’s guide to the encyclopedia of DNA chromosome.sdsc.edu/mouse/index.html) and the Gene Expression Omnibus elements (ENCODE). PLoS Biol. 9 e1001046 (2011). (GSE29184). Reprints and permissions information is available at www.nature.com/ 4. Gerstein, M. B. et al. Integrative analysis of the Caenorhabditis elegans genome by reprints. The authors declare no competing financial interests. Readers are welcome to the modENCODE project. Science 330, 1775–1787 (2010). comment on the online version of this article at www.nature.com/nature. 5. Roy, S. et al. Identification of functional elements and regulatory circuits by Correspondence and requests for materials should be addressed to B.R. Drosophila modENCODE. Science 330, 1787–1797 (2010). ([email protected]). 12 0 | NA TUR E | V O L 4 8 8 | 2 A U G U S T 2012 ©2012 Macmillan Publishers Limited. All rights reserved
Search
Read the Text Version
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124