一天21篇nature,PCAWG火力全开,数据分析进入下一个时代

癌症是一种基因组疾病,由细胞获取关键癌症基因的体细胞突变引起。这些突变改变了调控细胞生长和与组织环境相互作用的途径。直到最近,对癌症基因组的研究仍然主要集中在蛋白质编码基因上,但是这些基因总共只占基因组的1%。为了重视和解决这一问题,ICGC/TCGA全基因组癌症分析(Pan-Cancer Analysis of Whole Genomes 缩写:PCAWG)项目对38种不同肿瘤类型的2600多种原发癌及其配对正常组织进行了全基因组测序和综合分析。这一系列成果的发布,对于深入的了解肿瘤具有重要的帮助,对于生信分析也宣布了在未来数据分析中非编码序列的分析也变得非常重要。Nature集团旗下:Nature、Nature Genetics 、Communications Biology、NatureCommunications、Nature Biotechnology 合力发布该成果,并根据研究方向进行了如下汇总。
一、概述
1、Pan-cancer analysis of whole genomes(Nature)Description of the PCAWG resource of >2,600 whole cancer genomes and their matching normal tissues across 38 tumour types, including data, portals, analysis pipelines and downstream integrative analyses. A full list of authors (pdf 145 kb) is available for download.
二、结构变异
2、Patterns of somatic structural variation in human cancer genomes(Nature)Analysis of patterns and signatures of structural variants across PCAWG, identifying 16 signatures of structural variation, including a new set of replication-based processes generating clusters of several rearrangements.
3、Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer(Nature Genetics)Genomic rearrangements can alter the 3D chromatin organization inside the nucleus; this study describes the prevalence and effects of these mutations on chromatin folding domains and gene expression in human cancers.
4、Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition(Nature Genetics)A computational approach to study retrotransposons (‘jumping genes’) in the human genome finds that they can participate in the origin and development of some human tumours
5、Comprehensive analysis of chromothripsis in 2,658 human cancers using whole‑genome sequencing(Nature Genetics)Chromothripsis is found to be much more prevalent across cancers than previously thought, with a frequency of >50% in several cancer types.
三、肿瘤进化
6、The evolutionary history of 2,658 cancers(Nature)By reconstructing the life history of cancers from their genomes, the study determines the evolutionary trajectories of cancers, showing that cancers develop over many years to sometimes even decades, and highlighting opportunities for early cancer detection.
7、Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis(Communications Biology)A resource of oncogenic and tumour suppressor long noncoding RNAs reveals evidence for deep evolutionary conservation of their functions since human–mouse divergence.
四、突变特征
8、The repertoire of mutational signatures in human cancer(Nature)The characterization of 4,645 whole‑genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small insertion‑and‑deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.
9、Comprehensive molecular characterization of mitochondrial genomes in human cancers(Nature Genetics)Analysis of mitochondrial genomes (mtDNA) using whole‑genome sequencing data from 2,658 cancer samples across 38 cancer types identifies hypermutated mtDNA cases, frequent somatic nuclear transfer of mtDNA and high variability of mtDNA copy number in many cancers.
10、Genomic footprints of activated telomere maintenance mechanisms in cancer(Nature Communications)Genomic characteristics are described that enable the identification of patients with alternative lengthening of telomeres from DDNA sequences with high specificity, with relevance for the development of new diagnostic and prognostic tests.
11、Divergent mutational processes distinguish hypoxic and normoxic tumours(Nature Communications)Cancers grow in different locations around the body, and these differ in their levels of oxygen; the study investigates how oxygen levels change the ways tumours grow, mutate, evolve and become lethal.
五、癌症驱动
12、Analyses of non-coding somatic drivers in 2,658 cancer whole genomes(Nature)A new framework for analysing non-coding drivers discovers new candidates and shows that they are less frequent than protein-coding disruptions.
13、Pathway and network analysis of more than 2,500 whole cancer genomes(Nature Communications)Multi-faceted pathway and network analysis of 2,583 whole cancer genomes integrates non-coding and coding mutations across known and new cancer processes.
14、The landscape of viral associations in human cancers(Nature Genetics)Viral landscape across 38 cancer types identifies known and new links to cancer aetiology.
六、基因调控
15、Genomic basis for RNA alterations in cancer(Nature)This study provides a comprehensive catalogue of RNA alterations in cancer, including gene expression, splicing, allelic expression and fusions, and associates them with DNA-level alterations identified from whole‑genome sequencing. Integrated analysis of DNA and RNA changes highlights the heterogeneous mechanisms of cancer gene alterations.
16、High-coverage whole‑genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations(Nature Communications)Hundreds of genes, including known cancer-associated genes, are found to have altered expression in conjuction with the nearby presence of a somatic structural variant (SV) breakpoint.
七、工具
17、Butler enables rapid cloud-based analysis of thousands of human genomes(Nature Biotechnology)Butler is an open source framework for large-scale analysis of scientific data with cloud computing, which applies continuous system monitoring and automated self-healing to deal with failures, allowing for 43% more efficient data processing than prior approaches.
18、Integrative pathway enrichment analysis of multivariate omics data(Nature Communications)ActivePathways is an integrative method for prioritizing target pathways and genes in complex multi-omics data sets such as coding and non-coding mutation data of cancer genomes.
19、A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns(Nature Communications)Using machine learning, we can accurately discriminate 24 common tumour types based solely on their patterns of somatic mutation, potentially allowing us to determine the identity of tumours of uncertain primary using whole genome sequencing.
20、Combined burden and functional impact tests for cancer driver discovery using DriverPower(Nature Communications)A new highly sensitive algorithm is described for distinguishing cancer driver from passenger mutations in whole‑genome and exome sequencing data.
21、Inferring structural variant cancer cell fraction(Nature Communications)SVclone is a computational method for inferring the cancer cell fraction of structural variant (SV) breakpoints from whole‑genome sequencing data.
22、Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig(Nature Communications)A new method, TrackSig, uses mutational signatures to inform the accurate reconstruction of tumour subclones and their evolutionary trajectories.
请关注“恒诺新知”微信公众号,感谢“R语言“,”数据那些事儿“,”老俊俊的生信笔记“,”冷🈚️思“,“珞珈R”,“生信星球”的支持!