热点新闻
RNA-seq(10):KEGG通路可视化:gage和pathview
2023-10-05 02:30  浏览:550  搜索引擎搜索“手机展会网”
温馨提示:信息一旦丢失不一定找得到,请务必收藏信息以备急用!本站所有信息均是注册会员发布如遇到侵权请联系文章中的联系方式或客服删除!
联系我时,请说明是在手机展会网看到的信息,谢谢。
展会发布 展会网站大全 报名观展合作 软文发布
这部分直接从上部分RNA-seq(9):富集分析(功能注释)的数据而来,当然如果你上部分数据存盘了,这部分直接导入并进行转换就可以。这里我们先用另外一个R包 gage package (Generally Applicable Gene-set Enrichment for Pathway Analysis)进行KEGG 富集分析,这样也可以和上部分进行比较。

提前说明几个问题

  • kegg的物种缩写在这里查看
  • 我们使用 gage package (Generally Applicable Gene-set Enrichment for Pathway Analysis) 进行通路分析。点击下载 gage package workflow vignette for RNA-seq pathway analysis查看gage包工作流程。一旦有了富集的通路list,就可以使用pathview 包进行通路可视化。当然这会用到上下调信息。
  • 用pathview进行可视化

先安装R包

source("https://bioconductor.org/biocLite.R") biocLite("gage") biocLite("pathview") biocLite("gageData") library("pathview") library("gage") library("gageData") install.packages("dplyr") library("dplyr") #library(clusterProfiler) #library(DOSE) #library(stringr) #library(org.Mm.eg.db)

加载数据

data(kegg.sets.mm) data(sigmet.idx.mm) kegg.sets.mm = kegg.sets.mm[sigmet.idx.mm] head(kegg.sets.mm,3) setwd("F:/rna_seq/data/matrix") sig.gene<-read.csv(file="DEG_treat_vs_control.csv") gene.df<-bitr(gene, fromType = "ENSEMBL", toType = c("SYMBOL","ENTREZID"), OrgDb = org.Mm.eg.db) head(sig.gene)

> head(sig.gene) X baseMean log2FoldChange lfcSE stat pvalue padj 1 ENSMUSG00000003309 548.1926 3.231611 0.2658125 12.157485 5.234568e-34 8.193146e-30 2 ENSMUSG00000046323 404.1894 3.067050 0.2628220 11.669687 1.820923e-31 1.425055e-27 3 ENSMUSG00000001123 341.8542 2.797485 0.2766499 10.112004 4.887441e-24 2.549941e-20 4 ENSMUSG00000023906 951.9460 2.382307 0.2510718 9.488551 2.342684e-21 9.116395e-18 5 ENSMUSG00000018569 485.4839 3.136031 0.3312999 9.465836 2.912214e-21 9.116395e-18 6 ENSMUSG00000000184 601.0842 -2.827750 0.3154171 -8.965112 3.099648e-19 8.085948e-16

开始用gage包进行富集分析,gage()函数需要fold change 和Entrez gene IDs

foldchanges = sig.gene$log2FoldChange names(foldchanges)= gene.df$ENTREZID head(foldchanges)

如下显示:

> head(foldchanges) 11768 73708 16859 54419 53624 12444 3.231611 3.067050 2.797485 2.382307 3.136031 -2.827750

开始pathway分析,获取结果

keggres = gage(foldchanges, gsets = kegg.sets.mm, same.dir = TRUE) # Look at both up (greater), down (less), and statatistics. lapply(keggres, head)

显示为

> lapply(keggres, head) $greater p.geomean stat.mean p.val q.val set.size exp1 mmu04514 Cell adhesion molecules (CAMs) 0.2680462 0.6286461 0.2680462 0.5360924 12 0.2680462 mmu04510 Focal adhesion 0.6382502 -0.3594187 0.6382502 0.6382502 10 0.6382502 mmu04144 Endocytosis NA NaN NA NA 8 NA mmu03008 Ribosome biogenesis in eukaryotes NA NaN NA NA 0 NA mmu04141 Protein processing in endoplasmic reticulum NA NaN NA NA 0 NA mmu04740 Olfactory transduction NA NaN NA NA 1 NA $less p.geomean stat.mean p.val q.val set.size exp1 mmu04510 Focal adhesion 0.3617498 -0.3594187 0.3617498 0.7234996 10 0.3617498 mmu04514 Cell adhesion molecules (CAMs) 0.7319538 0.6286461 0.7319538 0.7319538 12 0.7319538 mmu04144 Endocytosis NA NaN NA NA 8 NA mmu03008 Ribosome biogenesis in eukaryotes NA NaN NA NA 0 NA mmu04141 Protein processing in endoplasmic reticulum NA NaN NA NA 0 NA mmu04740 Olfactory transduction NA NaN NA NA 1 NA $stats stat.mean exp1 mmu04514 Cell adhesion molecules (CAMs) 0.6286461 0.6286461 mmu04510 Focal adhesion -0.3594187 -0.3594187 mmu04144 Endocytosis NaN NA mmu03008 Ribosome biogenesis in eukaryotes NaN NA mmu04141 Protein processing in endoplasmic reticulum NaN NA mmu04740 Olfactory transduction NaN NA

得到pathway

keggrespathways = data.frame(id=rownames(keggres$greater), keggres$greater) %>% tbl_df() %>% filter(row_number()<=10) %>% .$id %>% as.character() keggrespathways

结果如下:

> keggrespathways [1] "mmu04514 Cell adhesion molecules (CAMs)" "mmu04510 Focal adhesion" [3] "mmu04144 Endocytosis" "mmu03008 Ribosome biogenesis in eukaryotes" [5] "mmu04141 Protein processing in endoplasmic reticulum" "mmu04740 Olfactory transduction" [7] "mmu03010 Ribosome" "mmu04622 RIG-I-like receptor signaling pathway" [9] "mmu04744 Phototransduction" "mmu04062 Chemokine signaling pathway"

# Get the IDs. keggresids = substr(keggrespathways, start=1, stop=8) keggresids

> keggresids [1] "mmu04514" "mmu04510" "mmu04144" "mmu03008" "mmu04141" "mmu04740" "mmu03010" "mmu04622" "mmu04744" "mmu04062"

最后,可以通过pathview包中的pathway()函数画图。下面写一个函数,这样好循环画出上面产生的前10个通路图。

# 先定义画图函数 plot_pathway = function(pid) pathview(gene.data=foldchanges, pathway.id=pid, species="mmu", new.signature=FALSE) # 同时画多个pathways,这些plots自动存到工作目录 tmp = sapply(keggresids, function(pid) pathview(gene.data=foldchanges, pathway.id=pid, species="mmu"))

显示如下

> tmp = sapply(keggresids, function(pid) pathview(gene.data=foldchanges, pathway.id=pid, species="mmu")) Info: Downloading xml files for mmu04514, 1/1 pathways.. Info: Downloading png files for mmu04514, 1/1 pathways.. 'select()' returned 1:1 mapping between keys and columns Info: Working in directory F:/rna_seq/data/matrix Info: Writing image file mmu04514.pathview.png Info: Downloading xml files for mmu04510, 1/1 pathways.. Info: Downloading png files for mmu04510, 1/1 pathways.. 'select()' returned 1:1 mapping between keys and columns

然后我们去工作目录,查看KEGG pathway,我放三张图查看下:





mmu04144.pathview.png




mmu04510.pathview.png




mmu04514.pathview.png

至此,KEGG 通路可视化完成

后记:

更详细的可视化见(可以从counts开始)

发布人:8f95****    IP:117.173.23.***     举报/删稿
展会推荐
  • T121
  • 2023-10-05浏览:504
让朕来说2句
评论
收藏
点赞
转发