📦 microViz
is an R package for analysis and visualization of microbiome sequencing data.
🔨 microViz
functions are intended to be easy to use and flexible.
🔬 microViz
extends and complements popular microbial ecology packages like phyloseq
, vegan
, & microbiome
.
📎 This website is the best place for documentation and examples: https://david-barnett.github.io/microViz/
This ReadMe shows a few example analyses
The Reference page lists all functions and links to help pages and examples
The Changelog describes important changes in new microViz package versions
The Articles pages give tutorials and further examples
Creating ordination plots (e.g. PCA or PCoA)
Modelling and plotting individual taxon associations with taxatrees
More coming soon(ish)! Post on GitHub discussions if you have questions/requests
You can install the latest available microViz package version using the following instructions.
# Installing from github requires the devtools package
install.packages("devtools")
# To install the latest "released" version of this package
devtools::install_github("david-barnett/microViz@0.9.1") # check 0.9.1 is the latest release
# To install the very latest version:
devtools::install_github("david-barnett/microViz")
# If you encounter a bug please try the latest version & let me know if the bug persists!
# If the Bioconductor dependencies don't automatically install you can install
# them yourself like this:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install(c("phyloseq", "microbiome", "ComplexHeatmap"))
💻 Windows users - will need to have RTools installed so that your computer can build this package (follow instructions here: http://jtleek.com/modules/01_DataScientistToolbox/02_10_rtools/)
🍎 macOS users - might need to install xquartz to make the heatmaps work (to do this with homebrew, run the following command in your mac’s Terminal: brew install --cask xquartz
📦 I highly recommend using renv for managing your R package installations across multiple projects.
🐳 Alternatively, for Docker users an image with the main branch installed is available at: https://hub.docker.com/r/barnettdavid/microviz-rocker-verse
📅 microViz is tested to work with R version 4 on Windows, MacOS, and Ubuntu 18 and 20. R version 3.6.* should probably work, but I don’t formally test this.
microViz provides a Shiny app for an easy way to start exploring your microbiome data: all you need is a phyloseq object.
# example data from corncob package
pseq <- corncob::ibd_phylo %>%
tax_fix() %>%
phyloseq_validate()
ord_explore(pseq) # gif generated with microViz version 0.7.4 (plays at 1.75x speed)
library(phyloseq)
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.1.2
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
# get some example data
data("dietswap", package = "microbiome")
# create a couple of numerical variables to use as constraints or conditions
dietswap <- dietswap %>%
ps_mutate(
weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1),
female = if_else(sex == "female", true = 1, false = 0),
african = if_else(nationality == "AFR", true = 1, false = 0)
)
# add a couple of missing values to show how microViz handles missing data
sample_data(dietswap)$african[c(3, 4)] <- NA
You have quite a few samples in your phyloseq object, and would like to visualise their compositions. Perhaps these example data differ participant nationality?
dietswap %>%
comp_barplot(
tax_level = "Genus", n_taxa = 15, other_name = "Other",
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
palette = distinct_palette(n = 15, add = "grey90"),
merge_other = FALSE, bar_outline_colour = "darkgrey"
) +
coord_flip() +
facet_wrap("nationality", nrow = 1, scales = "free") +
labs(x = NULL, y = NULL) +
theme(axis.text.y = element_blank(), axis.ticks.y = element_blank())
#> Registered S3 method overwritten by 'seriation':
#> method from
#> reorder.hclust vegan
htmp <- dietswap %>%
ps_mutate(nationality = as.character(nationality)) %>%
tax_transform("log2", add = 1, chain = TRUE) %>%
comp_heatmap(
taxa = tax_top(dietswap, n = 30), grid_col = NA, name = "Log2p",
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
colors = heat_palette(palette = viridis::turbo(11)),
row_names_side = "left", row_dend_side = "right", sample_side = "bottom",
sample_anno = sampleAnnotation(
Nationality = anno_sample_cat(
var = "nationality", col = c(AAM = "grey35", AFR = "grey85"),
box_col = NA, legend_title = "Nationality", size = grid::unit(4, "mm")
)
)
)
ComplexHeatmap::draw(
object = htmp, annotation_legend_list = attr(htmp, "AnnoLegends"),
merge_legends = TRUE
)
Ordination methods can also help you to visualise if overall microbial ecosystem composition differs markedly between groups, e.g. BMI.
Here is one option as an example:
tax_filter()
tax_agg()
tax_transform()
ord_calc()
ord_plot()
# perform ordination
unconstrained_aitchison_pca <- dietswap %>%
tax_filter(min_prevalence = 0.1, tax_level = "Genus") %>%
tax_agg("Family") %>%
tax_transform("clr") %>%
ord_calc()
#> Proportional min_prevalence given: 0.1 --> min 23/222 samples.
# ord_calc will automatically infer you want a "PCA" here
# specify explicitly with method = "PCA", or you can pick another method
# create plot
pca_plot <- unconstrained_aitchison_pca %>%
ord_plot(
plot_taxa = 1:6, colour = "bmi_group", size = 1.5,
tax_vec_length = 0.325,
tax_lab_style = tax_lab_style(max_angle = 90, aspect_ratio = 0.5),
auto_caption = 8
)
# customise plot
customised_plot <- pca_plot +
stat_ellipse(aes(linetype = bmi_group, colour = bmi_group), size = 0.3) +
scale_colour_brewer(palette = "Set1") +
theme(legend.position = "bottom") +
coord_fixed(ratio = 0.5, clip = "off") # makes rotated labels align correctly
# show plot
customised_plot
You visualised your ordinated data in the plot above. Below you can see how to perform a PERMANOVA to test the significance of BMI’s association with overall microbial composition. This example uses the Family-level aitchison distance to correspond with the plot above.
# calculate distances
aitchison_dists <- dietswap %>%
tax_filter(min_prevalence = 0.1) %>%
tax_transform("identity", rank = "Family") %>%
dist_calc("aitchison")
#> Proportional min_prevalence given: 0.1 --> min 23/222 samples.
# the more permutations you request, the longer it takes
# but also the more stable and precise your p-values become
aitchison_perm <- aitchison_dists %>%
dist_permanova(
seed = 1234, # for set.seed to ensure reproducibility of random process
n_processes = 1, n_perms = 99, # you should use at least 999!
variables = "bmi_group"
)
#> 2022-04-28 12:21:50 - Starting PERMANOVA with 99 perms with 1 processes
#> 2022-04-28 12:21:50 - Finished PERMANOVA
# view the permanova results
perm_get(aitchison_perm) %>% as.data.frame()
#> Df SumOfSqs R2 F Pr(>F)
#> bmi_group 2 104.0678 0.04177157 4.773379 0.01
#> Residual 219 2387.2862 0.95822843 NA NA
#> Total 221 2491.3540 1.00000000 NA NA
# view the info stored about the distance calculation
info_get(aitchison_perm)
#> ps_extra info:
#> tax_agg = Family
#> tax_transform = identity
#> tax_scale = NA
#> distMethod = aitchison
#> ordMethod = NA
#> constraints = NA
#> conditions = NA
You could visualise the effect of the (numeric/logical) variables in your permanova directly using the ord_plot
function with constraints (and conditions).
perm2 <- aitchison_dists %>%
dist_permanova(variables = c("weight", "african", "sex"), seed = 321)
#> Dropping samples with missings: 2
#> 2022-04-28 12:21:50 - Starting PERMANOVA with 999 perms with 1 processes
#> 2022-04-28 12:21:53 - Finished PERMANOVA
We’ll visualise the effect of nationality and bodyweight on sample composition, after first removing the effect of sex.
perm2 %>%
ord_calc(constraints = c("weight", "african"), conditions = "female") %>%
ord_plot(
colour = "nationality", size = 2.5, alpha = 0.35,
auto_caption = 7,
constraint_vec_length = 1,
constraint_vec_style = vec_constraint(size = 1.5, colour = "grey15"),
constraint_lab_style = constraint_lab_style(
max_angle = 90, size = 3, aspect_ratio = 0.35, colour = "black"
)
) +
stat_ellipse(aes(colour = nationality), size = 0.2) +
scale_color_brewer(palette = "Set1") +
coord_fixed(ratio = 0.35, clip = "off") +
theme(legend.position = c(0.9, 0.1), legend.background = element_rect())
#>
#> Centering (mean) and scaling (sd) the constraints and conditions:
#> weight
#> african
#> female
microViz heatmaps are powered by ComplexHeatmap
and annotated with taxa prevalence and/or abundance.
# set up the data with numerical variables and filter to top taxa
psq <- dietswap %>%
ps_mutate(
weight = recode(bmi_group, obese = 3, overweight = 2, lean = 1),
female = if_else(sex == "female", true = 1, false = 0),
african = if_else(nationality == "AFR", true = 1, false = 0)
) %>%
tax_filter(
tax_level = "Genus", min_prevalence = 1 / 10, min_sample_abundance = 1 / 10
) %>%
tax_transform("identity", rank = "Genus")
#> Proportional min_prevalence given: 0.1 --> min 23/222 samples.
# randomly select 30 taxa from the 50 most abundant taxa (just for an example)
set.seed(123)
taxa <- sample(tax_top(psq, n = 50), size = 30)
# actually draw the heatmap
cor_heatmap(
data = psq, taxa = taxa,
taxon_renamer = function(x) stringr::str_remove(x, " [ae]t rel."),
tax_anno = taxAnnotation(
Prev. = anno_tax_prev(undetected = 50),
Log2 = anno_tax_box(undetected = 50, trans = "log2", zero_replace = 1)
)
)
😇 If you find any part of microViz useful to your work, please consider citing the JOSS article:
Barnett et al., (2021). microViz: an R package for microbiome data visualization and statistics. Journal of Open Source Software, 6(63), 3201, https://doi.org/10.21105/joss.03201
Bug reports, questions, suggestions for new features, and other contributions are all welcome. Feel free to create a GitHub Issue or write on the Discussions page. Alternatively you could also contact me (David) on Twitter @_david_barnett_ .
This project is released with a Contributor Code of Conduct and by participating in this project you agree to abide by its terms.
sessionInfo()
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_3.3.5 dplyr_1.0.8 phyloseq_1.38.0
#> [4] microViz_0.9.0.9009
#>
#> loaded via a namespace (and not attached):
#> [1] ggtext_0.1.1 nlme_3.1-155 matrixStats_0.61.0
#> [4] bitops_1.0-7 RColorBrewer_1.1-3 doParallel_1.0.17
#> [7] GenomeInfoDb_1.30.1 tools_4.1.1 utf8_1.2.2
#> [10] R6_2.5.1 vegan_2.5-7 DBI_1.1.2
#> [13] BiocGenerics_0.40.0 mgcv_1.8-38 colorspace_2.0-3
#> [16] GetoptLong_1.0.5 permute_0.9-7 rhdf5filters_1.6.0
#> [19] ade4_1.7-18 withr_2.5.0 tidyselect_1.1.2
#> [22] gridExtra_2.3 compiler_4.1.1 microbiome_1.16.0
#> [25] cli_3.2.0 Biobase_2.54.0 Cairo_1.5-15
#> [28] TSP_1.2-0 xml2_1.3.3 labeling_0.4.2
#> [31] scales_1.2.0 stringr_1.4.0 digest_0.6.29
#> [34] rmarkdown_2.13 XVector_0.34.0 pkgconfig_2.0.3
#> [37] htmltools_0.5.2 fastmap_1.1.0 highr_0.9
#> [40] GlobalOptions_0.1.2 rlang_1.0.2 rstudioapi_0.13
#> [43] shape_1.4.6 generics_0.1.2 farver_2.1.0
#> [46] jsonlite_1.8.0 RCurl_1.98-1.6 magrittr_2.0.2
#> [49] GenomeInfoDbData_1.2.7 biomformat_1.22.0 Matrix_1.4-0
#> [52] Rcpp_1.0.8.3 munsell_0.5.0 S4Vectors_0.32.3
#> [55] Rhdf5lib_1.16.0 fansi_1.0.2 ape_5.6-2
#> [58] viridis_0.6.2 lifecycle_1.0.1 stringi_1.7.6
#> [61] yaml_2.3.5 MASS_7.3-55 zlibbioc_1.40.0
#> [64] rhdf5_2.38.1 Rtsne_0.15 plyr_1.8.7
#> [67] grid_4.1.1 parallel_4.1.1 crayon_1.5.1
#> [70] lattice_0.20-45 Biostrings_2.62.0 splines_4.1.1
#> [73] gridtext_0.1.4 multtest_2.50.0 circlize_0.4.14
#> [76] magick_2.7.3 ComplexHeatmap_2.10.0 knitr_1.37
#> [79] pillar_1.7.0 igraph_1.2.11 rjson_0.2.21
#> [82] markdown_1.1 reshape2_1.4.4 codetools_0.2-18
#> [85] stats4_4.1.1 glue_1.6.1 corncob_0.2.0
#> [88] evaluate_0.14 data.table_1.14.2 png_0.1-7
#> [91] vctrs_0.3.8 foreach_1.5.2 gtable_0.3.0
#> [94] purrr_0.3.4 tidyr_1.2.0 clue_0.3-60
#> [97] assertthat_0.2.1 xfun_0.30 survival_3.2-13
#> [100] viridisLite_0.4.0 seriation_1.3.5 tibble_3.1.6
#> [103] iterators_1.0.14 registry_0.5-1 IRanges_2.28.0
#> [106] cluster_2.1.2 ellipsis_0.3.2