Microbiome data analysis with microViz

David Barnett

Part 1: Diversity

Intro to phyloseq


phyloseq S4 object: processed mouse gut microbiota data

mice <- readRDS(here::here('data/mice.rds'))


Printed object shows you functions to access the data inside!

mice
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 3229 taxa and 520 samples ]
sample_data() Sample Data:       [ 520 samples by 11 sample variables ]
tax_table()   Taxonomy Table:    [ 3229 taxa by 7 taxonomic ranks ]

Taxonomy table

tax_table(mice) %>% head(15)
Taxonomy Table:     [15 taxa by 7 taxonomic ranks]:
        Kingdom    Phylum           Class                 Order             Family               Genus              Species      
ASV0001 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0002 "Bacteria" "Proteobacteria" "Gammaproteobacteria" "Xanthomonadales" "Xanthomonadaceae"   "Stenotrophomonas" "maltophilia"
ASV0003 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0004 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0005 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0006 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0007 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0008 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0009 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0010 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Rikenellaceae"      "Alistipes"        NA           
ASV0011 "Bacteria" "Bacteroidetes"  "Bacteroidia"         "Bacteroidales"   "Porphyromonadaceae" NA                 NA           
ASV0012 "Bacteria" "Firmicutes"     "Clostridia"          "Clostridiales"   "Lachnospiraceae"    "Eisenbergiella"   NA           
ASV0013 "Bacteria" "Proteobacteria" "Gammaproteobacteria" "Pseudomonadales" "Moraxellaceae"      "Acinetobacter"    NA           
ASV0014 "Bacteria" "Firmicutes"     "Clostridia"          "Clostridiales"   "Lachnospiraceae"    NA                 NA           
ASV0015 "Bacteria" "Firmicutes"     "Bacilli"             "Lactobacillales" "Lactobacillaceae"   "Lactobacillus"    NA           

OTU table

otu_table(mice)[1:8, 1:8]
OTU Table:          [8 taxa and 8 samples]
                     taxa are columns
       ASV0001 ASV0002 ASV0003 ASV0004 ASV0005 ASV0006 ASV0007 ASV0008
D14.A1    3343       0    4205    3470    3607    1210    1159    1852
D14.B5    4332       0    5412    2494    3083    1451    1663    1745
D0.D5     5344       0    3906    1439    2396    1402    1217    2078
D0.E1     2994       0    4005    2188    2882    1267    1821    1788
D0.E2     2315       0    3987     955    1665    1025     899    1342
D0.E3     1972       0    4336    1876    3422    1208     551    1900
D0.E4     2352       0    2561    1960    2060    1350    1095    1200
D0.E5     1386       0    1752    1471    1363     724    1050     871

You can also use the @ symbol.

mice@otu_table[1:8, 1:8] # the same result
OTU Table:          [8 taxa and 8 samples]
                     taxa are columns
       ASV0001 ASV0002 ASV0003 ASV0004 ASV0005 ASV0006 ASV0007 ASV0008
D14.A1    3343       0    4205    3470    3607    1210    1159    1852
D14.B5    4332       0    5412    2494    3083    1451    1663    1745
D0.D5     5344       0    3906    1439    2396    1402    1217    2078
D0.E1     2994       0    4005    2188    2882    1267    1821    1788
D0.E2     2315       0    3987     955    1665    1025     899    1342
D0.E3     1972       0    4336    1876    3422    1208     551    1900
D0.E4     2352       0    2561    1960    2060    1350    1095    1200
D0.E5     1386       0    1752    1471    1363     724    1050     871

Sample data (a.k.a. metadata)


sample_data(mice)[1:15, 1:10]
                sample_id      barcode run plate sample   sex cage treatment_days treatment      virus
D14.A1  1.Thackray.D14.A1 ACGAGACTGATT 368     1      1 FALSE   A1           D.14   Vehicle Uninfected
D14.B5 10.Thackray.D14.B5 ACCGGTATGTAC 368     1     10 FALSE   B2           D.14   Vehicle    WNV2000
D0.D5  100.Thackray.D0.D5 ATCTACCGAAGC 368     2    100 FALSE   D5             D0     Metro Uninfected
D0.E1  101.Thackray.D0.E1 ACTTGGTGTAAG 368     2    101 FALSE   E1             D0     Metro    WNV2000
D0.E2  102.Thackray.D0.E2 TCTTGGAGGTCA 368     2    102 FALSE   E2             D0     Metro    WNV2000
D0.E3  103.Thackray.D0.E3 TCACCTCCTTGT 368     2    103 FALSE   E3             D0     Metro    WNV2000
D0.E4  104.Thackray.D0.E4 GCACACCTGATA 368     2    104 FALSE   E4             D0     Metro    WNV2000
D0.E5  105.Thackray.D0.E5 GCGACAATTACA 368     2    105 FALSE   E5             D0     Metro    WNV2000
D0.F1  106.Thackray.D0.F1 TCATGCTCCATT 368     2    106 FALSE   F1             D0     Metro    WNV2000
D0.F2  107.Thackray.D0.F2 AGCTGTCAAGCT 368     2    107 FALSE   F2             D0     Metro    WNV2000
D0.F3  108.Thackray.D0.F3 GAGAGCAACAGA 368     2    108 FALSE   F3             D0     Metro    WNV2000
D0.F4  109.Thackray.D0.F4 TACTCGGGAACT 368     2    109 FALSE   F4             D0     Metro    WNV2000
D14.C1 11.Thackray.D14.C1 AATTGTGTCGGA 368     1     11 FALSE   A3           D.14   Vehicle Uninfected
D0.F5  110.Thackray.D0.F5 CGTGCTTAGGCT 368     2    110 FALSE   F5             D0     Metro    WNV2000
D0.G1  111.Thackray.D0.G1 TACCGAAGGTAT 368     2    111 FALSE   G1             D0       Amp Uninfected

Ranks

rank_names(mice)
[1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"

Taxa

taxa_names(mice) %>% head() 
[1] "ASV0001" "ASV0002" "ASV0003" "ASV0004" "ASV0005" "ASV0006"

Samples

sample_names(mice) %>% head() 
[1] "D14.A1" "D14.B5" "D0.D5"  "D0.E1"  "D0.E2"  "D0.E3" 

Variables

sample_variables(mice)
 [1] "sample_id"       "barcode"         "run"             "plate"           "sample"          "sex"             "cage"            "treatment_days"  "treatment"       "virus"           "survival_status"

Cheat code

View(mice)

Seeing is believing

Get a feel for your data, by making simple plots.

We’re going to use the package microViz

Data subsetting

We can subset the samples using the sample_data info

miceSubset <- mice %>% ps_filter(
  treatment_days == "D13", virus == "WNV2000", treatment == "Vehicle"
)
miceSubset
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 563 taxa and 10 samples ]
sample_data() Sample Data:       [ 10 samples by 11 sample variables ]
tax_table()   Taxonomy Table:    [ 563 taxa by 7 taxonomic ranks ]

Basic bar plots

miceSubset %>% comp_barplot(
  tax_level = "unique", n_taxa = 12, bar_width = 0.7,
  sample_order = "asis", tax_transform_for_plot = "identity"
) + coord_flip()

Proportional bar plots

miceSubset %>% comp_barplot(
  tax_level = "unique", n_taxa = 12, bar_width = 0.7,
  sample_order = "asis"
) + coord_flip()

Proportional bar plot - better names

miceSubset %>% comp_barplot(
  tax_level = 'unique', n_taxa = 12, bar_width = 0.7,
  sample_order = 'asis'
) + coord_flip()

Proportional bar plot - all ASVs

miceSubset %>% comp_barplot(
  tax_level = 'unique', n_taxa = 12, bar_width = 0.7,
  sample_order = 'asis', merge_other = FALSE
) + coord_flip()

Aggregated bar plot - Families

miceSubset %>% comp_barplot(
  tax_level = "Family", n_taxa = 10, bar_width = 0.7, 
  sample_order = 'asis'
) + coord_flip()

Aggregated bar plot - Genera

miceSubset %>% comp_barplot(
    tax_level = "Genus", n_taxa = 12, bar_width = 0.7,
    sample_order = 'asis', merge_other = FALSE
  ) + coord_flip()

Aggregated bar plot - Phyla

miceSubset %>% comp_barplot(
    tax_level = "Phylum", n_taxa = 7, bar_width = 0.7, 
    sample_order = 'asis'
  ) + coord_flip()

Phylum name changes!

  • Actinobacteria is now Actinomycetota
  • Bacteroidetes is now Bacteroidota
  • Proteobacteria is now Pseudomonadota (!)
  • Firmicutes is now Bacillota (!?!)

Alpha diversity

How diverse is the bacterial microbiome of each sample?

Why is diversity interesting?


Biologically

  • Diverse == healthy, resilient, good (?)


Practically

  • Summarize an ecosystem in one number: easy to compare

Quantifying diversity?

Observed Richness

The more the merrier. Just counting. \(N = N\).

Diversity - Shannon index (H)

Richness AND evenness matter. \(H = -\sum_{i=1}^Np_i\ln p_i\)

Diversity - Effective numbers

\(e^H = N\) if all taxa were equally abundant.

Your turn