
Transform taxa in phyloseq object and record transformation
Source:R/tax_transform.R
tax_transform.Rd
Transform taxa features, and optionally aggregate at specified taxonomic rank beforehand.
You can pipe the results of tax_agg
into tax_transform
,
or equivalently set the rank argument in tax_transform
.
Usage
tax_transform(
data,
trans,
rank = NA,
keep_counts = TRUE,
chain = FALSE,
zero_replace = 0,
add = 0,
transformation = NULL,
...
)
Arguments
- data
a phyloseq object or psExtra output from
tax_agg
- trans
any valid taxa transformation (e.g. from
microbiome::transform
)- rank
If data is phyloseq: data are aggregated at this rank before transforming. If NA, runs tax_agg(data, rank = NA). If rank is NA and data is already psExtra, any preceding aggregation is left as is.
- keep_counts
if TRUE, store the pre-transformation count data in psExtra counts slot
- chain
if TRUE, transforming again is possible when data are already transformed i.e. multiple transformations can be chained with multiple tax_transform calls
- zero_replace
Replace any zeros with this value before transforming. Either a numeric, or "halfmin" which replaces zeros with half of the smallest value across the entire dataset. Beware: the choice of zero replacement is not tracked in the psExtra output.
- add
Add this value to the otu_table before transforming. If
add
!= 0,zero_replace
does nothing. Either a numeric, or "halfmin". Beware: this choice is not tracked in the psExtra output.- transformation
deprecated, use
trans
instead!- ...
any extra arguments passed to
microbiome::transform
or pass undetected =a number
when using trans = "binary"
Details
This function often uses microbiome::transform
internally and can perform the
same transformations, including many from vegan::decostand
(where the default MARGIN = 2).
See below for notes about some of the available transformations.
tax_transform
returns a psExtra
containing the transformed phyloseq object and
extra info (used for annotating ord_plot
ordinations):
tax_transform (a string recording the transformation),
tax_agg (a string recording the taxonomic aggregation rank if specified here or earlier in
tax_agg
).
A few commonly used transformations:
"clr", or "rclr", perform the centered log ratio transformation, or the robust clr, using
microbiome::transform
"compositional" converts the data into proportions, from 0 to 1.
"identity" does not transform the data, and records this choice for
ord_plot
"binary" can be used to transform tax abundances into presence/abundance data.
"log2" which performs a log base 2 transformation (don't forget to set zero_replace if there are any zeros in your data)
(r)clr transformation note
If any values are zero, the clr transform routine first adds a small pseudocount of min(relative abundance)/2 to all values. To avoid this, you can replace any zeros in advance by setting zero_replace to a number > 0.
The rclr transform does not replace zeros. Instead, only non-zero features are transformed, using the geometric mean of non-zero features as denominator.
Binary transformation notes
By default, otu_table values of 0 are kept as 0, and all positive values
are converted to 1 (like decostand(method = "pa")
).
You can set a different threshold, by passing e.g. undetected = 10, for
example, in which case all abundances of 10 or below would be converted to 0.
All abundances above 10 would be converted to 1s.
Beware that the choice of detection threshold is not tracked in the psExtra.
Examples
data("dietswap", package = "microbiome")
# aggregate taxa at Phylum level and center log ratio transform the phyla counts
tax_transform(dietswap, trans = "clr", rank = "Phylum")
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 8 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 8 taxa by 1 taxonomic ranks ]
#>
#> otu_get(counts = TRUE) [ 8 taxa and 222 samples ]
#>
#> psExtra info:
#> tax_agg = "Phylum" tax_trans = "clr"
# this is equivalent to the two-step method (agg then transform)
tax_agg(dietswap, rank = "Phylum") %>% tax_transform("clr")
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 8 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 8 taxa by 1 taxonomic ranks ]
#>
#> otu_get(counts = TRUE) [ 8 taxa and 222 samples ]
#>
#> psExtra info:
#> tax_agg = "Phylum" tax_trans = "clr"
# does nothing except record tax_agg as "unique" and tax_transform as "identity" in psExtra info
dietswap %>% tax_transform("identity", rank = NA)
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 130 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 130 taxa by 4 taxonomic ranks ]
#>
#> psExtra info:
#> tax_agg = "unique" tax_trans = "identity"
# binary transformation (convert abundances to presence/absence or detected/undetected)
tax_transform(dietswap, trans = "binary", rank = "Genus")
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 130 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 130 taxa by 3 taxonomic ranks ]
#>
#> otu_get(counts = TRUE) [ 130 taxa and 222 samples ]
#>
#> psExtra info:
#> tax_agg = "Genus" tax_trans = "binary"
# change detection threshold by setting undetected argument (default is 0)
tax_transform(dietswap, trans = "binary", rank = "Genus", undetected = 50) %>%
otu_get() %>%
.[1:6, 1:4]
#> OTU Table: [4 taxa and 6 samples]
#> taxa are columns
#> Actinomycetaceae Aerococcus Aeromonas Akkermansia
#> Sample-1 0 0 0 0
#> Sample-2 0 0 0 1
#> Sample-3 0 0 0 1
#> Sample-4 0 0 0 1
#> Sample-5 0 0 0 0
#> Sample-6 0 0 0 0
# log2 transformation after replacing all zeros with a pseudocount of 1
tax_transform(dietswap, trans = "log2", rank = "Family", zero_replace = 1)
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 22 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 22 taxa by 2 taxonomic ranks ]
#>
#> otu_get(counts = TRUE) [ 22 taxa and 222 samples ]
#>
#> psExtra info:
#> tax_agg = "Family" tax_trans = "log2"
# log2 transformation after replacing all zeros with a pseudocount of half
# the minimum non-zero count value in the aggregated dataset
tax_transform(dietswap, trans = "log2", rank = "Family", zero_replace = "halfmin")
#> psExtra object - a phyloseq object with extra slots:
#>
#> phyloseq-class experiment-level object
#> otu_table() OTU Table: [ 22 taxa and 222 samples ]
#> sample_data() Sample Data: [ 222 samples by 8 sample variables ]
#> tax_table() Taxonomy Table: [ 22 taxa by 2 taxonomic ranks ]
#>
#> otu_get(counts = TRUE) [ 22 taxa and 222 samples ]
#>
#> psExtra info:
#> tax_agg = "Family" tax_trans = "log2"