Green Crab Experiment Part 47
Combined metabolomics-lipidomics WCNA
The next step in my endeavor to integrate my datasets is to throw all the metabolites and lipids into a single WCNA, then correlate the module eigengenes with temperature and physiological traits. That way, I’ll have a better understanding of how metabolite-lipid interactions influence whole-organism physiology. All my work was done in this script.
Side note: R Studio on my laptop has a weird font issue. This Github issue helps.
Run WCNA
The WCNA was pretty staight-forward. I smashed together the formatted dataframes from the metabolomics and lipidomcs WCNAs, identified a soft thresholding power of 6, and ran my analysis. Once I had my modules, I correlated them with physiological traits and temperature separately to mimic what I did with the single-omics WCNAs. I only used temperature since day was not a significant factor in the lipid PCA.
Figure 1. Combined WCNA correlated with temperature and physiological traits
Interpretation plan
Honestly I’m not sure how I want to interpret this out. My gut feeling is to focus on significant modules that were correlated with a physiological trait. Whichever method I choose, I’ll need lists of the compounds in each module. Of course, getting this information took me longer than I’d like, but I’m happy with how quickly I was able to identify the issues and find solutions! Here’s what helped:
- using
summarise_ifto perform actions across numerical columns - creating a list of dataframes using a for loop
- naming objects inside a list
- outputting dataframes to csvs from a list
module_alldfs <- list() #Create a list to save dataframes
for (i in 0:19) {
module_alldfs[[i+1]] <- rbind(module_df %>%
dplyr::rename(., Module = colors,
Compound = Metabolite) %>%
mutate(Compound = str_trim(Compound, side = "both")) %>%
filter(., Module == i) %>%
inner_join(., (rawMetabolomicsData %>%
dplyr::rename(Compound = BinBase.name) %>%
mutate(Compound = str_trim(Compound, side = "both"))),
by = "Compound") %>%
dplyr::select(., -c(ret.index:mass.spec, InChI.Key)) %>%
mutate(class = "metabolite"), #Take metabolite information from the module of interest. Trim white space on each side of the compound name. Join with raw metabolomics data and retain columns of interest
module_df %>%
dplyr::rename(., Module = colors,
Compound = Metabolite) %>%
mutate(Compound = str_trim(Compound, side = "both")) %>%
filter(., Module == i) %>%
inner_join(., (rawLipidomicsData %>%
dplyr::rename(Compound = lipid) %>%
mutate(Compound = str_trim(Compound, side = "both"))),
by = c("Compound")) %>%
dplyr::select(., -c(identifier:ESI.mode)) %>%
unique(.) %>%
group_by(Compound) %>%
summarise_if(is.numeric, ~sum(., na.rm = TRUE)) %>%
ungroup(.) %>%
mutate(., PubChem = NA,
KEGG = NA,
Module = i,
class = "lipid") #Take lipid information from the module of interest. Trim white space on each side of the compound name. Join with raw lipid data and retain columns of interest. Concatenate any duplicate rows for lipids with the same name.
)
} #Use a for loop to create a list of module dataframes
module_allInformation <- data.frame("Module" = 0:19,
"Metabolites" = rep(0, times = 20),
"Lipids" = rep(0, times = 20)) #Create empty dataframe to store information
for (i in 0:19) {
module_allInformation$Metabolites[i+1] <- nrow(module_alldfs[[i+1]] %>% filter(., class == "metabolite")) #Count the number of metabolites in specified list
module_allInformation$Lipids[i+1] <- nrow(module_alldfs[[i+1]] %>% filter(., class == "lipid")) #Count the number of lipids in the specified list
}
head(module_allInformation) #Confirm dataframe modification
write.csv(module_allInformation, "combined-WCNA/module-information.csv", quote = FALSE, row.names = FALSE)
for (i in 0:19) {
write_delim(module_alldfs[[i+1]], paste0("combined-WCNA/module", i, "_compounds.tsv"), delim = "\t", col_names = TRUE)
} #Save each module dataframe from list as a tab-delimted file
All lists can be found in this folder, and the counts of metabolites and lipids in each module can be found here.
Interpret output
I decided to move forward with my plan to only look at the modules that had significant correlations with temperature and a physiological trait:
- ME3: Negatively correlated with 5ºC, negatively correlated with righting response, positively correlated with oxygen consumption
- ME6: Negatively correlated with 5ºC, negatively correlated with TTR
- ME13: Negatively correlated with 5ºC, positively correlated with 30ºC, negatively correlated with TTR
- ME11: Negatively correlated with 30ºC, negatively correlated with oxygen consumption
When I looked at the number of metabolites and lipids in each module, almost all of them were single-compound modules! That in itself suggests that metabolites and lipids have unique roles in regulating the physiological endpoints we are interested in. My next step was annotate the compounds in these modules. This was MUCH easier than annotating last week since I already figured out the nuts and bolts of how to do it. I annotated metabolites and lipids separately. I was interested in:
- whether any of the compounds were VIP for temperature
- whether any of the compounds were in a priori pathways or had lipid class information
- whether any of the compounds were in enriched pathways for temperature
module3_allMetabAnnot <- module_alldfs[[4]] %>%
dplyr::select(Compound:KEGG, class) %>%
filter(., class == "metabolite") %>%
mutate(., respirationMetabolites = case_when(Compound %in% respirationMetabolites %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% respirationMetabolites %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
mutate(., calciumSignalingMetabolites = case_when(Compound %in% calciumSignalingMetabolites %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% calciumSignalingMetabolites %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
mutate(., coldProtectionMetabolites = case_when(Compound %in% coldProtectionMetabolites %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% coldProtectionMetabolites %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
left_join(x = .,
y = read.delim("../05-metabolomics-analysis/metaboanalyst/all-module-VIP.txt", sep = "\t") %>%
dplyr::select(metabolite, comparison) %>%
mutate(metabolite = str_trim(metabolite, side = "both")) %>%
pivot_wider(., id_cols = metabolite, names_from = "comparison", names_prefix = "VIP", values_from = "comparison") %>%
mutate(., VIP13v30_day22 = case_when(is.na(VIP13v30_day22) == FALSE ~ "Y",
is.na(VIP13v30_day22) == TRUE ~ "N")) %>%
mutate(., VIP13v5_day22 = case_when(is.na(VIP13v5_day22) == FALSE ~ "Y",
is.na(VIP13v5_day22) == TRUE ~ "N")) %>%
mutate(., VIP5_day3v22 = case_when(is.na(VIP5_day3v22) == FALSE ~ "Y",
is.na(VIP5_day3v22) == TRUE ~ "N")) %>%
dplyr::rename(., Compound = metabolite),
by = "Compound") %>%
mutate(., VIP13v30_day22 = case_when(is.na(VIP13v30_day22) == TRUE ~ "N",
is.na(VIP13v30_day22) == FALSE ~ VIP13v30_day22)) %>%
mutate(., VIP13v5_day22 = case_when(is.na(VIP13v5_day22) == TRUE ~ "N",
is.na(VIP13v5_day22) == FALSE ~ VIP13v5_day22)) %>%
mutate(., VIP5_day3v22 = case_when(is.na(VIP5_day3v22) == TRUE ~ "N",
is.na(VIP5_day3v22) == FALSE ~ VIP5_day3v22)) %>%
mutate(., valineBiosynthesis = case_when(Compound %in% valineBiosynthesis %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% valineBiosynthesis %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
mutate(., arginineBiosynthesis = case_when(Compound %in% arginineBiosynthesis %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% arginineBiosynthesis %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
mutate(., glutathioneMetabolism = case_when(Compound %in% glutathioneMetabolism %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% glutathioneMetabolism %>% str_trim(., side = "both") == FALSE ~ "N")) %>%
mutate(., alanineMetabolism = case_when(Compound %in% alanineMetabolism %>% str_trim(., side = "both") == TRUE ~ "Y",
Compound %in% alanineMetabolism %>% str_trim(., side = "both") == FALSE ~ "N"))
#Take module information and retain columns of interest. Subset for metabolite information. Add annotation information for a priori pathways, VIP for temperature treatments, and enrichment for temperature treatment.
module3_allLipidAnnot <- module_alldfs[[4]] %>%
dplyr::select(Compound:KEGG, class) %>%
filter(., class == "lipid") %>%
left_join(x = .,
y = read.delim("../06-lipidomics-analysis/metaboanalyst/all-module-VIP.txt", sep = "\t") %>%
dplyr::select(lipid, comparison) %>%
mutate(lipid = str_trim(lipid, side = "both")) %>%
dplyr::rename(Compound = lipid) %>%
unique(.) %>%
pivot_wider(., id_cols = Compound, names_from = "comparison", names_prefix = "VIP", values_from = "comparison") %>%
mutate(., VIP13v30 = case_when(is.na(VIP13v30) == FALSE ~ "Y",
is.na(VIP13v30) == TRUE ~ "N")) %>%
mutate(., VIP13v5 = case_when(is.na(VIP13v5) == FALSE ~ "Y",
is.na(VIP13v5) == TRUE ~ "N")) ,
by = "Compound") %>%
mutate(., VIP13v30 = case_when(is.na(VIP13v30) == TRUE ~ "N",
is.na(VIP13v30) == FALSE ~ VIP13v30)) %>%
mutate(., VIP13v5 = case_when(is.na(VIP13v5) == TRUE ~ "N",
is.na(VIP13v5) == FALSE ~ VIP13v5)) %>%
left_join(x = .,
y = significantVIP_13v30_annot %>%
rbind(significantVIP_13v5_annot) %>%
mutate(lipid = str_trim(lipid, side = "both")) %>%
dplyr::select(lipid, class:saturation) %>%
unique(.) %>%
dplyr::rename(lipidClass = class,
Compound = lipid),
by = "Compound") %>%
left_join(x = .,
y = read.csv("../06-lipidomics-analysis/metaboanalyst/LION-enrichment-report-13v30/LION-lipid-associations-13v30.csv", strip.white = TRUE) %>%
slice(-1) %>%
mutate(LION.term = substr(x = LION.term, start = 9, stop = 1000000L)) %>%
rename(lipid = LION.term) %>%
mutate(lipid = str_trim(lipid, side = "both")) %>%
unique(.) %>%
remove_rownames() %>%
column_to_rownames(., var = "lipid") %>%
dplyr::select_if(colnames(.) %in% LIONtoTerm13v30$LION.term) %>%
rownames_to_column(., var = "Compound"),
by = "Compound") %>%
left_join(x = .,
y = read.csv("../06-lipidomics-analysis/metaboanalyst/LION-enrichment-report-13v5/LION-lipid-associations-13v5.csv", strip.white = TRUE) %>%
slice(-1) %>%
mutate(LION.term = substr(x = LION.term, start = 9, stop = 1000000L)) %>%
rename(lipid = LION.term) %>%
mutate(lipid = str_trim(lipid, side = "both")) %>%
unique(.) %>%
remove_rownames() %>%
column_to_rownames(., var = "lipid") %>%
dplyr::select_if(colnames(.) %in% LIONtoTerm13v5$LION.term) %>%
rownames_to_column(., var = "Compound")) %>%
mutate(., LION.0000003 = case_when(LION.0000003 == "x" ~ "both",
LION.0000003 != "x" ~ NA)) %>%
mutate(., LION.0000010 = case_when(LION.0000010 == "x" ~ "13v30",
LION.0000010 != "x" ~ NA)) %>%
mutate(., LION.0000031 = case_when(LION.0000031 == "x" ~ "13v30",
LION.0000031 != "x" ~ NA)) %>%
mutate(., LION.0000095 = case_when(LION.0000095 == "x" ~ "13v30",
LION.0000095 != "x" ~ NA)) %>%
mutate(., LION.0000465 = case_when(LION.0000465 == "x" ~ "13v30",
LION.0000465 != "x" ~ NA)) %>%
mutate(., LION.0012010 = case_when(LION.0012010 == "x" ~ "13v30",
LION.0012010 != "x" ~ NA)) %>%
mutate(., LION.0012080 = case_when(LION.0012080 == "x" ~ "both",
LION.0012080 != "x" ~ NA)) %>%
mutate(., LION.0000011 = case_when(LION.0000011 == "x" ~ "13v5",
LION.0000011 != "x" ~ NA)) %>%
mutate(., LION.0012081 = case_when(LION.0012081 == "x" ~ "13v5",
LION.0012081 != "x" ~ NA))
#Take module information and retain columns of interest. Subset for lipid information. Add annotation information for VIP for temperature treatments, class and saturation state for VIP, and enrichment for temperature treatment.
All of the output can be found in this folder.
Since there were so few compounds in each module, I opted not to run any enrichment with MetaboAnalyst or LION/web. I may pull that information at a later time if I’m having trouble writing the discussion.
Going forward
- ASCA analysis
- Metabolomics and lipidomics DIABLO analysis
- Outline discussion
- Write discussion
- Write abstract