WGBS Analysis Part 19

Separating female and indeterminate samples

My intended R code works, so now I need to finalize my script and create an analogous methylKit script for mox!

Before creating a mox script, I wanted to separate out the indeterminate and female oyster samples. When looking for covariate metadata, I realized that not all samples are from female oysters! When I picking extracted DNA samples to sequence, I tried to get as many female samples as possible, but due to some issues with having enough material for sequencing, I ended up sequencing one indeterminate sample per treatment. My plan was to use this indeterminate sample to see if any DML I identified were more related to maturation status, and how immature oyster gonad methylation could be impacted. I promptly forgot I had this plan until looking at previous lab notebooks, so thank you past Yaamini.

My first instinct was to try subsetting the methylRawListDB created by methRead like a standard dataframe, but that didn’t work. I returned to the methylKit handbook and found a convenience function to subset objects, select. However, this function doesn’t work for methylRawListDB! A quick search lead my back to reorganize, which is the function I used to set up randomization trials. Looking at the examples for the function, I realized I could subset the methylBase object created by unite, instead of subseting the methylRawListDB, filtering data, and combining! I think the comparative analysis portion (PCA and correlation plot) is useful when all samples are included, and that’s created using the unite methylBase object. If I could subset the female or indeterminate samples from this object, then I can identify sex-specific DML in a less computationally-intensive way.

I successfully used the following code to separate female samples from the methylBase object:

methylationInformationFilteredCov5Fem <- methylKit::reorganize(methylationInformationFilteredCov5,
                                                               sample.ids = c("1", "3", "4",
                                                                              "6", "7", "8"),
                                                               treatment = c(rep(1, times = 3),
                                                                             rep(0, times = 3))) #Get female sample information from methylBase object created by unite by specifying sample IDs. Include treatment information as well

By specifying sample.ids, I could pull out the samples I wanted. I confirmed the function worked by looking at my original object (methylationInformationFilteredCov5) and the subsetted object (methylationInformationFilteredCov5Fem):

Screen Shot 2021-04-11 at 1 24 08 PM

Screen Shot 2021-04-11 at 1 24 15 PM

Figures 1-2. methylBase objects before and after subsetting

Sample 2 in the subsetted object is the same as sample 3 in the original object, so I think it worked! I then cleaned up my R Markdown script and created separate sections for female- and indeterminate-DML identification. I used covariate and overdispersion corrections for the female data, but not for the 1 vs. 1 indeterminate sample comparison. I ran separate comparative analyses on the each data subset just to see what it might look like, but I didn’t pay too much attention to the output since I haven’t updated the data I’m working with to the Roslin alignments (I modified the code, but I haven’t rerun those chunks because they’re time-consuming and I’m going to do it on mox anyways). Knowing that I could subset from the methylBase object and assign treatment information in that subsetting process, I also updated my randomization trial code:

for (i in 1:1000) { #For each iteration
  pHRandTreat <- sample(sampleMetadataFem$pHTreatment,
                        size = length(sampleMetadataFem$pHTreatment),
                        replace = FALSE) #Randomize female treatment information
  pHRandDML <- methylKit::reorganize(methylationInformationFilteredCov5Fem,
                                     sample.ids = sampleMetadataFem$sampleID,
                                     treatment = pHRandTreat) %>%
    methylKit::calculateDiffMeth(., covariates = covariateMetadataFem,
                                 overdispersion = "MN",
                                 test = "Chisq") %>%
    methylKit::getMethylDiff(., difference = 25, qvalue = 0.01) %>%
    nrow() -> pHRandTest25Fem[i] #Shuffle treatments from methylBase object created by unite. Calculate diffMeth and obtain DML that are 25% and have a q-value of 0.01. Save the number of DML identified
}

Going forward

  1. Write methods
  2. Obtain relatedness matrix and SNPs with EpiDiverse/snp
  3. Run R script on mox
  4. Write results
  5. Identify genomic location of DML
  6. Determine if RNA should be extracted
  7. Determine if larval DNA/RNA should be extracted
Written on April 11, 2021