This script uses MethylAid to perform sample-level quality control on the data from three studies: (i) GOTO, (ii) CD4+ T-cell functional experiments, and (iii) TwinLife.
Load packages
library(MethylAid)
library(BiocParallel)
Load in the target files for each study
load("../Processing/GOTO_targets-unfiltered.Rdata")
load("../../Study2_CD4T/CD4T_data-targets.Rdata")
load("../../Study3_TwinLife/TwinLife_data-targets.Rdata")
Set BPPARAM
BPPARAM <- MulticoreParam(8)
One IDAT file was corrupted (203527980080_R04C01
), so we removed it from targets. R.W. later sent an uncorrupted version, which we used to update the file.
Summarize IDAT files for MethylAid
sData_goto <- summarize(targets_goto,
batchSize=50,
BPPARAM=BPPARAM,
force=TRUE)
Save sData
save(sData_goto, file="../Processing/MethylAid/GOTO_sData-wave1.Rdata")
Summarize IDAT files for MethylAid
sData_cd4t <- summarize(targets_cd4t,
batchSize=50,
force=TRUE)
Save
save(sData_cd4t, file="../../Study2_CD4T/CD4T_data-sData.Rdata")
Summarize IDAT files for MethylAid
sData_twinlife <- summarize(targets_twinlife,
batchSize=50,
force=TRUE)
Save
save(sData_twinlife, file="../../Study3_TwinLife/TwinLife_data-sData.Rdata")
Following saving of data, the sData
objects were exported locally and inspected using MethylAid.
Outliers were saved in the study directory, for use in the next script.
For GOTO, we identified 26 outliers and resent them in wave 2.
sessionInfo()
## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Rocky Linux 8.10 (Green Obsidian)
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.15.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] IlluminaHumanMethylationEPICmanifest_0.3.0
## [2] minfi_1.40.0
## [3] bumphunter_1.36.0
## [4] locfit_1.5-9.8
## [5] iterators_1.0.14
## [6] foreach_1.5.2
## [7] Biostrings_2.62.0
## [8] XVector_0.34.0
## [9] SummarizedExperiment_1.24.0
## [10] Biobase_2.58.0
## [11] MatrixGenerics_1.10.0
## [12] matrixStats_1.0.0
## [13] GenomicRanges_1.46.1
## [14] GenomeInfoDb_1.34.9
## [15] IRanges_2.32.0
## [16] S4Vectors_0.36.2
## [17] BiocGenerics_0.44.0
## [18] BiocParallel_1.32.6
## [19] MethylAid_1.28.0
## [20] forcats_0.5.2
## [21] stringr_1.5.0
## [22] dplyr_1.1.3
## [23] purrr_0.3.4
## [24] readr_2.1.2
## [25] tidyr_1.2.1
## [26] tibble_3.2.1
## [27] ggplot2_3.4.3
## [28] tidyverse_1.3.2
## [29] rmarkdown_2.16
##
## loaded via a namespace (and not attached):
## [1] readxl_1.4.1 backports_1.4.1
## [3] BiocFileCache_2.2.1 plyr_1.8.8
## [5] splines_4.2.2 gridBase_0.4-7
## [7] digest_0.6.31 htmltools_0.5.5
## [9] fansi_1.0.4 magrittr_2.0.3
## [11] memoise_2.0.1 googlesheets4_1.0.1
## [13] tzdb_0.4.0 limma_3.54.2
## [15] annotate_1.72.0 modelr_0.1.9
## [17] askpass_1.1 timechange_0.2.0
## [19] siggenes_1.68.0 prettyunits_1.1.1
## [21] colorspace_2.1-0 blob_1.2.4
## [23] rvest_1.0.3 rappdirs_0.3.3
## [25] haven_2.5.1 xfun_0.39
## [27] hexbin_1.28.3 crayon_1.5.2
## [29] RCurl_1.98-1.12 jsonlite_1.8.5
## [31] genefilter_1.76.0 GEOquery_2.62.2
## [33] survival_3.5-5 glue_1.6.2
## [35] gtable_0.3.3 gargle_1.5.0
## [37] zlibbioc_1.44.0 DelayedArray_0.24.0
## [39] Rhdf5lib_1.20.0 HDF5Array_1.22.1
## [41] scales_1.2.1 DBI_1.1.3
## [43] rngtools_1.5.2 Rcpp_1.0.10
## [45] xtable_1.8-4 progress_1.2.2
## [47] bit_4.0.5 mclust_6.0.0
## [49] preprocessCore_1.60.2 httr_1.4.6
## [51] RColorBrewer_1.1-3 ellipsis_0.3.2
## [53] pkgconfig_2.0.3 reshape_0.8.9
## [55] XML_3.99-0.14 sass_0.4.6
## [57] dbplyr_2.2.1 utf8_1.2.3
## [59] later_1.3.1 tidyselect_1.2.0
## [61] rlang_1.1.1 AnnotationDbi_1.56.2
## [63] munsell_0.5.0 cellranger_1.1.0
## [65] tools_4.2.2 cachem_1.0.8
## [67] cli_3.6.1 generics_0.1.3
## [69] RSQLite_2.2.17 broom_1.0.1
## [71] evaluate_0.21 fastmap_1.1.1
## [73] yaml_2.3.7 knitr_1.43
## [75] bit64_4.0.5 fs_1.6.2
## [77] beanplot_1.3.1 scrime_1.3.5
## [79] KEGGREST_1.34.0 nlme_3.1-162
## [81] doRNG_1.8.6 sparseMatrixStats_1.10.0
## [83] mime_0.12 nor1mix_1.3-0
## [85] xml2_1.3.4 biomaRt_2.50.3
## [87] compiler_4.2.2 rstudioapi_0.14
## [89] filelock_1.0.2 curl_5.0.1
## [91] png_0.1-8 reprex_2.0.2
## [93] bslib_0.5.0 stringi_1.7.12
## [95] GenomicFeatures_1.46.5 lattice_0.21-8
## [97] Matrix_1.5-4.1 multtest_2.50.0
## [99] vctrs_0.6.3 pillar_1.9.0
## [101] lifecycle_1.0.3 rhdf5filters_1.10.1
## [103] jquerylib_0.1.4 data.table_1.14.8
## [105] bitops_1.0-7 httpuv_1.6.11
## [107] rtracklayer_1.54.0 BiocIO_1.8.0
## [109] R6_2.5.1 promises_1.2.0.1
## [111] codetools_0.2-19 MASS_7.3-60
## [113] assertthat_0.2.1 rhdf5_2.42.1
## [115] rjson_0.2.21 openssl_2.0.6
## [117] withr_2.5.0 GenomicAlignments_1.30.0
## [119] Rsamtools_2.10.0 GenomeInfoDbData_1.2.9
## [121] hms_1.1.2 quadprog_1.5-8
## [123] grid_4.2.2 base64_2.0.1
## [125] DelayedMatrixStats_1.16.0 illuminaio_0.40.0
## [127] googledrive_2.0.0 shiny_1.7.2
## [129] lubridate_1.9.2 restfulr_0.0.15
Cleanup
rm(list=ls())