Summarizes the advantages of cloud-based data handling and analysis and describes the STRIDES initiative which involves Google Cloud Platform and Amazon Web Services. The STRIDES program offers discounts, consultancy and professional services to NIH institutes and centers, or NIH-funded researchers.
Introduction to AMP-PR workspaces which exist within the Terra platform. These are organized by access level and consist of extensive notes and example scripts written in both Python and R script. Recaps a lot of the tools that have been mentioned in previous talks. Provides important information about how to link Terra to a Google billing account to pay for data usage and set up alerts.
Introduction to the Terra platform and walkthrough of the website, including registration, using workspaces, and a bit of background on Hail, Cromwell, Jupyter notebooks. Covers the sub-menus in Terra and talks about the free credit program. Hands-on guide to using a ""workspace"" with a public data example
Covers the rationale for having data and analysis tools located together in the cloud environment, as well as Google Cloud Storage, Google BigQuery Database, and Google Compute Engine (including costs involved). Gives examples of moving data around using the command line - list, copy, remove commands - and the integrated analysis tools (SQL, Python, R).
Purpose is to accelerate trials through diagnostic, prognostic, and progression biomarkers. Clinical data harmonized with CDISC standards, handle existing WGS, RNA seq, proteomics, and clinical data. Covers tiers of access (level 1 clinical data and level 2 all data) and requirements.
Covers GWAS study design, including UK Biobank. Covers QC, both sample and SNP QC, and population structure, imputation (including tools), combining datasets or meta-analysis using summary stats. Also brief overview of a range of secondary analysis including fine mapping, pleiotropy, MR etc.
Discusses exome versus whole-genome sequencing, and the different types of genetic variation that can occur. Introduces tools such as Integrative Genomic Viewer, Burrows-Wheeler Aligner (BWA), STAR Aligner, Picard and GATK. Covers library preparation and sequencing, then data pre-processing (raw sequence output then alignment to reference genome to create BAM file) and variant discovery (for both germline and somatic genetic variation, and also copy number variation).