Training & Development
Training is central to the GP2 effort, and throughout this project we will offer development opportunities on genetics of Parkinson’s disease and other related areas to everyone interested. Materials like these from different sources will be posted here, along with educational content produced by the GP2 team.
Covers the rationale for having data and analysis tools located together in the cloud environment, as well as Google Cloud Storage, Google BigQuery Database, and Google Compute Engine (including costs involved). Gives examples of moving data around using the command line - list, copy, remove commands - and the integrated analysis tools (SQL, Python, R).
Purpose is to accelerate trials through diagnostic, prognostic, and progression biomarkers. Clinical data harmonized with CDISC standards, handle existing WGS, RNA seq, proteomics, and clinical data. Covers tiers of access (level 1 clinical data and level 2 all data) and requirements.
Covers GWAS study design, including UK Biobank. Covers QC, both sample and SNP QC, and population structure, imputation (including tools), combining datasets or meta-analysis using summary stats. Also brief overview of a range of secondary analysis including fine mapping, pleiotropy, MR etc.
Discusses exome versus whole-genome sequencing, and the different types of genetic variation that can occur. Introduces tools such as Integrative Genomic Viewer, Burrows-Wheeler Aligner (BWA), STAR Aligner, Picard and GATK. Covers library preparation and sequencing, then data pre-processing (raw sequence output then alignment to reference genome to create BAM file) and variant discovery (for both germline and somatic genetic variation, and also copy number variation).