Site Tools


integrarray_qc_guidelines

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
integrarray_qc_guidelines [2025/12/15 19:32] – Figure inserted 98.60.169.32integrarray_qc_guidelines [2026/01/23 21:20] (current) – Added Files 206.192.168.19
Line 4: Line 4:
 ==== 1. Genotype Calling ==== ==== 1. Genotype Calling ====
  
-Genotypes were called for many samples at the Center for Inherited Disease Research (CIDR). The genotyping platform used was the Global Screening Plus Custom Array (here called the GSA array but also referenced as product by BPM Amos_Custom_20032937X382854_A1, genome build GRCh38/hg38), with the calling algorithm GenomeStudio version 2011.1, Genotyping Module version 1.9.4, GenTrain Version 1.0. The array consists of a total of 741,786 genotyped SNPs and 0 intensity-only probe. The custom component was configured with about 46,789 additional genotypes (based on 50,000 available beadtypes). The annotated file of SNPs configured for the array is denoted as‘Annotated file’. The requested SNPs and genes are included in ‘VariantListsforArray’. The CIDR quality control (QC) steps follow the R packages GWASTools (Gogarten SM et al. 2012), GENESIS (Conomos MP et al. 2017) and SNPRelate (Zheng X et al. 2012). The methods of QA/QC used by CIDR were described in (Laurie CC et al. 2010).+Genotypes were called for many samples at the Center for Inherited Disease Research (CIDR). The genotyping platform used was the Global Screening Plus Custom Array (here called the GSA array but also referenced as product by BPM Amos_Custom_20032937X382854_A1, genome build GRCh38/hg38), with the calling algorithm GenomeStudio version 2011.1, Genotyping Module version 1.9.4, GenTrain Version 1.0. The array consists of a total of 741,786 genotyped SNPs and 0 intensity-only probe. The custom component was configured with about 46,789 additional genotypes (based on 50,000 available beadtypes). The annotated file of SNPs configured for the array is denoted as ‘{{ :annotatedfile.xlsx |Annotated file}}’. The requested SNPs and genes are included in ‘{{ :variantlistsforarray.xlsx |VariantListsforArray}}’. The CIDR quality control (QC) steps follow the R packages GWASTools (Gogarten SM et al. 2012), GENESIS (Conomos MP et al. 2017) and SNPRelate (Zheng X et al. 2012). The methods of QA/QC used by CIDR were described in (Laurie CC et al. 2010).
  
 ===== 2. Sample QC ===== ===== 2. Sample QC =====
Line 171: Line 171:
 Overall PCA for independent markers: Overall PCA for independent markers:
  
 +{{ :31949_ind_pc1_vs_pc2.png?500 |}}
  
 The QC steps prior to imputation were refined based on Chris’ suggestions. The QC steps prior to imputation were refined based on Chris’ suggestions.
Line 182: Line 182:
  
  
-No samples were dropped; all samples will be included in the imputation+No samples were dropped; all samples will be included in the imputation, the marker lists of all steps can be found in file: {{ :integrarray_qc_guidelines_marker_list.xlsx |}}
- +
-· 735,370 markers in the original release (chr1–chr23) +
- +
-· 942 markers with ≥3 discordant calls among 149 duplicate pairs +
- +
-· 3,552 markers with p < 1×10⁻⁷ in controls +
- +
-· 2,052 markers with p < 1×10⁻¹² in cases +
- +
-· 23,112 markers with call rate < 0.95 +
- +
-· 27,119 total unique markers affected by filters (3), (4), (5), and (6)+
  
-· 708,672 markers remaining after removing the filtered markers+  *  **735,370** markers in the original release (chr1–chr23) 
 +  *  **942** markers with ≥3 discordant calls among 149 duplicate pairs, missing genotypes were not included 
 +  *  **3,552** markers with p < 1×10⁻⁷ in 14,537 unrelated caucasian controls 
 +  *  **2,052** markers with p < 1×10⁻¹² in 9895 unrelated caucasian cases 
 +  *  **23,112** markers with call rate < 0.95 
 +  *  **27,119** total unique markers affected by filters discordant calls, HWE and call rate 
 +  *  **708,672** markers remaining after removing the filtered markers 
 +  *  **632,677** markers retained after running the McCarthy Group Tool workflow against the TOPMed R3 reference panel
  
-· 615,894 markers retained after running the McCarthy Group Tool workflow against the TOPMed R3 reference panel 
integrarray_qc_guidelines.1765827172.txt.gz · Last modified: by 98.60.169.32