Site Tools


dfb_98991_97996_xca

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
dfb_98991_97996_xca [2026/04/14 03:38] – [2.2 Duplicate calling concordance] 93.95.115.235dfb_98991_97996_xca [2026/04/15 23:57] (current) – [Integrarray QC Guidelines – November 25, 2025] 93.95.115.235
Line 14: Line 14:
 Outlined below are the QC steps that will ensure adequate alignment with these processes. Outlined below are the QC steps that will ensure adequate alignment with these processes.
    
-  *  Exclude samples with call rate &amp;lt;80%. +  *  Exclude samples with call rate <80%. 
-  *  Then, exclude SNPs with call rate &amp;lt;80%. +  *  Then, exclude SNPs with call rate <80%. 
-  *  Then, exclude samples with call rate &amp;lt;95%. +  *  Then, exclude samples with call rate <95%. 
-  *  Then, exclude SNPs with call rate &amp;lt;95% .+  *  Then, exclude SNPs with call rate <95% .
  
 ==== 2.2 Duplicate calling concordance ==== ==== 2.2 Duplicate calling concordance ====
Line 43: Line 43:
  
  
-We will include XO, XXY, and XYY karyotypes when they occur but mark these unusual sex chromosome patterns. These karyotypes were likely called correctly in GenomeStudio; however, their inferred sex based on the X-chromosome inbreeding coefficient (F value) typically falls within the 0.2–0.8 range (ambiguous). Therefore, such samples should be manually retained. Samples with low X heterozygosity (&amp;lt;5%) or rare karyotypes (e.g., XXX) wil be removed as they are difficult to genotype will be excluded. Pipelines need to track all deleted individuals and the reasons for their deletion, as some investigators are specifically interested in sex chromosome anomalies and how these may relate to cancer risk and aging.+We will include XO, XXY, and XYY karyotypes when they occur but mark these unusual sex chromosome patterns. These karyotypes were likely called correctly in GenomeStudio; however, their inferred sex based on the X-chromosome inbreeding coefficient (F value) typically falls within the 0.2–0.8 range (ambiguous). Therefore, such samples should be manually retained. Samples with low X heterozygosity (<5%) or rare karyotypes (e.g., XXX) wil be removed as they are difficult to genotype will be excluded. Pipelines need to track all deleted individuals and the reasons for their deletion, as some investigators are specifically interested in sex chromosome anomalies and how these may relate to cancer risk and aging.
  
 ==== 2.4 Ancestry ==== ==== 2.4 Ancestry ====
Line 55: Line 55:
  
  
-Exclude samples with heterozygosity &amp;lt;5% or &amp;gt; 40%, or with heterozygosity deviation if p&amp;lt;10&amp;lt;sup&amp;gt;-6&amp;lt;/sup&amp;gt;, (|Z|&amp;gt;4.892). Inflated heterozygosity suggests contaminated samples, while low heterozygosity suggested poor DNA quality. Perform heterozygosity test within each ancestral groups separately.+Exclude samples with heterozygosity <5% or 40%, or with heterozygosity deviation if p<10<sup>-6</sup>, (|Z|>4.892). Inflated heterozygosity suggests contaminated samples, while low heterozygosity suggested poor DNA quality. Perform heterozygosity test within each ancestral groups separately.
  
 ==== 2.6 Relatives ==== ==== 2.6 Relatives ====
Line 72: Line 72:
  
   *  Exclude SNPs zeroed by the cluster file with no genotypes.   *  Exclude SNPs zeroed by the cluster file with no genotypes.
-  *  Exclude samples with call rate &amp;lt;80% +  *  Exclude samples with call rate <80% 
-  *  Exclude SNPs with call rate &amp;lt;80% +  *  Exclude SNPs with call rate <80% 
-  *  Exclude samples with call rate &amp;lt;95% +  *  Exclude samples with call rate <95% 
-  *  Exclude SNPs with call rate &amp;lt;95%+  *  Exclude SNPs with call rate <95%
 ==== 3.2 Hardy-Weinberg ==== ==== 3.2 Hardy-Weinberg ====
-Check Hardy-Weinberg: exclude SNP if P &amp;lt; 10&amp;lt;sub&amp;gt;-7&amp;lt;/sub&amp;gt; in controls or P &amp;lt; 10&amp;lt;sub&amp;gt;-12&amp;lt;/sub&amp;gt; in cases.+Check Hardy-Weinberg: exclude SNP if P 10<sub>-7</subin controls or P 10<sub>-12</subin cases.
  
 ===== 4. SNP QC Exclusions Combined Across Groups ===== ===== 4. SNP QC Exclusions Combined Across Groups =====
Line 96: Line 96:
 ==== 5.1 Rare SNPs with poor call rate ==== ==== 5.1 Rare SNPs with poor call rate ====
  
-Exclude SNPs with call rate below 95% and MAF &amp;lt;0.001 in any group from the imputation input files. However, genotyped calls for these SNPs will remain available for analysis.+Exclude SNPs with call rate below 95% and MAF <0.001 in any group from the imputation input files. However, genotyped calls for these SNPs will remain available for analysis.
  
 ==== 5.2 Non-ideal cluster plots ==== ==== 5.2 Non-ideal cluster plots ====
Line 186: Line 186:
   *  **735,370** markers in the original release (chr1–chr23)   *  **735,370** markers in the original release (chr1–chr23)
   *  **942** markers with ≥3 discordant calls among 149 duplicate pairs, missing genotypes were not included   *  **942** markers with ≥3 discordant calls among 149 duplicate pairs, missing genotypes were not included
-  *  **3,552** markers with p &amp;lt; 1×10⁻⁷ in 14,537 unrelated caucasian controls +  *  **3,552** markers with p 1×10⁻⁷ in 14,537 unrelated caucasian controls 
-  *  **2,052** markers with p &amp;lt; 1×10⁻¹² in 9895 unrelated caucasian cases +  *  **2,052** markers with p 1×10⁻¹² in 9895 unrelated caucasian cases 
-  *  **23,112** markers with call rate &amp;lt; 0.95+  *  **23,112** markers with call rate 0.95
   *  **27,119** total unique markers affected by filters discordant calls, HWE and call rate   *  **27,119** total unique markers affected by filters discordant calls, HWE and call rate
   *  **708,672** markers remaining after removing the filtered markers   *  **708,672** markers remaining after removing the filtered markers
   *  **632,677** markers retained after running the McCarthy Group Tool workflow against the TOPMed R3 reference panel   *  **632,677** markers retained after running the McCarthy Group Tool workflow against the TOPMed R3 reference panel
-+
-1+
dfb_98991_97996_xca.1776137915.txt.gz · Last modified: by 93.95.115.235