Genetics II: Molecular Basis for Genetic Diseases
Mutation and Protein Function
Mutations
Ultimate source of genetic variation
Loss of funtion
May alter coding, regulatory, or other regions
Can have range of effects if residual function is maintained
Gain of Function
Enhance one or more of the functions of the protein
Increase in amount of function or abundance of the protein
Novel Property mutations
Sickle Cell Anemia:
Hemoglobin chains aggregate.
E6V (AA Sub)
Heterotropic or ectopic gene expression
Review of Terms
Locus
Position on the chromosome:
Disease Locus & Marker Locus
Marker
A measurable unit on a chromosome
single nucleotide polymorphism (SNP)
Allele
One of several alternative forms of sequence at a locus
2 alleles per locus, one per chromosome
Genotype
Alleles present at that locus
Heterogeneity
Allelic Heterogeneity
The occurrence of more than one allele at a locus
Diff mutation at same gene
B-Thalassemia:
HBB: chromosome 11p15.4
over 200 disease causing mutations identified
Locus Heterogeneity
Assosiation of more than one locus with clinical phenotype
Ex: thalassemia. genes at diff loci (16 a-globin and 11 B-globin gene both causing thalassemia
Phenotypic Heterogeneity
Sickle cell disease and B-thalassemeia each result from distinct B-globin gene mutations.
Hemoglobin and Hemoglobinopathies
The Hemoglobins:
a, b, and y: Globin switching
changes in expression of global molecules during development
Temporal switches of globin synthesis are accompanied by changes in the principal site of erythropoiesis
Locus Control Region (LCR):
req for expression of all genes in B-globin cluster
areas of "open" DNA gives TF access to reg elements that mediate expression
Structural variants
Alters AA sequence of globin PP--> alter properties of the protein
Ex: Sickle Cell and Methemoglobin
Thalassemias
Diseases that result from decreased abundance of one or more of globin chains
Ex: B-thalassemias
Hereditary persistence of fetal hemoglobin
A group of clinically benign conditions that impair perinatal switch: y-globin --> B-globin synthesis
Ex: Hb F
Practical:
The most common forms of α-Thalassemia
are the result of gene deletions. Rationalize
the high frequency of deletions in mutational
carriers.
a- Thalassemia:
-Two identical a-globin genes on each chromosome (16)
Tandem homologous a-globin genes facilitates misalignment between domains
Population Genetics
Forces of Evolution
Random Mutation
New Traits arise via chance mutations in DNA
Genetic Drift
Mutations may change in frequency by chance events
Gene Flow
Mutations spread by migration
NS
Mutations increase in freq if they increase in number of offspring
Calculating allele frequency
Number of Indi x allele count
Hardy-Weinberg Law
p2 + 2pq + q2=1
1=p+q
Assumes no evolution
Sexual Selection vs Assortative Mating
Sexual selection:
increases trait frequency
Assorative mating:
Increases homozygosity of variants, creates correlations across distant loci within complex traits
creates confounding across traits
overestimates heritability of traits
Genetic basis for mendelian disease
Objectives
-Demonstrate genetic mapping by following the co-
segregation of alleles within a family, i.e. linkage analysis
- Quantify the extent to which a genetic marker and disease
locus are linked, i.e. co-inherited
- Review successful examples of disease loci identified
through linkage analysis
3
Steps for Disease Gene Identification
Linkage analysis
Posostionlal cloning
use fam with disease, identify regions that co-seg with phenotype
Why Linkage:
• In most cases little is known about the genomic location of
genes contributing to disease
• Thus, the study design usually consists of systematically
surveying the entire genome
• The extent of linkage is a function of the physical distance
between the loci on the chromosome
• Based on recombination between loci
10
Recombination happens in prophase
Candidate gene
Genome wide association study (GWAS)
Characteristics for Mendelian Disease
Recognizable pattern of inheritance
single gene mutation
Allelic heterogeneity
Risk variants have high disease penetrance with strong genetic defects
Low frequency of variants and low risk disease prevalence
Linkage analysis
Goal:
Goal: Identify a chromosomal region linked to a
disease within families that exceeds the null
expectation, which enable localization of disease
gene in the genome
Two point linkage analysis
Test if a genetic marker & disease locus are linked
• Genotypes at disease locus are unknown but
phenotype (affection status) is known
LOD
Log Likelihood ratio
Z(theta) = log 10 (L(theta)/L(theta=0.5))
Likelihood:
probability of data given the prarmetiers
theta: Pr (recombination
1-theta=Pr (no recom)
Problems:
Mendelian diseases are not really “simple”
• Reduced penetrance
• Heterogeneity (allelic, phenotypic)
• Genotyping errors leads to spurious
recombinants → loss of power
• The multi-locus map helps to detect this error by checking
for unusual double recombinants
Phase concept
Specific alleles that a person has/ that are inherited
Biochemical and genetic basis for disease
Protein Classes
Housekeeping proteins
Specialty Proteins
Disease based on Mutation in diff class of proteins
Enzymes
PKU and Tay-Sachs
PKU:
mutation in PAH--> neg impact on degre of phenylalanine
PAH expressed in liver, damage CNS
First genetic defect to cause intellectual disability
normal at birth, microcephaly, hyperactivity, seizures, and learning disability
Phenyalaline--> Tyrosine, with help from BH4 (cofactor) and PAH
Example of allelic Heterogeneity:
Over 1557 mutations world wide found in patients with PKU
Variant and non-variant PKU
Tay-Sachs:
Lysosomal storage disease
Pseudodeficiency Alleles: clinically benign allele that has a reduction in function activity detected by in vitro assays but has suffiecnt activity
Defects in receptor proteins
Familial Hypercholesterolemia
Group of metabolic disorders
chracterized by elevated plasma lipids carried by apolipoprotein B
LDLR mutations are auto semidom trait
Both homo and heterogeneity phenotypes
Gene dosage: earlier manifestation and sev.
Transport defects
Cystic Fibrosis
Principal effects in lungs and exocrine pancreas
increase in sweat sodium and Cl concentration
Genocopy: Similar phenotypes show varying genotypes
on different loci
CFTR only gene assoc with CF
Structural Proteins
DMD
Duchenne Muscular Dystrophy
Muscle weakness at 3-5 yoa, heart and resp are also affected
X linked rec, mutation rate 10^-4
Multifactorial Disorders
Qualitative and Quantitative traits
Qualitative: trait that the person either does or does not have
Qunatiative:Measurable physiological or biochem quantity that differs among diff indivs, usually follows normal dis.
Familial Aggregation and Correlation
Siblings share 50% of alleles
More close the fam member, share more allele
Fam Ag: greater than expected number of affected relatives
compared to that of the freq of general pop.
Measures: RR and Fam hx case-control
Larger rrr, greater fam ag (greater than 1)
Quantitative: Correlation and Heritability
Coefficient of Correlation (r)
(higher heritability, greater contribution of genetic diff.
0= no genetic contrib
1=genotype is responsible (totally)
Distinguish Genetic and environmental
Use fam Studies:
Use twin studies- MZ and DZ
DNA finger printing
MZ/DZ graph
Estimate Heritability of twins:
H2= 2 x (rmz-rdz)
Identifying genetic basis for complex disease
Objectives:
Understand the fundamentals of study design for a genome-wide association study (GWAS).
Become familiar with analytical approaches to extend the capabilities of GWAS, e.g. LD and imputation.
Acknowledge the strengths and limitations of the GWAS approach
Genetics to Phenotype
Mendelian traits:
Ex. Sickle cell Anemia and CF
Complex traits: height, Type 2 diabetes
mix of genetic and environment
Genomic association studies
Good for high heritability
Mendelian disorders
Large study pop (case/control unrelated)
Dense genotyping
GWAS
Examination of genetic variation across a given genome
Many SNPs
Designed to identify genetic associations with observable triats
Qualitative: Obesity, T2D, Asthma, CVD
Quantitative: BMI, Fasting glucose
Agnostic (genome scan) approach
Similar to linkage analysis
across entire genome
Dissimliar to candidate gene appraoach
***Mapping the human genome and linkage disequilibrium motivated and enabled the GWAS paradigm.**
Associations studies
Current high throughput genotyping allows us to inexpensively genotype large numbers* of genetic markers(or variants) across the genome
Key concept: Exploring associations between the variations in a gene and a trait assumes:
**there exist correlations between the marker you genotyped and the functional polymorphism (due to LD)**
Refer to slide 21 for comparison of linkage analysis and association.
Mainly: Linakge = mendilion and Association=complex diseases
Linkage Disequilibrium
The non-random association of alleles at two or more loci.
If SNPs are completely independent from one another, they are considered to be in linkage equilibrium.
Alternatively, any detected degree of association (greater than chance) between the allele frequencies (of the SNPs) indicates linkage disequilibrium.
Overview of association methods
Allelic Test of association
Simple case-control study with no covariates or population structure
Regression methods
Joint estimation of multiple variables
Modeling outcome with SNP and covariates (e.g. age, sex, population structure)
Accounting for multiple comparisons (false discovery rate)
Displaying association results
Manhattan plot
Regional Association plot
Family Data methods
** P- value**
Bonferroni Threshold:
1.00x10^-6