Among the several definitions and interpretations of F_{ST} , I like the paper of Bhatia et al. 2013 because it summarizes them well. One of them, that I like, is Hudson's definition (and estimation) of F_{ST} , which for two population turns out to be the same as Weir-Hill and Weir-Cockerham. I found useful to "translate" into animal breeding jargon and the variances of two purebreds and F1 and F2 crosses. This has probably been done elsewhere. In fact this is largely based on Bhatia et al.'s explanations and Appendix. I will use q=1-p often. I assume large populations of the same size.
Hudson's definition
Consider populations 1 and 2. Let
F_{ST}=1-\frac{H_w}{H_b}=\frac{H_b - H_w}{H_b}
where H_w=p_1 (1-p_1)+p_2 (1-p_2) is heterozygosity "within" and H_b=p_1 (1-p_2)+p_2 (1-p_1) is heterozygosity "between". What does this mean?
- H_w=p_1 (1-p_1)+p_2 (1-p_2) the heterozygosity "within" can be thought as the average heterozygosity across the two populations: H_w= \frac{H_1 + H_2}{2} = \frac{1}{2}\left(2 p_1 (1-p_1)+2 p_2 (1-p_2)\right)=p_1 (1-p_1)+p_2 (1-p_2)
- H_b=p_1 (1-p_2)+p_2 (1-p_1) is the heterozygosity of an F1 population with one gamete coming from population 1 and the other from population 2.
The difference H_b - H_w is the numerator of the F_{ST} and is:
H_b - H_w= p_1 q_1 + p_2 q_2 - p_1 q_2 + p_2 q_1 = (p_1 - p_2)^2
where (p_1 - p_2)^2=N is the numerator of the F_{ST} and it is Nei's minimal genetic distance.
In an F2 population there is HW equilibrium and the allele frequency is p_{F2}=\frac{p_1 + p_2}{2} . Thus the heterozygosity is H_{F2}=2 \frac{p_1 + p_2}{2}\frac{q_1 + q_2}{2}=\frac{1}{2}(p_1 + p_2)(q_1 + q_2) .
The increased variance in an F2 population from the average variance across the two populations is
H_{F2} - H_w=\frac{1}{2}(p_1 + p_2)(q_1 + q_2) - (p_1 q_1 + p_2 q_2) = \frac{1}{2}(p_1 q_2 + p_2 q_1 - p_1 q_1 - p_2 q_2)=\frac{1}{2}(p_1 - p_2)^2
which is half the difference H_b - H_w , thus H_{F2} - H_w=\frac{1}{2}(H_b - H_w) .
The segregation variance is the difference between heterozygosities in the F1 and in the F2. From before H_{F2} =\frac{H_b}{2}+\frac{H_w}{2} and it is
H_{F2} - H_b = \frac{H_b}{2}+\frac{H_w}{2} - H_b = \frac{H_w}{2} - \frac{H_b}{2}=\frac{1}{2}(H_w - H_b)= \frac{1}{2}(p_1 - p_2)^2 = \frac{N}{2}
Thus, when we move from two populations to an F1 we gain say \Delta H in genetic variance and we reproduce the F1 to create an F2 we loss (from the F1) \frac{\Delta H}{2} and we gain (from the purebreds) \frac{\Delta H}{2} . The (numerator of the) F_{ST} is a measure of this. More exactly, the F_{ST} explains how much of the variance of the (hypothetical) F1 population is due to mixing populations and not to the variance within populations (this is of course Wright's original interpretation).
Nei's definition
We will call it F_{ST}^{Nei} . It is defined as
F_{ST}^{Nei}=\frac{(p_1 - p_2)^2}{2\bar{p}(1-\bar{p})}
where, because \bar{p}=\frac{p_1 + p_2}{2}=p_{F2} , the denominator is exactly the heterozygosity in our F2 population. Thus the F_{ST}^{Nei} and Hudson's F_{ST} are not the same thing, because the denominator refer to a different "common" population, an F1 population for Hudsons and WC and an F2 for Nei. Bhatia et al. show that on expectation and in the limit
F_{ST}^{Nei} \rightarrow \frac{F_{ST}^1+F_{ST}^2}{2-\frac{F_{ST}^1+F_{ST}^2}{2}} but in fact, Hudson's F_{ST}=\frac{F_{ST}^1+F_{ST}^2}{2} which after some manipulations gives
F_{ST}^{Nei}=\frac{F_{ST}}{1-\frac{F_{ST}}{2}}
so, Nei's F_{ST}^{Nei} (very slightly for small values) understimates Hudson's (or Weir-Cockerham, Weir-Hill) F_{ST} , but this is normal, strictly speaking they're not the same thing.
No comments:
Post a Comment