Thursday, March 4, 2021

Scaling traits and covariances.



 

Scaling traits for numerical analyses and de-scaling results such as variance component estimates can be a headache. Here it is how to proceed in a systematic manner.

Assume that we have a trait in a scale of very large variance (milk yield in liters) and other traits in a very small scale (points). We want to scale the traits so that the algorithms are numerically stable. So for a single record we have, say, 7 traits.

We transform each row of record, y,  pre-multiplying by a matrix with a scale factor, S. For instance S can contain 1/10 for milk yield (assume that this is the first trait) and 1 otherwise. Or it can contain the inverse of the phenotypic standard deviation of each trait. Scaled records y^s that we fed to, say, airemlf90 is 

$latex \bf{y}^s=\bf{Sy} $latex

Then the variances are scaled such that $latex Var({\bf y}^s )={\bf S}Var({\bf y}){\bf S}' $latex. If the original variances of y are, for instance for the genetic component, $latex {\bf G}_0 $latex, they are scaled such that $latex {\bf G}^s_0 $latex =$latex {\bf S G_0 S} $latex'. For instance 


G0 = [ 10 2 3
       2 4 0
       3 0 5]
S=[1/10 0 0
   0 1 0
   0 0 1]

Then 

G0s=S*G0*S’
3×3 Array{Float64,2}:
 0.1  0.2  0.3
 0.2  4.0  0.0
 0.3  0.0  5.0

The 1st col and  row go multiplied by 10 so the [1,1] element is multiplied by 100.

To transform back from, say, REML estimates of $latex \bf G_0^s $latex we multiply by the inverse of S:

$latex \bf \hat{G}_0=S^{-1} \hat{G}_0^s (S' )^{-1} $latex
For instance

inv(S)*G0s*inv(S’)
3×3 Array{Float64,2}:
 10.0  2.0  3.0
  2.0  4.0  0.0
  3.0  0.0  5.0

In the particular case of genetic correlation and heritabilities, they are invariant to the transformation. This is because they are multiplied by the same numbers in the numerator and the denominator.