Sunday, January 3, 2021

Paradoxes of BLUP

In genetic evaluation, it is well understood that an individual with progeny records has a more accurate prediction than an animal with few (or no) progeny records. Trying to illustrate the case with a small  example we found out some apparent paradox due to confounding with fixed effects. 

The pedigree is simple, two full-sibs (7 and 8) where 7 has seven offspring. So we hope animal 7 to be more accurate than animal 8/ All animals have a single record.

The model is also simple, $latex y =  u +\mu +e $latex with $latex h^2=0.5 $latex. We can compute accuracies directly in blupf90 using 

OPTION store_accuracy 1 

(store_accuracy 1 because it is the first effect in the model). Reliabilities are stored in file acc_bf90 . The result is surprising:

1 0.42468738

2 0.42468738

3 0.42468738

4 0.42468738

5 0.39977264

6 0.39977264

7 0.28647215

8 0.34482759

9 0.36994532

...


so, animal 7 is the less reliable in spite of having most progeny. Why is that? The reason is that animal 7 is highly confounded with the mean fit in the model through its descendants. If we remove the mean from the model and we fit $latex y =  u +e $latex we obtain:

1 0.54197995

2 0.54197995

3 0.54197995

4 0.54197995

5 0.62781955

6 0.62781955

7 0.71052632

8 0.59649123

9 0.54779806

here, we obtain what we expected, animal 7 is the most reliable. However the model $latex y =  u +e $latex is wrong: there *should* be an overall mean in the model.

The morale is that progeny of individuals should be spread across levels of fixed effects, and cross-classified as much as possible (different sires in different herds) in order to ensure good accuracies. Alternatively, if sires are not well spread, contemporary groups should be random although this is against common practice (see Larry Schaeffer's advice about this).




No comments:

Post a Comment