May 15, 2019

Interpretive Summary: The impact of clustering methods for cross-validation

Interpretive Summary: The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions.

By: Anne Wallace

In this study published in the April 2019 Journal of Animal Science, researchers determined the predictive accuracy of different genomic breeding variables (GBV) in Red Angus steer. In order to achieve this aim, they evaluated several different clustering methods for cross-validation.

Clustering organizes data into structured groups, but the predictive accuracy of each clustering model may vary. Therefore, cross-validation—a statistical method that assesses predictive performance—is important to determine accuracy of clustering models.

Seven different clustering methods were evaluated in this study: (1) random clustering, (2) k-means, (3) k-medoids, principal component clustering with numerator relationship matrix (4) PCN (data) and (5) PCN (cov), and with identical-by-state genomic relationship matrix (6) PCG (data) and (7) PCG (cov). A total of 9,763 Red Angus genotypes were evaluated with Geno-Diver simulation to assess predictive versus true accuracy (five replicates). The Bayes C model was used to estimate GBV. Traits included: birth weight (BWT), yearling weight (YWT), marbling and rib-eye area for deregressed estimated breeding values (DEBV) and for adjusted phenotypes BWT, YWT, rib-eye area and intramuscular percent fat.

The results of this study found that random clustering and DEBV had the highest estimated GBV accuracy and the least bias, when compared to the other methods evaluated in this study.

Understanding the accuracy of genomic selection and maximizing the use of genetic tools is important for improved selective breeding as this may help to maximize production.  The results of this study suggest that using random clustering may be more effective for selective breeding in Angus Steer. Overall, more studies are needed to better understand the utility of these findings in Angus Steer and other cattle breeds. 

To view the article, visit the Journal of Animal Science.