2022 Digital Poster Session | Tools for Polyploids

All presentations are available for download by clicking on the title. Downloads should start automatically.

QTL analysis of the “Tanzania” x “Beauregard” sweetpotato mapping population for resistance to Meloidogyne enterolobii

Simon Fraher¹, Tanner Schwarz², Bonny Oloka³, Ken Pecota¹, Chris Heim¹, Adrienne Gorny², G. Craig Yencho¹. ¹Department of Horticultural Science, North Carolina State University, Raleigh, NC. ²Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC. ³Root Crops Program, National Agricultural Research Organization, Entebbe, Uganda.

Due to its highly heterozygous hexaploid nature, sweetpotato (I. batatas) lags behind other crops in terms of genomic tools. New tools and strategies have afforded opportunities to associate sequence data with phenotypic data through QTL analysis, especially the publication of a diploid reference genome for I. trifida in 2018, and open-source software packages like QTLpoly and MAPpoly. QTL analysis is expected to be instrumental in accelerating breeding in this crop, notably for nematode resistance and nutritional factors. We analyzed the progeny of the sweetpotato bi-parental mapping population ‘Tanzania’ x ‘Beauregard’ (TB) representing 250 genotypes (including 4 check lines) for resistance to the emergent plant-parasitic nematode, Meloidogyne enterolobii (M.e.). Parent ‘Tanzania’ is highly resistant, and parent ‘Beauregard’ is highly susceptible. Bioassays showed clear bimodal segregation for resistance, suggesting a simplex major allele conferring resistance. Using the R package QTLpoly and the I. trifida reference genome, we discovered a major QTL peak at base pair 7,039,636 (79.21cM) of linkage group 4 of I. batatas associated with resistance to M.e. This analysis suggests variability in M.e. resistance within the TB population can largely be ascribed to genetic differences amongst the progenies (h² =66.9%). The next steps will search for flanking markers associated with these genotypes and attempt to identify markers which can be screened in the seedling stage, reducing the need for costly and laborious bioassays with this quarantined pest.

Octoploid Strawberry Linkage Map from Reduce Representation Sequencing SNP Markers

Jose Guillermo Chacon¹, Marcelo Mollinari¹, Bode A. Olukolu², Zhao-Bang Zeng¹, Gina E. Fernandez³. ¹Department of Horticultural Science, North Carolina State University, Raleigh, NC. ²Department of Entomology and Plant Physiology, University of Tennessee, Knoxville, TN. ³Bioinformatics Research Center, North Carolina State University, Raleigh, NC.

The cultivated strawberry (Fragaria ×ananassa L.) is an allopolyploid (2n= 8x = 28) with a complex genomic composition that hindered genetic and genomic studies, as the similarity between subgenomes, introgressions from a dominant genome, and other issues complicate accurate mapping and variant calling. The recent publication of a full chromosome length reference genome and improvements of deep sequencing for polyploids facilitated overcoming part of those difficulties, including the development of linkage maps. A biparental population was generated crossing the NCSU selections NCS 10-080 × NCS 10-147. The parents and 280 seedlings were sequenced using the reduced representation sequencing OmeSeq protocol resulting in 2.47 billion reads. The ngsComposer application was used for quality control, demultiplexing, and filtering, resulting in 1.84 billion reads. The read alignment resulted in a coverage of 92.32% of the four subgenomes in the allo-octoploid strawberry reference genome ‘Camarosa’ 1.0. The map construction was done using the MAPpoly R package. After a quality control screening, a total of 6133 markers and 212 offspring individuals were used to build a genetic linkage map comprised of 28 linkage groups with a total length of 3154 cM and 4022 SNP makers. The minimum linkage group size was 30.2 cM and 49 markers, and a maximum of 155.69 cM and 168 markers, with an average interval genetic distance of 1.32 cm. The high marker density and the correspondence between the number of assembled linkage groups and the number of expected chromosomes indicates that MAPpoly R is a robust analysis. This analysis provides an excellent framework map for forthcoming studies, including QTL analysis and understanding modes of inheritance in this complex polyploid species.

Genomic prediction for yield and processing traits in the tetraploid potato

Jeewan Pandey¹, Douglas C. Scheuring¹, Jeffrey W. Koym², and Maria Isabel Vales¹. ¹Department of Horticultural Sciences, Texas A&M University, College Station, TX. ²Texas A&M AgriLife Research and Extension Center, Lubbock, TX.

The current breeding process to develop a new potato cultivar takes a long time (10–15 years). One way to speed up the process and make it more efficient is to shorten the recurrent selection breeding cycle. This can be achieved by assigning breeding values to clones in the breeding pipeline and bringing those with the most favorable breeding values as parents for the crossing block. The aim of this study was to implement one angle of genomic selection by obtaining breeding values of chipping potato clones and recommend parents for the breeding program. Five hundred and forty-nine unique chipping potato clones were evaluated between 2017 and 2020 near Dalhart, TX, and genotyped using the Illumina Infinium Potato SNP array. Genomic-estimated breeding values (GEBVs) for chip color, chip quality, specific gravity, and total yield were obtained using the package StageWise. Potato clones with the most favorable GEBVs were identified and recommended as parents. The mean reliability of the GEBVs obtained were 0.75, 0.43, 0.61, and 0.33 for chip color, chip quality, specific gravity, and total yield, respectively. Breeders will increase the probability of transferring useful traits from parents to their progeny by choosing parental lines with the most favorable GEBVs. In turn, progeny with the best GEBVs can be re-used as parents or advanced to become new varieties. Thanks to the development of software packages suitable for polyploid species, genomic selection in potatoes is becoming more feasible and attractive.

Exploring the Genetic Control of Potato Tuber Dormancy

Ao Jiao¹, Sanjeev Gautam¹, Jeewan Pandey¹, Douglas C. Scheuring², Jeffrey Koym¹, and M. Isabel Vales¹. ¹Department of Horticultural Sciences, Texas A&M University, College Station, TX. ²Texas A&M AgriLife Research and Extension Center, Lubbock, TX.

Potato tuber dormancy is defined as the period after harvest during which tubers do not sprout even under favorable conditions. The length of the dormancy period is measured from vine kill until tubers start sprouting and it is affected by genotype and environmental factors. Premature dormancy break is a major factor causing post-harvest tuber quality reduction. Common methods to prevent sprouting include cold storage and the use of sprout inhibitors. Cold storage causes cold-induced sweetening and results in higher energy costs, whereas sprout inhibitors raise health and environmental concerns. Developing potato varieties with long dormancy could contribute to reducing the use of cold storage and sprout inhibitors. In this study, we evaluated tuber dormancy variation and investigated the genetic background of tuber dormancy. Over 200 clones from the Texas A&M Potato Breeding Program were grown in Dalhart, TX in 2019 and were evaluated for dormancy at room conditions (18 °C, RH 60%, dark). The clones exhibited variation in dormancy ranging from 38 to 155 days, with the Russets having significantly longer dormancy (> 96 days) than other market groups (70 – 80 days). Two Texas A&M varieties, Reveille Russet and Vanguard Russet, were among the clones with the longest dormancy. A genome-wide association study was performed using GWASpoly with Infinium Illumina 22K V3 Potato Array to identify genomic regions associated with tuber dormancy. The main QTL identified was on chromosome 9, explaining 11% of the phenotypic variation. Follow-up evaluations will be conducted at additional locations under room temperature and cold storage.

Development of Genomic Resources for Cultivated Blueberry

Ira A. Herniter and Nicholi Vorsa. Rutgers University, New Brunswick, NJ.

Blueberry (Vaccinium sect Cyanoccocus) is an increasingly important fruit crop native to North America. Collected from the wild for thousands of years, blueberry was only domesticated in the early twentieth century. While blueberry has long been a minor crop, recently interest in its health properties, including high levels of anthocyanins, has led to increased consumption and cultivation around the globe. However, quality genetic resources for blueberry have been lacking, greatly hampering the development of new varieties which can produce well under the constraints of changing climatic and pest regimes. We report the development of several populations being used for trait mapping. Four are diploid interspecific populations resulting from crosses between V. corymbosum, native to temperate climate, and V. darrowi, native to a subtropical climate, as well as a large germplasm collection consisting of released blueberry cultivars and wild relatives, including species from across North America, as well as species from Europe, Pacific Russia, and South America. The populations have great diversity in a range of traits, including leaf shape, berry size, fruit chemistry, flowering time, among others. The populations are an important resource for the development of new varieties and increasing understanding of blueberry physiology.

Improving zoysiagrass (Zoysia Willd. species) for drought tolerance across the Southern US

Beatriz Tome Gouveia¹, Brian M. Schwartz², Yangi Wu³, Kevin E. Kenworthy⁴, Ambika Chandra⁵, Paul L. Raymer², Marta T. Pudzianowska⁶, Esdras M. Carbajal¹, Manual R. Chavarria Sanchez⁴, Jing Zhang², Bradley M. Batterhsell³, Pamela S. Rowe, Meghyn B. Meeks⁵, Tianyi Wang⁵, Chase N. McKeithen⁴, and Susan R. Milla-Lewis¹. ¹North Carolina State University, Raleigh, NC. ²University of Georgia, Tifton, GA. ³Oklahoma State University, Stillwater, OK. ⁴University of Florida, Gainesville, FL. ⁵Texas A&M University, Dallas, TX. ⁶University of California, Riverside, CA.

Zoysiagrass (Zoysia Willd. species, 2n = 4x = 40), a complex that encompasses 11 species, are primarily used for home lawns, public parks, and athletic fields, being the second most used warm-season turfgrass on golf courses in the US. Water limitations are currently one of the biggest challenges for the turfgrass industry. Starting in 2010, a Turfgrass Specialty Crop Research Initiative (SCRI) project, funded by USDA-NIFA, has focused on addressing the problems of limited availability and reduced quality of water for irrigating turfgrass areas by breeding warm-season turfgrass species for improved drought and salinity tolerances. The objective of this study was to evaluate the performance of zoysiagrass breeding lines from the breeding programs at University of Georgia – Tifton and Griffin, North Carolina State University, Texas A&M University and University of Florida under drought conditions. Field trials arranged as randomized complete-block designs with three replications were installed at research facilities at Citra, FL, and Stillwater, OK in the summer of 2020. The response variables evaluated were turfgrass quality under normal or non-drought conditions (TQND), and both percent green cover (PGC), evaluated using UAS, and turfgrass quality (NTEP ratings from 1 to 9) (TQD) under drought conditions. The genetic variance was significant and non-significant for all traits in the single-environment and multi-environment analysis, respectively. The genotype-by-environment interaction variance was significant for all traits. Heritability estimates were above 0.50 for all traits in all locations, except for TQD in Citra. A high positive correlation was observed between TQD and PGC in both locations, whereas these traits showed low correlation with TQND. Several breeding lines performed better than the checks for both TQD and PGC at both locations. Evaluation of these genotypes will continue through 2023. Results of this study will support the selection of drought-tolerant elite zoysiagrass genotypes with increased performance stability for the target regions.

Genome-Wide Association Study on Potato Tuber Defects Under Heat Stress

Sanjeev Gautam and M. Isabel Vales. Department of Horticultural Sciences, Texas A&M University, College Station, TX. Texas A&M AgriLife Research and Extension Center, Lubbock, TX.

Heat stress reduces marketable tuber yield and quality of potatoes. Tuber defects can be external (heat sprouts, chained tubers, knobs) or internal (vascular discoloration, internal heat necrosis). Successful cultivation of potatoes under heat stress requires planting heat-tolerant varieties that can produce high yields of marketable tubers. Heat tolerance is a complex trait and breeding for it possesses several bottlenecks. To facilitate future marker-assisted selection for heat tolerance, a genome-wide association study (GWAS) aimed to identify genomic regions associated with heat tolerance was conducted. Phenotyping for a panel of 217 diverse potato genotypes was conducted near Springlake, Texas (heat stress location) for two years using a randomized complete block design with two replicates. The genotypes differed in their capacity of expressing the external as well as internal defects on tubers under heat stress. GWAS was conducted using GWASpoly with Infinium Illumina 22 K V3 Potato Array. Significantly associated SNPs with external defect traits were located on chromosomes 3, 4, 6, and 7 while those with internal defect traits were located on chromosomes 3 and 10. The identified genomic regions may be important to improve heat tolerance in potatoes. Fine mapping of identified regions and validation of the markers associated with these regions would be required to further understand the mechanism involved in heat tolerance.

Sequenced Genomes of Allotetraploid Poa annua and its Diploid Progenitors, Poe infirma and Poa supina, Provide Insight into the Evolution and Breeding of Versatile Polyploid

Christopher Benson¹, David Huff¹, Shaun Bushman², Peter Maughan³, Rick Jeelen³, and Matthew Robbins². ¹Pennsylvania State University, State College, PA. ²USDA-ARS, Logan, UT. ³Brigham Young University, Provo, UT.

Poa annua (annual bluegrass, 2n=28, AABB) is an allotetraploid grass species that outperforms its diploid progenitors in both diversity of morphologies and geographic range. On golf course putting greens, Poa annua's ability to produce seed under 3mm mowing height has contributed to its discordant reputations as both a noxious weed and a valued commodity. Despite an estimated $40 billion U.S. turfgrass industry, there have been limited successful efforts directed at managing Poa annua, either for or against, primarily due to its complex genetic and epigenetic versatility. Here we present the pseudomolecule-level genome assemblies of Poa annua and its diploid progenitors, Poa infirma, and Poa supina. BRAKER2 annotations of these species with Iso-Seq RNA-evidence yielded 72,034, 37,207, and 35,698 high-confidence proteins, respectively. Both the assemblies and the annotations for all species contained >90% conserved orthologs, corroborating their quality. We demonstrate that the parental diploid genomes accurately represent the A and B subgenomes of Poa annua and characterize genetic exchange between and within the subgenomes of Poa annua. We show that the bifurcating ecological niches of the parents is mirrored by genomic and structural mutations in their diploid genomes. The subgenomes of Poa annua bear strong resemblance to its progenitors, confirming its status as a neo-allotetraploid. We speculate that Poa annua's global proliferation is conferred through the union of two parental genomes with wide genetic distance for hybridization and contrasting ecological ranges. We plan to incorporate genomic and transcriptomic resources to aid in better targeting of Poa annua in turfgrass beeding applications.

Development of a Genotyping by Sequencing Pipeline in Tetraploid Roses (Rosa sp.)

Tessa Hochhaus, Cristiane H. Taniguti, Jeekin Lau, Patricia E. Klein, David H. Byrne, and Osar Riera-Lizarazu. Department of Horticultural Sciences, Texas A&M University, College Station, TX.

Roses are highly heterozygous and most commonly diploids, triploids, and tetraploids. Genotyping by sequencing (GBS) has been performed in diploid rose populations, however, it has not been done in populations with higher ploidy because of their increased complexity (autopolyploidy). This complexity is due to the greater number of genotypic classes and the difficulty in accurately calling allele dosage. GBS uses restriction enzymes to reduce genome complexity and adapter barcodes to allow the pooling of multiple samples to increase the efficiency and to lower the sample cost. In this study, we are optimizing a GBS protocol for tetraploid roses using three populations (Morden Blush x George Vancouver, Stormy Weather x Brite Eyes, and Brite Eyes x My Girl). The optimization will entail varying sequencing read depth and coverage, while minimizing missing data, and using in-house workflows to test various combinations of open-source software for quality control, alignment of reads, identifying SNPs, and dosage calling. Through the development of this pipeline we hope to facilitate cost-effective genotyping in polyploid roses and the use of genomic-assisted breeding.

Identifying a Rose Germplasm Panel to Attain Optimal SNP Array Genotype Calling of Small Samples of Genotyped Individuals

Jeekin Lau, Cristiane H. Taniguti, David Byrne, and Oscar Riera-Lizarazu. Texas A&M University, College Station, TX.

Since genotyping with the Axiom WagRhSNP68K SNP array can be cost prohibitive, we explored an approach that would permit robust genotyping of samples in one or two 96-well plates. We have observed that genotyping accuracy via SNP arrays increases as the number of individuals used for genotype calling increases. We reasoned that this increased accuracy may be due to greater sample size and allelic diversity. To test this idea, we conducted an experiment where one bi-parental mapping population of 94 individuals plus two parents were clustered alone (one plate of genotyping) and in combinations with sets of related biparental populations and unrelated germplasm with increasing numbers and various levels of genetic diversity. We then compared both marker statistics and the linkage map quality generated from genotype calls of the target mapping population using the various datasets. As the number of individuals used in clustering increased, the number of useful markers increased nominally. However, the resulting linkage maps revealed that the addition of other genotypes in the marker clustering step resulted in shorter total map length and smaller gap sizes as the number of individuals and diversity increased. The decreased map lengths and gap sizes indicate that the inclusion of other genotypes helped genotyping accuracy. The output of this study will be a core set of genotyped rose germplasm that may be used to improve genotype calling of small samples of genotyped materials.

Unusual dalliances within the polyploid Chrysanthemum species complex

Neil O. Anderson¹, Liesl Bower-Jernigan¹, Rajmund Eperjesi¹, Robert Suryani¹, Steven Gullickson², and Albert Radloff². ¹University of Minnesota, Minneapolis, MN. ²MGK, Minneapolis, MN.

Many species in the Chrysanthemum complex and its alliance of satellite genera (e.g. Arctanthemum, Leucanthemum, Tanacetum) have been bred and selected since the 15th century BCE. Crops include a diversity of uses and forms from green pesticides (pyrethrum, C. cinerariifolium, C. coccineum; 2n=2x=16, 2n=3x=24), edible shoots (C. carinatum, C. coronarium, C. segetum; 2n=2x=16), salt tolerance (C. arcticum, subsp. arcticum, polaré; 2n=2x=16, 2n=4x=32), to ornamentals (cut flower, potted plant, garden types, C. xgrandiflorum, C. xhybridum; 2n=6x=54) also used for medicinal, herbal teas, and wine. The most widely cultivated crop, C. xgrandiflorum, is a complex perennial geophyte, an allohexaploid of >10 species (C. zawadskii, etc.), complicated by self incompatibility (3 S loci), pseudo-self compatibility (PSC), aneuploidy, sterility, inbreeding depression, and genetic load. Chrysanthemum populations in the University of Minnesota breeding program were analyzed for genetic structure within/among species and cultivar series to determine alliances within the ploidy complex. Genotypic analyses (GBS; DArTseqLD) was used to identify 389 low density, unique SNPs and determine genetic structure within and among wild (C. arcticum, subsp. arcticum, polaré; C. zawadskii) and cultivated (C. cinerariifolium, C. xgrandiflorum ‘Minn’ series, C. xhybridum ‘MammothTM’ series) species populations. Principal Coordinates Analysis (PCoA) of all species showed two clusters of C. arcticum/C. cinerariifolium and C. xgrandiflorum/C. xhybridum/C. zawadskii with 74.9% diversity for the principal component (PCoA1) and 8.1% for PCoA2. Surprisingly, the first subgroup had C. cinerariifolium in close dalliance with C.a. subsp. polaré (Nome, AK), followed by C.a. subsp. arcticum (Aleutian Islands), C. arcticum (Anchor Point, Kenai, Ninilchik, Valdez, AK) with C.a. subsp. polaré (Churchill, Manitoba, Canada). Chrysanthemum xgrandiflorum had the greatest level of genetic diversity, although it slightly overlapped with both C. zawadskii and C. xhybridum. The ‘Minn’ and ‘MammothTM’ series had low levels of genetic differentiation due to C. xhybridum being derived from C. xgrandiflorum and C. weyrichii (2n=6x=54). Future research will focus on phenotypic trait/SNP associations in a genome-wide association study (GWAS) to aid in marker-assisted selection.

Quantitative Trait Loci Associated with Flower Color Transition Phenotype in Tetraploid Roses

Haramrit Gill, Jeekin Lau, Qiuyi Fu, Natalie Anderson, David H. Byrne, and Oscar Riera-Lizarazu. Texas A&M University, College Station, TX.

Flower color is one of the most important breeding traits in ornamental roses. The combination of particular anthocyanins, their co-factors and their concentrations leads to different pigmentation patterns. We observed an interesting characteristic that we call ‘flower color transition’ in two tetraploid rose populations a) ‘Stormy Weather’ (SW) X ‘Brite Eyes’ (BE), b) ‘My Girl’ (MG) X ‘Brite Eyes’ (BE). The roses that exhibit this phenotype have flowers that transition from a light-yellow color to a dark pink/red (accumulation of anthocyanins) as the flower ages leading to bushes peppered with flowers of multiple colors. To our knowledge, the genetic control of this phenotype has not been studied in roses previously. Here, we present studies to better understand the inheritance of this trait and to identify quantitative trait loci (QTL) in two tetraploid bi-parental populations segregating for the ‘flower color transition’ trait. Our analysis suggests the presence of QTL on chromosomes 3 and 4. The location of QTL identified for the flower color transition coincides with the location of some genes involved in flavonoid biosynthetic pathway. Additional studies are underway to validate these results.

Predicting Length to Width Ratio, Roundness, and Compactness in Potatoes

Michael Miller and Laura Shannon. University of Minnesota, Saint Paul, MN.

Potato is an important crop to the global food system; however, adoption of new potato varieties has been slow compared to many other staple food crops. This is in part due to the importance of many traits beyond yield, in particular quality traits, to the marketability of potatoes. Quality traits are often measured using subjective and imprecise visual scales. These scales introduce error due to rater fatigue, rater experience, and differences in scale interpretation between raters. Visual scales also limit differentiation between potato clones which express a trait at similar but not identical levels. We have developed an image analysis platform in the R programming language to provide objective and precise numeric measurements of several potato tuber quality traits. Among the these traits are measurements of tuber shape including length to width ratio, roundness, and compactness. Combining the phenotype data provided by this platform with the genomic selection tools provided by Tools for Polyploids could allow for robust genomic selection models of potato tuber shape quality traits. A collection of 82 chip market class potato clones from the University of Minnesota breeding program were evaluated for shape traits over 3 field seasons. Genotyping was preformed using the SolCAP SNP array. Allele dosage calling was performed using the fitPoly R package. The StageWise package was then used to create predictions of genomic estimated breeding values for tuber shape traits.

Multiploidy support in polyRAD

Lindsay V. Clark, Joyce Njuguna, Alexander E. Lipka, and Erik J. Sacks. Department of Crop Sciences, University of Illinois, Urbana-Champaign, Urbana, IL.

polyRAD is an R package for Bayesian genotype calling from sequence read depth in diploid and polyploid organisms. It can use population structure or mapping population design to inform genotype calls, and can export discrete or continuous genotypes. Although the original version of polyRAD allowed inheritance model to vary across the genome, it still required all individuals to be the same ploidy, limiting its use in staple crops such as banana and yam in which breeding populations typically consist of a mixture of ploidies. polyRAD 2.0 will support multiploidy, allowing simultaneous genotyping of individuals of different ploidies. The “possiblePloidies” slot will still be used to indicate potential inheritance modes for loci. A new slot called “taxaPloidy” contains one integer for each individual to indicate its ploidy, and acts as a multiplier for the values stored in “possiblePloidies”. Examples of how to code this information in various crops will be presented in the digital poster. We will also present Miscanthus sacchariflorus as a use case, in which introgression has occurred among diploid, triploid, and tetraploid populations. The development version of polyRAD 2.0 can be installed from GitHub.

Increasing the Prediction Accuracy in Genomic Selection of Complex Traits using WGBLUP

Cesar A. Medina¹, Harpreet Kaur², Ian Ray², and Long-Xi Yu¹. ¹United States Department of Agriculture-Agricultural Research Service, Plant Germplasm, Introduction and Testing Research, Prosser, WA. ²Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, NM.

Genomic selection (GS) is a variant of marker-assisted selection, in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population to a specific trait. GS is useful in complex traits controlled by many genes with small effects. However, some complex traits such as biomass yield or abiotic stress tolerance have low prediction accuracy (measured as Pearson correlation between GEBV and phenotypic values). There is a need to increase the prediction accuracy to employ GS in breeding programs. In this work, we developed and tested an alternative GS model named weighted GBLUP (WGBLUP). We integrated DNA marker significant values of genome-wide association studies (GWAS) in WGBLUP analyses. We performed a case study using phenotypic data on biomass yield under salt stress of alfalfa and 13 phenotypic traits of potato to validate the WGBLUP model. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress and up to 90% in potato tuber length. The use of the WGBLUP model will allow the implementation of GS in different breeding programs, increasing the selection accuracy in complex traits.

A Semi-Automated SNP-Based Approach for Contaminant Identification in Biparental Polyploid Populations of Tropical Forage Grasses

Felipe Bitencourt Martins¹, Aline Costa Lima Moraes¹, Alexandre Hild Aono¹, Rebecca Caroline Ulbricht Ferreira¹, Lucimara Chiari², Rosangela Maria Simeão², Sanzio Carvalho Lima Barrios², Mateus Figueiredo Santos², Liana Jank², Cacilda Borges do Valle², Bianca Baccili Zanotto Vigna², Anete Pereira de Souza¹. ¹University of Campinas, Sao Paulo, Brazil. ²Embrapa Gado de Corte, Mato Grosso do Sul, Brazil.

Artificial hybridization plays a fundamental role in plant breeding programs since it generates new genotypic combinations that can result in desirable phenotypes. Depending on the species and mode of reproduction, controlled crosses may be challenging, and contaminating individuals can be introduced accidentally. In this context, the identification of such contaminants is important to avoid compromising further selection cycles, as well as genetic and genomic studies. The main objective of this work was to propose an automated multivariate methodology for the detection and classification of putative contaminants, including apomictic clones, self-fertilized individuals, half-siblings and full contaminants, in biparental polyploid progenies of tropical forage grasses. We established a pipeline to identify contaminants in genotyping-by-sequencing (GBS) data encoded as allele dosages of single nucleotide polymorphism (SNP) markers by integrating principal component analysis (PCA), genotypic analysis (GA) measures based on Mendelian segregation and clustering analysis (CA). The combination of these methods allowed the correct identification of all contaminants in all simulated progenies (n=200) with more than 690 markers and the detection of putative contaminants in three real progenies of tropical forage grasses, providing an easy and promising methodology for the identification of contaminants in biparental progenies of tetraploid and hexaploid forages or other species. The proposed pipeline was made available through the polyCID Shiny app, which was developed in R language with a user-friendly interface designed to facilitate its use by plant breeders. Furthermore, it can be easily coupled with traditional genetic approaches, such as linkage map construction and other SNP based techniques, thereby increasing the efficiency of breeding programs.