Understanding Plant Digital Sequence Information and its Implications

The plant system is in an interesting state of flux. There are more than seven million accessions of genetic materials drawn from both the natural environment and commercial breeding lines where plants are characterized in some detail. Only ten or fifteen plants are vitally important to the global food system. These plants provide the bulk of the human and animal nutrition that is foundational to our food security. What then does the digital world do to these plants?

By Peter W.B. Phillips, CSIP Researcher and JSGS Distinguished Professor

The plant system is in an interesting state of flux. There are more than seven million accessions of genetic materials drawn from both the natural environment and commercial breeding lines where plants are characterized in some detail. Only ten or fifteen plants are vitally important to the global food system. These plants provide the bulk of the human and animal nutrition that is foundational to our food security. What then does the digital world do to these plants? 

In the first instance, we are discovering that we do not really know a lot about the plants we have. Many of the accessions in seed banks or botanical gardens may have some really interesting properties, but we have often severed our knowledge from their particular use from wherever they were originally gathered and cultivated. For this reason, people are hoping to re-establish what breeders call the phenotypic characteristics and traits of various plants to the seeds themselves. This is a big data challenge. Part of this involves sequencing the genetic code but the bigger challenge is connecting information about what these plants generate in fields as they are grown or cultivated. Getting more information is only half the challenge. We also need strategies to guide how we use the knowledge and information of the plants. 

There are a range of efforts underway to breed new crops for our food system. The oldest and still relevant system involves individual farmers selecting and forcing crosses between wild relatives and within traditionally cultivated crops. Landrace varieties are still important in many parts of the most food-insecure regions of the world. 

For much of the past century governments and scientists have worked to develop new approaches and technologies to accelerate plant breeding. Modern improved varietal systems are anchored on national and international research centers and involve academics, commercial breeders and a few larger farmers. 

In a few crop spaces, some of the research effort is becoming more industrialized. While much of the effort is still based on individual breeders using their personal judgement and local resources, there is interest in applying more automation and machine learning to help breeders gather and connect the genomics information and the phenomic characteristics of the traits they grow. 

Biotech companies such as Bayer and research centres in universities such as in Saskatoon are building large scale operations to apply the latest technologies and data mining tools to accelerate the breeding process. In the past it could take 13 to 20 years to identify a trait you want to improve and to actually get it into a new variety into farmers hands and in the field. Shortening these timelines will accelerate our ability to generate higher yielding, higher quality and more environmentally resilient crops. The goal is to apply big data techniques alongside automation and machine learning, to reconnect the stock of genetic materials in gene banks and botanical gardens into the breeding space and then make them practically accessible to breeders as they work to improve their crop varieties. 

The current global research system is about 25% public and 75% private, with public institutions doing much of the upstream basic work and the commercial firms engaging when there is something to apply and market. The cultures and styles of work in the two systems at times barriers to full exchange of knowledge and insights. Mechanization and greater digitization should create opportunities for more fluidity between the two solitudes. 

Accelerating breeding, whether from digitization or the application of new breeding tools, will put more pressure on regulatory systems because there will likely be more varieties with a greater range of traits coming into the system for assessment, and the quantity, quality, range and precision of the supporting data will be immeasurably greater. These new and different information flows are bound to force policy makers and regulators to ponder how to most effectively and efficiently use this data to deliver products that improve yields, quality and resilience of our food system.

Download the Making Waves blogpost.

Read Governance of Digital Sequence Information and Impacts for Access and Benefit Sharing.