2024 Survey of Genomic, Genetic, and Breeding (GGB) Database Stakeholders

In partnership with Michael Coe (Washington State University)

In Year 3 of our NSF RCN grant (Award Abstract # 2126334), the AgBioData Consortium, in partnership with Washington State University, surveyed database stakeholders (or users) on standardized data curation principles and their implementation in data repositories for agricultural research and breeding programs as a follow-up to a baseline survey run in Year 1 (see report here). This follow-up survey aimed to assess whether changes in the perception or understanding of FAIR data in our community have occurred during the NSF RCN grant. 

Summary of Year 3 Survey Data

In total, we received 33 usable responses, which are summarized below. The full survey report is available here.

1. Survey Sample, Participant Characteristics, and Familiarity and Experience with GGB Databases

More than a third of respondents (33%) reported working at a Land Grant University, and 33% reported working at a university that offers related PhD degrees (Table 1). 

As displayed in Table 2, most reported their primary professional role as being a research scientist (21%), faculty member (15%), plant breeder/geneticist (12%), bioinformatician/data analyst (9%), postdoc (9%) or graduate student (9%), curator (6%), animal breeder/geneticist (3%), food and agriculture industry professional (3%), librarian (3%) or technician (3%). 


Most respondents reported a professional focus on plants, including major plant crops (42%), model organisms (30%), horticultural specialty crops (15%), wild organisms not directly used in agriculture (18%), and plants grown for other purposes besides human consumption (12%; Table 3). Animals were a primary focus for fewer respondents, including major livestock animals (18% of respondents), minor livestock animals (3%), model organisms (6%), and wild organisms not directly used in agriculture (3%). Pests, diseases, physiological stressors, and other threats were a focus for 18% of respondents. 

Table 1. Organizational Affiliations of 2024 Survey Respondents


Table 2. Primary Professional Role of 2024 Survey Respondents

 

Table 3. Professional Focus of 2024 Survey Respondents

 

As displayed in Table 4, 84% of participants had not tried to submit data to a GGB database, while 60% or more had attempted to search for or retrieve data from a GGB database, 40% had attempted to reanalyze data that had been retrieved from a GGB database, and 30% had attempted to integrate data from a GGB database with other datasets. Fewer than 20 percent reported that they had attempted to perform these functions and found it to be impossible, very difficult, or not easy to do in a satisfactory way.

Table 4. 2024 Participant Experience with Specific GGB Database Functions


 

Table 5 displays respondent reports of familiarity with FAIR data principles and practices. More than 70% “moderately” or “strongly” agreed that they are very familiar with the concept of FAIR data; more than half “moderately” or “strongly” agreed that they are very familiar with FAIR data management and could supervise these practices.

Table 5. Stakeholder Familiarity with Implementation of FAIR Data Practices, 2024

 

 

 

 

 

2. Baseline Ratings and Recommendations On FAIR Data and GGB Databases

Survey participants were asked to rate their agreement with statements about their experiences, observations, and opinions regarding the current status of FAIR data practices in GGB databases. They were also asked to rate priorities for future development of FAIR data practices in these projects and to provide related comments and recommendations (Table 6-7). Below we summarize the differences between the 2022 and 2024 survey periods, even if some differences might be due to sampling errors:

  • The value stakeholders place on FAIR data principles. Almost three-quarters of respondents “strongly” or “moderately” agreed that “If I have a choice among data resources, I am likely to work with those that best implement the FAIR data principles,” with 55% “strongly” agreeing, 17% “moderately “ agreeing, and another 17% “slightly” agreeing, for a total of 90% agreement.
  • Educational resources provided to database users. In 2024, 61% of stakeholder respondents “strongly” or “moderately” agreed that ”GGB databases related to my work highlight the importance of FAIR data principles for researchers and provide educational resources to help researchers understand and follow FAIR practices.” The corresponding figure from the 2022 survey sample was 51%. In 2024, 42% of respondents “strongly” or “moderately” agreed that “GGB databases related to my work provide useful resources for learning how to use them: tutorials, FAQs, how-to documents and videos, etc.” The corresponding ratings from the 2022 survey sample were 46%, indicating that the 2024 sample was less likely to report that their project provided helpful educational resources for users. 

  • Perceptions of the extent to which the GGB databases provide good guidelines for users on a key FAIR data practice. In 2024, 50% of respondents “strongly” or “moderately” agreed that “GGB databases related to my work provide good guidelines on what metadata to provide when preparing/submitting data.” In 2022, this figure was 37%.

  • User-friendliness of data system tools for contributing and finding/retrieving data. In 2024, ratings for ease of contributing FAIR and accurate data and ease of finding and retrieving data showed 39% and 53% of respondents “strongly” or “moderately” agreeing that projects related to their work provide this for users. In 2022, baseline survey participants gave identical ratings for data contribution (39%) and ease of retrieval (53%).

 

Table 6.  2024 Appraisal of the Status of FAIR Data Implementation in GGB Databases

 

 

 

 

 

 

 

 

2024 Priorities for Further Development of FAIR Data Practices in GGB Databases

Survey participants were asked to rate the importance of four potential priorities for improved data curation (Table 7). All six were rated as being “very important” or “highest priority” by more than 65 % of respondents. The highest ratings were given to “hyperlinks and interconnectedness among databases ” and “timely and up-to-date availability of curated data,” but “visualization of integrated data” and “training materials for FAIR data (for data submission)" were also rated as high priorities for further development.

Table 7. 2024 Priorities for Further Development of FAIR Data Practices in GGB Databases

 

Stakeholder Reports of AgBioData Impact on Participants and Their GGB Databases and Resources

In 2024, a new set of questions asked participants, “In the past 2 years, how have you engaged with the AgBioData Consortium?” Respondents could choose more than one answer. More than four-fifths (82%) of respondents had participated in at least one AgBioData event or activity; 48% reported attending an AgBioData workshop or presentation at a conference, 48% reported participating in an AgBioData Working Group, 48% reported attending an AgBioData community meeting (virtually or in person), 46% reported attending an AgBioData webinar, 36% reported having read an AgBioData paper, and 33% reported having responded to a previous AgBioData survey.