Monthly Webinar Schedule

GENERAL INFORMATION:

AgBioData webinars are generally held on the first Wednesday of every month at 10 AM PT | 11 AM MT | 12 PM CT | 1 PM ET.

Connection details are distributed about a week before the webinar via email to the AgBioData members.

Visit the registration page to join and sign up for the mailing list. Questions? Contact us at agbiodata@gmail.com.

UPCOMING WEBINARS:

2026

September 2 - Emily Clough (National Center for Biotechnology Information)
NCBI's Gene Expression Omnibus (GEO)
Abstract

| Slides | Recording |
August 5 - McKenzie Mabry (Florida Museum of Natural History, University of Florida & New York Botanical Garden; iDigBio)
Crop Wild Relatives And The Role Of Herbaria In Future Food Crop Security
Although Nikolai Vavilov recognized the potential of crop wild relatives (CWR) in the early 1900s, the advent of genome editing technologies such as CRISPR now enables scientists to more fully leverage CWRs as a source of genetic diversity for cultivated crops. As global agriculture faces escalating pressures from climate change, plant biologists are increasingly focused on sustaining crop productivity under shifting environmental conditions while meeting the demands of a growing population. CWRs represent a critical reservoir of genetic variation that can be harnessed to address these challenges. While efforts to expand germplasm collections of CWRs are ongoing, herbaria remain an underutilized resource. At the same time, the growing availability of digitized herbarium data provides new opportunities to address large-scale research questions. In this study, occurrence records for CWRs of several important crop species are obtained from repositories such as iDigBio and Global Biodiversity Information Facility. Following data cleaning, ecological niche modeling is used to estimate global habitat suitability under both current and future climate scenarios. This work highlights the increasingly important role of herbaria and digitized biodiversity data in supporting crop improvement efforts, particularly in the context of ongoing climate change.

| Slides | Recording |
June 3 - Alen Zimić (Cornell University)
Using LLMs for knowledge extraction at massive scales
Abstract

| Slides | Recording |
May 6 - Olivia Haley (USDA-ARS Postdoctoral Research Fellow at the Oak Ridge Institute for Science and Education)
Delivering AI-Ready Genomics with MaizeGDB
The integration of Artificial Intelligence (AI) into computational biology is changing biological research, particularly in agriculture, where large and complex datasets offer opportunities for discovery and crop improvement. Maize (Zea mays L.), a globally critical crop with extensive genomic, genetic, proteomic, and functional resources, stands to benefit from AI integration. The Maize Genetics and Genomics Database (MaizeGDB) is proactively building an AI-ready infrastructure by standardizing datasets, pre-computing complex features, developing novel interactive tools, and providing reproducible workflows. This paper details MaizeGDB's strategic initiatives to create a foundation of AI-ready data in standardized formats and generate precomputed embeddings from cutting-edge DNA and protein language models. We introduce new functionalities, including zero-shot variant effect scoring derived from biological language models (protein and DNA) and genome browser tracks for visualizing nucleotide conservation (conveying potential functional significance). Furthermore, we provide custom dataset assembly resources and reproducible workflows via GitHub. By providing access to and organization of maize data, MaizeGDB enables the maize research and breeding community to leverage AI for the accelerated discovery of gene function, variant interpretation, and the development of improved maize varieties.

| Slides | Recording |
April 1 - Omar Harb (Director of Scientific Outreach & Education; Project Manager, VEuPathDB)
VEuPathDB: A Comprehensive Informatics Resource for Eukaryotic Pathogen, Vector, and Host
VEuPathDB (the Eukaryotic Pathogen, Vector, and Host Informatics Resource) is an integrated bioinformatics platform that empowers researchers with comprehensive data, advanced analytical tools, and emerging artificial intelligence capabilities to accelerate discovery in infectious disease biology. The resource consolidates high-quality genomic, transcriptomic, proteomic, and functional data for eukaryotic pathogens, vectors, and relevant host systems, coupled with curated metadata and standardized ontologies. VEuPathDB’s suite of tools enables complex in-silico analyses, including comparative genomics, gene expression profiling, metabolic pathway exploration, and customizable data mining workflows. Recent enhancements leverage AI-driven methods for summarizing transcriptomic data, facilitating hypothesis generation and the interpretation of large-scale experimental results. Through an intuitive web-based interface and APIs, VEuPathDB supports reproducible science and community-driven annotation efforts. To remain open access and sustain its development, maintenance, and ongoing expansion of data and computational capabilities, VEuPathDB employs a voluntary subscription-based funding model. Under this approach, participating labs, institutions, and companies contribute annual subscription fees that support core infrastructure, curation activities, and user support services. The model balances broad access for the scientific community with stable funding streams, enabling strategic investments in AI integration and responsive tool enhancement. Subscription revenues complement competitive grant funding and are essential for long-term platform growth, ensuring that VEuPathDB continues to meet evolving research needs in pathogen, vector, and host informatics.

| Slides | Recording |

PAST WEBINARS:

2026

March 4 - John McNamara (Professor Emeritus of Animal Sciences at Washington State University)
Integrating phenomics and phenotypes into agricultural databases
The United States agriculture and food systems research and education system remains the envy of the world, and the US Department of Agriculture and the Land-Grant University system lead the public and private partnerships that have improved agricultural productivity and human health phenomenally for over 160 years. The continuation of these improvements relies on equitable access to trustworthy data, particularly in genetics and phenomics, and the ability to leverage such data to address future scientific challenges. In the new article set forth by the Phenotypic Data Standardization and Management working group, we discuss the growing need in agriculture for phenomic databases that follow findable, accessible, interoperable, and reproducible (FAIR) data guidelines, as well as the need for public policy supporting a sustainable funding model for these databases.

| Slides | Recording |
February 4 - Pascale Gaudet (Swiss Institute of Bioinformatics)
Extending the Gene Ontology for Biological Network Modeling
The Gene Ontology (GO) is an essential resource in the bioinformatics landscape, supporting a wide range of applications. It serves both as a comprehensive encyclopedia of biological knowledge and as a key tool for interpreting high-throughput biological data. More recently, GO has expanded to support biological network modeling through the development of GO Causal Activity Models (GO-CAMs), which enable network analysis within the highly structured GO data framework. This presentation will introduce the GO-CAM data model, describe its relationship to standard GO annotations, and highlight ontology developments driven by GO-CAM requirements. It will also discuss how GO is leveraging artificial intelligence to enhance multiple aspects of the GO workflow, including ontology development, annotation, and quality control.

| Slides | Recording |

2025

December 3 - Heidi J. Imker (University of Illinois Urbana-Champaign)
Sustainability During Instability: Long-Lived Life Science Databases and Science Funding Outlook in the United States
For many years, life science researchers have enjoyed free, open access to a wide range of online databases that make data sharing and reuse possible. But even long-established resources have always faced sustainability challenges and recent changes in U.S. science funding are making things even tougher. This presentation covers a study recently posted to bioRxiv which took a multiple mini–case study approach to document potential impacts for nine long-standing databases, including BHL, MorphoBank, OMIM, ORDB, rrnDB, VEuPathDB, WormAtlas, and two that preferred to remain anonymous. The results point to growing barriers to data access, a loss of valuable expertise, and disruptions to ongoing and future research. The presentation will conclude with a discussion of sustainability models given the current funding climate.

| Slides | Recording |
November 5 - Barend Mons (Professor Emeritus at Leiden University; LIFES Institute; GO-FAIR)
Stop data sharing: AgBioData and FAIR visitation

| Slides | Recording |
October 1 - Paige Kulzer (University of North Carolina at Charlotte)
Toward Sustainable Genomic Data Accessibility and Visualization with the Integrated Genome Browser
Integrated Genome Browser (IGB, pronounced “ig-bee”) is a fast, feature-rich, open-source desktop genome browser thousands of researchers have used to explore and analyze genomic data. To make it as easy as possible for researchers to load their data in IGB, we provide built-in genome assemblies and annotations for model and non-model organisms. We obtained many of these from sources familiar to AgBioData members, including TAIR, Sol Genomics Network, and MaizeGDB. Other database and genome browser developers do similar work to disseminate genomic data to researchers, and some offer robust programmatic access to their data via APIs (application programmer interfaces). By accessing these computational resources, IGB can show new assemblies without our team needing to replicate assembly data files to our own servers. In this webinar, we’ll discuss IGB’s latest integrations with genome data providers and AgBioData members. We’re also excited to highlight the ongoing work of doctoral student Karthik Raveendran in our lab who is developing innovative methods for visualizing single-cell RNA-Seq data in the Integrated Genome Browser and how this new capability helps biologists understand, evaluate, and analyze these data better. Altogether, these integrations address one of the most important problems in data visualization in bioinformatics: Developing sustainable ways to make the vast wealth of genome-centric experimental data available to the community.

| Slides | Recording |
September 4 - Sonia Balyan (Indian Biological Data Centre)
Indian Crop Phenome Database: Advancing Crop Research Through Open Phenomic Data
The Indian Crop Phenome Database (ICPD), developed at the Indian Biological Data Centre (IBDC), is a pioneering national initiative designed to address the challenges of managing large-scale phenotypic and associated datasets — including soil, weather, QTL, and passport data of genotypes — in agriculture. India generates vast volumes of phenotypic data from diverse crop species through field trials, breeding programs, and research projects; however, the absence of standardized formats and dedicated repositories has often left this wealth of information underutilized. ICPD addresses these gaps by fully embracing the FAIR principles — ensuring data are Findable, Accessible, Interoperable, and Re-usable. As the designated data hub for major mission-mode programs on Characterization of Genetic Resources supported by the Department of Biotechnology (DBT), India, ICPD offers a robust framework for digitization, curation, and sharing of crop phenotyping data, fostering seamless knowledge exchange across the scientific community. Each dataset receives a unique IBDC accession, ensuring traceability, proper citation, and long-term preservation. Supporting over 30 crop species, ICPD adopts international ontology standards for traits, tissues, developmental stages, and methodologies, while also allowing the creation of new ontology terms with temporary accessions that undergo expert curation. This generic framework enables the submission of any crop phenome data, providing both flexibility and standardization. By serving as a centralized, standards-driven, and FAIR-compliant repository, ICPD is poised to transform phenomics research in India — accelerating the development of climate-resilient, high-yield, and pest-resistant cultivars, and strengthening the scientific foundation for global food and nutritional security.

| Slides | Recording |
August 6 - Trupti Joshi (Joan C Edwards School of Medicine, Marshall University)
Translational Bioinformatics Resources and AI Solutions for Multiomics Research
Next-generation sequencing and multiomics data (bulk and single-cell) capturing molecular changes from genomics all the way to phenomics, have become an integral part of research in all domains, including biomedical sciences, plant sciences, and others. This rapid revolution in the multiomics has posed a growing need for translational tools that can handle large amounts of data, are easily expandable, provide interpretable results, and can be readily applied to any species. To address such translational needs, we have developed Soybean Knowledge Base (SoyKB) and Knowledge Base Commons (KBCommons) web-based frameworks, both fully equipped to handle the entire multiomics landscape for all organisms. Our developed tools, such as Allele Catalog, GenVarX, AccuTool, and MaDis, are specifically designed to provide the plant community with efficient data-driven solutions for better breeding strategies. Additionally, our G2PDeep, deep learning method, provides a comprehensive web-based resource for phenotype predictions using multiomics data for all organisms.

| Slides | Recording |
June 4 - Sean Wilkinson (Oak Ridge National Laboratory)
The FAIR Principles for Computational Workflows
Recent trends in the computational and data sciences show an increasing recognition and adoption of computational workflows as tools for productivity and reproducibility that also democratize access to platforms and processing know-how. As digital objects to be shared, discovered, and reused, computational workflows benefit from the FAIR principles, which stand for Findable, Accessible, Interoperable, and Reusable. The Workflows Community Initiative's FAIR Workflows Working Group (WCI-FW), a global and open community of researchers and developers working with computational workflows across disciplines and domains, has systematically addressed the application of both FAIR data and software principles to computational workflows in a recent publication. This presentation will discuss the WCI-FW working group, some of the critical questions we needed to answer, the resulting set of FAIR principles for workflows, and opportunities for collaborating with the AgBioData Consortium to support FAIR workflow implementation. We hope that the FAIR recommendations for workflows proposed in the paper will maximize their value as research assets and facilitate their adoption by the wider community.

| Recording |
May 14-15 (9 AM - 12 PM CT) - 2025 AgBioData community workshop.
Meeting agenda, slides, and recordings are available here.
April 2 - Gregor Bucher (University of Gottingen)
iBeetle-Base - a phenotypic database with community edition
The red flour beetle Tribolium castaneum has become an important insect model organism to study gene function and RNAi mediated pest control. iBeetle-Base was developed to document morphological phenotypes from the genome-wide RNAi gene-knockdown screen iBeetle and has become an integrated resource of genomic and phenotypic information for Tribolium castaneum research. Since our last release in 2024, iBeetle-Base allows the community to contribute RNAi phenotypes and associate Gene Ontology terms and publications with Tribolium genes (“community addition”). This novel functionality overcomes the lack of curators for databases of small communities. Further, we improved data accessibility, via programming interfaces alongside traditional HTML pages, which facilitates the automatic usage of the database by scripting and third-party tools. Importantly, this phenotypic database, with its special adaptations to small research communities, could become the nucleus for other interlinked databases in the future. iBeetle-Base is freely accessible at https://ibeetle-base.uni-goettingen.de/

| Recording |
March 5 - Alenka Hafner (Pennsylvania State University)
AgBioData's Data Reuse WG: our current guidelines and future considerations
The scientific community has long benefited from the opportunities provided by data reuse. Recognizing the need to identify the challenges and bottlenecks to reuse in the agricultural research community and propose solutions for them, the data reuse working group was started within the AgBioData consortium framework. Throughout two years of remote meetings and asynchronous discussion, we identified the limitations of data standards, metadata deficiencies, data interoperability, data ownership, data availability, user skill level, resource availability, and equity issues, with a specific focus on agricultural genomics research. We described these in a white paper, out now in GigaScience (https://doi.org/10.1093/gigascience/giae106), while proposing possible solutions stakeholders could implement to mitigate and overcome these challenges and providing an optimistic perspective on the future of genomics and transcriptomics data reuse.

| Recording |
February 5 - Adam Wright (Ontario Institute for Cancer Research) and David Molik (USDA-ARS)
Guidelines for Gene and Genome Assembly Nomenclature (GAAN)
Clear and informative naming schemes enhance the utility of genome assemblies and gene annotations. We present a comprehensive nomenclature framework which incorporates species, sequencing group, colony/breed/strain, version, and other critical metadata in a well-structured format. This approach aligns with standards from AgBioData discussions and ensures compatibility with major repositories such as those in the INSDC. To facilitate adoption, we developed the Gene and Genome Assembly Nomenclature (GAAN) tool for validating names under these guidelines. Future iterations of GAAN will integrate external databases like the Darwin Tree of Life Identifiers and the Vertebrate Breed Ontology for further validation.

| Slides | Recording |

2024

December 4 - Sarah Dyer (EMBL-EBI)
Plants, pollinators, and pests in Ensembl - integrating genomics data for agriculturally relevant species
Ensembl is an open platform integrating publicly available genomic data to support the exploration of gene annotations, genetic variation, and comparative genomics. Increasing numbers of genomes are available for agriculturally relevant species, with multiple high-quality genomes now being generated for many crops. In addition, large scale biodiversity projects are increasing the number of insect genomes available for pest and pollinator species. Providing ways to explore this genomic data is key to supporting research and breeding efforts for agricultural species. Ensembl Metazoa and Ensembl Plants now hold more than 380 invertebrate and almost 160 plant genomes, respectively. In addition, our new Ensembl site (https://beta.ensembl.org) already holds more than 1,500 invertebrate and over 100 plant genomes. The new site will ultimately replace the current suite of Ensembl component sites, bringing annotated genomes together from across the tree of life. In this webinar we will look at the species, data types and tools you can find in Ensembl and ways you can access these resources.

| Recording |
November 6 - Virtual round-table on new AgBioData working groups on artificial intelligence (AI) in agricultural genomics and natural language processing (NLP) for biocuration
Artificial intelligence in agricultural genomics and natural language processing for biocuration are becoming more popular in the community, allowing several applications and creating new data-related issues. We invite the AgBioData member databases and the larger community to provide feedback on these challenges, help us understand their importance for the AgBioData member databases, and define the focus of new AgBioData working groups.

| Recording |
October 2 - Dr. Montana Smith (Pacific Northwest National Lab)
NMDC: Advancing microbiome science through FAIR and standardized metadata and data
The National Microbiome Data Collaborative (NMDC)’s mission is to support a FAIR microbiome data-sharing network through infrastructure, data standards, and community building that addresses pressing challenges in environmental sciences. In this webinar, we will dive into what the NMDC is and how standardized metadata capture enables FAIR data. We will walk through the 4 NMDC products and how they’re lowering barriers for experimental scientists to conduct their research in a way that ensures data re-use.

| Recording |
September 4 - Dr. David Emms (InstaDeep)
AgroNT: A Foundational Large Language Model for Plant Genomics
Foundational large language models can be pre-trained on large unlabelled datasets and subsequently fine-tuned to a wide range of specific tasks. We’ll present AgroNT (Agro Nucleotide Transformer), a foundational DNA large language model pre-trained on reference genomes from 48 plant species with a predominant focus on crops. We have shown that AgroNT can be fine-tuned to obtain state-of-the-art predictions of many genomic elements, including polyadenylation sites, splice sites, open chromatin and enhancer regions. Furthermore, AgroNT can be fine-tuned to e.g. predict tissue-specific gene expression levels or to prioritize functional variants.
Building on our Nucleotide Transformer, the novel SegmentNT model is able to make nucleotide resolution predictions, well suited to tasks such as de novo genome annotation of previously unseen species. Both our AgroNT and SegmentNT models are open-sourced for academic research and non-commercial uses on our GitHub repository https://github.com/instadeepai/nucleotide-transformer and HuggingFace space https://huggingface.co/InstaDeepAI.

| Recording |
August 7 - Seth Murray (Texas A& M University)
Capturing Nature AND Nurture with Temporal Field Phenomics to Breed Better Crops
An organism’s phenome results from genotype (nature), environment and management effects (nurture) and their interactions, as well as measurement error. For over 30 years, DNA sequencing and genomics tools have advanced genotyping to where genomes can now be routinely saturated with measurements. In contrast, most focus in high throughput phenotyping and phenomics to date has been on automating previously known “traits” as measurable and interpretable phenotypes; akin to focusing on measuring a single DNA marker rather than measuring a saturated genome. Tools such as unoccupied aerial systems (UAS, aka UAVs, drones) collecting temporal phenomic measurements in the field now allow novel methods in plant breeding and new insights into plant biology. Viewing phenomics as a platform for discovery, similar to genomics, opens new methods for capturing phenomena in nature and nurture. To date, our experience with phenomic prediction from UAS in maize breeding for cumulative, complex phenotypes such as grain yield suggests it’s possible to predict organismal performance in untested environments; in fact possibly better than gold-standard genomic methods. Surprising insights into biology have also been made in through these activities predicting plant disease and resistance, evaluating genotypic resilience to stress, and identifying early season growth periods for crop improvement that have not been able to be selected. Method development and data analytics in phenomics are large investments, but worth making. Successfully measuring the phenome will impact every aspect of science and society, in biological disciplines from germplasm curators, physiologists to breeders, to education, the courtroom and policy.

| Recording | Slide |
June 5 - Ethy Cannon (USDA-ARS)
Pan-genomic resources at MaizeGDB
Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genome assemblies within a clade, can be more useful than single reference genomes. This is especially true for Zea mays, which has a particularly diverse and complex genome. Presenting full pan-genome data is challenging, especially for a diverse species, but valuable when pan-genomic data can be linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. I will present the pan-gene analysis pipeline pipeline Pandagma, and MaizeGDB’s pan-gene data center, which offers a variety of browsing and visualizations, including sequence alignment visualization, gene trees and more, which enables exploration of pan-genes in Zea .

| Recording |
April 29-30, May 2 - 2024 AgBioData virtual community workshop (agenda available here)
This three-day meeting will feature presentations from ending and current AgBioData working groups (WGs) about their accomplishments and recommendations, breakout room discussions on data-related issues, and updates on the consortium's future.

| Recording |
April 3 - Zachary Miller (Cornell University)
Introducing The Practical Haplotype Graph Version 2: A Streamlined and Simple Pangenome System
The Practical Haplotype Graph (PHG) is a powerful tool for representing diverse plant pangenomes and imputing new sample genotypes for breeding programs and other purposes. Low-coverage sequencing data from various technologies (DaRT, GBS, etc.) is sufficient to identify paths through the graph, which can be stored efficiently within the PHG database or used to call variants and create custom genomes for alignment. PHGv2 refines and streamlines the original PHGv1 platform.

| Recording | Slides |
March 6 - Cyril Pommier (INRAE-URGI)
FAIR Plant Phenomics Data Management Tools and Guidelines from ELIXIR and Emphasis European Infrastructures
Plant phenomics data has been greatly facilitated those past ten years at several levels: data standards to organize and describe data, databases for the management of the experiments, data repositories to ensure long term accessibility supplemented by data portals to maximise findability and finally guidelines to ease their usage. We will review the recent advances from joint initiatives involving two European infrastructures: ELIXIR (Life science data) and EMPHASIS (Plant phenomics). First, we will update the current status of MIAPPE (www.miappe.org), a data standard interoperable with the Breeding API that enables not only phenoytping experiment formalisation but also their linking with genotyping. We will also give an overview of its usage in generic data repositories such as Dataverse or Zenodo and their relation with experimental database such as PHIS. Finding the right documentation to use those tools and standard is not always straightforward. The RDMKit (https://rdmkit.elixir-europe.org/) is a guidelines portal that has been build to help researchers finding the information subset they need. Through dedicated section, such as the plant domain page (https://rdmkit.elixir-europe.org/), it shows the complementarity between standard and tools and provide the guidances needed for data management. Finally, we will also update the status of FAIDARE (https://urgi.versailles.inrae.fr/faidare/), a global data portal that indexes 30 databases using either BrAPI or a generic minimal format.

| Recording | Slides |
February 7th - This webinar will feature two presentations from:

Paul D. Thomas (University of Southern California and Gene Ontology Consortium)
Accurate annotation of protein sequences at large scale, using evolutionary modeling
Inferring (aka “annotating” or “predicting”) the functions of the vast numbers of known protein sequences has been a longstanding challenge in genomics. Over the last decade, a comprehensive system has been developed for addressing this challenge based on constructing and applying models of function evolution in protein families. The main components of the system– including PANTHER phylogenetic trees, Gene Ontology phylogenetic annotations and TreeGrafter software (now implemented in InterProScan)– work together in an integrated software and data suite that is now beginning to be broadly used to annotate the functions of protein-coding genes. I will describe each of these components, as well as how the tool can be easily used to annotate any set of protein-coding genes and how users can give feedback to help improve the annotations.

Alex Ignatchenko (EMBL-EBI)
Gene Ontology (GO) Annotation (GOA) project at EMBL-EBI aims to provide high-quality GO annotations to proteins in the UniProt Knowledgebase (UniProtKB), RNA molecules from RNACentral and protein complexes from the Complex Portal. Currently, the GOA database hosts 5 million manually curated GO annotations from over 70 research groups. This set is used as a foundation for 15 automatic GO annotation pipelines. The output data re-generated ever 2 month and commonly referred to as Inferred from Electronic Annotation (IEA). The IEA pipelines use range of statistical, rule-based and machine learning algorithms to enrich existing GO annotation coverage. The generated IEA set of over 1.1 billion GO annotations is subject to over 130 checks, constraints and filters to ensure the quality of predicted GO annotations. The GOA data is publicly available from GOA ftp and the GO annotation browser QuickGO. The GOA team is constantly looking for ways to improve the quality of GO annotations and gene product coverage.
The TreeGrafter is a method of prediction of GO annotations based on PANTHER family/subfamily and the InterPro signatures. The project is a collaboration between PANTHER and the InterPro team at EMBL-EBI. The algorithm was published in 2019, and it was incorporated into the InterPro in the second half of 2023. The TreeGrafter mappings were processed and added to the GOA database for testing shortly after. This implementation resulted in about 301 million GO annotations after the GOA pipeline checks and filters. More importantly, the final set has over 200 million GO annotations, which is not predicted by any other IEA methods. The GOA team plans to intergrade TreeGrafter GO annotation pipeline into the GOA database and release it to public in a first half of 2024.

| Recording |

2023

December 6th - This webinar will feature two presentations from:

Benjamin Cole (Joint Genome Institute) on "Data management considerations for plant single-cell genomics."
While plants have arrived on the single-cell scene relatively late, the number and complexity of plant single-cell datasets have exploded over the past four years. With that massive increase in data has come a pressing need to ensure accurate documentation of the experimental provenance of plant single-cell datasets, not only for reproducibility but also for reusability in meta-analyses. During this presentation, I will discuss the current state of plant single-cell research as well as the most common practices for data storage. I will also argue for the need for better standards in the field, and what that could potentially enable.

Christopher Tuggle (Iowa State University) & Muskan Kapoor (Iowa State University) on "Single-Cell genomics data incorporation into agricultural G2P research by building a FAIR data ecosystem."
We will describe a pilot-scale project to determine if our current metadata standards for livestock and crops can be used to ingest scRNAseq datasets in a manner consistent with HCA DCP standards and if established resources (e.g., Terra) can be used to analyze the ingested data. Currently, the most comprehensive data ingestion portal for high throughput sequencing datasets from plants, fungi, protists, and animals/humans is Annotare (located at EMBL-European Bioinformatics Institute). For agricultural animal datasets, another EMBL-EBI portal, the FAANG portal, has been developed. scRNAseq data/metadata can be submitted to FAANG using a semi-automated process. We have extended this tool for scRNAseq data so that files can be validated using the HCA DCP metadata and data validation service. These files are incorporated using EMBL-EBI’s HCA DCP ingestion service and transferred to Terra for further analysis. We will also describe a Shiny-based web application, implemented in R and called Shiny-PIGGI, for the single cell-level transcriptomic study of pig immune tissues and peripheral blood mononuclear cells, which will be an important resource for improved annotation of porcine immune genes and cell types (https://shinypiggi.ansci.iastate.edu). We intend to further build upon these existing tools to construct a scientist-friendly data resource and analytical ecosystem to facilitate single cell-level genomic analysis through data ingestion, storage, retrieval, re-use, visualization, and comparative annotation across agricultural species.

| Recording |
November 1st - Ben Rosen (USDA-ARS)
The Ruminant T2T Consortium
The first draft of the human genome assembly was released over twenty years ago. However, a gapless telomere-to-telomere (T2T) “complete” assembly remained elusive until last year. The highly repetitive nature of pericentromeric, subtelomeric, and duplicated gene families, such as rRNA arrays, made them impossible to assemble. It was only with advances in long-read sequencing technologies and new bioinformatic tools that these structures were resolved. Recently, we proposed the application of these new resources, tools, and knowledge in support of a “Ruminant T2T Consortium.” Our goal is to generate complete genomes for the ruminant evolutionary lineage. The ruminant Suborder comprises six Families and 66 living genera. These species are found in geographically dispersed areas and have adapted to a wide variety of environments. They have also been subjected to both natural and artificial selection. Our hypothesis is that T2T assemblies of ruminant species with relatedness varying from those capable of interbreeding to higher evolutionary distances (up to the estimated 25 million years ago last common ancestor) will inform our understanding of the underpinnings of ruminant evolution. It will also shed light on the genomic consequences of domestication and enhance our knowledge of the functional roles of heterochromatin and other repeat regions of the genome.

| Recording | Slides |
October 4th - Pascal Neveu (UMR MISTEA, INRAE, France)
PHIS, an ontology-based Information System for Plant Phenotyping
The European EMPHASIS infrastructure aims to enable researchers to use facilities, resources and services for high-throughput phenotyping of plants. Within the infrastructure we are leading actions to help scientists better understand plant performance and translate this knowledge into applications. This presentation will look at some examples of data management and implementation of data standards carried out in this context to add value to the phenotyping data. In particular, we will look at PHIS, an ontology-based information system based on the OpenSILEX framework.

| Recording | Slides |
September 6th - Harry Caufield (Lawrence Berkeley National Laboratory)
Staying grounded: assembling structured biological knowledge with help from large language models
Developing comprehensive knowledge bases and ontologies demands meticulous curation. The emergence of highly flexible, artificial intelligence-driven approaches to natural language processing offers novel ways to expedite this process. Current methods often rely on extensive training data, however, and struggle with complex, nested knowledge structures. In this talk, I will describe a new approach, Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES). This method for information extraction leverages the capability of Large Language Models (LLMs) to perform zero-shot learning (ZSL) with a variety of natural-language prompts. SPIRES operates with predefined data schemas, enabling information extraction that adheres to these structures. It also grounds concepts with well-established ontologies and vocabularies, avoiding the "hallucinations" common to text generated by LLMs.
Within OntoGPT, an LLM-querying framework we have developed, SPIRES supports rapid application to summarization and modeling across plant science and biology. Notably, this approach allows customization for new tasks and topics without a need for new training data. We have found that OntoGPT and SPIRES are capable of extracting structured knowledge from large literature collections and constructing knowledge graphs from the resulting relationships. Through harnessing the language comprehension capabilities of LLMs, SPIRES streamlines knowledge acquisition from across agricultural science and beyond.

| Recording | Slides |
August 2nd - Virtual round table on phenotypic data management issues of the AgBioData member databases
In Year 2 of our current NSF RCN grant, the AgBioData database community indicated sharing and managing phenotypic data as one of the primary data issues to address. We have surveyed our member databases, and many of them agreed that curating phenotypic data is challenging due to their diversity in terms of data types (e.g., images vs. spreadsheets) and data sources (e.g., breeding program, experimental trials, or literature), as well as the lack of standardization. We are inviting the AgBioData member databases and the larger community to discuss these challenges to understand their importance for AgBioData member databases and if they can be addressed entirely or partially in a new AgBioData working group.

| Recording | Slides |
July 12th - Sarah Lippincott (Dryad)
Companion planting: How generalist and specialist repositories can work together to promote agricultural data sharing and reuse
This conversation explores strategies for enhancing open sharing and reuse of agricultural data through collaboration between disciplinary and generalist repositories, specifically Dryad, an open data publishing platform and community. Generalist and specialist repositories bring distinct strengths to data sharing and reuse. Generalist repositories offer a home for a wide range of data types and support serendipitous discovery, while discipline-specific repositories offer granular metadata, specialized tools, and deep understanding of community needs. Agricultural data creators, and future data re-users, can benefit from collaboration between these different solutions, including building connections with complementary datasets stored in multiple repositories; consistency in metadata standards; and federated discovery systems. In this conversation, Dryad’s Head of Community Engagement, Sarah Lippincott, will describe Dryad’s stewardship of agricultural data and engage attendees in an exploration of how Dryad’s can work with the agricultural research community to improve data sharing, discovery, and reuse.

| Recording | Slides |
June 7th - Peter Selby (Cornell University)
Applications and impacts of the BrAPI project on plant breeding
Modern genomic breeding methods rely heavily on very large amounts of phenotypic and genotypic data, presenting new challenges in effective data management and integration. The datasets are often large and complex, and the data is often stored on multiple systems, sometimes separated by country and organization. As the common analysis methods increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. This webinar will be an introduction to The Breeding API (BrAPI) Project. The BrAPI Project began in 2014 when a small group of plant breeding and technology experts came together to try to standardize their data. Since then, BrAPI has become internationally accepted as one of the primary data exchange standards in the plant breeding domain. This webinar will give an overview of what BrAPI is, how it works, what it is capable of, and the impact the project has had so far on the community.

| Recording | Slides |
May 1-2 - 2023 AgBioData community workshop (Chicago, IL)

| Recording |
April 5th - Krystal Tsosie (Arizona State University)
From Green Revolution to “Rescue” Indigeneity: Using Digital Data Tools and Machine Learning Approaches to Protect Indigenous Knowledge and Biodiversity
Comprising less than 5% of the world's population, Indigenous people protect 80% of global biodiversity. The next genomic ”discoveries” in industry and academia may co-opt Indigenous knowledge or disenfranchise Indigenous peoples, who are often last to benefit and are least protected from intellectual property claims. Ethical and sustainable research necessitates new digital data approaches as grounded in machine learning and Indigenous stewardship models to operationalize CARE data governance principles, direct benefit sharing, and equitable engagement and partnerships with Indigenous communities.

| Slides |
March 1st - Irene Cobo Simón (Institute of Forest Science ICIFOR-INIA, CSIC; Spain)
CartograPlant: Cyberinfrastructure to improve plant health and productivity in the context of a changing climate.
Climate change is threatening plant health and productivity at all spatial scales. To date, it remains largely unknown whether plant breeding and agricultural management practices can keep pace with the rate and direction of environmental change, as well as species’ rate of adaptation to rapid environmental change. In addition, the incidence of invasive pests and pathogens is increasing as a consequence of globalization. This trend is being exacerbated by climate change. Thus, future plant health and productivity will depend on the match between genotypes (and their resulting phenotypes) and new environments. However, these analyses are challenging since they require the integration of diverse data types, usually decentralized and lacking in standardization: genotypic, phenotypic and environmental. Hence, centralized and up-to-date platforms which integrate, visualize and analyze high-throughput biological data are key, especially in the current big data era in plant biology. CartograPlant (https://cartograplant.org/) is a web-based application that integrates, visualizes, and analyzes genotypic, phenotypic, environmental data, and their associated metadata, from georeferenced plants. Environmental data is available through advanced integration of global and regional layers. The genotype and phenotype metrics are collected through direct submission of studies at the time of publication or through the biocuration efforts of the affiliated databases and applications (TreeGenes, BIEN, TreeSnap). Data analysis is enabled by accessing the metadata associated with the public studies and providing appropriate workflows through Galaxy (https://galaxyproject.org/). This metadata collection, using ontologies and standards, allows data integration and analysis coming from different studies, which is key to perform both mega and meta-analysis. Mega-analysis and meta-analysis of GWAS (GxP association) and landscape genomics (GxE association) studies can improve the power to detect association signals by increasing sample size and by examining more variants throughout the genome than each dataset alone. Thus, they allow users to answer unprecedented and ambitious adaptive questions, taking advantage of the potential of high-throughput biological data. This talk will describe the recent updates in data sources, functionalities, and analytic workflows offered by CartograPlant.

| Recording | Slides |
February 1st - Monica Munoz-Torres (University of Colorado School of Medicine)
The Monarch Initiative: harmonizing cross-species data for disease diagnostics and discovery.
Addressing complex scientific challenges requires weaving together data from diverse sources, organisms, contexts, formats, and granularities, and building a coherent holistic view of this data landscape to address any given problem is non-trivial – much of the relevant information is scattered and not readily accessible for searching or analysis. The Monarch Initiative is a consortium and a set of resources aiming to overcome these limitations by integrating the fragmented data landscape into the most comprehensive open collection of genotype-phenotype data available. Monarch seeks to bridge the space between basic and applied and clinical research, developing tools that facilitate connecting data across a variety of scientific approaches and disciplines including genomics, proteomics, molecular modeling, diagnosis of disease and syndromes, and the organization of patient record data. The Monarch Knowledge Graph (KG) links together clinical, biomedical, and basic science research data spanning multiple species, and it supports reasoning across a wide range of organisms, body systems, and diseases. We founded the Human Phenotype Ontology (HPO), one of the most widely used biomedical ontologies and the gold standard for describing human phenotypes, and are also creators of the Mondo unified disease ontology, the Unified Phenotype Ontology (uPheno), the cross-species anatomy ontology (Uberon), the Environmental Conditions and Treatments Ontology (ECTO), and most recently, the Vertebrate Breed Ontology (VBO), a single source for data standardization and integration of all breed names. We also created the Simple Standard for Sharing Ontology Mappings (SSSOM) to harmonize the ontologies that are used by the sources, and the only ISO-approved standard for exchanging detailed, case-level phenotype data, Phenopackets. Monarch tools and resources are publically available and are designed for both informatics users, as well as clinical and basic research use cases. By making data more interoperable, our widely-used standards for data annotation and exchange help support a wide range of data sharing and reuse by projects and organizations around the world, and reduce the effort they need to devote to data harmonization. During this presentation, we will introduce you to a few of these resources and offer you the information to find and implement the ones that best serve your scientific needs.

| Recording |

2022

December 7th - Sushma Naithani (Oregon State University)
Plant Reactome: Using OMICs data for biocuration of plant genes and pathways
The major challenge in analyzing and connecting genotype to phenotype data at the organismal level is their integration and visualization for knowledge synthesis, which is required for generating OMICs data-driven predictive models for precision breeding of crops as well as accessing the needs of conservation of biodiversity and long-term sustainability. The Plant Reactome (https://plantreactome.gramene.org) is one such platform that allows integration of data from heterogeneous sources (i.e., published literature, transcriptome, proteome and metabolome data, orthology-based projections) for synthesizing in silico modeling of system-level plant pathway networks including metabolic pathways, biological processes associated with plant development and reproduction, and genetic-regulatory mechanisms that mediate plant survival under varied stress conditions. It provides a valuable framework for understanding how a gene, a group of connected genes, or genotypic differences culminate into a phenotype and supports the generation of data-driven hypotheses for understanding the intra-and -inter-species differences for basic and translational research and precision breeding. Here, we emphasize our recent efforts in using Omics data for improving gene/gene family functional annotations and biocuration of gene-gene networks.

| Recording |
November 2nd - Jennifer E. Cross (Colorado State University)
Science of Team Science: Using Developmental Evaluation to Advance Transdisciplinary Teams and Evolve Science
For the past 20 years or more, science has been evolving to answer more complex questions, which require more complex teams. While scientists are eager to engage with diverse colleagues, academic institutions are slow to change and present a variety of barriers to advancing science. I will explore how the field “science of team science” has been growing and how team assessment, evaluation, and coaching can help teams become more effective. Case studies of interdisciplinary teams will be shared to illustrate how developmental evaluation and assessment can help accelerate team growth, and overcome institutional and infrastructural challenges and barriers.

| Recording |
October 5th - Chris Mungall (Lawrence Berkeley National Laboratory)
The Gene Ontology: Making functional annotation of plants and animals FAIR
The Gene Ontology is one of the most widely used databases in the biosciences, covering functional annotation of genes and gene products across a wide range of species. The GO is ubiquitously used to analyse a variety of types of high-throughput experimental data. Originally created to unify functional annotation across a handful of model organism databases, the GO has grown to encompass more species, and the structure of the GO has been extended to integrate with other ontologies such as CHEBI and the Plant Ontology. The structure of annotations has also evolved, and the GO now includes more expressive pathway-oriented annotations in the form of GO-CAMs (Causal Activity Models). In this talk I will give a practical guide to the structure of GO, how to find and request terms, how to search and create annotations, and how to use GO tools. I will also talk about how the broader AgBioData can contribute to the GO consortium to help seed functional annotation efforts in a more diverse range of organisms, and in particular with agriculturally relevant species.

| Recording |
September 7th - Nicholas J. Provart (University of Toronto)
Raising the BAR for Hypothesis Generation in Plant Biology Using Open Big Data
We have developed tools, available as part of the Bio-Analytic Resource at http://bar.utoronto.ca, for exploring large data sets from plants, to allow deeper insights into biological questions. My lab’s three visual analytic tools for transcriptomic data (eFP Browser, ePlant, and eFP-Seq Browser) allow for rapid access to comprehensive gene expression compendia we have curated for identifying tissues, cell-types, or perturbations in which a gene is active or alternatively spliced. Interactions, be they protein-protein or regulatory, create networks. We have developed new tools for exploring such data, either from large collections of experimentally-supported protein-protein or protein-DNA interactions or from predicted interactions, including protein-protein interactions inferred from molecular docking studies. We are currently working on integrating large-scale phenotype data from field trials monitored by drone-based sensors into ePlants we have developed for several agronomically-important species to improve understanding of links between genotype and phenotype.

| Recording |
August 3rd - ThankGod Ebenezer (EMBL-EBI)
The African BioGenome Project (AfricaBP): Genomics in the service of African biological diversity
Food security and biodiversity conservation represents a substantial issue worldwide and requires local solutions, as highlighted in the UN’s Sustainable Development Goals. I will discuss the progress and process of establishing a pan-African network to address this challenge through genomic science and how this could inform and influence policy across Africa.

| Recording |
June 1st - Camille Rustenholz (University of Strasbourg; France)
COST ACTION INTEGRAPE: Data integration to maximise the power of omics in grapevine improvement and beyond
The European network INTEGRAPE seeks the establishment of an open, international, and representative network, insuring that omics and phenotyping data generated in the grapevine research community are being produced in a secure and standardized format, following the F.A.I.R. principles of findability, accessibility, interoperability, and reusability. Amongst the most significant deliverables of INTEGRAPE:

- the elaboration of Guideline ‘cookbooks’ and Dictionary of unified grape-sample ontologies;

- the release of the PN40024 fourth genome assembly and its annotation;

- the creation of the Gene Reference Catalogue;

- the enlisting of Online repositories and tools for omics data exploration and visualization, which to date are not yet interoperable among them.

To tackle this last challenge, we applied for a COST Innovative Grant with the GRAPEDIA project (Grapevine Encyclopedia of genes and omics), which goal is to provide the community with a single open-access database, allowing data exploration and visualization of all grapevine resources, with tools for comparative analysis and customized services. In the GRAPEDIA database, we aim at centralizing, interconnecting, and showcasing these dispersed resources, and integrating them with those genomic efforts generated by the worldwide community. The target group is the entire scientific community working on the grapevine or using grapevine as their model plant for an “orphan” plant species, and also the private sector working on R&D in vitiviniculture.

| Recording |
May 4th - Karen Yook and Daniela Raciti, microPublication Biology
Bridging the gap between data production and database curation through microPublications
To solve a long-standing problem in data loss and accessibility, we developed a publishing platform, microPublication Biology, to bridge data publishing and database curation. Our journal accepts single experiment articles (microPublications) and embeds curation within the article submission/publishing workflow. microPublication Biology is an online, peer-reviewed, open-access journal published by the Caltech Library and discoverable in PubMed. Starting with articles focused on nematode biology, we continually expand to more organism communities, including Arabidopsis and most recently Dictyostelium, Maize, and Cotton. Our system is set up so that upon publication, atomized data is delivered directly to authoritative databases for each community (e.g., WormBase, Flybase, PomBase, TAIR), ensuring timely delivery to biological databases for deep data integration. We will give an overview of our journal and its integrated curation workflow and present our latest publishing metrics.

| Recording |
March 15-17, 2022 - AgBioData Community Workshop.
Facilitating crosstalk and network building across Working Groups
Our three-day, all-hands, online workshop will provide a forum for the working group to pose questions to and gather feedback from the AgBioData community. Each day will have a two-hour session (7-9 AM Pacific Time), with short presentations of selected working groups at the beginning, followed by breakout sessions, where WG and non-WG members can meet and discuss relevant topics, and a brief reporting period at the end. Your participation can contribute to move forward FAIR data sharing and management!

| Recording |
February 2nd - Baron Koylass and Timothee Cezard (EMBL-EBI)
The European Variation Archive: Genetic variation archiving and accessioning
The European Variation Archive (EVA) is a primary open repository for archiving, accessioning, and distributing genetic variation, including single nucleotide variants, short insertions and deletions (indels), and larger structural variants (SVs) in any species. Created in 2014 to provide FAIR access to genetic variation data, it has since grown to be a primary resource for genomic variants hosting >3 billion records and now maintains and provides the permanent variant locus identifiers (rs IDs) for all non-human species.

| Recording | Slides |

2021

December 1st - Meet the new AgBioData Working Groups!
The purpose of this meeting will be to quickly introduce each of the Working Groups and their initial plans. This will be an opportunity to learn what each working group is planning to focus on, followed by a short discussion. AgBioData members who have not signed up for a working group, or who wish to join an additional group, formally or informally, will have an opportunity to contact working group chairs.

| Recording |
November 10th - Silvie Fexova (Plant Expression Atlas)
Expression Atlas and Single Cell Expression Atlas – home of cross-species gene expression data
From submission to data visualisation – Our team at EBI maintains and develops a number of resources aimed to support (FAIR)sharing, re-use, integration and visualisation of functional genomics data from a broad range of species including many agricultural species (both plants and animals). In this webinar I will briefly introduce our archival services and tools as well as our two knowledgebases, the Expression Atlas and Single Cell Expression Atlas, that host thousands of publicly available transcriptomics experiments across species and biological conditions – re-analysed and visualised in a user-friendly interface for the scientific community to use and explore.

| Recording | Slides |
October 6th - Allyson Lister (FAIRsharing)
FAIRsharing: promoting the discovery of data standards, policies and databases across all research domains
FAIRsharing is an informative and educational resource on interlinked standards, repositories and policies, three key elements of the FAIR ecosystem. FAIRsharing promotes the existence and value of these standards, repositories and policies, fostering a culture change within the research community into one where the use of these resources for FAIRer data is pervasive and seamless. This is achieved by guiding consumers to discover, select and use these resources with confidence, and helping producers to make their resources more visible, more widely adopted and cited. This presentation will highlight key collaborative, successful activities as well as next steps within FAIRsharing. It will also provide information on how to become a recommended repository in FAIRsharing and how to use FAIRsharing to engage with your stakeholders as well as with journal publishers and their data policies.
| Recording | Slides |
September 1st - AgBioData RCN grant - Lisa Harper & Eva Huala

Help us chart the future of agricultural data!
Do you want easy access to better quality data?
We are THRILLED to announce that AgBioData (https://www.agbiodata.org/) has received a three-year NSF RCN award to expand our community committed to improving quality and access to agricultural data. New activities will include organizing workshops, establishing new working groups, and developing FAIR curriculum for scientists. We are expanding the consortium and welcome new members, especially students, post-docs, big-data scientists, funding agency scientists and members of the scientific publishing community interested in solving common FAIR data issues.

| Recording | Slides |
- RCN objectives,
- benefits of joining AgBioData,
- and how YOU can make a difference in the biological data environment for years to come.
August 4th - Noah Fahlgren & Malia Gehan (Donald Danforth Plant Science Center) on interactions between the phenomics and database communities.
High-throughput phenotyping has emerged as a promising area in plant, animal, and agricultural sciences that brings together researchers from life sciences, engineering, computer science, data science, mathematics, and other research fields to develop technologies for rapidly and accurately measuring phenotypes using robotics, imaging, and other tools. High-throughput phenotyping can be done at different scales, from cellular to ecological, typically using image-based approaches for data collection and analysis. The development of computer vision and machine learning approaches to extract biologically meaningful measurements from images, including physical, physiological, morphological, and qualitative properties of crops and livestock, is a major activity within the field. Phenotype datasets can be used for a variety of purposes, but in conjunction with large genomic datasets, are a powerful tool for linking phenotype to genotype, training genomic prediction models, and other approaches that integrate genetic, phenotypic, and environmental datasets. We will introduce our efforts to develop PlantCV (https://plantcv.danforthcenter.org/), an open-source platform for image-based plant phenotyping, and discuss opportunities for collaboration between the phenomics and database communities.

| Recording | Slides |
June 2nd - Lisa Harper - Dealing with gene models from 50 different reference genomes. A progress report from MaizeGDB. MaizeGDB now hosts over 50 reference-quality genome assemblies and their associated gene model sets and metadata. We have started to use a "Pan-Gene" concept to group syntelogs. We define a pan-gene as the set of gene models from multiple genomes that appear to represent the same gene. After I show you how we are implementing this at MaizeGDB, let's have a discussion about how other databases are dealing with this gene model explosion.

| Recording | Slides |
May 5th - Monica Poelchau

Recommendations from the AgBioData GFF3 working group

Over a year ago, AgBioData convened a discussion on GFF3 formatting issues, led by Scott Cain. This discussion led us to form the AgBioData GFF3 working group. Our goals are to 1) identify common problems with the GFF3 format; 2) recommend solutions for these problems; and 3) promote community adoption of these recommendations, so that data can be formatted in standard ways across databases. Members of AgBioData, Alliance of Genome Resources, and NCBI have been working on these goals for the past year. We are now ready to receive feedback from the AgBioData community on our recommendations, in order to get traction on the final goal – community adoption of these solutions.

| Recording | Slides |
April 7th - Guest speaker: Peifen Zhang

PhyloGenes (phylogenes.org) presents precomputed phylogenetic trees of plant gene families along with known functions for individual family members. By displaying experimentally validated gene functions associated to individual genes within a tree, PhyloGenes enables functional inference for genes of uncharacterized function, based on their evolutionary relationships to experimentally studied genes, in a visually traceable manner. For the many families containing genes that have evolved to perform different functions, PhyloGenes also facilitates the study of function evolution. The current PhyloGenes release (version 2.2) includes 40 plant genomes covering a broad taxonomic range and including all major crops, along with 10 non-plant model organisms represented in over 8,000 gene families. Over two-thirds of the families have at least one member with a validated known function as GO terms. To increase the predictive power of PhyloGenes, future work will involve community contribution and emphasize on incorporating new functional annotations of family members across families and subfamilies, and also adding complementary functional datasets such as gene expression and mutant phenotype.

| Recording | Slides |
March 3rd - Jack Gardiner and Lisa Harper will lead a discussion on Metabolomics: What is it and how might it lend insight into our understanding of complex biological traits.

| Recording | Slides |
February 3rd - Chuck Cook of The Global Biodata Coalition (GBC) is working with funders to encourage more efficient collaboration in funding data resources and to sustain funding for critical data resources. More info at the GBC website, including pdfs of past talks: www.globalbiodata.org. Contact Chuck via email.

| Recording | Slides |
January 13th - Imma Subirats & Kristin Kolshus, AGROVOC Ontologies.

During this webinar, the AGROVOC Team from the Food and Agriculture Organization (FAO) of the United Nations will introduce how AGROVOC is kept up to date with a number of institutions and individual domain experts serving as focal points for specific languages and topics.
| Recording | Slides |

2020

December 2: Dr. Anne Brown (PostDoc USDA-ARS) and Andrew Wilkey (ORISE Fellow) will talk about the Genotype Comparison Visualization Tool (GCViT). | Recording |
October 7 Guest Speaker: WheatIS | Recording |
September 2nd: Guest Speaker: Dr. Sierra Moxon talk about data models and exchange protocols. | Recording | Slides |
August 5th: Group discussion: "What has AgBioData done for you & your database?" | Notes |
June 3rd: Guest Speaker: Dr. Julie Dunning Hotopp talk about secondary data usage | Recording | Slides |
May 6th: Group discussion: Pan-genomes | Recording | Slides |
April 8th: Guest speaker: Medha Devare from CGIAR, will be talking about the Gardian platform | Recording |
March 4th: Group discussion: GFF format nightmares | Recording | Slides | Notes |
February 5th: Moira Sheehan will talk about the Breeding Insights Platform | Recording | Slides |
January 13th: 10am - 12pm PST - In Person Meeting at PAGXXVIII

2019

December 4th: Guillaume Bauchet and Christiano Simoes will talk about Breedbase. | Recording | Slides |
November 6th: AgBioData Discussion: The Future of agriculture-related data resources | Contact agbiodata@gmail.com for notes & recording
October 2nd: Peter Selby from BrAPI | Recording | Slides |
September 4th: Sunita Kumari from KBase | Recording | Slides |
August 7th: Kimberly Van Auken will talk about Textpresso | Recording |
June 5th: Ethy Cannon (PeanutBase) will lead a discussion about metadata and the Metadata and Persistence Working Group and Sook Jung (Main Lab databases) will lead a discussion about ontologies and the Ontologies Working Group. | Recording |
May 1st: Tanya Berardini (TAIR) and Lisa Harper (MaizeGDB) will lead a discussion about curation and the Curation Working Group | Recording | Slides |
April 3rd: Meg Staton will lead a discussion about Data Sharing and the Data Sharing using Web Services Working Group | Recording | Slides |
March 6th: Daureen Nesdill (University of Utah) and Carolyn Lawrence-Dill (Iowa State University) discuss the APLU-AAU Accelerating Public Access to Research Workshop. | Slides |
February 6th: Jacqueline Campbell leading a discussion on AgBioData business topics
January 14th: In-Person Meeting at PAGXXVII (2019) | Notes |

2018

December 5th: James Wilgenbusch will be talking about the GEMs (GxExMxS) platform | Recording
November 3rd: Margaret Woodhouse from MaizeGDB will be leading the discussion about pan-genomes | Recording | Slides
October 3rd: Michael Cherry from Alliance of Genome Resources | Recording
September 5th: Genome Nomenclature short talks from several groups (TAIR, MaizeGDB, GDR, and Gramene) | Recording | Slides
August 1st: Cynthia Parr from AgData Commons | Slides
June 6th: Alex Pico from Wiki Pathways | Recording
May 2nd: Esther Dzale-Yeumo from Research Data Alliance (RDA) | Recording
March 7th: Marcela Tello-Ruiz from Gramene
February 7th: Gary Saunders from EVA at EMBL | Slides