Knowledge Graphs and Predictive Models for Urban Agriculture Data

K-State home
K-State Olathe
Research
Urban Food Systems Initiative
Research
Knowledge Graphs and Predictive Models for Urban Agriculture Data

This team aims to identify variables affecting food and nutrition security from the urban food systems' point of view, which will impact decision-making criteria for/by stakeholders. Given the diverse set of stakeholders, datasets, the data parameters, success metrics as well as definitions of success will vary.

Predictive Modeling and Platforms
We will utilize the assessment data developed in Objective 1 to develop predictive models for connecting urban agriculture production to food security related data. We will develop predictive models utilizing our team's past experience with other work in predictive modeling and by generating knowledge graphs using KNARM. KNowledge Acquisition and Representation Methodology (McGinty, 2018) was designed with domain ontologies (a set of concepts and categories in a subject area that shows their properties and the relations between them) in mind for applications that assist decision-making. Our most recent research application has utilized this method to address supply chain problems currently faced in Ukraine using KnowWhereGraph (KWG), which is used to handle geospatial data.

We will also integrate the 1DATA Platform and will build ontologies and a knowledge-graph backend for a prototype web-based Urban Agriculture Data Hub. With previous experience brought from projects like LINCS Data Portal (McGinty 2018) and 1DATA Platform (www.1DATA.life), we will provide a responsive web-based application and an API (Application Programming Interface) for modular application building as well as modular ontology building for possible new applications of data and parameters obtained through Objective 1 and end user-side of the database and staging database that will be built as part of the architecture.

This team has established success in developing large-scale knowledge graphs with life-sciences data (McGinty et al. 2016, McGinty et al. 2017, McGinty 2018,McGinty et al. 2019, Xu 2021, Jaberi-Douraki 2021) and geospatial data (Gedara, 2021). This system architecture along with domain-agnostic knowledge graph building methods will ensure seamless integration with other Global Food Systems data sources on- and off-campus as well as integration of applications that aim to address knowledge discovery using statistical methods from the 1DATA team.

Data Retrieval and Research Methods
To integrate the databases (DBs) of interest and allow them to be used as inputs to KNARM for its semi-automated ontology building step, the data retrieval step will be implemented using Python and Java programming languages to ensure data capture in the staging database. As mentioned above, the knowledge graph backend will be integrated using existing methods used in KNARM, as well as platforms we built previously during the integration of the stakeholder interests and urban-agriculture-systems-related data properties for downstream analysis purposes.

Once the data is integrated using systematically-deepening modeling, modular ontologies and scalable knowledge graphs, we will implement a machine-learning framework that will be used for the analysis of our urban agriculture data. We will prepare the data for analysis by cleaning and transforming it and through application-related steps such as feature engineering, and data normalization. Appropriate machine learning models will be selected for the analysis through our current and previous expertise in this research. The choice of model will depend on the specific analysis goals and the nature of the data.

We will train and evaluate the models and integrate the analysis component with the rest of the backend framework. This will involve exposing the analysis results through the API and integrating them with the knowledge graph and visualization components. We will use statistical analysis, rules-based systems, as well as decision-support analysis using the data and models we will generate throughout this project. The backend systems will be accessible via a user-friendly, web-based user interface for the application. This component will allow the users to interact and query the different components of the system, including visualization tools. Depending on the nature of the data, we may need to implement authentication and authorization mechanisms to ensure that only authorized users can access and manipulate the data.

Project Team

Hande McGinty, assistant professor, computer science
Majid Jaberi-Douraki, professor, mathematics
Ayran Dalal, master's student, computer science