Skip to main content
Projects / Demonstration projects / Mindat Intelligent Platform: Using AI to enrich data search and pre-processing functions of an open data platform

Mindat Intelligent Platform: Using AI to enrich data search and pre-processing functions of an open data platform (Supplement)

Mindat is the world’s leading database about mineral species and their distributions across the world. It has been widely used by researchers and educators in geosciences. We have built the open data service for Mindat through NSF support and the corresponding R and Python packages. Nevertheless, to use the open data service, the users need the necessary skills to translate their needs into computer languages, which can be challenging for many geoscientists.

This project aims to transform Mindat into an intelligent, open data platform by integrating large language models (LLMs) with the existing open data service, enhancing its advanced search capabilities, and providing initial data analysis functionality. The LLM-assisted functions and workflows will facilitate more accessible and engaging open mineral data resources for researchers, educators, and enthusiasts worldwide, with the potential to generate many innovative applications. The project will also demonstrate the utility of AI technologies and help extend the number of new NAIRR Pilot users from the geoscience community.

Xiaogang Ma (Principal Investigator, University of Idaho), Jiyin Zhang (Technical Lead, University of Idaho), Jolyon Ralph (Collaborator, Hudson Institution of Mineralogy), Luke Sheneman (Collaborator, University of Idaho)

The work will leverage both the existing Mindat open data service and the NAIRR Pilot facilities and resources to build a Mindat Intelligent Platform. In addition to FAIR principles, this work will add friendliness as another dimension to the best practice of open data.

This work will establish efficient ways to deploy LLMs in domain-specific contexts and the methods to improve the trustworthiness and reproducibility of LLM-assisted data science workflows.

We will collaborate with the Mindat database technical team on the data service and website demo. We will also collaborate with the Institute for Interdisciplinary Data Science at University of Idaho on LLM applications.

The work will benefit many users of the Mindat database, especially those with limited coding skills, in their data-intensive geoscience research.

Integrating AI technologies into the Mindat platform will enhance accessibility and usability for a diverse user base, promote interdisciplinary research, foster educational advancements, and serve as a model for AI adoption in open data initiatives across many geoscience disciplines and beyond. The multidisciplinary team members of this project will efficiently use the NAIRR Pilot resources, generate breakthroughs in geoscience research, publish the results in prominent venues, and organize training and outreach activities to increase the impacts.

Learn more about how the Mindat Intelligent Platform can meet your needs by contacting our team directly at max@uidaho.edu. Visit our website to learn more at https://www.mindat.org/

This work is supported by supplemental funding to National Science Foundation Grant No. (#2126315).