Mindat is the world’s leading database about mineral species and their distributions across the world. It has been widely used by researchers and educators in geosciences. We have built the open data service for Mindat through NSF support and the corresponding R and Python packages. Nevertheless, to use the open data service, the users need the necessary skills to translate their needs into computer languages, which can be challenging for many geoscientists.
This project aims to transform Mindat into an intelligent, open data platform by integrating large language models (LLMs) with the existing open data service, enhancing its advanced search capabilities, and providing initial data analysis functionality. The LLM-assisted functions and workflows will facilitate more accessible and engaging open mineral data resources for researchers, educators, and enthusiasts worldwide, with the potential to generate many innovative applications. The project will also demonstrate the utility of AI technologies and help extend the number of new NAIRR Pilot users from the geoscience community.