Technical Report 335, c4e-Preprint Series, Cambridge
twa: The World Avatar Python package for dynamic knowledge graphs and its application in reticular chemistry
Reference: Technical Report 335, c4e-Preprint Series, Cambridge, 2025
- Introduced twa Python package for democratising dynamic knowledge graphs.
- Developed OGM to automatically synchronise Python objects with RDF graphs.
- Automated design and geometry assembly of metal-organic polyhedra.
- Integrated LLMs for streamlined MOP synthesis protocol extraction.
- Enabled reusable, ontology-driven workflows for AI-assisted cheminformatics.
Data-driven discovery is crucial in scientific domains, yet the lack of standardised data management hinders reproducibility. In chemical science, this is exacerbated by fragmented data formats. The World Avatar (TWA) addresses these challenges via a dynamic knowledge graph historically provided in Java-based toolkits. We present twa, an open-source Python package that lowers the barrier to semantic data management. Its object-graph mapper (OGM) synchronises Python class hierarchies with RDF knowledge graphs, streamlining ontology-driven data integration and automated workflows. We demonstrate twa’s capacity to unify fragmented chemical data and accelerate research through use cases in molecular design and AI-assisted synthesis protocol extraction for metal-organic polyhedra (MOPs). Our approach expands the existing OntoMOPs knowledge graph by adding 799 new MOPs derived from combinatorial assembly models. By abstracting complex SPARQL queries behind a user-friendly interface, twa fosters transparent, reproducible knowledge-driven discovery. The package is freely available via pip install twa or https://pypi.org/project/twa/.
PDF (5.2 MB)