KGTK: Knowledge Graph ToolKit

Abstract

KGTK is a comprehensive framework for the creation and exploitation of large knowledge graphs, designed for simplicity, scalability, and interoperability. Its key quality attributes are: 1) native support for reading and writing knowledge graphs in many formats; 2) extensibility, by seamless import and export to popular data science tools like Pandas and ElasticSearch; 3) modularity, i.e., pipeline-friendly design to create workflows with multiple components. 4) speed, i.e., to be comparably fast to SQL databases; and 5) scale to billions of statements, e.g., handle all Wikidata on a laptop. KGTK represents graphs in tables and leverages popular libraries developed for data science applications, enabling a wide audience of developers to easily construct knowledge graph pipelines for their applications. KGTK has dozens of commands, covering a wide range of imports and exports to popular formats, highly scalable querying and storage functionality, a rich and diverse suite of transformation functions,and modern analytics powered by machine learning and graph algorithms. KGTK provides several services, namely a text search interface, a user-friendly browser, a similarity interface, and a SPARQL endpoint. All services of KGTK can be readily customized to arbitrary graphs, thus closing the loop and enabling users to use KGTK seamlessly with their own data in a variety of formats.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 26, 2023
Accession Number: AD1204361

Entities

People

Filip Ilievski

Organizations

University of Southern California

KGTK: Knowledge Graph ToolKit

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas