KGTK: Knowledge Graph ToolKit

Abstract

KGTK is a comprehensive framework for the creation and exploitation of large knowledge graphs, designed for simplicity, scalability, and interoperability. Its key quality attributes are: 1) native support for reading and writing knowledge graphs in many formats; 2) extensibility, by seamless import and export to popular data science tools like Pandas and ElasticSearch; 3) modularity, i.e., pipeline-friendly design to create workflows with multiple components. 4) speed, i.e., to be comparably fast to SQL databases; and 5) scale to billions of statements, e.g., handle all Wikidata on a laptop. KGTK represents graphs in tables and leverages popular libraries developed for data science applications, enabling a wide audience of developers to easily construct knowledge graph pipelines for their applications. KGTK has dozens of commands, covering a wide range of imports and exports to popular formats, highly scalable querying and storage functionality, a rich and diverse suite of transformation functions,and modern analytics powered by machine learning and graph algorithms. KGTK provides several services, namely a text search interface, a user-friendly browser, a similarity interface, and a SPARQL endpoint. All services of KGTK can be readily customized to arbitrary graphs, thus closing the loop and enabling users to use KGTK seamlessly with their own data in a variety of formats.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 26, 2023
Accession Number
AD1204361

Entities

People

  • Filip Ilievski

Organizations

  • University of Southern California

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Artificial Intelligence
  • Artificial Intelligence Software
  • California
  • Computer Programming
  • Computer Science
  • Data Science
  • Databases
  • Graphical User Interface
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Natural Language Processing
  • Ontologies
  • Operating Systems
  • Statistics
  • United States
  • User Friendly
  • Web Browsers

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Graph Algorithms and Convex Optimization.

Technology Areas

  • AI & ML