Improving Anonymized Search Relevance with Natural Language Processing and Machine Learning

Abstract

Users often sacrifice personal data for more relevant search results, presenting a problem to communities that desire both search anonymity and relevant results. To balance these priorities, this research examines the impact of using Siamese networks to extend word embeddings into document embeddings and detect similarities between documents. The predicted similarity can locally re-rank search results provided from various sources. This technique is leveraged to limit the amount of information collected from a user by a search engine. A prototype is produced by applying the methodology in a real-world search environment. The prototype yielded an additional function of finding new documents related to a provided sample document. The prototype is evaluated using real-world search examples. Results indicate that the Siamese network can produce document embeddings superior to current encoders like the Universal Sentence Encoder. Results also show the promising performance of the prototype in improving search relevancy while limiting user data transmission.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 24, 2022
Accession Number
AD1166917

Entities

People

  • Niko A Petrocelli

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies
  • Ground and Sea Platforms

DTIC Thesaurus Topics

  • Air Force
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Bayesian Networks
  • Computational Science
  • Computer Languages
  • Dimensionality Reduction
  • Engineering
  • Information Processing
  • Information Science
  • Language
  • Machine Learning
  • Natural Language Processing
  • Natural Languages
  • Network Science
  • Neural Networks
  • Ontologies
  • Supervised Machine Learning
  • Unsupervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Information Retrieval
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks