Deep Recurrent Q-Network Approach for Multi Objective Markov Decision Process in Partially Observable Environment

Abstract

Prediction of relevant items to the users interest in a recommendation system (RS), is an example of partially observable Markov Decision Process (POMDPs) as users interests fluctuate over time and the items satisfaction rating matrix is typically sparse. This problem also requires multi-objectives optimization (MOO) for multi-objectives which are precision, novelty and diversity. Existing solutions on MOO are based on evolutionary algorithms, which requires combination with rating prediction techniques such as collaborative filtering to fill up the sparse matrix prior to producing recommendation. However, collaborative filtering has limitations when handling cold start or new users. Most RS merely focus on accuracy of high-rating or trendy items predictions. However, other metrics such as novelty and diversity which are equally essential to generate more quality recommendation have mostly been ignored. The main challenge of considering multiple evaluation metrics is the conflict between the objectives, since to improve either one metrics will hurt the accuracy and vice versa. Results have shown that the DRL approaches, which are the first available DRL approach for MOO in movie recommendation, are better in multi-objective compared to the benchmark. The recurrent layer in the DRL agent is also able to remodel the POMDP as a complete MDP environment, which allows prediction of the sparse rating matrix.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 23, 2021
Accession Number
AD1153753

Entities

People

  • Nurfadhlina Sharef

Organizations

  • University of Putra Malaysia

Tags

Communities of Interest

  • Materials and Manufacturing Processes
  • Space

DTIC Thesaurus Topics

  • Accuracy
  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Computer Science
  • Department Of Defense
  • Environment
  • Evolutionary Algorithms
  • Filtration
  • Information Operations
  • Information Systems
  • Laboratory Procedures
  • Learning
  • Malaysia
  • Multiobjective Optimization
  • Neural Networks
  • Optimization
  • Precision
  • Ratings
  • Recurrent Neural Networks
  • Reinforcement Learning
  • Scientific Research
  • Sparse Matrix

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Instructional Design and Training Evaluation.
  • Neural Network Machine Learning.