Sparse Matrix Factorization: Applications to Latent Semantic Indexing
Abstract
This article describes the use of Latent Semantic Indexing (LSI) and some of its variants for the TREC Legal batch task. Both folding-in and Essential Dimensions of LSI (EDLSI) appeared as if they might be successful for recall-focused retrieval on a collection of this size. Furthermore, we developed a new LSI technique, one which replaces the Singular Value Decomposition (SVD) with another technique for matrix factorization, the sparse column-row approximation (SCRA). We were able to conclude that all three LSI techniques have similar performance. Although our 2009 results showed significant improvement when compared to our 2008 results, the use of a better method for selection of the parameter K, which is the ranking that results in the best balance between precision and recall, appears to have provided the most benefit.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2009
- Accession Number
- ADA517770
Entities
People
- April Kontostathis
- Erin Moulding
- Raymond J Spiteri
Organizations
- University of Saskatchewan