Topic c.iii.3: Decoupling Perspectives from Mistakes for High-Precision Crowdsourcing (W911NF-17-S-0002)
Abstract
Human intelligence, when combined with automation, is essential for understanding and interpreting complex and unstructured content, including images, videos, and text. Crowdsourcing algorithms, i.e., algorithms that seamlessly blend human intelligence with automation, are therefore crucial for enhanced informational awareness and effective decision making using such content. As a result, crowdsourcing algorithms have a wide variety of defense, commercial, medical, scientific, and civilian applications. Despite the tremendous importance of crowdsourcing, present-day crowdsourcing algorithms are rather primitive. These algorithms use simplistic models that treat human judgments "as-is", without attempting to model the underlying viewpoints or perspectives that dictated the specific judgment. By doing so, the resulting "consensus" judgment end up rife with errors, confusion, and internal inconsistencies, since they mix multiple perspectives, and the expertise associated with the human annotators are often erroneous, since these simplistic models are not able to distinguish between a person making a mistake, or simply answering according to a different perspective. In this activity, we develop foundational principles for perspective-aware crowdsourcing: we develop techniques to model and reason about perspectives to obtain higher quality data from crowdsourcing. We study three dimensions: (a) problem we consider four fundamental and ubiquitous crowdsourcing problems: classification, sorting, categorization, and clustering; (b) model - we consider increasingly sophisticated models to represent perspectives, starting from a vanilla perspective-based model, to models that take into account perspectives along with expertise; and (c) workflow-we consider various types of crowdsourcing workflows, starting from a one-shot, to iterative, to workflows that get input that is tailored to a specific perspective. In developing solutions, we will employ key new ideas on modeling consistency between annotators, distinguishing between errors and varied perspectives, and relating perspectives across large data collections. We will test our techniques on multiple real-world datasets drawn from processing of text and vision. Overall, our techniques will lead to substantially higher accuracy results from crowdsourcing, while also requiring much less human input. Our exploration will also enhance our foundational understanding of crowdsourcing; specifically, how to best capture multiple viewpoints in hybrid man-machine systems, revealing inherent capabilities and limitations of such systems. The PI will leverage his significant expertise in crowdsourcing algorithn1 development-multiple "early career" awards, 40+ papers in the last four years, a published book on crowdsourcing, 3 dissertation awards from the database and data mining communities, and Stanford University, and 5 best paper awards-to bring to bear upon this crucial, timely project.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Feb 14, 2019
- Source ID
- W911NF1810335
Entities
People
- Aditya Parameswaran
Organizations
- Army Contracting Command
- United States Army
- University of Illinois Urbana–Champaign