Research on Efficient Safety Guidance and Fairness Metric for Text-to-Image Generative Models

Abstract

The rapid advancements in multimodal generative AI models, particularly text-to-image (T2I) diffusion models (DM) have enabled the creation of high-quality, realistic images from textual inputs. These developments have led to a surge in diverse applications leveraging such models. However, the large-scale training datasets used for these models are often unfiltered, containing unsafe and harmful content. This introduces significant risks, as these models may generate unexpected, harmful, or disturbing images. To mitigate these risks, various defense mechanisms have been proposed such as detection-based approaches. A notable method is also Safe Latent Diffusion (SLD), which employs classifier-free guidance during inference to adjust the estimated noise, steering the generated image away from unsafe concepts. Despite its effectiveness, we have identified several limitations in Safe Latent Diffusion (SLD), which offers multiple safety guidance configurations depending on the desired level of safety. First, the safeguard level does not apply uniformly across different prompts. We will focus on two main research topics regarding the safety and fairness of (diffusion-based) text-to-image (T2I) generative AI models- (a) developing inference-time safety guidance methods and (b) developing a novel, flexible metric to measure fairness of the T2I models.

Document Details

Document Type: DoD Grant Award
Publication Date: Feb 06, 2025
Source ID: FA23862514015

Entities

People

Taesup Moon

Organizations

Air Force Office of Scientific Research
Seoul National University
United States Air Force

Research on Efficient Safety Guidance and Fairness Metric for Text-to-Image Generative Models

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas