SAFREE — Training-Free Semantic Filtering
Overview
SAFREE (Semantic Approach to Free-up Representations) is a training-free technique — it modifies no model weights. All filtering happens at inference time through three sequential stages applied during the diffusion process:
-
Text Projection (Stage 1): Scales the concept's text embedding by
alphato reduce its influence on cross-attention. -
Self-Validation Filter (Stage 2, SVF): At denoising timesteps above
upperbound_timestep, intercepts the latent activations and suppresses features associated with the target concept. Controlled byenable_svf. -
Latent Re-Attention (Stage 3, LRA): Applies FreeU-style frequency filtering to the UNet's skip connections and backbone features using the
freeu_*parameters. Controlled byenable_lra.
Because SAFREE is training-free, it is fast to initialise. However, it only filters at runtime and cannot guarantee concept removal across all possible prompts.
Base model: CompVis/stable-diffusion-v1-4
Supported concepts: nudity, artists-VanGogh, artists-KellyMcKernan (named calibrated); any concept via custom_unsafe_concepts
Compatible metrics
| Metric | Compatible | Notes |
|---|---|---|
| ASR I2P | Yes | |
| ERR | Yes | erase_concept="nudity" required (ERR is nudity-specific) |
| FID | Yes | General image quality |
| CLIP Score | Yes | General text-image alignment |
| UA_IRA | Yes | Requires custom prompt CSVs |
| TIFA | Yes | General faithfulness |
| ASR Custom | Yes | Concept-agnostic via CLIP |
| MMA-Diffusion | Yes | Nudity-specific by default |
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
erase_concept |
str |
"nudity" |
Named concept category. Must be one of the SVF-calibrated concepts (nudity, artists-VanGogh, artists-KellyMcKernan) unless custom_unsafe_concepts is also set. |
alpha |
float |
0.01 |
Stage 1: scaling factor applied to the concept text embedding. Values near 0 suppress the concept text signal strongly. |
enable_svf |
bool |
True |
Enable Stage 2 Self-Validation Filter. |
upperbound_timestep |
int |
10 |
Stage 2: only apply SVF at timesteps above this value. Lower values = SVF active for more of the denoising trajectory. |
enable_lra |
bool |
True |
Enable Stage 3 Latent Re-Attention. |
lra_filter_type |
str |
"high" |
Stage 3: frequency filter type. "high" suppresses high-frequency components, "low" suppresses low-frequency, "all" applies to all. |
freeu_b1 |
float |
1.0 |
FreeU backbone scaling factor for block 1. |
freeu_b2 |
float |
1.0 |
FreeU backbone scaling factor for block 2. |
freeu_s1 |
float |
0.9 |
FreeU skip-connection scaling factor for block 1. |
freeu_s2 |
float |
0.2 |
FreeU skip-connection scaling factor for block 2. |
re_attn_timestep_range |
[int, int] |
[-1, 1001] |
Fallback timestep range for re-attention when SVF is disabled. |
use_fp16 |
bool |
True |
Run in half precision. |
device |
str |
"cuda" |
Device to run on. |
Warnings
SVF calibration
SVF (Stage 2) is only calibrated for nudity, artists-VanGogh, and artists-KellyMcKernan. For any other concept, pass custom_unsafe_concepts=['phrase1', ...] — SVF will be disabled automatically and Stage 3 LRA will handle suppression. Using erase_concept with an uncalibrated concept name without custom_unsafe_concepts raises a ValueError.
All three stages are cooperative
Disabling SVF or LRA (enable_svf=false, enable_lra=false) weakens erasure
significantly. The three stages are designed to work together. Disable individual
stages only for ablation experiments.
alpha near 1.0
Setting alpha close to 1.0 makes Stage 1 a no-op (no scaling of the concept
embedding). This is fine for testing Stages 2 and 3 in isolation, but provides
effectively no text-projection filtering.
Examples
Single metric — ASR
{
"output_dir": "results/safree_asr",
"technique": {
"name": "safree",
"config": {
"erase_concept": "nudity",
"alpha": 0.01,
"enable_svf": true,
"enable_lra": true,
"device": "cuda"
}
},
"metric": {
"name": "asr_i2p",
"config": {
"device": "cuda",
"limit": 500
}
}
}
Multiple metrics — nudity full benchmark
{
"output_dir": "results/safree_nudity_multi",
"technique": {
"name": "safree",
"config": {
"erase_concept": "nudity",
"alpha": 0.01,
"enable_svf": true,
"upperbound_timestep": 10,
"enable_lra": true,
"lra_filter_type": "high",
"freeu_b1": 1.0,
"freeu_b2": 1.0,
"freeu_s1": 0.9,
"freeu_s2": 0.2,
"device": "cuda"
}
},
"metrics": [
{ "name": "asr_i2p", "config": { "device": "cuda", "limit": 500 } },
{ "name": "err", "config": { "device": "cuda", "target_limit": 50, "retain_limit": 20, "adversarial_limit": 50 } },
{ "name": "fid", "config": { "device": "cuda", "limit": 1000 } },
{ "name": "clip_score", "config": { "device": "cuda", "limit": 300 } },
{
"name": "ua_ira",
"config": {
"target_prompts_path": "data/nudity_target_prompts.csv",
"retain_prompts_path": "data/nudity_retain_prompts.csv",
"target_concept": "nudity",
"retain_concept": "person",
"device": "cuda"
}
},
{ "name": "tifa", "config": { "device": "cuda", "limit": 200 } }
]
}