MACE — Mass Concept Erasure
Overview
MACE (Mass Concept Erasure, CVPR 2024) erases concepts through a closed-form weight update rather than gradient-based fine-tuning. It analytically computes a modification to the key and value projection matrices in the UNet's cross-attention layers, remapping the concept's token representations to a neutral (empty) representation.
Because it is closed-form, MACE is deterministic and significantly faster than fine-tuning
approaches like ESD or AdvUnlearn — a single erasure typically completes in seconds rather
than minutes. The lambda_cfr parameter controls the conservatism of the update: higher
values preserve the original weights more aggressively, at the cost of weaker erasure.
MACE also supports erasing multiple synonyms of a concept simultaneously by passing a list
to erase_concept, which is useful when a concept has many surface forms (e.g.
["nude", "naked", "nudity", "nsfw"]).
Base model: CompVis/stable-diffusion-v1-4
Supported concepts: Any — both single strings and lists of synonyms are accepted.
Compatible metrics
| Metric | Compatible | Notes |
|---|---|---|
| ASR I2P | Any I2P concept | NudeNet for nudity; CLIP for all others |
| ERR | nudity only | Requires erase_concept to be or contain "nudity" |
| FID | Any | General image quality |
| CLIP Score | Any | General text-image alignment |
| UA_IRA | Any | Requires custom prompt CSVs |
| TIFA | Any | General faithfulness |
| ASR Custom | Any | Concept-agnostic via CLIP |
| MMA-Diffusion | Any | Requires explicit target prompts for non-nudity |
Configuration reference
| Field | Type | Default | Description |
|---|---|---|---|
erase_concept |
str \| list[str] |
"nudity" |
Concept(s) to erase. A single string or a list of synonyms. All listed terms are mapped to the neutral representation. |
erase_from |
str \| list[str] \| None |
None |
Scope restriction. If set, only erases the concept when it appears in the context of this broader concept. Defaults to None (fully erase to neutral with no scope). |
lambda_cfr |
float |
0.1 |
CFR regularisation strength. Higher = more conservative update, weaker erasure. Lower = more aggressive erasure, higher risk of side effects on related concepts. |
load_path |
str \| None |
None |
Path to a .pt file containing a pre-modified UNet state dict (saved by a previous MACE run). If set, the CFR computation is skipped entirely and these weights are loaded directly. |
save_path |
str \| None |
None |
Path to save the CFR-modified UNet state dict as a .pt file after computation. Only used when CFR runs (i.e. load_path is not set). |
num_inference_steps |
int |
50 |
DDIM steps for image generation during evaluation. |
guidance_scale |
float |
7.5 |
Classifier-free guidance scale for generation. |
use_fp16 |
bool |
True |
Run in half precision. |
device |
str |
"cuda" |
Device to run on. |
Warnings
load_path and save_path are distinct
load_path and save_path serve different purposes and should not be set to the same
file. Set save_path on a first run to persist the modified weights. On subsequent runs,
set load_path to skip the CFR computation entirely. If neither is set, CFR re-runs on
every invocation.
Checkpoint format
Both load_path and save_path refer to a single .pt file containing the full UNet
state dict (torch.save(unet.state_dict(), path)), not a HuggingFace model directory.
The saved weights include all UNet parameters — only the cross-attention K/V matrices
differ from the base model, but the full state dict is stored for a clean load.
lambda_cfr tuning
The default lambda_cfr=0.1 is a reasonable starting point. If ASR remains high after
erasure, lower it (e.g. 0.01). If FID or CLIP Score degrades noticeably, raise it
(e.g. 0.5). The right value depends on the concept.
Synonym lists and ERR compatibility
When erase_concept is a list, the validation layer extracts the first element to
determine concept compatibility with ERR (nudity-specific). Ensure the first element
is "nudity" if using ERR with a synonym list. ASR I2P has no such restriction.
Examples
Single metric — ASR (nudity)
{
"output_dir": "results/mace_asr",
"technique": {
"name": "mace",
"config": {
"erase_concept": "nudity",
"lambda_cfr": 0.1,
"save_path": "checkpoints/mace_nudity.pt",
"device": "cuda"
}
},
"metric": {
"name": "asr_i2p",
"config": {
"device": "cuda",
"limit": 500
}
}
}
Single metric — with synonym list
{
"output_dir": "results/mace_nudity_synonyms",
"technique": {
"name": "mace",
"config": {
"erase_concept": ["nudity", "nude", "naked", "nsfw"],
"lambda_cfr": 0.1,
"save_path": "checkpoints/mace_nudity_synonyms.pt",
"device": "cuda"
}
},
"metric": {
"name": "asr_i2p",
"config": {
"device": "cuda",
"limit": 500
}
}
}
Multiple metrics — nudity full benchmark
{
"output_dir": "results/mace_nudity_multi",
"technique": {
"name": "mace",
"config": {
"erase_concept": "nudity",
"lambda_cfr": 0.1,
"save_path": "checkpoints/mace_nudity.pt",
"device": "cuda",
"num_inference_steps": 50,
"guidance_scale": 7.5
}
},
"metrics": [
{ "name": "asr_i2p", "config": { "device": "cuda", "limit": 500 } },
{ "name": "err", "config": { "device": "cuda", "target_limit": 50, "retain_limit": 20, "adversarial_limit": 50 } },
{ "name": "fid", "config": { "device": "cuda", "limit": 1000 } },
{ "name": "clip_score", "config": { "device": "cuda", "limit": 300 } },
{
"name": "ua_ira",
"config": {
"target_prompts_path": "data/nudity_target_prompts.csv",
"retain_prompts_path": "data/nudity_retain_prompts.csv",
"target_concept": "nudity",
"retain_concept": "person",
"device": "cuda"
}
},
{ "name": "tifa", "config": { "device": "cuda", "limit": 200 } }
]
}
Reusing trained weights across runs
Set save_path on the first run to persist the trained weights, then use load_path
on all subsequent runs to skip retraining. This is especially useful when benchmarking
multiple metrics against the same trained model. See
Caching adversarial prompts and technique weights
for the full workflow.