Skip to content

MACE — Mass Concept Erasure

Overview

MACE (Mass Concept Erasure, CVPR 2024) erases concepts through a closed-form weight update rather than gradient-based fine-tuning. It analytically computes a modification to the key and value projection matrices in the UNet's cross-attention layers, remapping the concept's token representations to a neutral (empty) representation.

Because it is closed-form, MACE is deterministic and significantly faster than fine-tuning approaches like ESD or AdvUnlearn — a single erasure typically completes in seconds rather than minutes. The lambda_cfr parameter controls the conservatism of the update: higher values preserve the original weights more aggressively, at the cost of weaker erasure.

MACE also supports erasing multiple synonyms of a concept simultaneously by passing a list to erase_concept, which is useful when a concept has many surface forms (e.g. ["nude", "naked", "nudity", "nsfw"]).

Base model: CompVis/stable-diffusion-v1-4

Supported concepts: Any — both single strings and lists of synonyms are accepted.


Compatible metrics

Metric Compatible Notes
ASR I2P Any I2P concept NudeNet for nudity; CLIP for all others
ERR nudity only Requires erase_concept to be or contain "nudity"
FID Any General image quality
CLIP Score Any General text-image alignment
UA_IRA Any Requires custom prompt CSVs
TIFA Any General faithfulness
ASR Custom Any Concept-agnostic via CLIP
MMA-Diffusion Any Requires explicit target prompts for non-nudity

Configuration reference

Field Type Default Description
erase_concept str \| list[str] "nudity" Concept(s) to erase. A single string or a list of synonyms. All listed terms are mapped to the neutral representation.
erase_from str \| list[str] \| None None Scope restriction. If set, only erases the concept when it appears in the context of this broader concept. Defaults to None (fully erase to neutral with no scope).
lambda_cfr float 0.1 CFR regularisation strength. Higher = more conservative update, weaker erasure. Lower = more aggressive erasure, higher risk of side effects on related concepts.
load_path str \| None None Path to a .pt file containing a pre-modified UNet state dict (saved by a previous MACE run). If set, the CFR computation is skipped entirely and these weights are loaded directly.
save_path str \| None None Path to save the CFR-modified UNet state dict as a .pt file after computation. Only used when CFR runs (i.e. load_path is not set).
num_inference_steps int 50 DDIM steps for image generation during evaluation.
guidance_scale float 7.5 Classifier-free guidance scale for generation.
use_fp16 bool True Run in half precision.
device str "cuda" Device to run on.

Warnings

load_path and save_path are distinct

load_path and save_path serve different purposes and should not be set to the same file. Set save_path on a first run to persist the modified weights. On subsequent runs, set load_path to skip the CFR computation entirely. If neither is set, CFR re-runs on every invocation.

Checkpoint format

Both load_path and save_path refer to a single .pt file containing the full UNet state dict (torch.save(unet.state_dict(), path)), not a HuggingFace model directory. The saved weights include all UNet parameters — only the cross-attention K/V matrices differ from the base model, but the full state dict is stored for a clean load.

lambda_cfr tuning

The default lambda_cfr=0.1 is a reasonable starting point. If ASR remains high after erasure, lower it (e.g. 0.01). If FID or CLIP Score degrades noticeably, raise it (e.g. 0.5). The right value depends on the concept.

Synonym lists and ERR compatibility

When erase_concept is a list, the validation layer extracts the first element to determine concept compatibility with ERR (nudity-specific). Ensure the first element is "nudity" if using ERR with a synonym list. ASR I2P has no such restriction.


Examples

Single metric — ASR (nudity)

{
  "output_dir": "results/mace_asr",
  "technique": {
    "name": "mace",
    "config": {
      "erase_concept": "nudity",
      "lambda_cfr": 0.1,
      "save_path": "checkpoints/mace_nudity.pt",
      "device": "cuda"
    }
  },
  "metric": {
    "name": "asr_i2p",
    "config": {
      "device": "cuda",
      "limit": 500
    }
  }
}

Single metric — with synonym list

{
  "output_dir": "results/mace_nudity_synonyms",
  "technique": {
    "name": "mace",
    "config": {
      "erase_concept": ["nudity", "nude", "naked", "nsfw"],
      "lambda_cfr": 0.1,
      "save_path": "checkpoints/mace_nudity_synonyms.pt",
      "device": "cuda"
    }
  },
  "metric": {
    "name": "asr_i2p",
    "config": {
      "device": "cuda",
      "limit": 500
    }
  }
}

Multiple metrics — nudity full benchmark

{
  "output_dir": "results/mace_nudity_multi",
  "technique": {
    "name": "mace",
    "config": {
      "erase_concept": "nudity",
      "lambda_cfr": 0.1,
      "save_path": "checkpoints/mace_nudity.pt",
      "device": "cuda",
      "num_inference_steps": 50,
      "guidance_scale": 7.5
    }
  },
  "metrics": [
    { "name": "asr_i2p", "config": { "device": "cuda", "limit": 500 } },
    { "name": "err", "config": { "device": "cuda", "target_limit": 50, "retain_limit": 20, "adversarial_limit": 50 } },
    { "name": "fid", "config": { "device": "cuda", "limit": 1000 } },
    { "name": "clip_score", "config": { "device": "cuda", "limit": 300 } },
    {
      "name": "ua_ira",
      "config": {
        "target_prompts_path": "data/nudity_target_prompts.csv",
        "retain_prompts_path": "data/nudity_retain_prompts.csv",
        "target_concept": "nudity",
        "retain_concept": "person",
        "device": "cuda"
      }
    },
    { "name": "tifa", "config": { "device": "cuda", "limit": 200 } }
  ]
}

Reusing trained weights across runs

Set save_path on the first run to persist the trained weights, then use load_path on all subsequent runs to skip retraining. This is especially useful when benchmarking multiple metrics against the same trained model. See Caching adversarial prompts and technique weights for the full workflow.