Running on a GPU Cluster
This page covers running eval-learn on SLURM-managed clusters. The commands use SLURM — translate to PBS/LSF as needed.
Single job
The minimal submission script:
#!/bin/bash
#SBATCH --job-name=eval-learn
#SBATCH --gres=gpu:1
#SBATCH --mem=32G
#SBATCH --time=04:00:00
#SBATCH --output=logs/%j.log
source activate your_env
export HF_TOKEN=your_token_here
eval-learn run --config examples/nudity/esd.json
output_dir in the config controls where results are written. Use an absolute path
or set it relative to your project root and cd there before submitting.
Array job — one technique per task
To benchmark all techniques against the same concept in parallel, submit an array job where each task runs a different config:
#!/bin/bash
#SBATCH --job-name=eval-learn-nudity
#SBATCH --array=0-12
#SBATCH --gres=gpu:1
#SBATCH --mem=32G
#SBATCH --time=06:00:00
#SBATCH --output=logs/%A_%a.log
source activate your_env
export HF_TOKEN=your_token_here
CONFIGS=(
examples/nudity/esd.json
examples/nudity/mace.json
examples/nudity/uce.json
examples/nudity/ssd.json
examples/nudity/ca.json
examples/nudity/cogfd.json
examples/nudity/trasce.json
examples/nudity/advunlearn.json
examples/nudity/saeuron.json
examples/nudity/safree.json
examples/nudity/sld.json
examples/nudity/concept_steerers.json
examples/nudity/free_run.json
)
eval-learn run --config "${CONFIGS[$SLURM_ARRAY_TASK_ID]}"
Each task writes to its own output_dir as defined in the config, so there are no
write conflicts.
Caching adversarial prompts across jobs
Adversarial prompt generation (P4D, MMA Diffusion, Ring-A-Bell) is expensive. If you are running multiple techniques against the same prompt set, generate once and share the cache across array tasks rather than re-running on every job.
Step 1 — generate prompts in a dedicated job:
#!/bin/bash
#SBATCH --job-name=gen-prompts
#SBATCH --gres=gpu:1
#SBATCH --mem=32G
#SBATCH --time=02:00:00
#SBATCH --output=logs/gen_prompts_%j.log
source activate your_env
export HF_TOKEN=your_token_here
eval-learn run --config gen_prompts.json
Where gen_prompts.json is a minimal config that runs just the adversarial metric
against a cheap technique (e.g. free_run) to produce the cache:
{
"output_dir": "results/prompt_cache_run",
"technique": {
"name": "free_run",
"config": { "model_id": "CompVis/stable-diffusion-v1-4", "device": "cuda" }
},
"metrics": [
{
"name": "asr_p4d",
"config": {
"concept_name": "nudity",
"target_prompts_path": "examples/data/nudity_target_prompts.csv",
"erase_id": "std",
"generated_prompts_output": "cache/p4d_nudity.csv",
"device": "cuda"
}
}
]
}
Step 2 — add a dependency so technique jobs start after prompts are ready:
PROMPT_JOB=$(sbatch --parsable gen_prompts.sh)
sbatch --dependency=afterok:$PROMPT_JOB --array=0-12 benchmark.sh
Step 3 — point each technique config at the cache by setting
precomputed_prompts_path instead of target_prompts_path:
{
"name": "asr_p4d",
"config": {
"concept_name": "nudity",
"precomputed_prompts_path": "cache/p4d_nudity.csv",
"device": "cuda"
}
}
See Caching adversarial prompts and technique weights for the full pattern including MMA Diffusion, Ring-A-Bell, and reusing trained weights.
Reusing trained weights across jobs
Training-based techniques (ESD, MACE, SSD, CoGFD, AdvUnlearn, CA) re-run training
on every invocation. If you want to run multiple metric configurations against the
same trained model without retraining, save the weights on the first run and load
them on subsequent runs via save_path / load_path in the technique config.
This is particularly useful when iterating on metric parameters — you train once, then resubmit only the evaluation portion.
HF Hub
Push results from a completed job directly to a HuggingFace dataset repo for centralised storage:
Or push an existing results directory separately after the job completes:
Tips
Shared filesystem caching — set cache directories to a shared path if your home directory is node-local:
# HuggingFace models (techniques, CLIP, BLIP-2, etc.)
export HF_HOME=/path/to/shared/storage/.cache/huggingface
# PyTorch hub models (torchvision FID inception weights, etc.)
export TORCH_HOME=/path/to/shared/storage/.cache/torch
With a shared home directory these are available to all nodes automatically and no override is needed.
GPU memory — most techniques require at least 16GB VRAM. Training-based techniques (ESD, CA) require more during the training phase due to the frozen reference UNet copy — request 40GB if available.
Time limits — adversarial prompt generation (especially P4D at num_iter=3000)
and long training runs can exceed short-queue limits. Use the caching workflow above
to split generation from evaluation into separate jobs.