For many AMD GPUs, you must add --precision full --no-half or --upcast-sampling arguments to avoid NaN errors or crashing. If --upcast-sampling works as a fix with your card, you should have 2x speed (fp16) compared to running in full precision.

  • Some cards like the Radeon RX 6000 Series and the RX 500 Series will already run fp16 perfectly fine (noted here.)
  • If your card is unable to run SD with the latest pytorch+rocm core package, you can try installing previous versions, by following a more manual installation guide below.