XFormers is a library by facebook research which increases the efficiency of the attention function, which is used in many modern machine learning models, including Stable Diffusion. First we look at why the XFormers is so effective, then we actually install XFormers locally to achieve a 1.5x speed boost.
Discord: https://discord.gg/s8rVscu2pM
Profiling Outcomes: https://docs.google.com/spreadsheets/d/1uknU0S0czTj2NqgI0qPEWSENhoh4oTiaseBrh6AMzBI/edit?usp=sharing
00:00 – Summary
00:45 – XFormer Explanation
09:32 – Initial Benchmark
11:16 – Install Webui
14:17 – Install Python
16:42 – Install CUDA
18:26 – Verify Webui Installation
22:22 – Install Visual Studio
23:48 – Install XFormers
32:10 – Results
——– Links ——–
Article I stole (B I G thanks to Matthieu Toulemont): https://www.photoroom.com/tech/stable-diffusion-100-percent-faster-with-memory-efficient-attention/
Github Guide: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Xformers
Reddit Guide: https://www.reddit.com/r/StableDiffusion/comments/xz26lq/automatic1111_xformers_cross_attention_with_on/
Rentry guide: https://rentry.org/sdg_faq#xformers-increase-your-its-more-cards-supported
Xformers Github: https://github.com/facebookresearch/xformers
Webui repo: https://github.com/AUTOMATIC1111/stable-diffusion-webui
Download Visual Studio Installer: https://visualstudio.microsoft.com/downloads/
Download CUDA 11.3: https://developer.nvidia.com/cuda-11.3.0-download-archive
Download Python 3.10: https://www.python.org/downloads/release/python-3100/
Git for Windows: https://git-scm.com/downloads
——– Music ——–
This Video: https://www.youtube.com/watch?v=5mgYnwhM4eg&ab_channel=RelaxMusicMeditation
——– Thumbnail ——–
Images from (many thanks): https://twitter.com/StuffyAi/status/1579542236597215232
#stablediffusion #aiart #xformers #tutorials #techtutorials
source
i have an rtx 30xx. just had to add "–xformers" in the bat and all was installed automatically. Just need to edit "Webui-user.bat" and set: COMMANDLINE_ARGS= –xformers. This trick works only with 30xx serie. Older gpus require different installation of xformer and probably require all steps as you explained in this video.
Very Good info!
I read that performance was marginal with a 10 series card, (1080ti in my case) though on windows 7 I cannot use 3.10 or a modern version of Cuda
Desperatly want to try this on my Steamdeck, runs sooooooo much better with XFormers on my 3080ti around 17it/s now.
soooo I have 3.8 gb of vram
Another great video! 💙
I hope we'll be able to TRAIN our own models locally via webUI as well
Thanks to the less and less usage of RAM on every update… I'm currently stuck with a true dinosaur GTX 980 4GB and using the good ol' WebUI (didn't try Xformers yet) which somehow works smooth (just not very fast)
At the moment I'm training via Google Colab, but I just love processing locally when possible even if it's not the fastest, but at least working.
People have been getting this error:
C:UserstomasAppDataLocalTemptmpxft_00001e78_00000000-7_fmha_block_dgrad_fp16_kernel_loop.sm80.cudafe1.cpp : fatal error C1083: Cannot open compiler generated file: '': Invalid argument
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc.exe' failed with exit code 4294967295
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip
This happens, super frustratingly when the path to your webui install is too long. I solved it by (1) moving everything directly into "C:/" and then renaming the directory to just "webui" to make it even shorter (I mention this in the tute, but it's a long tute, I get it) 🙃
attention is all you need
CUDA 11.8 here and xformers working fine.
Is this only for Nvidia cards because stable diffusion doesn't work to well with amd.
It's annoying hearing works on 4gb but not saying Nvidia.
Might want to start saying works for Nvidia, amd sometime later at beginning of the video.
You only need to edit the webui-user.bat (and run that bat afterwards) with the following addition. set COMMANDLINE_ARGS= –xformers
The title of the video tricked me – I thought SD version 1.5 was available for local install, but it's "1.5x faster"
Worked fine for me with Cuda 11.7 and VS 2022, no build errors.
This is great, thanks a lot 😊
Did git pull delete you checkpoint? 🙄
Also I would advocate strongly to use a venv or conda environment whenever working on projects like this, it will make your life a lot easier in the long run! 😊
⚠ UPDATE:
If you use a Pascal, Turing or Ampere card, you shouldn't need to build manually anymore. Uninstall your existing xformers and launch the repo with –xformers. A compatible wheel will be installed.
20 images / 79 seconds? Really dude?
nice video,
is anyone able to get xformers run on apple m1/m2?
This is really helpful, I'm going to test it later and hopefully it means I can add my laptop to my intranet and get that to generate images, whilst I work on my main PC. My laptop has a very basic old Nvidia GPU, but this might make it possible to become an image generator for me.