Abstruction#
The cause of the matter is as follows: Since I don't have an avatar for my xlog yet, I thought of using Stable Diffusion
to generate one. However, because I'm not very familiar with using Stable Diffusion
's Prompt
and Negative Prompt
, I wanted to use GPT4
to generate these Prompts
. I thought GPT4 probably doesn't have knowledge in this area, so I had to construct a Prompt
to teach GPT4
how to generate Stable Diffusion
's Prompt
.
But after testing, I found that GPT4 knows a little about Stable Diffusion
's Prompt
, but not much.
GPT4
gave me the Prompt
=
Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.
In fact, the Prompt
I used is =
masterpiece, 1 girl, cute face, white hair, red eyes, Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.
I discovered a Stable Diffusion
platform https://civitai.com/, which has many models that look great. It was overwhelming, and my brain couldn't process all the information. Platforms like civitai
indeed gather a lot of relevant information, and there are many erotic contents, which makes me very excited!! I really want to try it locally.
Install and Try stable-diffusion-webui AND NovelAI's model#
An error occurred while installing stable-diffusion-webui
Python 3.11.3 (main, Apr 14 2023, 23:30:17) [Clang 15.0.7 ]
Commit hash: 22bcc7be428c94e9408f589966c2040187245d81
Installing torch and torchvision
Looking in indexes: https://mirrors.ustc.edu.cn/pypi/web/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp311-cp311-linux_x86_64.whl (1801.8 MB)
ERROR: Could not find a version that satisfies the requirement torchvision==0.14.1+cu117 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.15.0, 0.15.0+cu117, 0.15.1, 0.15.1+cu117)
ERROR: No matching distribution found for torchvision==0.14.1+cu117
After searching, I found key information https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8093#discussioncomment-5111761
After patching the following content, it passed
:100644 100644 bfa53cb7 00000000 M webui-user.sh
diff --git a/webui-user.sh b/webui-user.sh
index bfa53cb7..1f628616 100644
--- a/webui-user.sh
+++ b/webui-user.sh
@@ -12,6 +12,8 @@
# Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
#export COMMANDLINE_ARGS=""
+export COMMAND_LINE_ARGS="-ckpt nai.ckpt"
+
# python3 executable
#python_cmd="python3"
@@ -26,6 +28,7 @@
# install command for torch
#export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
+export TORCH_COMMAND="pip install torch torchvision"
# Requirements file to use for stable-diffusion-webui
#export REQS_FILE="requirements_versions.txt"
Encountered another error
stderr: Running command git clone --filter=blob:none --quiet https://github.com/TencentARC/GFPGAN.git /tmp/pip-req-build-z10jlql1
Running command git rev-parse -q --verify 'sha^8d2447a2d918f8eba5a4a01463fd48e45126a379'
Running command git fetch -q https://github.com/TencentARC/GFPGAN.git 8d2447a2d918f8eba5a4a01463fd48e45126a379
Running command git checkout -q 8d2447a2d918f8eba5a4a01463fd48e45126a379
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 51, in <module>
_guard_py_ver()
File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 48, in _guard_py_ver
raise RuntimeError(msg.format(cur_py, min_py, max_py))
RuntimeError: Cannot install on Python version 3.11.3; only versions >=3.7,<3.11 are supported.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
It clearly tells me that it only supports Python (>=3.7,<3.11). I have to say I have already installed so many Python versions.
λ pyenv versions
system
2.7.18
3.8.16
3.9.16
3.10.11
* 3.11.3 (set by /home/aya/.pyenv/version)
I switched the Python version to 3.10
using pyenv and deleted the venv
folder under the stable-diffusion-webui
directory, then re-ran ./webui.sh
. Finally, it passed!
Using RTX3070 8G, it seems to take a few seconds to generate an image.
Sad, the VRAM is on the brink.
I found that using the copied Prompt
also couldn't generate images of the same quality. From here, The definitive Stable Diffusion experience ™#Testing accuracy and https://imgur.com/a/DCYJCSX can be learned
-In the Settings tab, change Ignore last layers of CLIP model to 2 and apply
This step was not completed, but I couldn't find this parameter after searching for a long time. I think it's likely due to the new version's name change or it was simply removed. Through search engines, I found this post https://www.reddit.com/r/StableDiffusion/comments/zekch3/stop_at_last_layers_of_clip_model_not_showing_up/, and in the current version, the parameter Clip skip can be found under the Stable Diffusion Section in the Settings Tab.
After changing it to 2, I was able to generate an almost identical Hello Asuka
(python: 3.10.11 • torch: 2.0.0+cu117 • xformers: N/A • gradio: 3.23.0 • commit: 22bcc7be • checkpoint: 89d59c3dde)
What are (.ckpt
, pt
) and .safetensors
? What is their relationship?#
https://rentry.org/voldy#-ti-hypernetworks-and-loras- has explanations and usage methods.
LoRAs (Low Rank Adaptation): Advanced networks that can be easily trained. Comes in .safetensor
file format.
LoRA Trial#
At first, I saw that a very nice colored image used the Model AbyssOrangeMix2 - Hardcore
, but I found they have an AbyssOrangeMix3 at huggingface/AbyssOrangeMix3, so of course, I had to try 3!
Abyss
-> "Abyss" is a good name!
The main model, "AOM3 (AbyssOrangeMix3)", is a purely upgraded model that improves on the problems of the previous version, "AOM2". "AOM3" can generate illustrations with very realistic textures and can generate a wide variety of content. There are also three variant models based on the AOM3 that have been adjusted to a unique illustration style. These models will help you to express your ideas more clearly.
Okay, I realized I made a mistake. I thought all .safetensor
files were LoRA, but that's actually not the case. AOM3 should be a Model, placed directly in models/Stable-diffusion/
, and then you can use it by selecting the checkpoint.
On https://civitai.com/, each resource has a Detail/Type
that indicates the type of resource, and how to use it can be found in How-to-use-models
However, I found that although the same Prompt can generate similar images, they are vastly different from the samples provided.
(Could it be because I used the AOM3 model without permission?) Sure enough, this output was generated using AbyssOrangeMix2_hard, and it was almost identical to the provided samples.
END generating the avatar. I almost forgot my original purpose But I feel that GPT4 doesn't really understand Prompts, possibly due to a lack of training data. If I have time, I can take a look at Did VisualGPT change its name? TaskMatrix to see how it works.
Avatar (I didn't spend much effort on the Prompt):
masterpiece, Touhou Project, 1girl, sex, Reisen
Negative prompt: (worst quality:1.25), (low quality:1.25), (lowres:1.1), (monochrome:1.1), (greyscale), multiple views, comic, sketch, (blurry:1.05),
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2320689785, Size: 512x512, Model hash: dce28483f1, Model: OrangeMixs_Models_AbyssOrangeMix2_Pruned_AbyssOrangeMix2_nsfw_pruned_fp16_with_VAE
Serve with Cloudflare argo tunnel#
When I wanted to access my stable-diffusion-webui
through a domain name, I thought of using Cloudflare argo tunnel, but I also encountered some problems.
When I successfully deployed Cloudflare argo tunnel and could access HTML and other resources through the domain name, the page still wouldn't load. Using the browser's developer tools, I could see that all resources could load normally, except for some console LOG information reminding me that it wasn't working properly.
To be supplemented (in essence: <Link> for preload resources did not set the as attribute)
But actually, SD-webui has an extension sd-webui-tunnels
, or you can use the command cloudflared tunnel --url http://127.0.0.1:7860
to open a tunnel and get a temporary domain name. Both methods can access stable-diffusion-webui
normally, so why doesn't my manually set tunnel work?
It can actually be noticed that there might be an issue with my domain or Cloudflare settings. Damn, I actually wanted to write out the thought process I've been pondering because I want to leave some problem-solving ideas and not just the result, but it seems not to work.
Observing the resources accessed through my domain and the temporary domain given by cloudflared tunnel --url http://127.0.0.1:7860
, I found some differences.
My domain:
- The HTML has some resources inserted by CF, including
https://static.cloudflareinsights.com/beacon.min.js
andRocket Loader™
. - At the same time, all my JS and CS
script
will be commented out, and a duplicate resource will be inserted below, but the<script>
will change to<link preload as=?>
. - Further observing the original HTML in Developer Tools/Network, I found that is missing.
It can be concluded that it's likely the fault of . I went to turn off the Rocket Loader™ in <domain>/speed/optimization/, and found that it was still there. In /rules/configuration-rules, I found some page rules I had set, which included a rule that set Rocket Loader™, and after turning it off, it returned to normal.
Add upscaler#
Add additional upscalers to models/ESRGAN
- https://www.reddit.com/r/StableDiffusion/comments/zih4bz/is_there_a_way_to_add_an_extra_upscaler_to/
- https://upscale.wiki/wiki/Model_Database#ESRGAN_.28.22old_Architecture.22.29_Models