Draw my avatar using AI

Abstruction#

The cause of the matter is as follows: Since I don't have an avatar for my xlog yet, I thought of using Stable Diffusion to generate one. However, because I'm not very familiar with using Stable Diffusion's Prompt and Negative Prompt, I wanted to use GPT4 to generate these Prompts. ~~I thought GPT4 probably doesn't have knowledge in this area, so I had to construct a Prompt to teach GPT4 how to generate Stable Diffusion's Prompt.~~

But after testing, I found that GPT4 knows a little about Stable Diffusion's Prompt, but not much.

GPT4 gave me the Prompt =
Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.

In fact, the Prompt I used is =
masterpiece, 1 girl, cute face, white hair, red eyes, Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.

I discovered a Stable Diffusion platform https://civitai.com/, which has many models that look great. It was overwhelming, and my brain couldn't process all the information. Platforms like civitai indeed gather a lot of relevant information, and there are many erotic contents, which makes me very excited!! I really want to try it locally.

Install and Try stable-diffusion-webui AND NovelAI's model#

An error occurred while installing stable-diffusion-webui

Python 3.11.3 (main, Apr 14 2023, 23:30:17) [Clang 15.0.7 ]
Commit hash: 22bcc7be428c94e9408f589966c2040187245d81
Installing torch and torchvision
Looking in indexes: https://mirrors.ustc.edu.cn/pypi/web/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
  Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp311-cp311-linux_x86_64.whl (1801.8 MB)
ERROR: Could not find a version that satisfies the requirement torchvision==0.14.1+cu117 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.15.0, 0.15.0+cu117, 0.15.1, 0.15.1+cu117)
ERROR: No matching distribution found for torchvision==0.14.1+cu117

After searching, I found key information https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8093#discussioncomment-5111761

After patching the following content, it passed

:100644 100644 bfa53cb7 00000000 M	webui-user.sh

diff --git a/webui-user.sh b/webui-user.sh
index bfa53cb7..1f628616 100644
--- a/webui-user.sh
+++ b/webui-user.sh
@@ -12,6 +12,8 @@
 # Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
 #export COMMANDLINE_ARGS=""
 
+export COMMAND_LINE_ARGS="-ckpt nai.ckpt"
+
 # python3 executable
 #python_cmd="python3"
 
@@ -26,6 +28,7 @@
 
 # install command for torch
 #export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
+export TORCH_COMMAND="pip install torch torchvision"
 
 # Requirements file to use for stable-diffusion-webui
 #export REQS_FILE="requirements_versions.txt"

Encountered another error

stderr:   Running command git clone --filter=blob:none --quiet https://github.com/TencentARC/GFPGAN.git /tmp/pip-req-build-z10jlql1
  Running command git rev-parse -q --verify 'sha^8d2447a2d918f8eba5a4a01463fd48e45126a379'
  Running command git fetch -q https://github.com/TencentARC/GFPGAN.git 8d2447a2d918f8eba5a4a01463fd48e45126a379
  Running command git checkout -q 8d2447a2d918f8eba5a4a01463fd48e45126a379
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 51, in <module>
          _guard_py_ver()
        File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 48, in _guard_py_ver
          raise RuntimeError(msg.format(cur_py, min_py, max_py))
      RuntimeError: Cannot install on Python version 3.11.3; only versions >=3.7,<3.11 are supported.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

It clearly tells me that it only supports Python (>=3.7,<3.11). I have to say I have already installed so many Python versions.

λ pyenv versions
  system
  2.7.18
  3.8.16
  3.9.16
  3.10.11
* 3.11.3 (set by /home/aya/.pyenv/version)

I switched the Python version to 3.10 using pyenv and deleted the venv folder under the stable-diffusion-webui directory, then re-ran ./webui.sh. Finally, it passed!

~~Using RTX3070 8G, it seems to take a few seconds to generate an image.~~

Sad, the VRAM is on the brink.

I found that using the copied Prompt also couldn't generate images of the same quality. From here, The definitive Stable Diffusion experience ™#Testing accuracy and https://imgur.com/a/DCYJCSX can be learned

-In the Settings tab, change Ignore last layers of CLIP model to 2 and apply

This step was not completed, but I couldn't find this parameter after searching for a long time. I think it's likely due to the new version's name change or it was simply removed. Through search engines, I found this post https://www.reddit.com/r/StableDiffusion/comments/zekch3/stop_at_last_layers_of_clip_model_not_showing_up/, and in the current version, the parameter Clip skip can be found under the Stable Diffusion Section in the Settings Tab.

After changing it to 2, I was able to generate an almost identical Hello Asuka (python: 3.10.11 • torch: 2.0.0+cu117 • xformers: N/A • gradio: 3.23.0 • commit: 22bcc7be • checkpoint: 89d59c3dde)

What are (`.ckpt`, `pt`) and `.safetensors`? What is their relationship?#

https://rentry.org/voldy#-ti-hypernetworks-and-loras- has explanations and usage methods.

LoRAs (Low Rank Adaptation): Advanced networks that can be easily trained. Comes in .safetensor file format.

LoRA Trial#

At first, I saw that a very nice colored image used the Model AbyssOrangeMix2 - Hardcore, but I found they have an AbyssOrangeMix3 at huggingface/AbyssOrangeMix3, so of course, I had to try 3!

~~Abyss-> "Abyss" is a good name!~~

The main model, "AOM3 (AbyssOrangeMix3)", is a purely upgraded model that improves on the problems of the previous version, "AOM2". "AOM3" can generate illustrations with very realistic textures and can generate a wide variety of content. There are also three variant models based on the AOM3 that have been adjusted to a unique illustration style. These models will help you to express your ideas more clearly.

Okay, I realized I made a mistake. I thought all .safetensor files were LoRA, but that's actually not the case. AOM3 should be a Model, placed directly in models/Stable-diffusion/, and then you can use it by selecting the checkpoint.

On https://civitai.com/, each resource has a Detail/Type that indicates the type of resource, and how to use it can be found in How-to-use-models

However, I found that although the same Prompt can generate similar images, they are vastly different from the samples provided.

00003-2320689785

(Could it be because I used the AOM3 model without permission?) Sure enough, this output was generated using AbyssOrangeMix2_hard, and it was almost identical to the provided samples.

END generating the avatar. ~~I almost forgot my original purpose~~ But I feel that GPT4 doesn't really understand Prompts, possibly due to a lack of training data. If I have time, I can take a look at ~~Did VisualGPT change its name?~~ TaskMatrix to see how it works.

Avatar (I didn't spend much effort on the Prompt):

00008-2320689785

masterpiece, Touhou Project, 1girl, sex, Reisen

Negative prompt: (worst quality:1.25), (low quality:1.25), (lowres:1.1), (monochrome:1.1), (greyscale), multiple views, comic, sketch, (blurry:1.05),
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2320689785, Size: 512x512, Model hash: dce28483f1, Model: OrangeMixs_Models_AbyssOrangeMix2_Pruned_AbyssOrangeMix2_nsfw_pruned_fp16_with_VAE

Serve with Cloudflare argo tunnel#

When I wanted to access my stable-diffusion-webui through a domain name, I thought of using Cloudflare argo tunnel, but I also encountered some problems.

When I successfully deployed Cloudflare argo tunnel and could access HTML and other resources through the domain name, the page still wouldn't load. Using the browser's developer tools, I could see that all resources could load normally, except for some console LOG information reminding me that it wasn't working properly.

To be supplemented (in essence: <Link> for preload resources did not set the as attribute)

But actually, SD-webui has an extension sd-webui-tunnels, or you can use the command cloudflared tunnel --url http://127.0.0.1:7860 to open a tunnel and get a temporary domain name. Both methods can access stable-diffusion-webui normally, so why doesn't my manually set tunnel work?

It can actually be noticed that there might be an issue with my domain or Cloudflare settings. ~~Damn, I actually wanted to write out the thought process I've been pondering because I want to leave some problem-solving ideas and not just the result, but it seems not to work.~~

Observing the resources accessed through my domain and the temporary domain given by cloudflared tunnel --url http://127.0.0.1:7860, I found some differences.

My domain:

The HTML has some resources inserted by CF, including https://static.cloudflareinsights.com/beacon.min.js and Rocket Loader™.
At the same time, all my JS and CS script will be commented out, and a duplicate resource will be inserted below, but the <script> will change to <link preload as=?>.
Further observing the original HTML in Developer Tools/Network, I found that $^2$ is missing.

It can be concluded that it's likely the fault of $^1$ . I went to turn off the Rocket Loader™ in <domain>/speed/optimization/, and found that it was still there. In /rules/configuration-rules, I found some page rules I had set, which included a rule that set Rocket Loader™, and after turning it off, it returned to normal.

Add upscaler#

Add additional upscalers to models/ESRGAN

HoshinoAya