Draw my avatar using AI

Abstruction#

事情的起因是这样的，由于咱的 xlog 还没有一个头像，所以想到用 Stable Diffusion 来生成一个，又因为咱不太会使用Stable Diffusion的 Prompt 和
Negative Prompt，所以咱就想使用 GPT4 来生成这些 Prompt。~~咱觉得 GPT4 很可能没有这方面的知识，所以咱得构造 Prompt 教 GPT4 怎么生成 Stable Diffusion 的 Prompt。~~

但是经过测试我发现 GPT4 知道一点 Stable Diffusion 的 Prompt，但是不多。

GPT4 给我的 Prompt =
Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.

实际上我用的 Prompt =
masterpiece, 1 girl, cute face, white hair, red eyes, Generate a 512x512 avatar with elements of technology, AI, and Touhou Project. Make sure it has clear lines and distinct color contrast.

发现了一个 Stable Diffusion 的平台 https://civitai.com/, 很多看起来很棒的模型。block 住了，发现了大量信息，大脑都处理不过来了。civitai 这样的平台确实汇聚来大量相关的信息，而且好多色情的内容啊，我好心动啊！！好想在本地试试。

安装试用 stable-diffusion-webui AND NovelAI's model#

安装 stable-diffusion-webui 出现错误

Python 3.11.3 (main, Apr 14 2023, 23:30:17) [Clang 15.0.7 ]
Commit hash: 22bcc7be428c94e9408f589966c2040187245d81
Installing torch and torchvision
Looking in indexes: https://mirrors.ustc.edu.cn/pypi/web/simple, https://download.pytorch.org/whl/cu117
Collecting torch==1.13.1+cu117
  Using cached https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp311-cp311-linux_x86_64.whl (1801.8 MB)
ERROR: Could not find a version that satisfies the requirement torchvision==0.14.1+cu117 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.15.0, 0.15.0+cu117, 0.15.1, 0.15.1+cu117)
ERROR: No matching distribution found for torchvision==0.14.1+cu117

经过搜索咱找到了关键信息 https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8093#discussioncomment-5111761

通过 patch 以下内容后 PASS

:100644 100644 bfa53cb7 00000000 M	webui-user.sh

diff --git a/webui-user.sh b/webui-user.sh
index bfa53cb7..1f628616 100644
--- a/webui-user.sh
+++ b/webui-user.sh
@@ -12,6 +12,8 @@
 # Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
 #export COMMANDLINE_ARGS=""
 
+export COMMAND_LINE_ARGS="-ckpt nai.ckpt"
+
 # python3 executable
 #python_cmd="python3"
 
@@ -26,6 +28,7 @@
 
 # install command for torch
 #export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
+export TORCH_COMMAND="pip install torch torchvision"
 
 # Requirements file to use for stable-diffusion-webui
 #export REQS_FILE="requirements_versions.txt"

又遇到一个错误

stderr:   Running command git clone --filter=blob:none --quiet https://github.com/TencentARC/GFPGAN.git /tmp/pip-req-build-z10jlql1
  Running command git rev-parse -q --verify 'sha^8d2447a2d918f8eba5a4a01463fd48e45126a379'
  Running command git fetch -q https://github.com/TencentARC/GFPGAN.git 8d2447a2d918f8eba5a4a01463fd48e45126a379
  Running command git checkout -q 8d2447a2d918f8eba5a4a01463fd48e45126a379
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 51, in <module>
          _guard_py_ver()
        File "/tmp/pip-install-or16ihw5/numba_9ca0f34767be492594e54fc49a1246fa/setup.py", line 48, in _guard_py_ver
          raise RuntimeError(msg.format(cur_py, min_py, max_py))
      RuntimeError: Cannot install on Python version 3.11.3; only versions >=3.7,<3.11 are supported.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

显然它告诉我只支持 Python (>=3.7,<3.11), 不得不说咱已经装了这么些 Python 版本了.

λ pyenv versions
  system
  2.7.18
  3.8.16
  3.9.16
  3.10.11
* 3.11.3 (set by /home/aya/.pyenv/version)

使用 pyenv 切换 Python 版本为 3.10 并且删除 stable-diffusion-webui 目录下的 venv 文件夹，重新 Run ./webui.sh. Finally, it PASS!

~~使用 RTX3070 8G，十几秒出张图的样子。~~

悲，显存岌岌可危。

咱发现用复制的 Prompt 也生成不了同样质量的图，从这里 The definitive Stable Diffusion experience ™#Testing accuracy 和 https://imgur.com/a/DCYJCSX 可以得知

-In the Settings tab, change Ignore last layers of CLIP model to 2 and apply

这一步未完成，但是找半天也找不到这个参数。我觉得大概率是由于新版本的名字变了或者直接就删掉了，通过搜索引擎找到了这个帖子 https://www.reddit.com/r/StableDiffusion/comments/zekch3/stop_at_last_layers_of_clip_model_not_showing_up/, 在当前版本可以在 Settings Tab 的 Stable Diffusion Section 下找到参数 Clip skip。

将其改成 2 后，可以生成几乎完全一样的 Hello Asuka (python: 3.10.11 • torch: 2.0.0+cu117 • xformers: N/A • gradio: 3.23.0 • commit: 22bcc7be • checkpoint: 89d59c3dde)

(`.ckpt`, `pt`) 和 `.safetensors` 各自是什么？有什么关系？#

https://rentry.org/voldy#-ti-hypernetworks-and-loras- 有解释和使用方法。

LoRAs (Low Rank Adaptation): Advanced networks that can be easily trained. Comes in .safetensor file format.

LoRA 试用#

一开始我看到的很不错的色图使用的 Model 是AbyssOrangeMix2 - Hardcore，但是我发现他们在 huggingface/AbyssOrangeMix3 有个 AbyssOrangeMix3，所以当然是试试 3 啦！

~~Abyss-> ” 深渊 “ 好名字！~~

The main model, "AOM3 (AbyssOrangeMix3)", is a purely upgraded model that improves on the problems of the previous version, "AOM2". "AOM3" can generate illustrations with very realistic textures and can generate a wide variety of content. There are also three variant models based on the AOM3 that have been adjusted to a unique illustration style. These models will help you to express your ideas more clearly.

~~好吧，咱发现我搞错了。咱以为 .safetensor file 都是 LoRA, 原来其实不是。AOM3 这个应该是 Model，直接放 models/Stable-diffusion/ 然后选择 checkpoint 就能用~~。

https://civitai.com/ 上每个资源都有 Detail/Type, 标注了资源的类型，而如何使用可以看 How-to-use-models

但是咱又发现虽然同样的 Prompt 可以生成类似的图片，但是和 samples 的几张图相比就大相径庭了。

00003-2320689785

(会不会是因为咱自作主张用的 AOM3 的模型？) 果然，这是使用 AbyssOrangeMix2_hard 的输出，和给的 samples 也几乎完全一样了。

END 生成头像。~~差点忘了我一开始的目的了~~ 但是我感觉 GPT4 并不是很懂 Prompt, 可能是训练数据里没有的原因。有时间的花可以看看 ~~VisualGPT 改名了？~~ TaskMatrix 是怎么做的，

头像（倒是没有花很多功夫在 Prompt 上）：

00008-2320689785

masterpiece, Touhou Project, 1girl, sex, Reisen

Negative prompt: (worst quality:1.25), (low quality:1.25), (lowres:1.1), (monochrome:1.1), (greyscale), multiple views, comic, sketch, (blurry:1.05),
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2320689785, Size: 512x512, Model hash: dce28483f1, Model: OrangeMixs_Models_AbyssOrangeMix2_Pruned_AbyssOrangeMix2_nsfw_pruned_fp16_with_VAE

Serve with Cloudflare argo tunnel#

当我想在通过域名访问我的 stable-diffusion-webui 时，我想到了使用 Cloudflare argo tunnel，但是也遇到了一些问题。

当我成功部署 Cloudflare argo tunnel，并且可以通过域名访问到 HTML 等资源，但是却加载不了页面。使用浏览器的开发者工具可以看到所有资源都可以正常加载，除了一些控制台的 LOG 信息提醒我它并没有正常工作。

待补充（大意：<Link> 为 preload 的资源没有设置 as 属性）

但是其实 SD-webui 有个拓展sd-webui-tunnels，或者通过 cloudflared tunnel --url http://127.0.0.1:7860 这个命令，也可以打开一个 Tunnel 获得一个临时域名，这俩种方式都可以正常访问 stable-diffusion-webui 那么为什么我自己手动设置的 tunnel 却不行？

其实可以察觉到是我的域名或者 Cloudflare 设置有些问题 ~~草其实我想写出一直思索的过程因为我想留下一些解决问题的思路而不只是结果但是好像不太行~~

观察通过我的域名访问和 cloudflared tunnel --url http://127.0.0.1:7860 给的临时域名得到的资源可以发现一些区别

我的域名：

HTML 里多了一些 CF 给我插入的资源，包括 https://static.cloudflareinsights.com/beacon.min.js 和 Rocket Loader™
同时我的所有 JS 和 CS 的 script 会被注释，然后在下面插入一份一模一样的资源，但是 <script> 会变成 <link preload as=?>
进一步观察 Developer Tools/Network 里的原始 HTML，会发现 $^2$ 是没有的

可以得出很大可能是 $^1$ 的锅，我去把 <domain>/speed/optimization/Rocket Loader™ 关了，发现它还在那里。在 /rules/configuration-rules 里找到了我自己设置的一些页面规则，里面有一条设置了Rocket Loader™，关了，恢复正常。

Add upscaler#

将额外的 upscaler 添加到 models/ESRGAN

HoshinoAya