Stable Diffusion
Listed below are the default versions / parameters. We are constantly looking for ways to improve so these are subject to change and given for informational purposes. And if your use case requires something different, we'd love to hear from you!
Base Model
Terms of the CreativeML Open RAIL-M license apply.
Custom Models
When you create a profile, you're triggering a training job which results in a custom model.
If you'd like to be able to create nested/hierarchical models, where the base model is either a model you trained previously (i.e. an existing profile), or - a model other than SD v1.5 - please let us know.
Variational Autoencoder
sd-vae-ft-mse, specifically 840000-ema-pruned
Training Parameters
We don't keep secrets around here π
Steps: typically around 1100 - 1600, varies based on the number of photos provided.
Learning rate & scheduler: 1e-6, polynomial
Class size: 100-300 depending on the class.
Class images: for most classes we use generated class images. We are working on acquiring realistic, high-res images. Right now the "Man" class is the only one set up to use real-world HD photos instead of the generated set (choosing it over Person might give a *slightly* better result).
Infrastructure Reliability & Security
The web application (www.soreal.ai) is completely separated from our AI infrastructure and from customer assets (input photos, models, generated images).
AI infrastructure & assets are hosted on Amazon Web Services using private VPCs and private, encrypted S3 buckets. Block- and/or file-based storage used for compute is encrypted as well.
There is no internet access to any of our compute (whether EC2 or serverless).
There is no public access to any S3 asset buckets.
We will never store your images or photos publicly. When you navigate to your dashboard, we automatically create a time-limited, signed, authorized URL that lets you view the images that are then shown on your dashboard.
Job Scheduling
Our system is built to scale horizontally with zero touch - based on demand.
We've typically been able to finish training in under 8 minutes and generate images in about a minute. Keep in mind that inference parameters will substantially influence time to generate.
If it's a particularly busy time, you may have to wait a few minutes for additional resources to be available. The estimated time will be calculated and shown in real time.
"I've been waiting for X and it's stuck / still hasn't started / hasn't finished" - If you find yourself waiting for a long time unexpectedly - send us a message so we can look into it.