Testing AMD Amuse - local image generation with Stable Diffusion AI models

2024-09-01

Amuse is a desktop application made by AMD and Tensorstack.ai that allows you to generate images locally using various Stable Diffusion AI models. It can be downloaded for free from amuse-ai.com.

Amuse application

The application is a showcase of running AI on AMD hardware, but it can be used for simple image generation for various needs. It requires around 32GB of RAM and a video card with enough VRAM. The highest quality/most advanced Flux.1 model requires 24GB VRAM - Radeon 7900 XTX.

Amuse will independently download required models and generate images based on our prompt and settings. It works rather well, but I had to restart it a few times after some settings changes when it started generating random images ignoring the source image. On my Ryzen 9 5900X and Radeon 6950 XT system lower quality models generated images near instantly while higher-quality images took 15 seconds or a bit more per image.

Amuse is running locally on your hardware

Required models will be downloaded automatically

Image generation options

The app has 3 main ways to generate images. The first is simple prompt-based - you describe what you want and the model will try to generate that... The other is prompt + simple drawing and the third is generating an altered version based on the source image - changing style, colors, and more.

Goose riding a moose standing on a shark

Image generation from prompt - high quality — Image generation from prompt

To convert an image you will have to select the aspect ratio and then use the prompt to describe the style you want the image to be generated in:

Amuse offers multiple options for image alteration

Drawing is a bit funky to get it to work but if you need something following a fixed layout then you may try it out:

Image generation from a prompt and a drawing

Models limitations

Most of the models are quite simple compared to those running on high-end AI servers. The images will be small while the model will often have problems with text and realism (body parts for example). At least AMD Amuse offers an option to upscale the images.

Lowest quality is quicker to generate but also the resulting image isn't very refined:

Murlocs having a pizza - highest quality

Those don't look like actual Murlocs but are still somewhat close on the creature type.

Game IP specific prompts

I used some World of Warcraft-specific prompts and the models can generate somewhat accurate (or at least interesting) images. This means the models were trained on some of the game artwork and know Sylvanas or Thrall to some extent.