5 Open-source Local AI Tools for Image Generation I Found Interesting

I discovered these interesting open-source AI tools for image generation that run locally. Explore their features, capabilities, and how they empower creativity without relying on cloud services.
Warp Terminal

Ever since I realized that AI was shaping the future, I’ve been fascinated by its endless possibilities.

I’m someone who enjoys testing large language models (LLMs) on my devices, and the open-source approach to data has always been my preference.

Why? Because open-source projects empower us to have control, privacy, and customization, which is essential in today's data-driven world.

When I decided to explore AI image generation, it felt like a natural extension of this mindset. Why rely on proprietary models when open-source alternatives offer powerful features and flexibility?

Now, I’ll admit - I don’t have the ideal hardware to run these models locally at blazing speeds, but where there’s a will, there’s a way! Sure, CPU inference is painfully slow, but it gets the job done eventually (and hey, patience builds character, right?).

During my research, I stumbled upon several fascinating projects. Some are fully ripe and ready to use, while others are still budding and need more time to mature.

This article is a combined list of some of the best open-source AI image generators that you can run locally. If I’ve missed any gems, feel free to let me know in the comments!

1. Stable diffusion 1.5 (paired with stable-diffusion webui)

stable diffusion webui
Stable Diffusion WebUI | Source: AUTOMATIC1111

Stable Diffusion v1.5 is a powerful latent text-to-image diffusion model designed to generate photo-realistic images from textual prompts.

Developed as an evolution of earlier versions, it was fine-tuned on a large-scale dataset, "LAION-Aesthetics v2 5+", to enhance its capabilities.

This model is particularly well-suited for artistic, creative, and research purposes, offering impressive results with minimal computational requirements.

Key features

  • Unlock high-quality text-to-image generation with its latent diffusion process, achieving impressive results with reduced computational overhead.
  • Fine-tuned on a large-scale dataset to improve its ability to generate visually appealing images.
  • Supports multiple platforms and tools, including Diffusers Library for seamless integration into Python workflows, ComfyUI, Automatic1111, SD.Next, and InvokeAI for local usage.
  • Enjoy efficient weight options like EMA-only weights for inference or EMA + non-EMA weights for fine-tuning tasks.
  • Leverage the Pretrained Text Encoder, inspired by Google's Imagen model, to robustly understand text prompts.
  • Generate artwork, design prototypes, and educational visuals with its creative applications, ideal for artistic and research purposes.

2. Invoke AI

invoke ai server webui
Source: InvokeAI

InvokeAI is a robust, open-source image generation project that takes its inspiration from upon Stable Diffusion, offering users a highly customizable experience for creating unique visuals.

Whether you're looking to generate artwork, photorealistic images, or something more abstract, InvokeAI provides a powerful toolkit with an easy-to-use interface.

Its flexibility is perfect for those who want more control over the creative process, especially for those working with specific intellectual property or requiring tailored workflows.

Key Features

  • Create highly detailed prompts with options for both positive and negative guidance to guide the generation process.
  • Generate images based on textual descriptions, with numerous customization options for finer control.
  • Use an existing image as a reference to help guide the AI in maintaining specific colors, structures, or themes.
  • Access a unified canvas that enables users to modify images by regenerating certain elements, editing content or colors (inpainting), and extending the image (outpainting).
  • Experiment with different models, each trained to generate specific styles or outputs, providing flexibility to match your creative needs.
  • Utilize advanced customization options like Low-Rank Adaptations (LoRAs) and Textual Inversion Embeddings to focus on specific characters, styles, or concepts.
  • Customize the number of de-noising steps and choose from different schedulers to optimize the generation process for quality and speed.

3. OpenJourney

openjourney website homepage

OpenJourney is a powerful, open-source text-to-image AI art generator that allows users to create stunning visuals from text prompts.

Launched in November 2022 by PromptHero, it has quickly gained popularity as a free alternative to MidJourney.

Built on Stable Diffusion, OpenJourney was trained using thousands of MidJourney images from its v4 update, as well as other AI models like DALL-E 2.

OpenJourney excels at generating photorealistic and artistic images, and its open-source nature ensures it remains accessible to a wide audience.

Key Features

  • Create stunning visuals from text prompts with its powerful text-to-image generation capabilities.
  • Enjoy photorealistic and artistic images, perfect for artists, designers, and anyone looking to generate high-quality content.
  • Access a library of curated prompt ideas to inspire your creativity and get started with generating art.
  • Customize the style and content of your generated images by crafting specific prompts that fit your vision.
  • Benefit from OpenJourney's stable diffusion-based architecture and additional training on MidJourney images for enhanced capabilities.
  • Take advantage of its wide accessibility, available for free download on Hugging Face as part of a broader ecosystem of open-source AI models.

4. LocalAI (all-rounder)

image generation inside telegram bot created using local ai
This is an example of telegram-bot created using LocalAI | Source: LocalAI

LocalAI is an open-source, free alternative to OpenAI that enables local AI inferencing on consumer-grade hardware.

It acts as a drop-in replacement for OpenAI's API specifications, allowing you to run large language models (LLMs), generate images, audio, and more without the need for a GPU.

local ai api webui
LocalAI API WebUI | Source: LocalAI-frontend

Created and maintained by Ettore Di Giacinto, LocalAI provides a flexible and cost-effective solution for running AI models on-premise.

Key Features

  • It offers compatibility with OpenAI API specifications, making integration straightforward for developers.
  • The platform operates on consumer-grade hardware, eliminating the need for a GPU.
  • Supports a wide range of models and platforms, including Llama, Hugging Face, and Ollama, for diverse applications.
  • Enables advanced text generation using models like Llama.cpp and transformers.
  • Allows users to generate images from text prompts for creative projects.
  • Includes audio features such as text-to-audio and audio-to-text with whisper.cpp.
  • Facilitates embedding generation for vector database tasks like semantic search.
  • Offers peer-to-peer inferencing for distributed AI processing across multiple devices.
  • Integrates voice activity detection using Silero-VAD for improved audio task accuracy.
  • Provides an easy-to-use WebUI for managing models without technical expertise.
  • Features a model gallery for browsing and downloading models directly from platforms like Hugging Face.

5. Foocus (Editor's choice)

fooocus webui generating a cat picture
Source: Fooocus

Fooocus caught my attention as one of the most user-friendly and innovative open-source image generators out there.

I was especially drawn to its ability to work on modest hardware(like mine, my poor laptop) and can handle diverse styles, having compatibility with various models.

It’s like having a Swiss Army knife for image generation!

Key features

  • Fooocus boasts a proprietary inpainting algorithm that delivers superior results for editing and completing images.
  • With the ability to use multiple prompts simultaneously, Fooocus enriches creative possibilities and output diversity, opening up new avenues of artistic expression.
  • Fooocus supports a vast array of SDXL models, accommodating styles from artistic to photorealistic, giving users endless options for experimentation.
  • Users can specify aspect ratios for tailor-made image generation, ensuring that every output meets their unique requirements.
  • Advanced style controls, including contrast, sharpness, and color adjustments, empower users to fine-tune generated images with precision.
  • Fooocus utilizes A1111's reweighting algorithm, enhancing the influence of specific elements within prompts for more targeted results.
  • The platform incorporates InsightFace technology for precise face swapping, ideal for creating personalized avatars or modifications.
  • Optimized for performance across a wide range of hardware configurations, Fooocus ensures accessibility and speed, regardless of the user's setup.

Conclusion

And there you have it! From Stable Diffusion to Fooocus, these are some of the open-source projects you can host or deploy locally to create stunning images right on your hardware.

While I won't dive into the murky waters of how these models get trained (support your favorite creators, and remember, stealing is bad!), I can tell you this: each project offers unique capabilities and tons of creative potential.

I like exploring local AI tools. Take this list of open source AI tools for documents.

5 Local AI Tools to Interact With PDF and Documents
Interact with your documents but in private with these local AI tools.

Now, before I get lost in a sea of stunning visuals and my laptop's fan decides to take off, I have a tiny request for you.

What do you think? Have any hidden gems that I missed? Do you agree with my not-so-secret affection for LocalAI and Fooocus?

Dive into the comments section and let me know your thoughts. Who knows? Your suggestion might just be the next project I test out (if my CPU allows it, of course)!

Until next time, keep generating and keep dreaming!

About the author
Abhishek Kumar

Abhishek Kumar

I'm definitely not a nerd, perhaps a geek who likes to tinker around with whatever tech I get my hands on. Figuring things out on my own gives me joy. BTW, I don't use Arch.

Become a Better Linux User

With the FOSS Weekly Newsletter, you learn useful Linux tips, discover applications, explore new distros and stay updated with the latest from Linux world

itsfoss happy penguin

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to It's FOSS.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.