Curated by THEOUTPOST
On Wed, 23 Oct, 12:10 AM UTC
9 Sources
[1]
Stable Diffusion 3.5 Can Generate Pictures of Women Lying Down on Grass
Stability AI has clapped back at its critics by announcing its latest AI image model Stable Diffusion 3.5 with a picture of a woman lying on grass. The girl lying on grass is a tongue-in-cheek acknowledgment of the chaos that engulfed the company's last product, Stable Diffusion 3, which was laughed at for its eldritch body horror abominations. The beleaguered company made reference to the botched model in its 3.5 announcement. "In June, we released Stable Diffusion 3 Medium, the first open release from the Stable Diffusion 3 series. This release didn't fully meet our standards or our communities' expectations," Stability AI says. People took to Reddit to ask why Stable Diffusion 3 is so bad at generating girls lying on grass accompanied by a series of bizarre and hilarious pictures. But Stability AI has apparently put the women lying on grass drama behind it by releasing Stable Diffusion 3.5 which performs better and can generate a woman lying down on grass. "Today we are introducing Stable Diffusion 3.5," Stability AI announces. "This open release includes multiple model variants, including Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo. Additionally, Stable Diffusion 3.5 Medium will be released on October 29." Stability AI's approach to AI image generators is different to its competitors because its models can be run locally on consumer hardware. Users can fine-tune their models to the style they want. "These models are highly customizable for their size, run on consumer hardware, and are free for both commercial and non-commercial use under the permissive Stability AI Community License," Stability AI says in a blog post. "Stable Diffusion 3.5 reflects our commitment to empower builders and creators with tools that are widely accessible, cutting-edge, and free for most use cases. We encourage the distribution and monetization of work across the entire pipeline - whether it's fine-tuning, LoRA, optimizations, applications, or artwork."
[2]
Stability AI Releases Stable Diffusion 3.5 Text-to-Image AI Model
Stability AI Releases Stable Diffusion 3.5 Text-to-Image AI Model Snippet: The Stable Diffusion 3.5 allows users to easily fine-tune the model for specific creative projects or build custom workflows Stability AI has introduced Stable Diffusion 3.5 which includes multiple model variants. These models are capable of running on consumer-grade hardware and are available for both commercial and non-commercial use under the flexible Stability AI Community License. Developers now can customise and integrate the models without worrying about restrictive licensing, making them ideal for a wide range of applications. The Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo is available for download on Hugging Face, and the inference can be accessed on GitHub. Stable Diffusion 3.5 offers a range of models catering to different users, including researchers, startups, and enterprises. The Stable Diffusion 3.5 Large model, with 8 billion parameters, delivers superior image quality and prompt adherence, making it apt for professional use at a 1-megapixel resolution. The Large Turbo version is a faster alternative, generating high-quality images in just 4 steps. The company says the model is also optimised for efficient performance on standard consumer hardware, particularly in the Medium and Large Turbo versions. Besides, it generates inclusive and diverse images, accurately representing various skin tones and features without needing extensive prompts. The models are trained on the subset of the LAION-5b dataset -- which was created by the DeepFloyd team -- to further filter adult content using the dataset's NSFW filter. The model is available at no cost for non-commercial purposes, including academic research. Startups, small to medium businesses, and creators can use the model commercially for free, provided their annual revenue is under $1M. Users maintain full ownership of the generated content, with no restrictive licensing. Meanwhile, Google announced it is pausing its Gemini artificial intelligence image generation feature after saying it offers "inaccuracies" in historical pictures. Gemini-generated pictures went viral on social media recently, leading to widespread ridicule and anger. Some users criticized Google, claiming that the company is overly concerned with being socially aware, even if it means sacrificing truth and accuracy. "We recently made the decision to pause Gemini's image generation of people while we work on improving the accuracy of its responses," said Google. Social media users have been complaining that the AI tool generates images of historical figures -- like the U.S. Founding Fathers -- as people of colour, calling this inaccurate.
[3]
Stable Diffusion 3.5 follows your prompts more closely and generates more diverse people
It also tries to right the wrongs of the last version that excelled at (accidental) body horror. Stable Diffusion, an open-source alternative to AI image generators like Midjourney and DALL-E, has been updated to version 3.5. The new model tries to right some of the wrongs (which may be an understatement) of the widely panned Stable Diffusion 3 Medium. Stability AI says the 3.5 model adheres to prompts better than other image generators and competes with much larger models in output quality. In addition, it's tuned for a greater diversity of styles, skin tones and features without needing to be prompted to do so explicitly. The new model comes in three flavors. Stable Diffusion 3.5 Large is the most powerful of the trio, with the highest quality of the bunch, while leading the industry in prompt adherence. Stability AI says the model is suitable for professional uses at 1 MP resolution. Meanwhile, Stable Diffusion 3.5 Large Turbo is a "distilled" version of the larger model, focusing more on efficiency than maximum quality. Stability AI says the Turbo variant still produces "high-quality images with exceptional prompt adherence" in four steps. Finally, Stable Diffusion 3.5 Medium (2.5 billion parameters) is designed to run on consumer hardware, balancing quality with simplicity. With its greater ease of customization, the model can generate images between 0.25 and 2 megapixel resolution. However, unlike the first two models, which are available now, Stable Diffusion 3.5 Medium doesn't arrive until October 29. The new trio follows the botched Stable Diffusion 3 Medium in June. The company admitted that the release "didn't fully meet our standards or our communities' expectations," as it produced some laughably grotesque body horror in response to prompts that asked for no such thing. Stability AI's repeated mentions of exceptional prompt adherence in today's announcement are likely no coincidence. Although Stability AI only briefly mentioned it in its announcement blog post, the 3.5 series has new filters to better reflect human diversity. The company describes the new models' human outputs as "representative of the world, not just one type of person, with different skin tones and features, without the need for extensive prompting." Let's hope it's sophisticated enough to account for subtleties and historical sensitivities, unlike Google's debacle from earlier this year. Unprompted to do so, Gemini produced collections of egregiously inaccurate historical "photos," like ethnically diverse Nazis and US Founding Fathers. The backlash was so intense that Google didn't reincorporate human generations until six months later.
[4]
Stable Diffusion 3.5 Is Out with Better Performance & the Right Number of Limbs
Stability, the AI startup behind popular image generator Stable Diffusion, has announced a new generation of models that are designed to be more customizable and versatile while outputting more "diverse" and representative content. ✕ Remove Ads The new Stable Diffusion 3.5 is being made available in three variations: Large, Large Turbo, and Medium. Stable Diffusion 3.5 Large clocks in at 8 billion parameters and is capable of generating images up to a megapixel in resolution. Stable Diffusion 3.5 Large Turbo has the same capabilities as Large, though it generates images more quickly at the cost of some quality. Stable Diffusion 3.5 Medium, on the other hand, is designed to run on edge devices, like laptops, tablets and smartphones. It can generate images in a variety of sizes from a quarter MP up to 2 MP. The Large and Large Turbo versions are available today, while Medium will arrive at the end of the month, on October 29. ✕ Remove Ads SD 3.5 promises significantly improved performance over its predecessor, Stable Diffusion 3.0, which made its debut in June. SD3's release was met with outrage and disappointment by its user base due to the model's inability to accurately render limbs and fingers, especially for subjects depicted lying on grass, as well as rampant artifacts and its difficulty following the given prompt. The company concedes that SD3 did not "fully meet our standards or our communities' expectations" but insists that "Stable Diffusion 3.5 reflects our commitment to empower builders and creators with tools that are widely accessible, cutting-edge, and free for most use cases." ✕ Remove Ads The new 3.5 model has also been trained to generate more diverse images of people, creating "images representative of the world, not just one type of person, with different skin tones and features," and without the need for extensive prompts. "During training, each image is captioned with multiple versions of prompts, with shorter prompts prioritized," Hanno Basse, CTO of Stability, told TechCrunch. "This ensures a broader and more diverse distribution of image concepts for any given text description. Like most generative AI companies, we train on a wide variety of data, including filtered publicly available datasets and synthetic data." ✕ Remove Ads Whether this leads to the same sorts of issues that Google faced in February when Gemini's image generator started spitting out "historically inaccurate" (read: black Nazi soldiers), remains to be seen. "Gemini's AI image generation does generate a wide range of people," a Google statement posted to Twittter in February, reads. "That's generally a good thing because people around the world use it. But it's missing the mark here." If you'd like to try Stable Diffusion 3.5 for yourself, both Large and Large Turbo are currently available for download from Hugging Face. You can find the inference code on GitHub. ✕ Remove Ads Source: Stability AI, TechCrunch
[5]
StabilityAI releases Stable Diffusion 3.5 -- a step up in realism
StabilityAI is today releasing its new Stable Diffusion 3.5 family of AI image models. It comes with improved realism, prompt adherence and text rendering compared to SD3. Like the first incarnation of SD3, Stable Diffusion 3.5 comes in three sizes -- Large (8B), Large Turbo (8B), and Medium (2.6B). These are all customizable and tuned to run on consumer hardware, while also being available to use under the Stability AI Community License. In short, this is a big step towards any user being able to create more realistic AI images. In a press release, StabilityAI admitted its Stable Diffusion 3 Medium model that launched in June "didn't fully meet our standards or our communities' expectations". The company added that "After listening to the valuable community feedback, instead of a quick fix, we took the time to further develop a version that advances our mission to transform visual media." Our AI Editor Ryan Morrison has been testing SD3.5 and says it is a significant upgrade, matching and potentially exceeding the capabilities of the recently released Flux 1.1 Pro. Stability AI says the models included are focused on customizability, efficient performance, and diverse outputs. "Stable Diffusion 3.5, our most powerful models yet, reflects our commitment to empowering creators with tools that are both widely accessible and cutting-edge," a spokesperson explained. That means images can be fine-tuned, the models run on consumer hardware "out of the box", and the images generated will feel more unique to one another. There's also a focus on new stylistic choices, including photography, painting, and more. It takes in hashtag prompts to specify a style such as boho or fashion. It can also use highlighting within the prompt to steer the model in a specific direction. "Additionally, our analysis shows that Stable Diffusion 3.5 Large leads the market in prompt adherence and rivals much larger models in image quality," a press release explains. "Stable Diffusion 3.5 Turbo offers some of the fastest inference times for its size, while remaining highly competitive in both image quality and prompt adherence, even when compared to non-distilled models of similar size." "Stable Diffusion 3.5 Medium outperforms other medium-sized models, offering a balance of prompt adherence and image quality, making it a top choice for efficient, high-quality performance." The model is free for non-commercial use, including scientific research, and free for small to medium-sized businesses up to $1 million in revenue. Anything beyond that will require an Enterprise Licence.
[6]
Stability AI releases next-gen open-source Stable Diffusion 3.5 text-to-image AI model family - SiliconANGLE
Stability AI releases next-gen open-source Stable Diffusion 3.5 text-to-image AI model family Generative artificial intelligence startup Stability AI Ltd. today announced the release of Stable Diffusion 3.5, which includes three next-generation open-source text-to-image AI model variants. "In June, we released Stable Diffusion 3 Medium, the first open release from this series. This release didn't fully meet our standards or our communities' expectations," the company said in the announcement. "After listening to the valuable community feedback, instead of a quick fix, we took the time to further develop a version that advances our mission to transform visual media." The three models in this release include Stable Diffusion 3.5 Large, 3.5 Large Turbo and 3.5 Medium. Each variant has been developed to meet the needs of scientific researchers, hobbyists and enterprise customers with increased levels of customization and accessibility for local and cloud deployments. Large is an 8 billion-parameter model designed for prompt adherence and high-quality image production. It's based on the standard Stable Diffusion family. The company said it's ideal for professional users looking for 1-megapixel resolution graphics. It's suitable for producing vivid images and digital assets for marketing campaigns and other similar enterprise use cases. Large Turbo is a streamlined version of 3.5 Large that produces high-quality images while retaining exceptional prompt adherence with only four steps, which makes it much faster than the original version. It's designed to produce images quickly without losing quality, making it good for rapid-generation workflows. Stability AI said that Turbo offers some of the fastest times to create images for its size in the industry and remains competitive for image quality and adherence to prompt, even compared with non-distilled models of its size. The new Stable Diffusion 3.5 Medium weighs in at 2.6 billion parameters and Stability AI said it built an improved architecture and training method to provide a balance between quality and customization. The model is capable of efficiently producing images between 0.25- and 2-megapixel resolutions and is optimized to run on standard consumer hardware without heavy demands. In developing the models, the company said, it used Query-Key Normalization in the AI transformers to help prioritize customizability and simplify fine-tuning. By doing this, developers will have an easier time customizing their model by tagging their inputs, but it also means that the model adheres better to specific natural language prompts. At the same time, prompts lacking specific wording are more likely to produce a broader range of image outputs. "To support this level of downstream flexibility, we had to make some trade-offs," the company said. "Greater variation in outputs from the same prompt with different seeds may occur, which is intentional as it helps preserve a broader knowledge base and diverse styles in the base models. However, as a result, prompts lacking specificity might lead to increased uncertainty in the output, and the aesthetic level may vary." Stability AI said the Stable Diffusion Medium 3.5 will be available on Oct. 29. All the models are open-source and available with the company's community license, free for noncommercial use, and commercial use up to $1 million in annual revenue, after which companies must inquire about an enterprise license. The model weights will be available shortly on Hugging Face for self-hosting. They can also be accessed through the Stability AI application programming interface as well as Replicate, Fireworks and ComfyUI. In the next few days, the ControlNets for the models will also be released to provide advanced control for the new models.
[7]
Stability claims its newest Stable Diffusion models generate more 'diverse' images | TechCrunch
Following a string of controversies stemming from technical hiccups and licensing changes, AI startup Stability AI has announced its latest family of image generation models. The new Stable Diffusion 3.5 series is more customizable and versatile than Stability's previous-generation tech, the company claims -- as well as more performant. There are three models in total: While Stable Diffusion 3.5 Large and 3.5 Large Turbo are available today, 3.5 Medium won't be released until October 29. Stability says that the Stable Diffusion 3.5 models should generate more "diverse" outputs -- that is to say, images depicting people with different skin tones and features -- without the need for "extensive" prompting. "During training, each image is captioned with multiple versions of prompts, with shorter prompts prioritized," Hanno Basse, Stability's chief technology officer, told TechCrunch in an interview. "This ensures a broader and more diverse distribution of image concepts for any given text description. Like most generative AI companies, we train on a wide variety of data, including filtered publicly available datasets and synthetic data." Some companies have cludgily built these sorts of "diversifying" features into image generators in the past, prompting outcries on social media. An older version of Google's Gemini chatbot, for example, would show an anachronistic group of figures for historical prompts such as "a Roman legion" or "U.S. senators." Google was forced to pause image generation of people for nearly six months while it developed a fix. With any luck, Stability's approach will be more thoughtful than others. We can't give impressions, unfortunately, as Stability didn't provide early access. Stability's previous flagship image generator, Stable Diffusion 3 Medium, was roundly criticized for its peculiar artifacts and poor adherence to prompts. The company warns that Stable Diffusion 3.5 models might suffer from similar prompting errors; it blames engineering and architectural trade-offs. But Stability also asserts the models are more robust than their predecessors in generating images across a range of different styles, including 3D art. "Greater variation in outputs from the same prompt with different seeds may occur, which is intentional as it helps preserve a broader knowledge-base and diverse styles in the base models," Stability wrote in a blog post shared with TechCrunch. "However, as a result, prompts lacking specificity might lead to increased uncertainty in the output, and the aesthetic level may vary." One thing that hasn't changed with the new models is Stability's licenses. As with previous Stability models, models in the Stable Diffusion 3.5 series are free to use for "non-commercial" purposes, including research. Businesses with less than $1 million in annual revenue can also commercialize them at no cost. Organizations with more than $1 million in revenue, however, have to contract with Stability for an enterprise license. Stability caused a stir this summer over its restrictive fine-tuning terms, which gave (or at least appeared to give) the company the right to extract fees for models trained on images from its image generators. In response to the blowback, the company adjusted its terms to allow for more liberal commercial use. Stability reaffirmed today that users own the media they generate with Stability models. "We encourage creators to distribute and monetize their work across the entire pipeline," Ana Guillèn, VP of marketing and communications at Stability, said in an emailed statement, "as long as they provide a copy of our community license to the users of those creations and prominently display 'Powered by Stability AI' on related websites, user interfaces, blog posts, About pages, or product documentation." Stable Diffusion 3.5 Large and Diffusion 3.5 Large Turbo can be self-hosted or used via Stability's API and third-party platforms including Hugging Face, Fireworks, Replicate, and ComfyUI. Stability says that it plans to release the ControlNets for the models, which allow for fine-tuning, in the next few days. Atability's models, like most AI models, are trained on public web data -- some of which may be copyrighted or under a restrictive license. Stability and many other AI vendors argue that the fair-use doctrine shields them from copyright claims. But that hasn't stopped data owners from filing a growing number of class-action lawsuits. Stability leaves it to customers to defend themselves against copyright claims, and, unlike some other vendors, has no payout carve-out in the event that it's found liable. Stability does allow data owners to request that their data be removed from its training datasets, however. As of March 2023, artists had removed 80 million images from Stable Diffusion's training data, according to the company. Asked about safety measures around misinformation in light of the upcoming U.S. general elections, Stability said that it "has taken -- and continues to take -- reasonable steps to prevent the misuse of Stable Diffusion by bad actors." The startup declined to give specific technical details about those steps, however. As of March, Stability only prohibited explicitly "misleading" content created using its generative AI tools -- not content that could influence elections, hurt election integrity, or that features politicians and public figures.
[8]
Stable Diffusion 3.5 debuts as Stability AI aims to improve open models for generating images
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Stability AI is out today with a major update for its text to image generative AI technology with the debut of Stable Diffusion 3.5. A key goal for the new update is raise the bar and improve upon Stability AI's last major update, which the company admitted didn't live up to its own standards. Stable Diffusion 3 was first previewed back in February and the first open model version became generally available in June with the debut of Stable Diffusion 3 Medium. While Stability AI was an early pioneer in the text to image generative AI space, it has increasingly faced stiff competition from numerous rivals including Black Forest Labs' Flux Pro, OpenAI's Dall-E, Ideogram and Midjourney. With Stable Diffusion 3.5, Stability AI is looking to reclaim its leadership position. The new models are highly customizable and can generate a wide range of different styles. The new update introduces multiple model variants, each designed to cater to different user needs.Stable Diffusion 3.5 Large is an 8 billion parameter model that offers the highest quality and prompt adherence in the series. Stable Diffusion 3.5 Large Turbo is a distilled version of the large model, providing faster image generation. Rounding out the new models is Stable Diffusion 3.5 Medium, which has 2.6 billion parameters and is optimized for edge computing deployments. All three of the new Stable Diffusion 3.5 models are available under the Stability AI Community License, which is an open license that enables free non-commercial usage and free commercial usage for entities with annual revenue under $1 million. Stability AI has an enterprise license for larger deployments. The models are available via Stability AI's API as well as Hugging Face. The original release of Stable Diffusion 3 Medium in June, was a less than ideal release. The lessons learned from that experience have helped to inform and improve the new Stable Diffusion 3.5 updates. "We identified that several model and dataset choices that we made for the Stable Diffusion Large 8B model were not optimal for the smaller-sized Medium model," Hanno Basse, CTO of Stability AI told VentureBeat. "We did thorough analysis of these bottlenecks and innovated further on our architecture and training protocols on the Medium model to provide a better balance between the model size and the output quality." How Stability AI is improving text to image generative AI with Stable Diffusion 3.5 As part of building out Stable Diffusion 3.5, Stability AI took advantage of a number of novel techniques to improve quality and performance. A notable addition to Stable Diffusion 3.5 is the integration of Query-Key Normalization into the transformer blocks. This technique facilitates easier fine-tuning and further development of the models by end-users. Query-Key Normalization makes the model more stable for training and fine-tuning. "While we have experimented with QK-normalization in the past, this is our first model release with this normalization," Basse explained. "It made sense to use it for this new model as we prioritized customization." Stability AI has also enhanced its Multimodal Diffusion Transformer MMDiT-X architecture, specifically for the medium model. Stability AI first highlighted the MMDiT architecture approach in April, when the Stable Diffusion 3 API became available. MMDiT is noteworthy as it blends diffusion model techniques with transformer model techniques. With the updates as part of Stable Diffusion 3.5, MMDiT-X is now able to help improve image quality as well enhancing multi-resolution generation capabilities Prompt adherence makes Stable Diffusion 3.5 even more powerful Stability AI reports that Stable Diffusion 3.5 Large demonstrates superior prompt adherence compared to other models in the market. The promise of better prompt adherence is all about the models ability to accurately interpret and render user prompts. "This is achieved with a combination of different things - better dataset curation, captioning and additional innovation in training protocols," Basse said. Customization will get even better with ControlNets Looking forward, Stability AI is planning on releasing a ControlNets capability for Stable Diffusion 3.5. The promise of ControlNets is more control for various professional use cases. StabilityAI first introduced ControlNet technology as part of its SDXL 1.0 release in July 2023. "ControlNets give spatial control over different professional applications where users, for example, may want to upscale an image while maintaining the overall colors or create an image that follows a specific depth pattern," Basse said.
[9]
Stable Diffusion 3.5: Stability AI Redeems Itself With New Models and Expanded Features - Decrypt
Stability AI may be starting its very own redemption arc. After the disappointment that was SD3 Medium, they've come back swinging with the release of two new models that had been promised back in July: Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Large Turbo. "In June, we released Stable Diffusion 3 Medium, the first open release from the Stable Diffusion 3 series. This release didn't fully meet our standards or our communities' expectations," Stability said in an official blog post. "After listening to the valuable community feedback, instead of a quick fix, we took the time to further develop a version that advances our mission to transform visual media." We generated a few images to try it out before rushing to write this breaking news -- and the results were pretty, pretty good. Especially for a base model. The SD 3.5 family is designed to run on consumer-grade systems -- even low end by some standards -- making advanced image generation more accessible than ever. And yes, they've heard the complaints about the previous version so this one promises to be a lot better -- so much that their featured image is a woman lying on grass, a wry reference to the horrorshow that occurred earlier when presented with the same challenge. Another important aspect of this release is the new licensing model. Stable Diffusion 3.5 comes under a more permissive license, allowing both commercial and non-commercial use. Small businesses and people who make less than $1,000,000 in revenue from the tool can use and build on these models for free. Those with a larger revenue must contact Stability to negotiate fees. By comparison, Black Forest Labs offers its lower-end Flux Schnell for free, its medium model Flux Dev free with non commercial use and its SOTA model Flux Pro is a closed source model. (For reference, Flux is generally considered the best open source image generator currently available -- at least in the current post-SDXL era.) Stability AI is releasing three versions of Stable Diffusion 3.5, all of which cater to different needs: The models are much more flexible, allowing users to fine-tune them for specific creative needs. And if you're worried about whether your consumer-grade GPU can handle this, Stability AI has your back. Our own tests show the Large Turbo spitting out images in about 40 seconds on a modest RTX 2060 with 6GB of VRAM. The non quantized full-fat version needs over 3 minutes on the same lower end hardware, but that's the price of quality. Stability AI is playing catch up against Flux, which is the go-to model for customizability. To improve user experience, Stability reimagined how SD 3.5 behaves. "In developing the models, we prioritized customizability to offer a flexible base to build upon. To achieve this, we integrated Query-Key Normalization into the transformer blocks, stabilizing the model training process and simplifying further fine-tuning and development," Stability said. In other words, you can tweak and refine these models much more easily than before, whether you're an artist wanting to create custom styles or a developer looking to build an AI-powered application. Stability even shared a LoRA training guide to help things kick things off a lot faster. LoRA (low rank adaptation) is a technique to fine tune models to specialize in a specific concept -- be it a style, or a subject-without having to retrain the whole large base model. Of course, with flexibility comes some trade-offs. The model is now so creative that Stability warns that "prompts lacking specificity might lead to increased uncertainty in the output, and the aesthetic level may vary" If you're still on the fence about Stable Diffusion 3.5 and its "uncertainty" drives you off, here's a bit of futureproofing for you -- it supports "negative prompts," meaning your prompt can include instructions not to do things. This is a massive boon for those who want to refine text and image generation without jumping through hoops. It's a nice addition for those who want a bit more control over their generations. Also, it seems pretty good at handling the good old SDXL style of prompting. In fact, in some ways, SD3.5's prompting style is closer to MidJourney than Flux, allowing users to get creative without needing a degree in linguistics. Beyond customization Stable Diffusion 3.5 moves forward in other areas: And yes, it's worth mentioning -- it's uncensored. SD3.5 Large can produce certain types of content, including nudity, without too much difficulty, though it's not perfect. For better or worse, the model isn't purposely restricted, so it offers users full creative freedom (though fine-tuning and some specific prompts may be required for best results). This was heavily criticized when SD3 launched and was pointed out as one of the main reasons it failed so hard in anatomy comprehension. We could confirm its ability to generate NSFW imagery, however, the model is not on the same level as the best Flux finetunes but is comparable to the original Flux models. But fair warning: as powerful as SD3.5 is, you NSFW Furry artists shouldn't expect a Pony Diffusion Model anytime soon -- or at all. The creator of the most popular and powerful NSFW model confirmed they are not interested in developing a SD3.5 finetune. Instead, they chose to build their models using Auraflow as a base. Once they are done, they may consider Flux. For the tinkerers out there, ComfyUI now supports Stable Diffusion 3.5, allowing local inference with signature node-based workflows. There are plenty of workflow examples ready to go, and if you're struggling with lower RAM but want to try the full SD3.5 experience, Comfy rolled out an experimental fp8-scaled model that lowers memory usage. On October 29, we'll get our hands on Stable Diffusion 3.5 Medium, and not long after Stability promised to release Control Nets for SD 3.5. ControlNets promise to bring advanced control features, tailored for professional use cases, and they could very well take the power of SD3.5 to the next level. If you want to know more about them, you can read a summary of our brief guide for SD 1.5. However, using controlents will let users do things like choosing their subject's pose, play around with depth maps, reimagine a scene based on a scribble, and more. So, is Stable Diffusion 3.5 a Flux Killer? Not quite, but it's definitely starting to look like a contender. Some users will still nitpick, especially after the drama of the SD3 Medium flop. But with better anatomy handling, a clearer license, and significant improvements in prompt adherence and output quality, it's hard to argue that this isn't a big step forward. Stability AI is learning from past mistakes and moving toward a future where advanced AI tools are more accessible to all.
Share
Share
Copy Link
Stability AI releases Stable Diffusion 3.5, addressing previous issues and offering enhanced image quality, prompt adherence, and diversity in AI-generated images.
Stability AI has unveiled Stable Diffusion 3.5, a significant upgrade to its open-source AI image generation model. This release aims to address the shortcomings of its predecessor, Stable Diffusion 3, which faced criticism for producing unrealistic and distorted images [1][2].
The new model boasts several enhancements:
Stable Diffusion 3.5 is available in three variants:
Stable Diffusion 3.5 is positioned to compete with other major AI image generators:
Stability AI acknowledges the issues with Stable Diffusion 3, particularly its tendency to produce distorted images:
The release of Stable Diffusion 3.5 marks a significant step in AI image generation:
As AI image generation continues to evolve, Stable Diffusion 3.5 represents a notable advancement in the field, promising more realistic, diverse, and user-friendly AI-generated imagery.
Reference
[2]
[3]
Stability AI has introduced three cutting-edge text-to-image models to Amazon Bedrock, expanding the platform's AI capabilities and offering developers new tools for visual content generation.
4 Sources
Stability AI introduces Stable Video 4D, a groundbreaking AI model capable of generating 3D videos from text prompts. This innovation marks a significant advancement in the field of AI-generated content, offering new possibilities for creators and industries.
3 Sources
Black Forest Labs releases Flux 1.1 Pro, a faster and more advanced AI image generator, outperforming competitors in speed and quality while introducing a new API for developers.
4 Sources
Google has quietly rolled out its latest AI image generator, Imagen 3, to all users in the United States. This move marks a significant expansion in the availability of Google's advanced text-to-image AI technology.
9 Sources
Midjourney, a leading AI image generation platform, has introduced a new web-based AI image editor. This tool combines image generation and editing capabilities, offering users a more streamlined and powerful creative process.
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved