ai-cover2

DALL-E 2. Artificial intelligence creating art, graphics, photos

How does it feel to be one of the first people in the world to have access to tests of the most advanced artificial intelligence capable of creating art? Absolutely wonderful :). If you had asked me three years ago when we would be able to talk relatively freely with A.I; or when it would start creating its own works, I would have said maybe around 2040. Asked the same question today, I wouldn’t have to guess – the answer is obvious and it’s: A.I. is already at this stage (but still in beta). I wrote this article at the beginning of closed beta tests for DALL-E 2, but later updated it multiple times. Last update: December 2022.

DALL-E 2 is the most advanced, “artistic” artificial intelligence. It generated all the graphics and “photos” in this article for me (of course, following my guidelines). DALL-E 2 from OpenAI should not be confused with DALL-E Mini made by Crayion. Mini is a simple A.I with nothing in common with the original.

How DALL-E 2 performs in different applications, what limitations it has, and how I managed to skip ahead of several million people in the beta queue – I will elaborate on all of this shortly. But first, let me clarify that these are not photo montages or manipulated images and graphics created by someone else earlier. Everything was done from scratch for the purpose of this article.

Just as if one hundred people were asked to create the same thing, they would give one hundred different results, similarly if we ask artificial intelligence the same thing one hundred times, we will receive something different each time (although most likely still consistent with the description).

Newsletter. Don’t miss the additional materials!

I’m preparing similar publications about the best A.I. currently available, both for strictly photos and graphics. There will even be an article about the best A.I. for upscaling graphic files, with numerous examples of how DALL-E images can look when upscaled to 4K.

There is also a publication coming up about photorealism in Midjourney. Additionally, I will show live how I use A.I. and prepare a list of the most useful A.I.-based tools. Leave your email and I’ll let you know about it.

.

Previous examples resemble photographs, but I may desire something more in the style of digital art:

It can be almost anything that comes to my mind, including various painting styles. Furthermore, in the following part I show how backstages of photo shoots would look like as paintings in the style of Picasso, Rembrandt, etc. To demonstrate the possibilities, I don’t just include photorealistic images but also a variety of others.

Presenting the capabilities of DALL-E 2 has intensified discussions on how A.I. will affect our creativity.

Annihilation of graphic designers and photographers?

In various creative professions, there has been concern about whether A.I. will take over their jobs already in the nearest years (while not long ago it was thought that these types of professions would survive the longest). Will an illustrator still be needed if a book author can give the same command to A.I.? What about other graphic designers, such as concept artists? Will some specialized photographic jobs disappear and so on? I have been asked about this constantly in recent days.

When my friend asked me to generate a few “photos”, after seeing the results he wrote: “They will make billions dollars from this. They’re putting photographers out of business.” And if we’re talking about those who specialize in stock photos, there’s no doubt that the demand for their services will drastically decrease. Searching for a portrait photo of a Dalmatian on a black background or a “cool-looking cat” in glasses and a leather jacket, I can start looking on stock websites or simply tell artificial intelligence what I’m looking for and have it done within seconds.

Moreover, stocks will probably operate differently in the future. Among other things, we will be able to generate photos using A.I on them; indicate what we want to change etc.

Lookbooks are simple photos for catalogs, usually taken on a uniform background – made in large quantities. In China, it is normal for the model to change outfits several times within a minute and so throughout the whole day. Such photographs will certainly be done virtually someday, but just a month ago I was convinced that this could be achieved using photorealistic avatars and 3D technology. Now I see that A.I can easily be trained to do this hundreds of times faster in two dimensions.Because let’s face it – this type of photography does not differ much from assembly line work and begs for automation. However, in most areas of photography, it is about commemorating something from reality, so I think that the development of A.I will not interfere with it too much; if anything, it will give photographers greater possibilities.

Besides, now other A.I. can already help a lot with photos. Topaz Photo AI can recover faces from photos that lack sharpness, increase resolution and even extremely reduce noise on very noisy photos. In the photo below, autofocus didn’t hit the target but I used Topaz and this is the result:

This and many other examples made by me can be seen in the article: Topaz Photo AI – Artificial Intelligence for Saving Photos: Enlarging, Sharpening and Denoising.

How to significantly improve the quality of graphics from DALL-E and other AIs?

In this article, I am posting all the graphics from DALL-E 2 without modification, which means that the resolution is a prehistoric 1024×1024 px. However, photographers know that one of the functions of the aforementioned Topaz Photo AI is upscaling photos and graphics. Therefore, it can be used to improve the resolution of what DALL-E generates. However, if intelligent sharpening is the only necessary function, then Topaz has something else that is much cheaper: Topaz Gigapixel AI app. If someone seriously considers using graphics generated in Midjourney, Stable Diffusion, DALL-E 2 etc., then Gigapixel or Photo AI will be a huge upgrade in quality:

Of course, the results will not always be amazing, but they are currently the best possible because Topaz has the best upscaling available right now.

AI helps photographers

Very quick, but interesting off-topic about A.I. in photography. I achieved this effect with just one click of a mouse, thanks to artificial intelligence Retouch4me:

This is the first AI capable of achieving results like a professional retoucher. Without cheesy and blurring skin, known from all other programs for “improving” skin. I also wrote an article about it with many examples of “before and after”: Retouch4me: Automatic photo editing, powered by artificial intelligence”. Therefore, I believe that at the current stage, AI provides immense help to photographers..

However, let’s leave the photos behind and return to generating graphics…

AI vs. work

Artificial intelligence may not necessarily take away jobs from professional designers (although it may in some cases), but it can simplify and accelerate the process tremendously. In concept art creating the first version using A.I. would take at least 100 times less time than doing it traditionally. I could test multiple ideas and ultimately create something better or faster. The need for creative ideas will never disappear, and no one pays top dollar for concepts just because they look good; they want them to be innovative, solve various problems, push graphic engines to their limits – among other things. A.I. will undoubtedly play a significant role in the future of the industry as well as other areas; however, that’s too distant of a future to consider seriously right now.

Creating 6 variations of what I want in DALL-E currently takes me 5-10 seconds and every time I receive exactly 6 proposals 4 proposals (because their number has been reduced in the current version of DALL-E). The possibilities are greater because you can create variations…of variations and edit them, but more on that later. Glowing mushrooms in a natural environment with smoke? A few seconds later:

The progress made in artificial intelligence over the last 3 years is unimaginable. However, without following the A.I. industry closely, it can be easy to not realize this. After all, most bots on websites are still as dumb as a rock, just like voice assistants on our phones. But there are also creations such as GPT which I believe is the most advanced AI and makes Siri and Alexa seem like they come from before the invention of the wheel.

I would have to converse with GPT for a really long time to realize that it’s not human. DALL-E is based on GPT. However, the version from a year ago didn’t make a big impression. It was only recently that results from version 2 were presented, and the internet exploded – even media completely unrelated to A.I. wrote about it extensively. At the same time, many people began showing graphics created by DALL·E Mini, but as I mentioned in the beginning, this is an entirely different small project that has nothing to do with the original, which is evident in its outcomes. Simply due to its name, it rides on the popularity of the original. In this article, I am focusing specifically on DALL-E 2.

This is still an early stage of development – the graphics have various artifacts, some elements are not coherent, etc. It is possible to improve selected fragments, although I have not done so in the graphics that I chose for the article. However, imagine how the final version will look like if the very beginnings of beta testing yield such results.

Artists are testing DALL-E 2

I was supposed to reveal how, despite so many millions of people applying for the beta tests, I found myself in it right from the start. It was due to my involvement in photography, 3D graphics and concept art. The order of applications didn’t matter – what mattered were artistic abilities. I also had to attend a meeting where everything related to DALL-E was explained and various questions could be asked. For example, why can’t naked characters be generated in the beta? This is because there’s not yet certainty that DALL-E can always distinguish between adults and children. It supposedly can, but if it ever made a mistake, there would be a huge problem.

At first, I didn’t fully understand why OpenAI was so interested in starting with artists. Now, after spending dozens of hours with DALL-E and on the beta tester Discord server, everything is clear. The first issue is obvious – creativity, hence creating images that AI may initially struggle with. However, we were also chosen because we know the differences between focal lengths, exposure times, lighting setups, rendering engines, graphic styles and various artists’ styles; we can determine frame compositions etc. We have a specific final effect in our minds and can describe it very accurately. We take all this into account when issuing commands to DALL-E. We check which parameters work best and get better results over time.

Among the testers there are also people who do not know much about photography, who do not understand how aperture or exposure time affects the appearance of a photo, how subjects may deform depending on their distance from the lens and so on. It is certainly more difficult for them to achieve predictable results, but they try to ask for the basics on Discord because in truth it’s all they need.

On the other hand, I’ve seen many sneaker concepts made by DALL-E, but since I know almost nothing about shoes, I wouldn’t be able to explain to A.I. what the ones I just imagined look like. So having as much knowledge as possible about everything you want to do comes in handy.

The creators of GPT and DALL-E, OpenAI, do not know the best way to address their AI. As it is artificial intelligence, there are no specific programmed commands. It simply responds to commands as it deems appropriate for the situation. The more precise the command, the closer (usually) the result will be to expectations. The character in question may be of a different age or gender and presented in a variety of styles. The building may be visible at different times of day with different lighting and colors; therefore it is important to specify everything accurately. For example, entering “hoto of slim girl, 20yo, close-up, high detail, studio, smoke, sharp, pink violet light, studio, 85mm sigma art lens” will yield less surprising results than using a shorter description.

But I can go much further… “Photo of robot with 20yo girl inside, LEDs visor helmet, profile pose, bust shot, high detail, studio, black background, smoke, sharp, cyberpunk, 85mm Sigma Art lens”:

Or something simpler and different: cyberpunk church, high detail, smoke, sharpness, neon lights, neon cross.

Below, I wanted “photo of dark temple, golden treasure, high detail, smoke, sharp, fog”.

And something more mundane: “House on fire at night, high detail, smoke, sharp, fog, darkness”.

A.I. in the service of photoshoots

On the list of photo shoots I had planned, there are portraits of cats wearing necklaces and crowns. But I’m not sure if I prefer a classic cat or a Sphinx. So I ask DALL-E for photos of both, and maybe it will help me make a decision:

I didn’t specify the lens, so the Sphinx on a black background came out caricaturally due to the same frame with wide angle lens.

I am an influencer

Of course, I could also ask for regular photos of kittens to boost social media:

And since we’re already talking about gaining fame on social media platforms… Hey DALL-E, “generate a picture of a fancy breakfast for Instagram.”

Another time it took to heart the word “Instagram”, because created graphics, instead of neutral colors, immediately had filters applied.

Or maybe a photo of cheesecake? Being an influencer has never been so easy…

Backstage of a photoshoot, painted in the style of well-known artists

DALLE-2 recognizes various painting styles (and more). Let’s take a look at what backstage photo shoot sessions would look like as fanfics of the greatest painters:

Rembrandt:

Rubens:

Picasso:

Pollock:

Vincent van Gogh:

Monet:

Salvador Dalí:

Warhol:

There is a lot of discussion about the legality of such graphics, as they can also be created based on the styles of living artists. Currently, the law cannot keep up with this issue, and for example, Stable Diffusion A.I. has withdrawn the option to base the style on a specific author.

Editing your own graphics

Instead of generating images from scratch, you can upload something of your own and tell DALL-E what to do with it. Unfortunately, I have no skills to draw, but I am good with 3D, so I loaded into DALL-E a robot render that I designed last year:

I uploaded the above file and without entering any commands, I clicked that I want to see its other variations. Here are some of the suggestions I received:

Unfortunately, the level of detail is much lower than in the original and most “images” generated from scratch. There are many artifacts because DALL-E currently does not do well in combining mechanical parts to make sense. However, the second option maintains shape consistency with the original and suggests how I could attach hands to the body differently. Based on the proposed head, I could also create something interesting. On the other hand, the first proposal’s head inspired me to create a completely new project. I think that as a tool for concept graphics, DALL-E can be useful, which is confirmed by many people in this industry as well.

Let’s see how it will handle the close-up on the hand. The original is in the top left corner, and the rest are variants proposed by A.I.

I was expecting a modified hand, but I received completely different elements. But what if I told DALLI to create a robot from scratch? I wanted a white humanoid robot with some red parts, rich in detail:

Much better. Considering that you can generate hundreds of variations, there is a good chance that many will be cool.

Editing your own photographs

Unfortunately, in the beta version it is not possible to upload photos with realistic faces. Considering that I am engaged in photographing fashion models, I don’t have much opportunity to test DALL-E for editing my own photos. I even tried photos where the face was partially covered by a hand or a hat covering everything above the mouth, but still received a message about the inability to upload photos with faces. This is good because it means that despite this restriction, DALL-E understands what is in the photo. Therefore, I dug up some old motorcycle pictures from several years ago (from the HDR Hole phase), selected one and checked what variations would be generated.

Original:

Variations:

These were not very satisfying results, so I decided to specify the changes. I asked for slicks and removal of registration. The result was rather poor.

Maybe I need more practice with editing photos in this way (i.e. issuing AI commands), but for now, it’s not working for me.

Alternative, artistic A.I.

A similar A.I. to DALLE is Imagen, created by Google. It looks very promising, but it cannot be tested yet. Another artistic artificial intelligence is Midjourney, which is taking the internet by storm because it’s significantly cheaper than DALL-E and had easier access to the closed beta testing.

Since I wrote the first version of this article, changes in Midjourney have been absolutely enormous. Especially in character generation.

Midjourney AI v4

Still not at the level of DALL-E because the characters are more “plastic” and there is much less “creativity” (the commands give very repetitive results), but it’s developing excellently.

Midjourney AI v4

The next AI of this type is the very slow Disco Diffusion, which was popular for very short time due to the much faster Stable Diffusion, which has an open code and can be easily used on one’s own computer (although if you don’t have a very fast graphics card with a lot of VRAM, fast results are out of reach).

Summary

In beta tests DALL-E 2 is still incomplete. The results have various errors and artifacts, some commands give poor results, and sometimes the output is far from expected. However, very often the results are great, and because several options are always proposed, the chances that at least one of them will be good are very high. However, since they reduced it from six proposals to four, I need to retry generating something very good much more frequently.

The beta version has many limitations – I already mentioned the lack of nudity and uploading photos with faces. Characters that are part of a larger frame have blurred faces as well. It’s also not possible to generate images of famous people not only because there’s a risk of violating their image rights but also because we know how it would end up. Political content is also prohibited, and only G-rated topics can be generated – meaning no violence or weapons allowed.

But these are all issues related to beta-testing so far. We don’t know when the final version will come out.

Here you can sign up for beta tests, and on Twitter in this thread, I regularly post new things from DALL-E, so I invite you to follow my profile.


Updates

UPDATE [21.07.2022 DALL-E 2 Monetization]: Currently, the beta no longer has 50 credits per day, only 15 credits per month + 50 at the beginning. Additional credits can be purchased: 115 credits cost $15. Each generation costs one credit.

UPDATE [02.12.2022]: The beta has been open to everyone for a while now.

UPDATE [14.12.2022]: I added information on how to upscale graphics using Gigapixel AI .

UPDATE [31.12.2022] Update regarding Midjourney v4 capabilities.

Udostępnij