Art critic Jerry Saltz on DALL-E 2: 'This is pretty crapola illustration'
(CNN Business)It only took Matt Laming, a 19-year-old from the United Kingdom, about a month to hit a million followers on Twitter. And all it required was sharing a steady stream of the most outlandish computer-generated images that he and a bunch of internet strangers could think up.
In recent weeks, the digital marketing apprentice, better known online as @weirddalle, has shared images depicting things like people vacuuming in the forest, the Demogorgon from Netflix’s “Stranger Things” holding a basketball, and a Beanie Baby that looks a lot like Danny DeVito.
These and other pictures, which range from ridiculous to disturbing, were created with a freely available artificial intelligence system called Craiyon. To use it, you just type what you’d like it to envision — “A rainbow lion eating a slice of pizza” — and it will spit out pictures in response.
“I think that’s the main draw of it: You can make anything a reality,” Laming said in an interview with CNN Business.
The brainchild of Boris Dayma, an Austin-based machine-learning engineer, Craiyon is popularizing a growing trend in AI. Computers are getting better and better at ingesting words and producing increasingly realistic-looking images in response. Lately, people are typing in about 5 million prompts per day, Dayma said.
There are similar, much more powerful AI systems than Craiyon, such as OpenAI’s DALL-E (Craiyon was initially named DALL-E Mini as an homage) and DALL-E 2, as well as Google’s Imagen. But unlike Craiyon, which anyone can try, most of these are not available to the public: DALL-E 2 is open to users via invitation only, while Imagen has not been opened up to users outside Google.
“I think it’s important to be able to have an alternative where everybody has the same access to this type of technology,” Dayma said.
In the process, however, Craiyon is effectively acting as a trial run for what could happen — good or bad — in the future if anyone can access such AI systems and solicit any kind of image from them with just a few words. And as with many nascent technologies, it is a work in progress; in the near term, if left unchecked, it may produce outcomes that reinforce stereotypes and biases.
The Notorious BFG
Dayma and some other coders built the AI system last July during a hackathon hosted by Google and Hugging Face, a company that builds and hosts AI models. Initially, Dayma said, he built it as a technical challenge; he thought DALL-E was cool and he wanted to make it himself. He posted the text-to-image generator — then called DALL-E Mini — on Hugging Face, where anyone could try it out (it’s still available there under that name). But it didn’t get much attention beyond the AI community until the past couple of months, perhaps due to the limited quality of the images it could produce.
In the past, for example, it would be able to envision simple things like a landscape, Dayma said. But little by little, he’s done things such as fixing bugs and improving code, enabling it to get better at coming up with more complicated images, such as the Eiffel Tower landing on the moon.
“When the model started drawing that, I was very happy,” he said. “But then people came up with things even more creative, and somehow the model reached a moment where it was able to do something that looked like what they asked for, and I think that was a turning point.”
The images Craiyon generates are not nearly as realistic-looking as what DALL-E 2 or Imagen can come up with, but they’re fascinating nonetheless: People tend to blur into objects, and images look fuzzy and at least slightly askew.
For now, Craiyon is mostly being used for fun by people like Laming — perhaps in part because its results are not nearly as crisp or photorealistic as the images you can get from DALL-E 2 or Imagen, but also because people are still trying to figure out what to do with it. (The Craiyon website currently runs ads to recoup costs for the servers that power the AI system, and Dayma said he’s trying to figure out how to make money from it while also allowing people to play with it for free.)
Many of the images Laming posts to Twitter come from a Reddit forum he created for people to post the prompts and resulting images they got when they ran them through the system. This is the same approach he takes for another Twitter account he runs, @spotifyweird, which tweets strange Spotify playlists.
Laming’s most popular tweet so far was a post on June 14 with the prompt “Fisher Price guillotine,” which was initially posted to his subreddit by a Reddit user. Popular posts may take an item from news or pop culture and mash it up with something completely random or shocking or gross — such as Minions-themed urinals — or simply come up with a funny play on words (think “The Notorious BFG” or “Ice Cube in an ice cube”).
As users get more familiar with the kinds of results Craiyon can produce, the prompts get increasingly specific in terms of the types of imagery they want to see — such as calling for a medical illustration of a burrito or courtroom sketches showing what it might look like if a capybara sued Elon Musk. Sometimes they’re just really weird, such as in this depiction of archaeologists discovering a plastic chair.
To come up with a good prompt, Laming suggested, just “think of the most outlandish situation to put someone or something in.” In effect, the prompts that lead to these pictures are themselves arguably a new form of creativity.
Biases on display
Mar Hicks, an associate professor at the Illinois Institute of Technology who studies the history of technology, said this AI system reminds them of early chatbots such as Eliza, a computer program built by MIT professor Joseph Weizenbaum in the 1960s and meant to mimic a therapist. Such programs could convince people they were communicating with another human, even though the computer didn’t truly understand what it was being told (Eliza gave scripted responses).
“I think it’s appealing the same way that a game of chance is appealing, or a party game,” Hicks said. “Where there’s some level of uncertainty about what’s going to happen.”
But Hicks is concerned about the AI system’s ability to respond to any written prompt with images, rather than occasionally giving an error message indicating it doesn’t know how to parse the phrases a person typed. “That means you will be getting garbage out some times,” they said, and the onus is on the users to figure out why. This was the case with some prompts I fed Craiyon, making it occasionally disappointing and frustrating to use, but Dayma pointed out that it’s not easy to predict what it can or can’t draw, and sometimes the results are surprising, or at least surprisingly weird.
Dayma said he’s heard from people using Craiyon to come up with a logo for a new business and as imagery in videos. (OpenAI and Google have suggested that their systems might eventually be used for things like image editing and generating stock images.)
While there may be creative possibilities for these AI systems, they share a key problem that pervades the AI industry at large: bias. They’re all trained on data that includes wide swaths of the internet, which means the images they create can also lay bare a host of biases including gender, racial, and social stereotypes.
Such biases are evident even in Craiyon’s fuzzy-looking images. And because anyone can type anything they want into it, it can be a disturbing window into how stereotypes can seep into AI. I recently gave Craiyon the prompt “a lawyer”, for instance, and the results were all blurry images of what appeared to be men in black judge’s robes. The prompt “a teacher,” meanwhile, yielded only figures that appeared to be women, each in a button-down shirt.
Dayma is aware of this. A “Frequently asked questions” section on Craiyon’s website mentions that the model’s reliance on internet data may result in “images that contain harmful stereotypes,” and that those behind Craiyon are working to document and analyze its biases. Dayma noted that many AI systems are biased, whether or not users are aware of it, and said he likes that everyone can observe Craiyon’s biases directly in the images it makes.
He also said that he tried to prevent the model behind Craiyon from learning certain concepts to start with. However, it only took me a few minutes to come up with some explicit prompts that yielded images that are, to put it bluntly, not safe for work.
Asked whether he thinks its general availability could be a bad thing, given its obvious biases, he pointed out that the images it comes up with, while better-looking than in the past, are clearly not realistic.
“If I draw the Eiffel Tower on the moon, I hope nobody believes the Eiffel Tower is really on the moon,” he said.
Source: Read Full Article