table of contents
This is part 2 of introductory posts, that explain the context and personal reasons
for starting "Diffusion Pilot", focusing on theory.
Catch up in part 1, or plunge deeper, learn about the tools and techniques in level 2 and 3 posts.
The very first sparks of the short film that lead me to starting "Diffusion Pilot" date back to more than 3 years ago, to when I received the pleasant news of being accepted into EKA and its animation program, yet I am still far from finishing it. What the hell have I been doing for so long? Starting a blog is only one of the things.
Following my research interests detailed in
part 1, I got stuck elbow deep in Deep Learning for a good while, which I present
in
part 3. Here though, I'd like to address the context of my studies, and the overall
irony of manual artistic labor in the middle of generative AI takeover.
The never ending process of a solo animated film project
In my animation MA study program, in addition to other courses, one develops, directs and produces an animated short film, usually doing all the practical labor alone, with some opportunities to collaborate with music and sound creatives. Some of my peers posses the immense discipline and focus to power through and graduate at the end of this 2 year program. However, many get easily lost in other creative opportunities or distractions that appear throughout those years. More importantly, like I mentioned in my first post, the creative freedom that the students are allowed instills temptation to strive for greatness and large scope. I would even say, that the endless labor associated with animation production is sometimes almost fetishized in the community. We nurture our independent films like offsprings, meditating and exploring the same mindspace for years until the film is finished and we finally move on. This can go many ways, likely being exhausting and putting authors passion and patience to the test, hopefully being rewarding in the end, and possibly depressing somewhere along the way.
Personally, with no financial support, I have decided to get the most from it, allowing myself to kindle
the most outlandish, neurotic and convoluted process for making an animated film. That is why it has been such an uphill push for the most part, but
willingly so. It has become the kind of film I couldn't pitch and form a team
on, it's...
...the kind of creative process that is only possible to do in a solitary psychosis.
This process of making an animated film by yourself often ends up intertwining with the psychology of the author. It's a recurring theme in the field, partially shared with arts and hustle culture, but significantly more severe than in some other fields. My friend and study colleague, John F. Quirk, for his thesis did a meta analysis of this process through several interviews, and structured it into a sort of self-help for animators guide titled "Sustaining Enthusiasm in the Animated Medium".
So then, remembering the keyword from
part 1 of intro, why don't we all drop our pens and puppets, and just make AI generate our
films? Harvest both ends of the Joy curve and shove AI into the pit of
grind in the act that some call "democratization of creativity".
[insert typical comment about being/not-being nice to our future AI
overloads...]
AI, or as I would put it, "Deep-learning enabled creativity", like every
new technology brings waves of new styles, genres, and techniques that
make end goals easier, but tradition always lives on. This dynamic is a whole
different thesis topic of its own, being a hot topic right now, actively
dissected through artworks, discussions and writing.
But to put it simply, it's not so simple. The tradition of animation is immensely interlinked with copious patience - the act of animating, directing, and the process itself.
Everybody is an artist, everybody is an animator!
"What a time to be alive!" Károly Zsolnai from "two minute papers" shouts every week with each machine learning paper getting closer and closer to having direct idea to result AI magic. Then commenters make the typical bets on when we'll have mind-reading Netflix generating our infinite entertainment on demand.
For static imagery, the idea-to-result (text-to-image) magic is already very much in effect and felt in the industry, but for video, the hill is yet to be conquered. At the time of writing, readily avilable state of the art tool is Runway's "Gen-2" , generating impressive, yet wobbly and typically "AI-awkward" results. *(Since then, Sora by OpenAI has shaked the scene up, perhaps sooner than most have expected to see anything of this fidelity)
A typical animator, knowingly or subconsciously, is likely not going through so much trouble solely to get the final video file out there, and may be not the one searching for ways to generate endless films with just a few keystrokes on a text prompt. The process, labor, and contact with the material is celebrated and cherished just as much, if not more than the end result. I would say this applies to both the individual level, and in terms of recognition in the independent film industry.
From my personal observations so far at the time of writing this, majority of generative AI users are not the usual creatives, who have been honing their craft and skills to manifest ideas into concrete creations. Rather it is a new wave of people who haven't had access before to making visual art in such large array of styles, so easily. Their values and sense of fulfillment may be vastly different from the typical hardcore artist/crafts-person, which sometimes creates a sense of divide and angst, not to mention the emerging culture and newly forming skills and their "heroes" such as "prompt engineering".
This emerging "cheat" of making art is a monstrous topic with lots of philosophical baggage, from which I am moving along. I bring it up to highlight the two contrasting ways of looking into animation through/with AI right now, and to offer my take on it from one of those sides, from the perspective of rather hardcore animator.
As a result of my process oriented animator mentality and deep diving into this tech, I'm making rather profound discoveries. I realize there are vast, novel techniques and forms of expressions for visual artists to explore using current generative AI tools, even if you're a person sworn to honing your meticulous control and manual process. Like I mention in part 1, it can be viewed as a medium of it's own. By "it" more specifically I mean the entire possibility space of a generative image model, such as Stable Diffusion, also refereed to as the "latent space". A "neural-network clay" that is not in all cases a "replacement" of animators, but something unique to be sculpted on a deep level, which I'd say is often overlooked in the heated twitter debates and radicalized battles of opinions.
This realization took me a while though, because I was neglecting the hype of text-to-image paradigm. It felt to me like an overly simplified and crude way to "carve" into this new medium and explore that possibility space. It didn't feel hands-on, not even the most complex forms of "prompt engineering", it lacked agency and depth in its process. However, there was a key moment in my search where it all drastically shifted and clicked for me, mostly thanks to "ControlNets". In part 3, I review that search and how ControlNets fit into it, while development posts will delve deeper into technicalities of animation with it.
Notable resources regarding Art & AI
My thesis and Diffusion Pilot is very much focused on today, experimenting on the
newborn technology as it is rapidly growing towards something with potentially
profound implications on our society and art culture.
To whom the latter is interesting, here are some selected examples that juggle the topics I am mostly walking past in
my research - the plethora of ethical, cultural, economic, psychological and
other factors tied to the current day AI rush, in the face of Arts.
"AI & Creativity"
Philippe Pasquier (Metacreation Lab)
and Martin Pichlmair
A great crash course on what's happening right now with AI and Arts, before it inevitably gets outdated in the reckless AI race. The first half is packed with intriguing examples of what they call "Creative AI" across many domains, not just static images. This is coming from the "Metacreation Lab" where they intersect Science, Art and Applications by building, testing, creating art with, and eventually deploying tools and toys as applications.
The second part is a like a Yin to the first parts
Yang, focusing instead on the challanges and issues arising with the
emergence of Creative AI. Really great stuff.
My favorite quote from
Martin's talk (rephrased):
Using generative AI is like tapping into some weird global unconscious, that is boiled down into a database of statistics and probabilities between artifacts of human expression.
Craft 2.0 : Collaboration with Artificial Intelligence
Haeun Kim (2023)
Spotted among the graduating projects in EKA this year, this MA thesis examines the different types of human-AI collaboration, covering vast conceptual ground, while also actually executing on each of the types through practical work with ceramics. I have big respect for the author, who navigates the topic very elegantly and thoroughly, without sensationalism or panic about AI. They cover the most prominent types of AI currently accessible with actual care about how each one of them work, searching for moments of synergy between her creative process and AIs input.
The thesis also has very good choices for case studies, and even compares itself against them in this 2D diagram. I love the two dimensions that Haeun chooses to place the projects in the space, and I bet one could add more dimensionality. It shows the possible variety of AI related artistic research. When I am done with my thesis, maybe I will try to see where on this graph I end up.
Haeun has her thesis in an accessible and attractive booklet format, that you can also see digitally on her website.
Video essay "The AI Art Apocalypse"
Tim spends over 2 hours on his video essay about AI and Art not because of inability for being concise, but because there is THAT MUCH to consider and talk about, especially looking into the future.
Networks to dig into:
Haeuns in her thesis mentions "Arts Electronica" and its platform
"European ARTificial Intelligence Lab", that offers residencies and
nurtures some of the most cutting edge AI artistic research.
CAS is like a one-stop-shop for Computer x Art world. I especially
recommend watching any of the
archived talks and presentations on Youtube
from their events, to get a sense of the wildest cocktails of science,
technology, art, and of course AI.
And AIArtists.org I literally
discovered at the time of writing this, while trying to recall other things.
Looks very decent!
With all of this out of the way, it's time to finally get technical! On part 3, I'll close off the Intro series and take you through the process of my search and research that lead me to settle on using Stable Diffusion.