The Clockwork Penguin

Daniel Binns is a media theorist and filmmaker tinkering with the weird edges of technology, storytelling, and screen culture. He is the author of Material Media-Making in the Digital Age and currently writes about posthuman poetics, glitchy machines, and speculative media worlds.

Category: Technology

  • Grotesque fascination

    A few weeks back, some colleagues and I were invited to share some new thoughts and ideas on the theme of ‘ecomedia’, as a lovely and unconventional way to launch Simon R. Troon’s newest monograph, Cinematic Encounters with Disaster: Realisms for the Anthropocene. Here’s what I presented; a few scattered scribblings on environmental imaginaries as mediated through AI.


    Grotesque Fascination:

    Reflections from my weekender in the uncanny valley

    In February 2024 OpenAI announced their video generation tool Sora. In the technical paper that accompanied this announcement, they referred to Sora as a ‘world simulator’. Not just Sora, but also DALL-E or Runway or Midjourney, all of these AI tools further blur and problematise the lines between the real and the virtual. Image and video generation tools re-purpose, re-contextualise, and re-gurgitate how humans perceive their environments and those around them. These tools offer a carnival mirror’s reflection on what we privilege, prioritise, and what we prejudice against in our collective imaginations. In particular today, I want to talk a little bit about how generative AI tools might offer up new ways to relate to nature, and how they might also call into question the ways that we’ve visualized our environment to date.

    AI media generators work from datasets that comprise billions of images, as well as text captions, and sometimes video samples; the model maps all of this information using advanced mathematics in a hyper-dimensional space, sometimes called the latent space or a U-net. A random image of noise is then generated and fed through the model, along with a text prompt from the user. The model uses the text to gradually de-noise the image in a way that the model believes is appropriate to the given prompt.

    In these datasets, there are images of people, of animals, of built and natural environments, of objects and everyday items. These models can generate scenes of the natural world very convincingly. These generations remind me of the open virtual worlds in video games like Skyrim or Horizon: Zero Dawn: there is a real, visceral sense of connection for these worlds as you move through them. In a similar way, when you’re playing with tools like Leonardo or MidJourney, there can often be visceral, embodied reactions to the images or media that they generate: Shane Denson has written about this in terms of “sublime awe” and “abject cringe”. Like video games, too, AI Media Generators allow us to observe worlds that we may never see in person. Indeed, some of the landscapes we generate may be completely alien or biologically impossible, at least on this planet, opening up our eyes to different ecological possibilities or environmental arrangements. Visualising or imagining how ecosystems might develop is one way of potentially increasing awareness of those that are remote, unexplored or endangered; we may also be able to imagine how the real natural world might be impacted by our actions in the distant future. These alien visions might also, I suppose, prepare us for encountering different ecosystems and modes of life and biology on other worlds.

    But it’s worth considering, though, how this re-visualisation, virtualisation, re-constitution of environments, be they realistic or not, might change, evolve or hinder our collective mental image, or our capacity to imagine what constitutes ‘Nature’. This experience of generating ecosystems and environments may increase appreciation for our own very real, very tangible natural world and the impacts that we’re having on it, but like all imagined or technically-mediated processes there is always a risk of disconnecting people from that same very real, very tangible world around them. They may well prefer the illusion; they may prefer some kind of perfection, some kind of banal veneer that they can have no real engagement with or impact on. And it’s easy to ignore the staggering environmental impacts of the technology companies pushing these tools when you’re engrossed in an ecosystem of apps and not of animals.

    In previous work, I proposed the concept of virtual environmental attunement, a kind of hyper-awareness of nature that might be enabled or accelerated by virtual worlds or digital experiences. I’m now tempted to revisit that theory in terms of asking how AI tools problematise that possibility. Can we use these tools to materialise or make perceptible something that is intangible, virtual, immaterial? What do we gain or lose when we conceive or imagine, rather than encounter and experience?

    Machine vision puts into sharp relief the limitations of humanity’s perception of the world. But for me there remains a certain romance and beauty and intrigue — a grotesque fascination, if you like — to living in the uncanny valley at the moment, and it’s somewhere that I do want to stay a little bit longer. This is despite the omnipresent feeling of ickiness and uncertainty when playing with these tools, while the licensing of the datasets that they’re trained on remains unclear. For now, though, I’m trying to figure out how connecting with the machine-mind might give some shape or sensation to a broader feeling of dis-connection.

    How my own ideas and my capacity to imagine might be extended or supplemented by these tools, changing the way I relate to myself and the world around me.

  • Conjuring to a brief

    Generated by me with Leonardo.Ai.

    This semester I’m running a Media studio called ‘Augmenting Creativity’. The basic goal is to develop best practices for working with generative AI tools not just in creative workflows, but as part of university assignments, academic research, and in everyday routines. My motivation or philosophy for this studio is that so much attention is being focused on the outputs of tools like Midjourney and Leonardo.Ai (as well as outputs from textbots like ChatGPT); what I guess I’m interested in is exploring more precisely where in workflows, jobs, and daily life that these tools might actually be helpful.

    In class last week we held a Leonardo.Ai hackathon, inspired by one of the workshops that was run at the Re/Framing AI event I convened a month or so ago. Leonardo.Ai generously donated some credits for students to play around with the platform. Students were given a brief around what they should try to generate:

    • an AI Self-Portrait (using text only; no image guidance!)
    • three images to envision the studio as a whole (one conceptual, a poster, and a social media tile)
    • three square icons to represent one task in their daily workflow (home, work, or study-related)

    For the Hackathon proper, students were only able to adjust the text prompt and the Preset Style; all other controls had to remain unchanged, including the Model (Phoenix), Generation Mode (Fast), Prompt Enhance (off), and all others.

    Students were curious and excited, but also faced some challenges straight away with the underlying mechanics of image generators; they had to play around with word choice in prompts to get close to desired results. The biases and constraints of the Phoenix model quickly became apparent as the students tested its limitations. For some students this was more cosmetic, such as requesting that Leonardo.Ai generate a face with no jewelry or facial hair. This produced mixed results, in that sometimes explicitly negative prompts seemed to encourage the model to produce what wasn’t wanted. Other students encountered difficulties around race or gender presentation: the model struggles a lot with nuances in race, e.g. mixed-race or specific racial subsets, and also often depicts sexualised presentations of female-presenting people (male-presenting too, but much less frequently).

    This session last week proved a solid test of Leonardo.Ai’s utility and capacity in generating assets and content (we sent some general feedback to Leonardo.Ai on platform useability and potential for improvement), but also was useful for figuring out how and where the students might use the tool in their forthcoming creative projects.

    This week we’ve spent a little time on the status of AI imagery as art, some of the ethical considerations around generative AI, and where some of the supposed impacts of these tools may most keenly be felt. In class this morning, the students were challenged to deliver lightning talks on recent AI news, developing their presentation and media analysis skills. From here, we move a little more deeply into where creativity lies in the AI process, and how human/machine collaboration might produce innovative content. The best bit, as always, will be seeing where the students go with these ideas and concepts.

  • Unknown Song By…

    A USB flash drive on a wooden surface.

    A week or two ago I went to help my Mum downsize before she moves house. As with any move, there was a lot of accumulated ‘stuff’ to go through; of course, this isn’t just manual labour of sorting and moving and removing, but also all the associated historical, emotional, material, psychological labour to go along with it. Plenty of old heirlooms and photos and treasures, but also a ton of junk.

    While the trip out there was partly to help out, it was also to claim anything I wanted, lest it accidentally end up passed off or chucked away. I ended up ‘inheriting’ a few bits and bobs, not least of which an old PC, which may necessitate a follow-up to my tinkering earlier this year.

    Among the treasures I claimed was an innocuous-looking black and red USB stick. On opening up the drive, I was presented with a bunch of folders, clearly some kind of music collection.

    While some — ‘Come Back Again’ and ‘Time Life Presents…’ — were obviously albums, others were filled with hundreds of files. Some sort of library/catalogue, perhaps. Most intriguing, though, not to mention intimidating, was that many of these files had no discernible name or metadata. Like zero. Blank. You’ve got a number for a title, duration, mono/stereo, and a sample rate. Most are MP3s, there are a handful of WAVs.

    Cross-checking dates and listening to a few of the mystery files, Mum and I figured out that this USB belonged to a late family friend. This friend worked for much of his life in radio; this USB was the ‘core’ of his library, presumably that he would take from station to station as he moved about the country.

    Like most media, music happens primarily online now, on platforms. For folx of my generation and older, it doesn’t seem that long ago that music was all physical, on cassettes, vinyl, CDs. But then, seemingly all of a sudden, music happened on the computer. We ripped all our CDs to burn our own, or to put them on an MP3 player or iPod, or to build up our libraries. We downloaded songs off LimeWire or KaZaA, then later torrented albums or even entire discographies.

    With physical media, the packaging is the metadata. Titles, track listings, personnel/crew, descriptions and durations adorn jewel cases, DVD covers, liner notes, and so on. Being thrust online as we were, we relied partly on the goodwill and labour of others — be they record labels or generous enthusiasts — to have entered metadata for CDs. On the not infrequent occasion where we encountered a CD without this info, we had to enter it ourselves.

    Wake up and smell the pixels. (source)

    This process ensured that you could look at the little screen on your MP3 player or iPod and see what the song was. If you were particularly fussy about such things (definitely not me) you would download album art to include, too; if you couldn’t find the album art, it’d be a picture of the artist, or of something else that represented the music to you.

    This labour set up a relationship between the music listener and their library; between the user and the file. The ways that software like iTunes or Winamp or Media Player would catalogue or sort your files (or not), and how your music would be presented in the interface; these things changed your relationship to your music.

    Despite the incredible privilege and access that apps like Spotify, Apple Music, Tidal, and the like, offer, we have these things at the expense of this user-file-library relationship. I’m not placing a judgement on this, necessarily, just noting how things have changed. Users and listeners will always find meaningful ways to engage with their media: the proliferation of hyper-specific playlists for each different mood or time of day or activity is an example of this. But what do we lose when we no longer control the metadata?

    On that USB I found, there are over 3500 music files. From a quick glance, I’d say about 75% have some kind of metadata attached, even if it’s just the artist and song title in the filename. Many of the rest, we know for certain, were directly digitised from vinyl, compact cassette, or spooled tape (for a reel-to-reel player). There is no automatic database search for these files. Dipping in and out, it will likely take me months to listen to the songs, note down enough lyrics for a search, then try to pin down which artist/version/album/recording I’m hearing. Many of these probably won’t exist on apps like Spotify, or even in dingy corners of YouTube.

    A detective mystery, for sure, but also a journey through music and media history: and one I’m very much looking forward to.