Touching a Third Sound: Trans-Sensing in a World of Deepfakes
Skip to main content

Touching a Third Sound:
Trans-Sensing in a World of Deepfakes

Person behind translucent screen
Jules Gimbrone. Photo: Malanda Jean-Claude for Walker Art Center

Deepfakes, a digital cultural phenomenon that emerged in 2018, are digitally manipulated videos that depict people saying or doing things that they never actually said or did. These synthesized videos are created using algorithms that learn the digital information comprising a face or avatar and then manipulate these systems, fusing them with original content. Over the past two years, the casual proliferation of photorealistic replacement apps like Pinscreen, FaceApp, or MugLife has placed deepfakes at the center of an ethical maelstrom. Companies working on the cutting edge of artificial intelligence—and putting that AI in the hands of consumers—are grappling with their role in creating technologies that can be used for dubious ends.

Scenario 1: A video is released depicting a politician saying something violently homophobic, sexist, or racist. The politician’s defense: “That’s a deepfake video.”

Scenario 2: A jealous man creates “revenge porn” by pasting his ex-girlfriend’s head on the body of a pornographic actress. He leaks the video to his ex-girlfriend’s workplace.

Scenario 3: A hacker manipulates a video of a president, then puppets them to declare war on another country.


We live in a cauldron of cheap visual proliferation. So much of what we value—social connections, commerce, representations of self, politics, art, and secret desires—is increasingly conveyed through shorthand on a digital screen. As reality TV stars become politicians, selfies become hashtag identities, complex sociopolitical critiques are reduced to tweets, and our activist itch is scratched by liking a divisive Facebook post, we are, unwittingly, forced to constantly reevaluate and attempt to parse the “real” from the “fake.” Overloaded by media, our need to make sense of complex information by containing it in quick binary-based discernment—i.e., real/fake, good/bad, man/woman—leaves us disempowered and reduced to slotting.

So I wonder: how can we cultivate new methods of sensing the world that aren’t reliant on a categorical flattening of complexity into the quantitative binary of real/fake?

As a composer and visual artist, I work in a third space between the visual and the sonic to produce synesthetic, hybrid content. What I call Trans-Sensing Modalities are methods that trans people—specifically those who identify as transgender, but also people whose subjectivity is unmoored from the dominant culture—intuitively cultivate to navigate the world. An emphasis on sensory presence and integration, such as trans, is a reprioritization of the nuanced body, the flexible body, the imagined body, and the listening body, one that is able to perceive information beyond a quick binary-based flattening.

Jules Gimbrone, Dysmorphas Draw a Line, 2018, installation view of In Practice: Another Echo, SculptureCenter, 2018. Photo: Kyle Knodell
Listen: Jules Gimbrone, audio excerpt from the installation Dysmorphias Draw a Line (2018), 08:12


Developing Trans-Sensing Modalities:

I am in the corner store buying seltzer. A woman walks in with her extremely adorable baby in a stroller. I quickly calculate my beard length, hair pulled up or down, wearing hat/not, leather jacket on/off, to deduce whether it will be appropriate or creepy for me to make eyes, funny faces, or engage in any playful way with the baby.

I walk into Guitar Center. A young white man greets me with a “Hey, bro, please check your bag at security.” I quickly pull up my shoulders, slow my gait, and walk to the speaker section where overeducated and underemployed sad music boys try to mansplain me the differences between DANTE digital protocol and analogue networking. I relay technical specificities in order to flag being worthy of non-condescending help.

I am walking down the street at night in Brooklyn, I notice three teenage boys walking toward me laughing and talking loudly. After I pass, one boy turns to his friends and yells, “Hey, was that a boy or a girl?” They yell to me down the street: “What the fuck, though?” I walk faster.

I pack up my computer and walk out to the parking lot of the hotel I am staying in while on a brief residency. The groundskeeper, a middle-aged white guy with a baseball cap on and a pale pink T-shirt tucked into his work jeans, upon exiting his work shed, calls out to me, “How’s it going, champ?” I hesitate, adjust my voice to as low as possible without sounding theatrical and say, “It’s great! It’s a beautiful day!” As I walk to my car, I think about what he must think is going on here with me. I—a “champ,” a young man—am staying at this nice hotel by myself and getting into my 2004 VW Golf and driving away from the hotel. On further reflection, I doubt he thinks about it at all.

I am interested in revealing the ways in which a trans subjectivity listens, discerns, and responds to the rigidity and assumptions of social systems and how this navigation requires a broadening of the perception of subjectivity beyond the solidified singular “real” self. Nuanced, integrated, perceptual sensing strategies are catalyzed by a need to decipher, decode, resonate, read, pass, and tune social systems and structures. While these hyper-sensing modalities may be forged from the threat of real or imagined harm, they can be claimed as powerful tools for retuning our bodies away from quick optical differentiations.

Jules Gimbrone, Perpetual Motion, digital print, 2019

In 1976, psychologists Michael Posner, Mary Jo Nissen, and Raymond Klein, asserted that sight is the most dominant of all the senses humans possess. There is ongoing debate, however, about why this is true, with some scientists positing that it is because of the limited ability of the visual system to alert us to threat. Because sight is poor at discernment, they conclude, our attention needs to be disproportionately favored toward the visual. This logic implies that there is room to actively tune to your senses in new, dynamic ways that could potentially disrupt our dominant (visual) way of perceiving the world.




You are driving across the country and find yourself approaching dusk in the southwest corner of Kansas. You drive two miles off the main highway down a winding dirt road, shrubs and tall grass slapping at your car. You are alone, and it is the beginning of summer. You pull into a loosely tended campground, which is marked by a series of wooden posts along a cul-de-sac at the end of the dirt road, and into a space. You get out to stretch. The air is sweet and singing the thousand small songs of cicadas, crickets, frogs. There is a pond adjacent the dirt oval. No other cars are there except for a big, black Toyota 4Runner and two green tents on the other side of the campground. You notice gun cases and hunting gear in the bed of the truck but do not see anyone. You look at the sky and determine you have, at most, one hour until nightfall. You contemplate the likelihood of the owners of the truck returning and seeing you, alone, at your site. Ripping through your car, quickly, you find your folded tent. You time yourself—five minutes, 10 minutes—putting together the interlocking metal posts. The nylon surface stretches over the metal poles displaying its octagonal arc. The tent is up. You are called to by the hum of the insects and the lulling chill of the pond. You walk towards it, take off your shoes and socks, and put your feet in. Your toes slip on a slimy rock. You see the blue-green dragonflies attached and flying, one on top of the other. You smell algae and dirt and the dank sweetness of your unwashed driving body. As the sun starts setting it sends crimson rays flaring in different directions. It is still hot, and you feel brave. You hide inside the tall grass and take off your clothes quickly and pile them close to the shoreline. Slipping into the pond, you are shocked by the cold but then relax, enveloped. You hold your chest, you rub your arms and legs, and let the mud ooze into your toes. Untying your hair, you put your head into the water. You turn and float on your back, rotating your head so your ears are just above and then below the surface. You play this surface. Up and down. First the chrp, szzz, bzzz in the air, then the muted low hum of your amplified underwater pulse pushing your ear drums. Beneath, you hear your heart pumping thumps and the whoosh-whoosh of your feet kicking the water. Above, you take a breath in then turn your body down into the water. You float. At once a hard sound punctures the water, a clackboom. You bring your head into the air and hear the distant crack of a shotgun. You crawl to the surface and your clothes: a binder, boxers, jeans, and blue T-shirt. You cover your body and run back to your tent. After a night of no sleep you leave before sunrise.

Jules Gimbrone. Photo: Malanda Jean-Claude for Walker Art Center

As the visibility of the trans-rights movement increases, so too has the level of violence towards trans people—specifically towards trans women of color. According to a recent survey, one in four transgender people has been assaulted because they are trans. In addition, self-perpetuated violence is as much a threat to survival as external violence. A 2018 a study focusing on suicide rates in transgender teens found that over 50 percent of trans boys interviewed, 29.9 percent of the transgender girls, and 41.8 percent of non-binary teens had attempted suicide.

All trans people at some point come into contact with a negotiation regarding what is “real” in terms of “self” and how they appear in the world. It may speak something like “this is not what they think it is” or “I am not what they think I am” or “there isn’t language for me” or “I don’t see myself reflected” or “I am in the wrong body” or “this system is rigged” or “fuck this: gender isn’t real.”


Hearing Lips and Seeing Voices,” a psychological paper published by Harry McGurk and John MacDonald, outlines the multi-modal language acquisition phenomenon called the McGurk effect. When presented with contradictory visual and auditory linguistic content (i.e. video of lips speaking a chosen word paired with the overdubbed audio of a different but similar word) there is a propensity to start “hearing” the incorrect audio. This display, or “visual capture,” in which the visible information interferes with the comprehension of the “real” audio, points to our reliance on the visual to ascertain meaning.

This illusion causes an integrated perceptual space that blurs the lines between solely visual or auditory orientation and cognition. McGurk and MacDonald relate this phenomenon to the perception of a “third sound”—one that creates a perceptual conduit between discrete interpretations of sound/source, visual/sonic, real/fake, self/other, and subject/object. This synesthetic, nonbinary type of perceiving is what I think is most akin to Trans-Sensing Modalities—a way of being that is dependent on a system of interactions rather than a singular, “real” source.

Jules Gimbrone, Pilar Not Saying Fool, Fuck, Cock, Black, White, Man, Dyke, Hand, Cunt, video, 2019

In the mid 2000s, YouTube was the place to go to see and hear transgender people who were undergoing hormone-replacement therapy. I would watch video diaries of FTMs, who were taking testosterone injections and measuring their body’s changes from week to week. There would be a quantification of muscle mass, broadening of shoulders, narrowing of hips, pinching of subcutaneous fat, adding up of chin hairs, measuring of mustache growth, male-pattern hair shape, proliferation of acne, jaw widening, and the most interesting part (to me), the voice deepening and twisting timbres.

In these videos there seemed to always be a negotiation with a heteronormative, visually oriented, dominant idea of real masculinity—or perhaps a haunting akin to a real-man ghost. Some explicitly stated their masculinity goals—pointing toward a specific amount of beard growth and fullness or a certain numeric V-shaped ratio of hips to shoulders—while a smaller group stated their ambivalence with the “ghost” or their frustrations in being able to access these incremental goals. Other groups rejected the external standards and resisted any outside pressure that would deny their authentic, nonconforming selves.

Maybe there was something hollow in these video diaries that didn’t actually lead toward a satisfying connection with each other or with ourselves. I would see people becoming more and more complicated in their expressions of dissatisfaction with the standards, these ghosts, pressed upon our bodies. Ultimately, many left the platform craving a different type of conversation.


Liner Notes for “A Slip”:

Over the last ten years, I have carried my fifty-pound, 1980s, Dokorder reel-to-reel recording machine with me from Northampton, MA, to Brooklyn, NY, to Pittsburgh, PA, to Lama, NM, to Los Angeles, CA. This hefty, archaic object allowed me time away from digital media to confront my own voice and instruments sometimes in the company of close friends. There, our breaths amplified in our headphones, delayed by the built-in echo, mirrored by haunting traces of previous recordings etched on the tape. Over the last six months, as my body was changing from weekly testosterone injections, I was confronted most acutely by my evolving voice. Sometimes I woke up to a gravelly, hoarse, crackly wisp of a voice, other days I felt my normal timbre and tone combined with the simultaneous unease and excitement in the transformations. The body is always an unwieldy instrument. The self is continuously made and unmade, yet the shifts that we can suddenly perceive seem to always hold more weight.

Listen: Jules Gimbrone, “A Slip” (2013), Audio, 03:53, Fault Lines, Pack Projects


In new study, researchers found that voice-only communication created more empathetic responses compared to vision-only and multisensory communication. They conclude that if you want to have a deeper conversation with more emotional nuance, you can rely more on voice-only aural communication. Pick up the phone and talk only. Close your eyes and have a conversation.

When we focus more on the tone, breath, pitch, hesitations, tempo of what is being said, rather than focusing on the person’s face or visual behavior, we can listen and understand more complex emotional content. Listening alone can garner more dynamic information than both watching or watching and listening.

Listen: Jules Gimbrone, “Language Until it Doesn’t” (2017), Audio, 08:19, presented as part of The World Is Sound at the Ruben Museum of Art (June 16, 2017–January 8, 2018)


Trans-Sensing Modalities offer new ways of being within the cauldron of digital visual proliferation. Trans people perceive a third sound, or create ways of attending to and caring for ourselves within systems that are founded on “real” empirical, binary-based, quick, optical categorizations. By attending to this digital moment, and this crisis of real/fake, in new, hybrid, synesthetic ways, we open up a third way for meaning to appear. Perhaps as we move deeper into the web of visual dominance, our only way forward will to be cultivate new, integrated ways of sensing our world.

Get Walker Reader in your inbox. Sign up to receive first word about our original videos, commissioned essays, curatorial perspectives, and artist interviews.