Transdiegetic Sound Design in Film: Definition & Examples

Transdiegetic sounds in film case study featured image
Reading Time: 10 minutes

Published: February 26, 2026

Add FilmDaft as a preferred source on Google
Add FilmDaft as a preferred source on Google

In film, sometimes scenes make you reclassify what you are hearing. A sound starts “in the world,” then it starts acting like the film score. Sometimes the score gains a clear source inside the story. Enter transdiegetic sound design in film.

TL;DR

  • Definition: Transdiegetic sound design is when sound crosses the boundary between diegetic (story-world) and non-diegetic (narrational) status.
  • Why it matters: The crossing can expose control, authorship, and how the film builds space and attention.
  • How to spot it: Listen for a source reveal, a cut into a nested layer (show-within-a-film), or a mix shift that makes the sound feel less tied to a room and more like score.

Introduction

Transdiegetic sound design is when a film’s sound crosses the border between the story world and film narration.

You hear this when a sound starts as something the characters can hear, then it starts acting like the score. You also hear it when score-like music gains a clear source inside the story.

Film theory uses diegesis in more than one historical sense, and those senses still coexist in scholarship. That overlap is one reason the vocabulary can get confusing when you start classifying music. (Castelvecchi 2020).

This article keeps transdiegetic focused on the diegetic boundary in film sound. It also names a second boundary that helps with a careful cross-media translation, the address boundary, where a character speaks straight to you. (Jørgensen 2008; Brown 2012).

How this article is organized

This article moves from terms to method to scene work. The goal is that you can reuse the same checklist on your own scenes.

  • Where diegesis comes from, and how film theory uses it.
  • Where “transdiegetic” comes from as a label in film sound writing.
  • Jørgensen’s internal/external split in game sound, plus a careful film translation.
  • What I’m adding beyond Taylor, Stilwell, and Winters, so the novelty is explicit.
  • Examples at a glance, plus the criteria used to pick them.
  • Case studies with concrete listening cues.
  • Common mislabels and limits / edge cases.

Where “diegesis” comes from

In film studies, diegesis often means the story’s time-and-space world. That meaning is not identical to the older Greek “narration mode” usage. That mismatch is a real source of confusion in sound debates. (Castelvecchi 2020).

Diegesis and mimesis in ancient theory

In Plato’s discussion of storytelling, diegesis can mean telling a story through narration, while mimesis points to enacted imitation. Later theory reshapes the relationship between the terms, which helps explain why “diegesis” can mean different things across fields. (Taylor 2007, 1–2).

How film studies adopted “diegesis” as “story world”

Taylor describes how film studies adopts the French diégèse in a filmological “story world” sense and records a dispute about its introduction and reception. (Taylor 2007, 3).

If you want a clean claim you can cite and reuse, Winters uses Souriau’s filmological sense and writes: “diegesis indicates the existence of a unique filmic universe.” (Winters 2010, 5).

Diegetic and non-diegetic sound in film

Once diegesis means “story world,” sound analysis often uses a basic pair: diegetic sound belongs to the story world, and non-diegetic sound belongs to film narration. That pair helps you describe many scenes, but music often sits in a grey zone. (Winters 2010).

Read more about diegetic and non-diegetic sound and music in film.

Why the border is tricky for music

Music can be anchored to a visible source, then it can spread across cuts and spaces in a way that exceeds that source. Stilwell calls attention to this in-between zone with the fantastical gap. (Stilwell 2007, 184–202).

Winters adds a related critique. He argues that film music theory can overstate “non-diegetic” as purely outside the story world, since underscoring often helps build narrative space. (Winters 2010).

Spelling choice in this article

This article uses non-diegetic with a hyphen in the body text. Some source titles use nondiegetic without a hyphen, and those titles keep their original spelling in the reference list.

Where the term “transdiegetic” comes from

Taylor introduces transdiegetic as a label for sound that crosses the border between diegetic and non-diegetic status and stays hard to pin down. Taylor defines it like this: “sound’s propensities to cross the border of the diegetic to the non-diegetic and remaining unspecific.” (Taylor 2007, 3).

Taylor also links the expression to early-1990s usage and a Zurich film-studies context. (Taylor 2007, 3).

Jørgensen’s split in-game sound, and why it matters for film

Kristine Jørgensen uses transdiegetic to describe a communication frame in games that crosses between the game world and the player. Jørgensen names the frame like this: “I call this emerging frame of communication transdiegetic.” (Jørgensen 2008).

Her value for film analysis is not that films work like games. Her value is that she splits transdiegetic sound into two directions, which helps you stay specific when you analyze boundary crossing. (Jørgensen 2008).

Jørgensen’s two categories (key clauses with ellipses)

  • Internal transdiegetic: Jørgensen defines this case through the avatar-to-player bridge. The defining clauses are: “diegetic source within the gameworld (the avatar) … entity external to the gameworld (the player).” (Jørgensen 2008).

  • External transdiegetic: Jørgensen defines this case through an interface-to-world bridge. The defining clauses are: “source that does not exist within the gameworld (the map) … directly relevant for what is going on internally in the gameworld.” (Jørgensen 2008).

I explain transdiegesis in more detail in this article on GameDaft.

A careful note on “ludomusicology” as a later umbrella label

Later scholarship often places game-music research inside ludomusicology. Fritsch and Summers describe it as “a word originating in the work of Guillaume Laroche, Nicholas Tam and Roger Moseley.” (Fritsch and Summers 2021, 3).

In the same passage’s note, they also mention that Moseley used the term in 2008 for a paper titled “Rock Band and the Birth of Ludomusicology.” (Fritsch and Summers 2021, 3).

How we modify Jørgensen’s split for film theory

Film has no avatar that you control, so you cannot transfer the player–avatar model directly. Film does have two boundaries that matter for analysis, and sound can support both.

The diegetic boundary in film

This is the boundary between diegetic and non-diegetic sound. This is where Taylor’s transdiegetic label fits most cleanly. (Taylor 2007, 3).

The address boundary in film

This is the boundary between the story and you as a watcher. In film studies, direct address and breaking the fourth wall name moments when a character speaks straight to you. (Brown 2012).

Direct address is not automatically transdiegetic, since it does not always shift a sound from diegetic to non-diegetic. It still matters here because it is a film-native way to model “inside speaks to outside,” which is one of the key functions Jørgensen tracks in games. (Jørgensen 2008; Brown 2012).

A film-ready split you can apply

  • Internal transdiegetic (film): a sound starts with a clear story-world anchor, then it starts acting like narration across edits or spaces. (Stilwell 2007, 184–202).
  • External transdiegetic (film): a cue starts as score-like narration, then the film gives it a story-world anchor inside a nested layer, such as a show-within-a-film or a staged performance space. (Taylor 2007, 3).

What I’m adding beyond Taylor, Stilwell, and Winters

Taylor, Stilwell, and Winters help you see why the diegetic border is unstable in practice. This article adds three reusable tools for film analysis.

  • A directional split for film (internal vs external) adapted from Jørgensen, with film-native boundaries in place of the avatar loop. (Jørgensen 2008).
  • A layered-listener mapping tool for films with embedded spectatorship, where one cue can sit in different statuses depending on which layer you track. (Taylor 2007, 3).
  • A listening checklist that forces you to name mix cues that trigger reclassification, not just the category label. (Chion 2019).

Examples at a glance

This table is a quick way to spot the pattern before you do full micro-analysis.

Criteria for inclusion

I chose the examples below for three reasons. First, each example has a clear reclassification trigger you can point to. Second, the set covers different mechanisms, such as nested media layers, direct address, and source music that takes on score-like control. Third, each example has a scene you can revisit without needing a full film-wide argument.

Film exampleSound elementCrossing typeWhat to listen for
The Truman Show (1998, Paramount Pictures)Broadcast-style scoring inside a show-within-a-filmExternal transdiegeticDoes the cue stay score-like across a cut into the production layer, or does it gain a story-world anchor?
Ferris Bueller’s Day Off (1986, Paramount Pictures)Direct address speechAddress boundary supportHow does the mix keep the voice “private” and stable so the address feels intentional?
Baby Driver (2017, Sony Pictures Releasing)Source music tied to editing and action timingInternal transdiegeticWhen does a “song in the scene” start acting like the scene’s organizing grid?
Apocalypse Now (1979, United Artists)Radio song that swells beyond its on-screen sourceInternal transdiegeticTaylor’s example turns on a swell that expands beyond the radio’s local space. (Taylor 2007, 3).
Henry V (1989, Samuel Goldwyn Company)Diegetic singing that becomes “phantom” choral/orchestral supportInternal transdiegeticListen for the moment the chant stops feeling local and starts feeling like score. (Taylor 2007, 3).
Atonement (2007, Focus Features)Typewriter rhythm that blends with scoreInternal transdiegeticWhen does the typewriter stop reading as prop sound and start reading as percussion? (Watts 2018).

Method: a checklist for transdiegetic analysis

This method keeps you grounded in what you can point to on screen and in the mix.

  1. Name the sound event. Identify the cue, sound, or music that matters.
  2. State its starting status. Decide if it begins as diegetic or non-diegetic.
  3. Find the crossing trigger. Look for a source reveal, a cut to a nested layer, or a shift in how “located” the sound feels. (Taylor 2007, 3).
  4. Track access. Ask who can hear it inside the story world, and who cannot.
  5. Name the mix cues. Listen for changes in level, reverb/space, stereo width, and masking.
  6. State the function. Explain what the crossing does for time, space, control, or theme.

Chion’s larger point supports this approach: sound can change how you read the image and how you organize the scene in your head. (Chion 2019).

Case study 1: The Truman Show (1998, Paramount Pictures), “Truman Sleeps”

This sequence is a clean test case because the film contains a show inside the film. That structure creates stacked listening positions, so sound status becomes part of the story’s control system.

Listen for the cut away from Truman’s bedroom into the show’s production layer. Ask whether the music stays “placeless” like score or starts feeling like show-made scoring. That shift is the point.

The layered listening positions

  • Truman’s world: what Truman can hear in his bedroom.
  • The production layer: the control-room world that runs the show.
  • The in-film TV watchers: the people who watch Truman inside the story.
  • You: watching the film.

What the scene makes you reclassify

At first, you can classify the cue as non-diegetic because there is no music source in Truman’s room. Then the film’s nested “show” layer invites another reading, since broadcast scoring is part of how the show is made inside the story. That is the external transdiegetic situation: one cue can sit in different statuses depending on which layer you track. (Taylor 2007, 3).

Granular sonic cues to listen for

These tests keep the analysis specific. They also keep you from guessing.

  • Reverb and room-feel: does the cue feel like it lives in a space, or does it stay dry and “placeless” across cuts?
  • Dynamics: does the cue swell as the film widens the scene’s scope, which can push it toward narrational control?
  • Stereo width: does the cue stay narrow and local, or does it widen like score that wraps the scene?
  • Masking and priority: does the cue sit above production noises and dialogue in a way that signals narrational priority?

Taylor’s definition helps you justify why these tests matter, since his term targets sound that crosses the border while “remaining unspecific.” (Taylor 2007, 3).

Why the crossing matters for theme

Weir states the film’s two-level authorship directly in the soundtrack notes. Weir writes: “Sometimes the music is Christof’s choice, sometimes it’s mine!” (Weir 1998).

That line gives you a concrete theme claim. The film is about a manufactured reality, and the music is part of the manufacture. The soundtrack does not only set mood. It can reveal control.

Case study 2: Ferris Bueller’s Day Off (1986, Paramount Pictures), direct address in the opening

This film is not built around a show-within-a-film. It still matters here because it gives you a film-native version of “inside speaks to outside.” Brown treats direct address as a central category for this viewer-targeted speech. (Brown 2012).

Listen for how the mix makes Ferris’s voice feel like a private aside. Pay attention to vocal closeness, room tone, and whether anything competes with the address.

What direct address changes in the soundtrack

The sound question is simple: how does the soundtrack make the direct channel feel stable? One common method is vocal proximity. The voice feels closer than normal dialogue, so the address reads as deliberate communication, not overheard speech. (Brown 2012).

How this helps the film translation of Jørgensen’s “internal” type

Jørgensen’s internal type describes a world-anchored channel that communicates outward. In games, the channel often runs through the avatar. In film, the closest analogue is direct address speech, since it creates a channel from a character inside the fiction to you outside it. (Jørgensen 2008; Brown 2012).

Case study 3: Baby Driver (2017, Sony Pictures Releasing), source music that becomes narration

This example shows an internal transdiegetic pattern in a standard film setup. You do not need a show-within-a-film. You need music that starts as a clear source, then it starts acting like the scene’s timing system. (Stilwell 2007, 184–202).

Listen for sync points. Track when cuts, footsteps, door hits, and car movement lock to the beat so strongly that the song starts acting like narration.

The internal transdiegetic move

When a cue begins as source music, you can still ask whether it stays a local object in the world or becomes a narrational grid that organizes the scene. Stilwell’s fantastical gap helps you describe that slide without forcing an all-or-nothing label. (Stilwell 2007, 184–202).

Common mislabels: what is not transdiegetic

These are common places where people use “transdiegetic” too loosely.

Mislabel 1: a normal montage needle-drop

A song that plays over a montage is often non-diegetic from start to finish. The song can bridge time and place, but there is no reclassification trigger. There is no boundary crossing to defend, so “transdiegetic” adds nothing.

Mislabel 2: mickey-mousing as boundary crossing

Mickey-mousing is tight synchronization between music and on-screen action. That can happen without any diegetic shift. Winters even lists mickey-mousing as a case where music can be perceived as operating “on the same level as the rest of the narrative,” which is a different issue from crossing the diegetic border. (Winters 2010, 8).

Limits and edge cases

Some scenes make the diegetic border hard to classify even when there is no deliberate transdiegetic crossing. These are common traps.

  • Dream sequences and hallucinations: sounds can be “real” inside a character’s subjective world while remaining outside the film’s shared world. You need to name which world you mean before you classify.
  • Musical films: characters can shift between speaking and singing conventions fast. The “do they hear it” question can become convention-based instead of plot-based. (Castelvecchi 2020).
  • Voiceover that feels spatial: voiceover can feel like narration, memory, confession, or address. If the film treats it as inhabiting space, you need to explain how the mix signals that shift.
  • Documentary and nonfiction: Taylor notes that diegetic labels can be more problematic outside fiction, so you should justify your use carefully when “story world” is not clearly fictional. (Taylor 2007, 3).

Summing Up

Transdiegetic sound design gives you a precise way to analyze moments where a cue crosses the diegetic and non-diegetic boundary and where that crossing is part of the scene’s meaning. Taylor provides both the origin story and a clear definitional core. (Taylor 2007, 3).

Jørgensen’s internal/external split helps because it names direction, not only ambiguity. Film has no avatar, so the film translation relies on film-native boundary events such as layered spectatorship and direct address. (Jørgensen 2008; Brown 2012).

Read Next: Want to see theory in action?


Explore our full Film History, Theory & Genre hub to learn how movements, styles, and structure have shaped screen culture.


Then dive into our Case Studies & Analysis section for close reads of iconic films, scenes, and techniques—broken down with high school-friendly examples you can use in class or on set.

References

  • Brown, Tom. 2012. Breaking the Fourth Wall: Direct Address in the Cinema. Edinburgh: Edinburgh University Press.
  • Castelvecchi, Stefano. 2020. “On ‘Diegesis’ and ‘Diegetic’: Words and Concepts.” Journal of the American Musicological Society 73 (1): 149–171. https://doi.org/10.1525/jams.2020.73.1.149.
  • Chion, Michel. 2019. Audio-Vision: Sound on Screen. 2nd ed. New York: Columbia University Press.
  • Fritsch, Melanie, and Tim Summers. 2021. “Introduction.” In The Cambridge Companion to Video Game Music, edited by Melanie Fritsch and Tim Summers. Cambridge: Cambridge University Press.
  • Jørgensen, Kristine. 2008. “Audio and Gameplay: An Analysis of PvP Battlegrounds in World of Warcraft.” Game Studies 8 (2). https://gamestudies.org/0802/articles/jorgensen.
  • Stilwell, Robynn J. 2007. “The Fantastical Gap between Diegetic and Nondiegetic.” In Beyond the Soundtrack: Representing Music in Cinema, edited by Daniel Goldmark, Lawrence Kramer, and Richard Leppert, 184–202. Berkeley: University of California Press.
  • Taylor, Henry M. 2007. “Discourses on Diegesis: The Success Story of a Misnomer.” Offscreen 11 (8–9). https://offscreen.com/pdf/taylor_diegesis.pdf.
  • Watts, Catrin. 2018. “Blurred Lines: The Use of Diegetic and Nondiegetic Sound in Atonement (2007).” Music and the Moving Image 11 (2): 23–36. https://doi.org/10.5406/musimoviimag.11.2.0023.
  • Weir, Peter. 1998. “Notes.” The Truman Show soundtrack materials (Milan Records), reproduced at Philip Glass’s official site. https://philipglass.com/recordings/truman_show/.
  • Winters, Ben. 2010. “The Nondiegetic Fallacy: Film, Music, and Narrative Space.” Music & Letters 91 (2): 224–244. https://doi.org/10.1093/ml/gcq019.
  • Baby Driver. 2017. Directed by Edgar Wright. Sony Pictures Releasing. Film.
  • Ferris Bueller’s Day Off. 1986. Directed by John Hughes. Paramount Pictures. Film.
  • Henry V. 1989. Directed by Kenneth Branagh. Samuel Goldwyn Company. Film.
  • Apocalypse Now. 1979. Directed by Francis Ford Coppola. United Artists. Film.
  • Atonement. 2007. Directed by Joe Wright. Focus Features. Film.
  • The Truman Show. 1998. Directed by Peter Weir. Paramount Pictures. Film.

By Jan Sørup

Jan Sørup is an indie filmmaker, videographer, and photographer from Denmark. He owns FilmDaft.com and the Danish company Apertura, which produces video content for big companies in Denmark and Scandinavia. Jan has a background in music, has drawn webcomics, and is a former lecturer at the University of Copenhagen.