adriana  sá
trans-disciplinary   music

Research summary


AG#2: exploring disparities between human perception
and digital analysis in audio-visual composition


AG#2 is the second version of a 3D software developed for the audio input of a zither with aged strings, a personal tuning system, and specific playing techniques.

AG#2 implements the Arpeggio-Detuning software, extending its creative strategies to the audio-visual domain. The difference between the detected pitch from the zither and the closest tone/ half tone is applied to the processing of digital sound and image, and to the audio-visual mapping.

AG#2 adapts software intended for the creation of video games, like AG#1. But very differently, AG#2 explores my three creative principles from scratch, at low level in the digital architecture. As a result, the sound, the image and the audio-visual relationship are far more organic. There is also greater control over the sound spatialisation features provided by the 3D engine.

The 3D engine was written by John Klima (2013), using an iOs/Android system from video games called Marmelade and an audio library called Maximilian. I made the specifications, the audio, the 3D world and the parameterisations.

Bellow are videos and a description of how the intrument explores my creative principles for the sound, the image and the audio-visual relationship. There are also some performances images.

Related articles:

Theoretical work:
Creative principles
Parametric model
Parametric representation
Practical work:
Practice summary





Screen capture: AG#2 with 3D world #1

Please see this full-screen
(click bottom-right button).
Or click HERE to see on Vimeo




the audio-visual instrument with AG#2

Please see this full-screen
(click bottom-right button).
Or click HERE to see on Vimeo



Screen capture of AG#2 with 3D world #2

Please see this full-screen
(click bottom-right button).
Or click HERE to see on Vimeo

Technical diagram > how the audio-visual instrument works:

Creative strategies for sonic complexity:

The audio-visual instrument has two interfaces for digital audio: the analysis from the zither input and a few switches assigned to audio sample banks. Whilst the switches provide control in defining musical sections, digital analysis creates unpredictability.

The detected pitch from the zither is mapped to the closest tone or half tone. The tone/ half tone is not played back itself. It is further mapped to a pre-recorded sound, which plays back twice. A pitch down value is applied during the second playback. This value equals the difference between the detected pitch and the closest tone or halftone. As the zither playing dwells with digital mappings and constraints, the music shifts in-between tonal enters.

AG#2 includes three audio sample banks, each containing twelve pre-recorded sounds. I developed specific zither playing techniques in combination with each sample bank (table on the right).


Adriana Sá solo

  Demos & Collaborative works

Creative strategies for visual continuty

The projected image is a shifting 3D camera view over a painting-like, underwater landscape, which morphs with sound. Visual changes happen at a level of detail; the projected image creates a stage scene that reacts to the sound, without distracting attention from the music.

My taxonomy of continuities and discontinuities shows that to keep the music in the foreground one must dispense with disruptive visual changes, i.e. radical discontinuities, which automatically attract attention. One should apply Gestaltist principles to visual dynamics: the image must enable perceptual simplification in order to provide a sense of overall continuity. There can be progressive continuities and ambivalent discontinuities. With progressive continuities, successive events display a similar interval of motion (Gestalt of good continuation). Ambivalent discontinuities are simultaneously continuous and discontinuous. At low resolution, the foreseeable logic is shifted without disruption. At high resolution, discontinuities become more intense.

3D world structure in AG#2:

1 Background:
static image rendered full screen, the backdrop.
2 Sky dome:
an enclosing, rotational image with multiple layers, which rotates consistently, independently from input sound. This motion creates progressive continuity, conveying gestalts of good continuation and invariance [Wertheimer 1938]. Sky dome transparencies allow portions of the background image to become visible, creating ambivalent discontinuities.
3 Environment terrain:
a large-scale 64x64 vertex mesh with multiple image layers, which is not affected by sound and which can be thought of as a distant "landscape". Transparencies allow portions of the sky dome and the background to become visible beneath and behind the environment terrain.
4 Foreground terrain:
a 64x64 vertex mesh with multiple image layers, which can be thought of as the ground the 3D camera is standing upon. Each vertex can be displaced in real-time. Audio input produces undulations modulated by sine waves. Audio input also displaces individual vertices, creating elevations. These visual dynamics prompt the gestalts of invariance and similarity [Wertheimer 1938]. Transparencies allow portions of the environment terrain, the sky dome and the background to become visible, creating ambivalent discontinuities.
5 Water surface:
a small-scale 64x64 vertex mesh with multiple image layers and transparency. Each vertex can be displaced in real-time. The water moves consistently, creating progressive continuity. Given the camera stands upon the foreground terrain, the water surface layers upon the sky dome and the background. This layering creates additional ambivalent discontinuities. Water transparency is mapped to sound. Transparency changes convey gestalts of invariance [Wertheimer 1938].


Creative strategies for a fungible audio-visual relationship:

The audio-visual mapping is fungible: cause and effect relationships range from transparent to opaque, producing a sense of causation, and simultaneously confounding the cause and effect relationships. The table bellow shows that certain visual changes are synchronised with sound, other changes occur with variable delay upon detection, and there are also visual changes that do not depend on sound.

The 3D camera direction shifts and the terrain elevations are synchronised with audio detection. A digital sound is emitted, a terrain elevation emerges, and the camera view shifts toward a new target direction.The water transparency parameter mapped to that same detection is synchronised with the second playback of the sample mapped to that same detection - the one that is pitched-down according to the difference between the detected pitch and closest tone/ half tone. The foreground terrain undulates according to variable frequencies, which change with variable delay upon detection. The sky dome and the water surface move consistently, at different pace, but independently from audio detection.



AG#2 implements 3D sound spatialisation features designed from scratch, considering the principles of sonic complexity and audio-visual fungibility. The elevations from the foreground terrain (see above) are sound emitters, which means that the digital sound output is routed dynamically to the speakers; it depends on the 3D camera position relatively to the position of the elevation in the digital 3D world. I use an inverted, doubble stereo system (see bellow), rather than a multi-channel sound diffusion system. The speed at which the sounds move through the system corresponds to the speed of the 3D camera. This produces a sense of causation. Yet, for example a visible sound emiiter on the left of the screen is simultaneousny emitted through the front-left speaker and the rear-right speaker. That confounds the cause and effect relationships.



AG#2 has been developed along with a performance series called Included Middle. I performed with this version of the audio-visual instrument in Portugal and the U.K., solo and in collaboration with other musicians.

Solo performances were in London, at Amersham Arms and at St. James Church/ Goldsmiths College; in Viseu, at Invisible Places/ Jardins Efémeros; and in Amsterdam, at STEIM. A collaborative performance was at Teatro Maria Matos, in Lisbon, with John Klima and Tó Trips. Two more were at Carpe Diem, also in Lisbon: a duo with Helena Espvall, and a trio with Nuno Torres and Nuno Morão.