Idea: Driving Through Virtual Soundscapes

This is an idea for a new way to navigate interactively through large audio sets, such as collections of thousands of music tracks, and to automatically or interactively learn and evolve interesting trajectories through such spaces.

For each track, choose a representative location (by default, in the middle of the track) and make a 2.5 second soundbite. This will represent an audio “icon” for the track; it should be representative of the nature of the track.

Next, tile a virtual 2D space with audio icons. There are many possible tiling algorithms. One would be to tile tracks such that tracks with similar bpm’s and keys are closer to one another; another might be by artist, or by genre, or by recording date, or by album, or by popularity ratings of the tracks, or by several of these measures.

Next, create a virtual vehicle that can move through this audio space. This vehicle is the “audio cursor” — whatever it is located on in the audio space is played. The audio cursor can be moved by the listener using a navigation input device (such as a mouse, trackpad, or the iPod trackwheel). In particular, it can be turned left or right in the virtual 2D landscape, and its velocity could also be changed with something functioning like a gas pedal. It might also have the ability to go forward or reverse. An ideal interface would be a car-cockpit input device those that are now sold for use with PC racing car games (steeringwheel, pedals, shifter, etc.).

Next, let the user “drive” through the audio landscape.

Randomly choose a starting point in space and start playing it. As the listener changes direction (via the trackwheel) it crossfades between current clip and next adjacent clip in space. Thus until the user has fully changed direction onto an adjacent clip both clips are played with the volume and fade weight adjusted in proportion to the direction of travel. This has the effect that the clip you are turning away from moves to correct surroundsound location and the volume fades out, while the clip you are turning towards moves to the correct surroundsound location and the volume fades in.

For simplicity around every location in audio space there are 8 directions, corresponding to 8 clips that are considered to be adjacent to it. Also the universe wraps around as well. The same clip can also be in many places in the space at once (so that it can be linked logically from other clips). In other words, there is no restriction that a clip can only exist in one location (although this could be enforced if desired).

Next, run genetic algorithms on the landscape to organize the clips in the best ways (thematically, or by genre, or according to mood, etc.). Imagine that this virtual audio landscape actually has geography and topography — corresponding to genres (location) and ratings or bpm (elevation).

Also imagine that users could create roadways through this space. Roadways evolve over time as a path is traveled more frequently. These roadways correspond to “playlists.” As listeners travel through the space they leave imprints that decay slowly over time. The more frequently users traverse a location in space the “deeper” or “stronger” the imprint at that location. As users navigate provide them with feedback about the intensity of the imprint at their current location — as well as the intensity of the imprint at each adjacent location around them. Ideally users should be able to sense both the cumulative imprint intensity at any location as well as the rate of change of the imprint intensity; and perhaps a momentum measurement that combines the two. This feedback should be used such that it influences the listener’s direction as they traverse the space: based on imprints, users should tend to navigate from the current track to the more highly imprinted adjacent tracks, thus reinforcing linkages between the most popular directions at each junction in space. Alternatively, links could established between each track and its adjacent tracks in order to separate the measure of track popularity and direction popularity. To accomplish this, let there be a directional link in each direction between each pair of adjacent tracks in space. Now simply adjust the weight of each link according to how frequently it is traversed. Next, it is possible to automatically adjust link weights as a function of node (track) states — such that the links adjust to favor the more popular tracks. The above methods of evolving paths through space are similar to ant-derived “scent-trail” collaborative rating schemes.

Automatically save the trajectory that a user takes in each driving/listening session as a playlist. Allow users to name their playlists. At every junction between 2 tracks, search all playlists to see if any playlists contain a matching sequence that traverses the nodes. Show the names of any found playlists graphically as alternative paths to take from the current node. In other words, when at any node, the user can see what playlists intersect with that node and then choose to “follow path” of any playlist they are interested in from that point.

To further enhance the listening experience consider combining all of the above with 3D visualization or immersive VR simulation, in which the user is literally navigating in a shared persistent 2D or 3D audio space, along with perhaps millions of other users. You could then travel with your friends, or go to a place where there were lots of others gathered (a live performance or new release perhaps), or even follow someone (a “DJ”) on a tour or trajectory. You could also choose your direction in the landscape using both auditory and visual cues — this is especially important for providing “distance vision” in the audioscape (not practical through audio due to caucophany issues!) — so while you can only hear the track you are playing and the track (if any) you are leaving or turning from, you can see visual representations of certain qualities of more distant tracks around you — perhaps elevation, color, texture, weather, light, and other features of a simulated landscape. You could then intuit that in a given direction the listening experience changes in a qualititave way, such as getting more groovy, or more peaceful, or going towards a certain mood or genre, etc.

Social tagging: > > > > > > >

8 Responses to Idea: Driving Through Virtual Soundscapes

  1. teknos says:

    A Tour Through Sound

    Idea: Interactive Audio Spatial Navigation Absolutely brilliant idea from Nova. Read it, and come back. Okay- so here’s some additions. You could also add a way for “drivers” to add meta-content to improve connections – or to suggest connections that

  2. teknos says:

    A Tour Through Sound

    Idea: Interactive Audio Spatial Navigation Absolutely brilliant idea from Nova. Read it, and come back. Okay- so here’s some additions. You could also add a way for “drivers” to add meta-content to improve connections – or to suggest connections that

  3. Matt says:

    Whilst it’s normally a good idea to base an interface on a skill that the user already knows, Driving has some limitations. Apart from the mechanics of the interface (what hardware? controls?) the essence of driving is A to B. Linear. Ignoring the surroundings. Making a turn just moves from one linear track to another. Music is an inherently pan-dimensional space.
    Something like a 3d-Defender or simulated hot air ballooning might be more fun, and apposite to exploration of the music space. Or ditch the R and make it DIVING!

  4. Nova Spivack says:

    Well, you could use a joystick to navigate and that would enably you have an audio-vehicle that could drive, fly and dive, if you really want! Oh heck, I’ll even throw in a prototype quantum teleportation drive we are working on so you can hyperwarp through wormholes in space-time to distant locations instantly. I wasn’t going to add one of these because they’re still experimental, but my boys tell me they can have one ready for alpha testing by next week. Use at your own risk.

  5. unmediated says:

    Hating on Multimedia

    Maciej Ceglowski has posted an “audioblogging manifesto” (transcript here) that is worth a listen. His basic point, that dictation-style audioblog posts and talking-head-style videoblog posts are boring, a waste of time, and antithetical to the nature …

  6. unmediated says:

    Hating on Multimedia

    Maciej Ceglowski has posted an “audioblogging manifesto” (transcript here) that is worth a listen. His basic point, that dictation-style audioblog posts and talking-head-style videoblog posts are boring, a waste of time, and antithetical to the nature …

  7. anon says:

    Would love to know your thoughts on DHTs and semantic free references? See the link below-
    http://nms.lcs.mit.edu/projects/sfr/

  8. Elliott Back says:

    “Next, run genetic algorithms on the landscape to organize the clips in the best ways” seems a little strange. Given some set of clips S with various properties and rankings p so that for each x in S there is a set of p, wouldn’t simple sorting suffice, such as the type that occurs in everyday database transactions suffice? Genetic algorithms are slow, and have terrible performance as the number of items in S increases, and the number of possible permutations of parameters to optimize increases non-linearly.