Abstract | ||
---|---|---|
We present the analysis of crowdsourced studies into how a population of Amazon Mechanical Turk Workers describe three commonly used audio effects: equalization, reverberation, and dynamic range compression. We find three categories of words used to describe audio: ones that are generally used across effects, ones that tend towards a single effect, and ones that are exclusive to a single effect. We present select examples from these categories. We visualize and present an analysis of the shared descriptor space between audio effects. Data on the strength of association between words and effects is made available online for a set of 4297 words drawn from 1233 unique users for three effects (equalization, reverberation, compression). This dataset is an important step towards implementing of an end-to-end language-based audio production system, in which a user describes a creative goal, as they would to a professional audio engineer, and the system picks which audio effect to apply, as well as the setting of the audio effect. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2964284.2967207 | ACM Multimedia |
Keywords | Field | DocType |
Interfaces,audio engineering,effects processing,signal processing,reverberation,equalization,compression,vocabulary,crowdsourcing | Computer vision,Population,Speech coding,Computer science,Audio mining,Folksonomy,Artificial intelligence,Audio signal processing,Multimedia,Dynamic range compression,Vocabulary,Professional audio | Conference |
Citations | PageRank | References |
1 | 0.36 | 6 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Taylor Zheng | 1 | 1 | 0.36 |
Prem Seetharaman | 2 | 28 | 7.14 |
Bryan Pardo | 3 | 830 | 63.92 |