2. Physics of Sound

Sound

  1. Sound is a disturbance of an elastic medium.
  2. Sound is a type of energy which exists in three dimensions, and causes a disturbance within the molecular particles of an elastic medium (for example, a gas [air], liquid [water], or solid [your bedroom wall]). Sound does not travel through a vacuum. Sound is not a particle itself, and exists as a wave which oscillates above and below an equilibrium. 

Sound Waves

  1. Sound waves are a sinusoidal, longitudinal variation of pressure occurring around an equilibrium. 
  2. Individual waves are called sound waves, and these exist in their most basic / purest state as sine waves. All sounds are always only made up of sine waves, even those we perceive as being more complex (whether audibly, or using visual measurement tools). Complex sound waves are made of two or more sine waves, though sound as it exists in the environment has an infinite number of interacting sine waves. Sine Waves as we might consider them in a mathematical sense do not exist in nature as a sole phenomena.

The velocity of sound waves decrease as they travel, resulting in a finite distance, however sound moves faster in denser mediums due to the more tightly packed molecular structure. For example, sound travels faster through water than through air, and faster at low altitude than at high altitude (due to the cumulative effect of gravity compacting the molecules in the atmosphere). Sound waves, like light waves, also diffract, however since sound has much longer wavelengths than light, we are able to hear sounds around corners of solid objects, whose source we might not be able to see visually.

Frequency

 

  1. The frequency of sound is a measure of its wavelength, which is determined by the rate at which the elastic medium oscillates above and below the equilibrium. Frequency is measured in the time domain in Hertz (Hz).
  2. Frequency is determined by the number of times per second a sound wave repeats its cycle, this is measured in Hertz (Hz). Frequency is also known as Pitch, you might hear reference to the pitch of a string for example. The difference is that Frequency is a mathematical measurement of a physical phenomena, whereas Pitch is a description of our perception of the sound. For example, two people might describe experiencing a different pitch from the same objectively measured frequency. This might be due to any number of factors, including the physical environment in which the sound occurs, the equipment used to generate the sound, and the listeners attunement to the harmonic content within the sound.

Low and High frequency sound waves behave differently in physical environments. For example, low frequency sound waves are able to travel long distances, and diffract around corners and through seemingly solid objects. High frequency sound waves do not travel as far, and are transferred into heat energy more quickly than low frequency waves when coming into contact with solid objects. 

Frequency and Amplitude are unrelated, a sound might predominantly have a high Frequency and a low Amplitude, or a low Frequency and a high Amplitude, or any combination.

Amplitude

  1. The amplitude of a sound is determined by the range by which the disturbance causes the elastic medium to rise above and below the equilibrium. 
  2. Amplitude is a measurement of the range by which the elastic medium through which a sound travels fluctuates above and below the equilibrium. Amplitude might also be called volume or loudness, however these terms have other specific meanings and should not be confused with Amplitude. Like Frequency, Amplitude is an objective mathematical measurement and will be displayed in a Peak Meter, whereas Root Means Square (RMS) is an integrated loudness based on average peak readings, volume is the perception of each individual listener, and Loudness (measured in LUFS) is an algorithmically calculated measurement of combined perceived loudness over time, whilst applying an EQ curve adapted to human hearing.

Phase

  1. Phase is a periodic measurement of the amplitude of a singular point of the sound wave in relation to the equilibrium of the elastic medium.
  2. Phase is a periodic measurement of the amplitude of a singular point of the sound wave in relation to the equilibrium of the elastic medium. A sound wave will always be ‘in’ phase with itself, however two or more sound waves might be ‘out’ of phase with one another, resulting in phase cancellation. If two sine waves are perfectly in phase, their amplitude is summed together, as a result our ears would then perceive this sound to be louder. When two sine waves of the exact same frequency are not in phase, any phase difference will be deducted from the summed amplitude of the two waves, this could result in anything from total silence through to a barely perceivable difference to our ears. Since sound waves are typically complex, phase cancellation may also occur periodically across two or more interrelating waveworms, resulting in spectral changes, changes in perceived frequency content over time. 

Phase cancellation in audio isn’t inherently ‘bad’, it can be a creative tool depending on your creative philosophy, it’s also a largely unavoidable natural phenomena which allows our ears to process and perceive the spatial location of sounds. Learning to perceive phase cancellation in audio however provides you with the option of engaging with this phenomena with creative intent.

Fundamental Frequency

  1. The lowest frequency sine wave component of a complex sound
  2. The Fundamental Frequency is the lowest frequency sine wave component of a complex sound, from which all of the other harmonics are derived. All complex sounds have a Fundamental Frequency which can be objectively measured, and the Fundamental Frequency is also the true Pitch of the sound. However, due to both measurement equipment and the ear’s attunement to harmonic content you may experience other bands of the sounds content to be more present than the fundamental.

Harmonics

  1. Harmonics are frequencies derived from the Fundamental Frequency in complex sounds.
  2. Harmonics (or harmonic content) are frequencies derived from the Fundamental Frequency of a complex sound. The Harmonics might be in a mathematical sequence, such as in a Sawtooth Wave, or the 2nd Order Harmonic Distortion which occurs in some analogue electrical equipment, or much more complicated again like the Bessel functions which describe the harmonics generated by physical cymbals and gongs. Harmonics might also be stochastic (random), or many times more complex than we could possibly calculate. 

 

3. Generating sound

Non-Instrument Sources

Because sound is a type of energy transference, it is generated as a result of any interaction between two materials in an environment which contains an elastic medium. Whilst in music we often think of a purpose built instrument or human voice as being our target sound source, we must also consider environmental noise, embodied noise, machine noise, and noise as a by product of our direct interaction with our environment.

The fan in a computer, the analogue circuitry in a synthesiser, the hiss of a valve amp or a DAC, reflections in a room as a result of human movement, traffic noise, the wind, all of these are sources of sound which we might not be concentrating on fully because they are often all around us. Any of these sounds might be captured on an audio recording, though whether they are classed as ‘unwanted’ sound pollution, or as integral parts of our sense of reality is a subjective philosophical difference. Since a subjective difference is never fixed, this also depends on the application of agency and technique to the current scenario.

One definition of ‘musical instrument’ might be ‘a tool that was purposefully designed to make music’. This is reductive however, and non-instruments can also be repurposed or recontextualised as instruments themselves, simply by interacting with them with creative agency. For example, an audio recording of the sound of footsteps, or cars passing on a road could become rhythmic elements in a composition. An item never intended as a percussion instrument such as a box of cereal could be used as a shaker. Our individual definition of music is deeply culturally and socially ingrained, and therefore the difference between the tools which we think of as being designed for music making or not are part of that perception. If we collectively limit our expectations about what an instrument (or music itself) is or isn’t, we limit our scope for creativity.

Acoustic Instrumentation

The Human Voice

The Human voice is the oldest form of acoustic instrumentation. Vocal chords are muscles and other tissues inside the larynx, located in the throat, which vibrate when they touch as air from the lungs passes over them. This produces sound, the pitch, duration, dynamics and timbre of which can be controlled by the singer.

Wind Instruments

Wind instruments are a physical object which relies on air (typically from a human mouth) being passed through a structure and system of air vents to produce a sound. As these air vents are opened or closed in different combinations, this changes the length of the air flow within the structure and changes the resulting pitch of the sound waves as they exit the structure. These structures might be entirely passive, for example a recorder, or more complex such as a trumpet with mechanical valves, or a trombone with a sliding mechanism. 

Wind instruments might also contain a reed, which is a thin material located in the mouthpiece of the instrument which the performer blows air over, producing sound. Wind instruments which contain a reed are called Woodwind instruments, though this does not imply that they are all made solely from wood. Woodwind instruments include the saxophone, oboe and clarinet amongst others.

Since wind instruments have such a long history, technological innovations have contributed to changing designs. For example, a flute might be entirely passive and carved from bone or wood, but might also be made from metal, contain a reed and have a series of mechanical valves. These instruments have the same name and lineage, but are not constructed or played in the same way. 

String Instruments 

String instruments utilise an elastic string between two fixed positions (there are exceptions, such as the pedal steel guitar in which these positions can be moved by the performer). The string is typically plucked or struck (like a harp of guitar), or bowed using an additional object called a bow. A bow also uses an elastic material stretched between two fixed positions, and generates vibrations in the string of the instrument via friction (like a violin).

Similarly to the technological innovations found in wind instruments, technology has also had a profound impact on string instruments. For example, harpsichords pluck their strings via a mechanism when the key is struck, and pianos (which were invented later) strike the string with a beater. String instruments have also been adapted to incorporate transducer pickups to amplify their sound.

Percussion

Percussion instruments are perhaps the most varied category, since they don’t have any particular shared characteristics (like fixed string instruments). Percussion instruments include drums with membranes, gongs, idiophones and metallophones (such as marimbas and glockenspiels), and non-pitched instruments such as metal cymbals and beaded shakers. The range of design, construction, purpose and technique applied to percussion instrumentation is such that each category deserves its own explanation.

Drums are typically cylindrical ‘shells’ with an elastic membrane stretched over one or both ends. When the membrane is struck, its motion pushes air into the cylinder, which amplifies the sound

Gongs and cymbals are typically circular disks of metal which are struck. The vibrations which exist in these instruments are incredibly complex, and produce infinite overtones. 

Idiophones and metallophones are instruments which typically feature rectangular bars of different sizes. The size of the bar determines the pitch produced when the bar is struck. Idiophones have bars made from wood, and metallophones have bars made from metal. Some idiophones and metallophones are built atop a frame with cylindrical tubing suspended under each rectangular bar. These lengths of tubing are individually sized to accentuate the volume and overtones of each bar.

Hand Percussion

Hand percussion is any percussion instrument which is entirely hand held, and includes shakers (which can be as simple as rice inside a metal can), and tambourines (which typically have a number of small metal disks attached to a handheld frame)

Synthesis

[It might be worth pre-empting the synthesis section for a short introduction, since there is so much heterogeneity between architecture, systems and technology, string instruments all work on the same physical principles for example, and are generally just different shapes, whereas each area of synthesis is distinct, yet also presents significant crossover. It makes more sense to me to present a description of analogue, digital, sampling and software synthesis before discussing function generators etc, HOWEVER, it cannot be discounted that a user might only navigate to a single section, and if that section is not sufficiently detailed then this is an issue.

The below is an example from my course material, but still requires editing to be suitable, for example removing brand names. The tone is conversational and there is emphasis added for the typical classroom setting.

Additive Synthesis - Uses multiple sine waves stacked on top of each other to create complex harmonics. As used in the Hammond organ and later in the 1980’s to create incredibly complicated synths like the Kawai K5 (1987).

Subtractive Synthesis - Uses Voltage Controlled Oscillators (VCO) or Digitally Controlled Oscillators (DCO) to create waveforms which are then processed using filters, Low Frequency Oscillators (LFO), Envelopes and other processes such as ring modulators. See Minimoog (1970), Roland Juno 60 (1982)

FM Synthesis - The frequency of one sine wave modulates the frequency of another, creating a complex new wave. This system uses several ‘operators’ (which can be thought of as traditional oscillators), some of which are heard as sound (and called carriers) and  others which are used solely to modulate the carrier (called modulators).  Carriers and modulators can be set up in complex ‘algorithms’ and this technology can create some remarkable results. Later FM synths introduced a range of different waveforms in addition to the simple sine wave but the first FM synths to hit the market in the early 1980’s changed the sound of pop music for the entire decade! See Yamaha DX7 (1983)

Phase Distortion - A technology employed by Casio’s CZ range of synthesizers in the 1980’s. It sounds similar to FM synthesis but is subtly different and capable of reproducing the sound of analogue filter sweeps. See Casio CZ1000 (1984)

Linear Arithmetic (S&S) Synthesis - Roland released the D50 in 1987 which used very short samples of real instruments which are then backed up by more traditional digital oscillators. At the time this sounded far more realistic than existing technology and was another huge step forward, influencing the sound of pop music at that time. See Roland D50 (1987)

Physical Modelling - This technology uses Digital Signal Processors (DSPs) and was first used in synths like the Korg Prophecy. This is the same technology used in modern plug-in ‘softsynths’ that model real life instruments. This technology can create realistic instruments that sampling could never recreate, for example the Hammond organ emulation on a Nord keyboard allows the user to access all of the 250 million draw bar settings available on the original organ!!! This meant that the Korg Prophecy could create realistic trumpet and string sounds AND sound like a Minimoog all in the same package. See Korg Prophecy (1995), Mutable Instruments Rings (modern Eurorack)

Wavetable/Vector Synthesis

Wavetable synthesis allows the user to morph between complex waveforms or samples by using LFO or ADRS settings, and in some cases a joystick attached to the synth. This is often described as having a slow ‘evolving’ characteristic to the sound but can also create surprising rhythmical textures. Though originally difficult to program using the limited user interface of the early synths, it’s had a modern resurgence and is implemented in much more manageable form in Ableton’s Wavetable for example. See Korg Wavestation (1990)

Sampling

Sampling is essentially recording a sound to be played back later, either by a keyboard or      sequencer. Samplers are solely digital devices although some sampling keyboards also had analogue filters to warm up the sound. Sample rates are measured in bit-depth, a CD (compact disk) has a maximum depth of 16-bit whereas early samplers might be 8-bit or 12-bit. One of the first samplers was the Fairlight CMI which would have cost about £200,000 in today’s money! This was followed by slightly more affordable models such as the EMU Emulator. This meant that musicians could for the first time play the real sound of a violin or a choir from their keyboard and the concept was incredibly popular. Some synthesisers use samples which are processed through more conventional techniques such as filters and ADSR, some of these couldn’t actually sample audio, they just played it back.

CD quality sampling only became a reality for musicians in the late 1980’s with the introduction of the Akai S1000. The modern DAW (Logic/Ableton etc) is often called a sampler as it uses exactly the same concept of digitally storing sound. See Fairlight CMI (1979), EMU Emulator (1981), Akai S950 (1988), Korg M1 (1988)

Drum Machines

Drum machines are units which aim to produce the percussive sound of drums. They come in two main types but generally follow the same technology found in other synthesis.

Analogue - The sounds are created from oscillators with ADSR envelopes to shape the sound. Usually, these envelopes are pre-set but sometimes there are controls to fine tune the sounds. See Roland TR-808

Sample Based - These units hold pre-recorded samples of drums on ROM chips. Sometimes these machines have controls to change the pitch of the samples although this is quite rare. See Oberheim DMX

Other - Some drum machines employ physical modelling synthesis and some combine analogue and 

sample-based technology. The Roland TR-909 used analogue drums and sampled cymbals.

Analogue Synthesis

Analogue synthesis produces sound by generating rapid fluctuations of voltage in an electrical circuit. The nature of the sound is determined by the same functions and parameters discussed in the physics of sound section. An analogue synthesiser is built from a series of distinct circuits which perform different duties, as such there are many different analogue synthesis systems, utilising various combinations of these circuits. 

A synthesiser may, or may not, contain the following circuits, and these circuits may or may not be hardwired, meaning that in some cases routing options are either soldered in my the manufacturer and cannot be changed, or can be changed using set parameters such as switching, or in the case of modular and semi-modular synthesisers, can be freely connected from the front panel using patch cables. Digital and software synthesisers may also be either hardwired or provide flexible routing options.

Analogue synthesis is sometimes called subtractive synthesis, but this is misleading, since subtractive synthesis is a performative method of programming a synthesiser (which may or may not be analogue itself). Analogue systems might be equally capable of frequency modulation or additive synthesis for example.

Oscillators

For example, the speed of the voltage fluctuations determines the pitch produced by the circuit. These voltage fluctuations are commonly known as oscillations, and the circuit which produces them is called an oscillator. Single analogue oscillators are only capable of producing simple waveforms, such as triangle, square and sawtooth, whose harmonic content is determined by the number, and order, of harmonics. Dual oscillators set up to modulate using ring, frequency or amplitude modulation are called complex oscillators, and these are capable of producing more complex waveforms.

Oscillators are either entirely voltage controlled (VCO), or have their tuning stabilised by an additional digital circuit (DCO). In this context a DCO is still an analogue oscillator. A synthesiser might also include digital oscillators but an otherwise fully analogue signal path, or have VCOs but other digital circuits, these synthesisers are known as hybrid synthesisers.

Filters

The timbre of the sound in an analogue synthesiser can be controlled by a filter, another distinct circuit, which adds or subtracts harmonics. Common filter types are low-pass (removes harmonic content above a set frequency), high-pass (removes content below a set frequency), and band-pass (removes content either side of a set frequency - The set frequency is known as the cutoff. If a filter includes a resonance control, then the harmonics at the set frequency can also be increased at the same time as cutting. If a filter is capable of self oscillation, the resonant peak will produce an audible pitch.

Function Generators

[Function generators is the umbrella term for EGs, LFOs etc]

Envelope Generators

Various controls can be automated in the time domain using an envelope generator. For example, the amplitude of the signal or the cutoff frequency of the filter. Envelope generators are triggered using a bipolar current to determine an on or off signal. When the voltage of the current is zero the circuit is in the off position, if the voltage is increased beyond a set threshold (for example, 5v or 10v depending on the system), the circuit is in the off position. A trigger is a short burst of voltage, and gate is a continuous voltage applied to the circuit, for example for the duration that a key on a connected keyboard is held.

Envelope generators may perform a single duty, or have several controls dedicated to different portions of the envelope. For example, an AR envelope provides control over the attack and release, an ADSR envelope provides control over the attack, decay, sustain and release functions.

Attack - Controls the time it take for the sound to reach full amplitude 

after the envelope is triggered.

Decay - Controls the time it takes for the sound to reach the Sustain phase after the envelope is triggered.

Sustain - Controls the volume at which the note continues after the Decay phase until the gate signal ends. Sustain requires a gate signal at input, since a gate is a continuous voltage over time.

Release - Controls how quickly the amplitude returns to zero after the gate signal ends.

Low Frequency Oscillators

A Low frequency oscillator (LFO) works in exactly the same way as a VCO or DCO, and share the same basic waveforms and cross modulation potential. Typically they are constructed to produce much slower cycle rates than an audio oscillator, far below the range of human hearing and these gradual changes in voltage can be used to control other functions over time, such as the cutoff frequency of a filter, or the pitch of an oscillator. However, since LFO and VCO/DCO’s are essentially identical in concept, some VCO can be slowed to LFO rates and vice versa. If an LFO is capable of producing a cycle rate fast enough to be audible, it is called an audio rate LFO.

Voltage Controlled Amplifier

A voltage controlled amplifier (VCA) is a circuit designed to amplify incoming voltages. For example, an envelope generator can change the amplitude of the outputted signal over time, and an LFO would create an increase and decrease in output amplitude based on its frequency and shape. A VCA can be routed directly to the output of the synthesiser, or be used to further modulate internal parameters.

Mixer

The mixer section of a synthesiser controls the amplitude of inputted signals, and sums (combines) these prior to being outputted. A mixer section might control the amplitude of each oscillator for example before these are sent to a filter (or other circuit), or the mixer section might route these summed signals directly to the line or headphone output of the synthesiser. 

Digital Synthesis 

Digital synthesis is produced via digital signal processing (DSP) using a set of mathematical algorithms represented numerically. Digital synthesis requires the use of a digital to analogue converter (DAC) to convert these numbers into alternating current. Digital synthesisers utilise many of the same function generators as analogue synthesis, however the technology they utilise, and the resulting architecture and interfacing are heterogeneous, and thus require the same level of understanding individually as analogue synthesis does in its own right. For example, phase modulation, frequency modulation, additive, linear arithmetic, wavetable and physical modelling are all distinct methods of synthesis, each requiring their own specific circuitry and/or software to function. 

WIP*** - Note: Explain various types of digital synthesis 

Samplers

A sampler is a device which samples audio, meaning that it takes an inputted sound source and digitises it into a PCM ‘sample’. The sampler device then typically has features which enable the user to trim, truncate, reverse, repitch and adjust the amplitude of the PCM sample. Sampler devices are then capable of replaying samples via MIDI keyboard, sequencer or trigger pads, depending on the design. For example, a sampler could record (via a microphone and cable) a single note on a violin, that note is then edited in the sampler and mapped to the keys of a MIDI keyboard, the sample is then played back at the appropriate pitch across the keyboard. Pitch is generated by speeding up or slowing down the sample, relative to the desired interval.

Software samplers may not contain the ability to sample audio, but still work in the same way as hardware samplers in other respects. Since a plugin sampler already exists in the digital realm, in most cases digital audio can simply be loaded into the plugin from a source folder. 

Software Sound Generation

Software sound generation describes any digital data which is generated and has the capacity to be converted into sound waves by a speaker. This might include purpose-built digital DSP software synthesisers, or a Max MSP program which produces generative tones. It might also include practices such as databending, where digital photographs are hacked, often corrupted, and converted into audio files for playback. It could also include any hardware device which contains DSP technology, including modelling amplifiers and Eurorack modules for example.


 

4. Transmitting Sound

Electrical Cabling

Electrical cabling conducts electrons, typically along a core of copper wire covered in a non-conductive rubber or plastic coating. Cables contain one or more cores, through which alternative current is conducted, in addition to an earth wire, which prevents hum and lowers risk of electric shock. Using the correct cable is imperative to safeguard against fire, electric shock and damage to electrical equipment. Cabling also usually contains some type of shielding, which is reduced to reduce hum caused by environmental interference and electromagnetic fields.

Mains Power

Although mains cabling is not used to transmit sound, it is vital to understand its purpose in our role. In the UK, mains power is 230 volts with a tolerance between +10% and -6%, and the amperage is powerful enough to cause injury and death. The United Kingdom uses 3 pin plugs for all mains power, except in some cases where plug outlets include 110v or USB outlets. Many devices do not require a full 240 volts, and power adaptors are used, for example a laptop charger, or a 12v power supply for guitar effects pedals. It is also possible to use a ‘travel adaptor’ to convert a non-domestic 2 pin power supply to a 3 pin plug. However, devices which require an IEC ‘kettle’ lead should not be used with such adaptors, and should be connected directly to the mains power socket using a single uninterrupted IEC lead, unless it is a 110v device from America or Japan for example used in conjunction with a step down transformer. Any devices used on campus requiring 230v mains power need to be PAT tested in house. PAT testing is a series of safety inspections carried out by an authorised party.

Audio Cable

Audio cables are designed to conduct electrons, typically in one direction. Their design varies, and the number of cores, and type of connectors used typically define their use. Audio cables are either balanced or unbalanced. Balanced cables are suitable for conducting signals at line level, and unbalanced cables are suitable for conducting signals at instrument level, for example guitar leads. Using the correct cable is critical, as each type of cable is constructed and rated to conduct a maximum load. For example when connecting instrument amplifier head and cabinets, purpose built speaker cables must be used - Using a standard guitar cable to connect a head and cabinet presents significant fire risk in addition to risking damage to the equipment.

XLR

External Line Return (XLR) cables are used to connect microphones to audio inputs on an interface or mixing desk, and to connect various other equipment such as outboard processors.

Instrument Cables

Instrument cables typically have a quarter inch jack connector, one conductor wire and an earth, and are designed to process signals at around -30db. Instrument cables are used to connect instruments to other equipment such as amplifiers, DI boxes, and interfaces. If guitar pedals are being used for example, then short patch cables are typically used to connect pedals. 

Balanced Cables (TRS)

Balanced (TRS) cables typically have a quarter inch jack connector, one conductor wire and an earth, and are designed to process signals between -10db and +4db. Balanced cables are often used to connect audio equipment such as outboard processors and mixing desks, they should not be used with guitars, but can be used with some keyboards.

Speaker Cable

Speaker cables for guitar cabinets use a quarter inch jack connector but differ from instrument cables in that they contain two conductor wires and an earth. When connecting instrument amplifier head and cabinets, purpose built speaker cables must be used - Using a standard guitar cable to connect a head and cabinet presents significant fire risk in addition to risking damage to the equipment.

Speaker cables for studio monitors typically have two conductors in addition to an earth, have XRL connectors and conduct at line level.

Speaker cable for home hifi systems is different again, and is typically a single conductor, where the bare ends are clipped into the back of speakers. Some studios include consumer grade hifi speakers for reference playback.

Bantam

Bantam cables are a type of miniaturised balanced cable used in some patch bay systems to save space. They should not be used for any other purpose.

Combined

Other types of cable capable of processing audio typically have the role of connecting analogue devices to digital. These include USB, Firewire and Thunderbolt cables, all of which are capable of conducting audio at the same time as digital data, and conversely to other types of audio cable, can transmit audio in two directions at the same time. This is achievable by the increased number of cores, in addition to earth.

Radio Waves

Radio waves are a type of non-ionising radiation (meaning that they are not harmful), which exist at the low-energy end of the electromagnetic spectrum. Radio waves are capable of travelling through solid surfaces, and great distances through the atmosphere and space. The farthest human made object that we can communicate with is the Voyager 1 space probe launched in 1977 which is currently approximately 24.3 billion km from earth. The oldest radio signal we have received is from an unknown origin and is reported to be 8.8 billion years old. We also use radio in cars to tune into radio stations which play news, music and other entertainment, and short range radio like walkie talkies, which can be purchased from many toy shops. 

Radio waves exist in nature, but artificially generated radio waves were first demonstrated in a laboratory setting by Heinrich Hertz in 1887, using theory developed by James Clerk Maxwell. The first practical radio assemblies were developed by Guglielmo Marconi between 1894 and 1895. Radio uses a transmitter to convert audio or other data signals to radio waves, which are received by antenna and converted back into the original medium.

Wifi (fibre optic light)

As one of the most common communication mediums we use today, we regularly utilise wifi signals to transmit musical data, including streaming services, and any computing device set up to receive wifi signals. In fact, wifi utilises radio waves to transmit digital binary code.

Bluetooth

Similarly to wifi, bluetooth also utilises radio waves to transmit digital data.


 

5. Storing Sound

Brain - Memory

The brain is the oldest form of storage, our instinctive verbal reaction to certain emotional or environmental situations are hardwired into our brains, pre-dates the origin of language, and can be found across nature. Without memory, our oral history of music, which human beings have passed down from generation to generation for thousands of years, would not otherwise exist. The oldest musical instrument to be discovered is a flute crafted by a Neanderthal, dating to the Palaeolithic period around 60,000 years ago, whereas the oldest known written scores date back to around 1400 BCE (roughly 3400 years ago) and were found in modern day Syria and Iraq.

Despite the standardisation of written scores over several thousand years and the advent of analogue and digital storage devices, oral tradition and memory continue to serve as the most common forms of humans storing sound. Performing music from memory, getting the last thing you listened to stuck in my head, engaging in musical activity using technique, describing to someone how you want a recording to sound, and language itself are all reliant on the human brain’s ability to store data relating to sound.

Written Scores - Notated scores, visual scores, Piano Roll.

Written scores consist of either a physical or digital representation of sound, and are typically a series of instructions for other actors (human or non-human) to reproduce that sound independently.

The most common types of written scores are standardised notation and these are as many and diverse as verbal languages. Western standard notation is only one type of written notation, this could be seen as one of a great many distinct languages, and certainly not the only written musical language that exists. Such is the difference between written scores that may or may not share similarities such as the order or direction with which they are intended to be read, and the technique and notes available for the intended instrumentation.

Visual Scores use shapes, symbols and pictograms to represent sound. Visual scores do not necessarily contain words but might include additional annotation. Since visual scores do not rely on written language or standardised coding, they may translate more successfully between cultures, technique and instrumentation.

Piano Roll is a type of visual score that became widely used following the advent of computer sequencers with graphical monitoring capabilities. Piano Roll is commonly linked to MIDI, and represents the notes of a piano, with side scrolling blocks of data which represent pitch, note length and dynamics.

Analogue Storage

Analogue storage mediums store sound as either an electromagnetic signature, or as a physical phenomena.

Wax cylinder, and shellac or vinyl disks are amongst the oldest forms of storing sound and rely on physical cuts being made into a soft material, which is then hardened. Once this process is complete the data cannot be removed, though it can be changed if the medium is damaged for example. The material is then spun around on a circular axis and a needle is placed into the grooves left by these cuts. The needle is one half of a transducer, and converts its motion as it moves across the material into alternating current. In terms of technological hierarchy, wax cylinder was invented first, in 1877, and is the source of the term ‘record’ to describe a physical medium which replays sound. Shellac and later vinyl disks became standardised by the early 20th century, and vinyl disks are still produced and used today. 

Vinyl disks are played on a turntable, which spins at either 33 rotations per minute (rpm) or 45 rpm. They are limited by a maximum storage capacity, 33rpm 12 inch vinyl disks hold the most data of any vinyl format, with a maximum of 25 minutes per side. Vinyl disks also come in 10 inch and 7 inch varieties, which hold less data than a 12 inch disk. Vinyl disks have two sides, A and B, and if data is recorded to both sides, the record must be physically flipped on the turntable to the reverse side to access the desired content.

Wire and magnetic tape were designed to replace and improve on the qualities of disk, with the first wire recorder made in 1898 and commonly used until the 1950’s. The first magnetic tape was made in 1928 and became standardised from the 1950’s, tape is still in use today as a recording and playback medium for both audio and video. Wire and tape operate in predominantly the same manner, but have different physical shapes. The recording and playback device use transducers, data is recorded on the wire/tape using a ‘record head’ which encodes a magnetic signature on the surface of the material. When this electromagnetic signature is passed through a ‘play head’ the information is converted again to alternating current. Both wire and tape are limited in storage to the length of the medium, in theory an infinite length of tape could hold infinite data, but consumer tapes hold a maximum of 90 minutes of audio data. Similar to vinyl disks, tapes also have two ‘sides’ A and B, meaning that during playback, at the end of side A the tape needs to be ejected from the playback device, physically flipped and reinserted on side B to access the desired content.

Digital storage

Digital storage is easier to explain than RAM. Storage is also measured in Gigabytes (GB), though the number of Gigabytes associated with digital storage is a lot higher. A computer might have 128GB or 256GB of storage on its hard drive, and the higher the number the more storage is available. In most computers there will only be one hard drive, though in others there may be space for more (especially in desktop). All of the Operating Software and additional apps will be stored on the hard drive so this can fill up quickly, though most music producers will strip all non-essential apps from their computer to improve performance (do be careful here not to remove anything which is essential to the operation of the machine). Having very limited space on the drive may result in performance issues on the machine. Thankfully, extra space can be found using USB pen drives, external hard drives or cloud storage, and these can all be used to catalogue old work and keep your main hard drive running smoothly.

There are two types of hard drive used in computers, solid state drives (SSD) and hard disk drives (HDD). HDD have almost been phased out as legacy technology, they are slower and more prone to damage as they have a physical spinning optical disk inside. Solid State Drives are faster but more expensive, so you will pay more for higher storage amounts. It’s recommended where possible that your essential systems run off an SSD, and your sample library, old project files and holiday photos live on a much cheaper USB stick or HDD drive.

Optical storage

Optical storage devices are digital storage types which are written and read by a laser. These include Compact Disks - CD-ROM disks (CDs), Digital Versatile Disks / Digital Video Disks (DVDs), and Blu-Ray disks. Optical storage is significant in the development of audio because it significantly increases the available dynamic range compared to analogue storage types such as vinyl records and magnetic tape. Whereas records and tape have an upper limit of around 70db, compact disks are capable of around 90db. This led to audio being mastered at much higher loudness levels than was possible with analogue storage devices, which in turn led to the ‘loudness wars’. 

Optical storage devices are also able to contain computer programs, video and video games for example, and this had a profound impact on the video game industry in particular. Fourth generation consoles such as the Sega Megadrive and Nintendo 64 used ROM cartridges which contained MIDI data which played of a real-time internal sequencer and synthesis chip (the Yamaha YM2612 FM Synthesis chip for example), and were limited by polyphony and number of MIDI tracks. Fifth generation consoles such as the Sony Playstation 1 used CD-ROM disks, and could play stored digital audio in the same manner as a vinyl record or magnetic tape - Therefore electronic dance music, orchestral scores, and radio/compilation soundtracks were possible at 16-bit and 44.1kHz (See for example Wipeout, Metal Gear Solid, Grand Theft Auto 1 & 2). This bolstered an entire sub-section of the music industry dedicated to soundtracking video games, at the present day the video games industry is worth more than three times that of the record industry, and more than four times that of the film industry.

There are drawbacks however, related to digital audio quality such as sample rate and bit depth, and storage capacity. Audio CD-ROM disks are limited to 16-Bit and 44.1kHz sample rate, and able to store 700MB of data. DVDs are capable of 24-bit at 48kHz audio, and around 4.7GB of data, and Blu-Rays can store around 25GB of data. 

There have been other less successful optical formats such as Mini-Disk for example, but streaming services have had the largest impact on the continued use of optical disks to distribute music. Streaming no longer requires a physical disk, nor a playback device with a disk tray and laser. Video games and and some other mediums continue to be distributed on disk for now, though the rise in video games consoles lacking a disk drive is a signal that the industry is continuing to abandon this format in favour of downloads and streaming. This could be seen as a way of minimising the market in second hand games to further profits, or as a way of limiting the environmental impact of resource intensive legacy physical mediums.


 

6. Converting Sound

Audio Interface

An audio interface is a device which converts alternating current to a binary digital signal. The device itself is usually either built into a mixing desk, a small standalone box, or a 14 inch rack unit, all of which exist to provide input and output connectivity with a computer. On the input side there are plugs to connect audio equipment which generate or capture sound (microphones, instruments etc). On the output side there is usually a USB, Firewire or Thunderbolt cable which sends the digital signal into the computer. The computer then sends a binary digital signal back to the interface, which has a further output connectivity to send alternating current to headphones or speakers (which then convert that alternating current back into sound). 

Most standalone interfaces include essentials such as preamps, line/instrument switching and phantom power. Interfaces can have one channel, or they could have 128 channels or more. The number of channels typically matches the number of channels on the mixing desk when installed into a permanent recording setup, though a mixing desk is not required for all interfaces. Each channel is capable of recording one source at a time. So, if you wanted to record an acoustic guitar and a singer at the same time with one microphone for each you would require two channels minimum. If you wanted to record an acoustic guitar, and then overdub a vocal over the top, you could just use one input channel. It’s recommended to choose an interface which matches your connectivity requirements and budget.

USB Mixing Desk

A mixing desk is a device which takes multiple audio signals and sums them together. Mixers also usually offer on board EQ and send/return channels in addition to preamps, line/instrument and phantom power. Some mixers have a built-in USB interface which generally allows for the master bus to send two channels of audio to a computer - This can be a combined stereo output, or two mono outputs. Mixers with this capability can be useful and expressive, and allow for several audio devices to be connected to the mixer at any given time. Mixers can also be used as an expansion for interfaces, for example, a mixer’s stereo outs might be plugged into two channels on an 8 channel interface, giving mixing capability across two channels, whilst leaving 6 channels free. This could be a decision for later if or when you feel the next to expand your set up. There’s no real difference in price between a small two channel interface and a small mixer with USB connectivity.

Direct USB Inserts

Some microphones and guitar DI’s take an analogue device and convert it to USB to plug straight into a computer. This can be a convenient way of skipping the use of an interface or mixer altogether, though potentially at the cost of audio quality if noise exists within the system.

Other Medium Conversion Devices

WIP***

 

Our Nervous System

All organic lifeforms react to sound. With the exception of sea sponges, all creatures have a nervous system which varies in complexity. Humans possess a complex nervous system, which includes not only the brain, but a complex network of nerve cells (neurons) which permeate the body, and pass sensory information back and forth. A human being might therefore respond to sensory input in a number of ways, including sight, hearing, touch, taste and smell. 

In the context of sound, we understand that sound is a disturbance in an elastic medium, therefore vibrations in a solid surface might be felt when we touch that surface. Sound waves may also be felt in the body, since the body itself is an elastic medium. 

We perhaps most commonly associate sound with hearing, and our ears are designed to funnel sound waves towards our inner ear, where an ear drum reacts much like the diaphragm on a microphone, and cochlea transfers this movement into an electrical signal which is interpreted by our nervous system.

If you are one of the 1 in 1000 people who experience synesthesia you might see, smell or taste sound.

Deafness, or hearing damage may mean that a person is unable to hear sound of certain frequencies or amplitudes. Hearing damage can occur if a person comes into contact with a dangerous level of noise, or prolonged exposure to noise. All people lose sensitivity to sound as they age, and high frequencies are reduced more quickly.

Transducers

Transducers are devices which convert energy from one form into another, and have many applications and uses. In the context of audio technology, they are typically used to convert the vibrations caused by sound waves into alternating current, or to convert alternating current to vibrations which we perceive as sound. For example, a microphone is designed to convert disturbances in an elastic medium into alternating current. A loudspeaker is designed to convert alternating current into movement which causes a disturbance in an elastic medium. In fact, the principle theory behind the dynamic microphones and loudspeakers is so similar, that it’s possible to use a speaker as a microphone simply by reversing the speaker wire [warning, do not try this at home]. 

Microphones

Microphones are devices which convert disturbances in an elastic medium into alternating current. They have several designs but follow a similar principle. All microphones produce higher amplitude alternating current when excited by higher energy sound waves, and higher SPL.

Condenser Microphones

Condenser microphones typically work by employing a diaphragm fed with an electric charge (called phantom power), which moves when excited by sound waves. The movement of the diaphragm is converted into alternating current by a transducer, which is fed to the output of the microphone. The relatively low mass of the diaphragm makes it sensitive to high pressure sound waves (high SPL), and means that high frequencies (which contain shorter wave cycles and possess less energy) are more easily transferred to an electrical signal by the design. A condenser does not pick up less low frequency content than other microphones, it simply picks up more high frequency content than other designs.

Ribbon Microphones

Ribbon microphones use a rectangular strip of copper or other metal which visually resembles a ribbon, suspended in a magnetic field. The ribbon moves when excited by sound waves, and transfers this movement into alternating current using a transducer. High SPL levels, and physical shock can be highly damaging to ribbon microphones because the ribbon is fragile. The ribbon is less sensitive to high frequency content than other microphones, because the ribbon is under high tension and does not move as much as the diaphragm on other designs. Therefore, since high frequency content possesses less energy than low frequency content, ribbon microphones are more sensitive to low frequency content. This does not mean that they pick up more low frequency content than other designs, they simply pick up less high frequency content.

Dynamic Microphones

Dynamic microphones utilise a diaphragm connected to a transducer, which physically moves in two dimensions, backwards and forwards. When sound waves excite the diaphragm, this movement generates alternating current. This design principle mirrors that of a loudspeaker, but in reverse. Because dynamic microphones contain a physical moving part, they are less sensitive to high SPL levels, and high frequency content. This means that they are often used for applications in which high SPL levels are likely to occur. Because the diaphragm of the microphone only moves in two directions, dynamic microphones operate most efficiently when positioned directly facing the sound source, where the energy and SPL of the sound waves is at its highest. Sound waves which meet the diaphragm at an angle apply less pressure to the cap, and cause less movement, therefore, lower amplitude alternating current.

Piezo Microphones

Piezo microphones have no moving parts and produce alternating current by converting vibrations in discrete crystal or ceramic elements in response to mechanical stress or pressure. Piezo microphones are often used in applications where the use of other types of microphones would be impossible, such as underwater, or inside the human body (for medical use and surgery). Piezo microphones are often called ‘contact microphones’.

Pickups

Instrument pickups use magnetic induction to convert the physical vibration of strings into alternating current. This design requires a magnet, wrapped in coils of wire (the number of coils is often in the thousands). Instrument pickups apply differing frequency response and amplitude depending on the number of wire wraps, and the type of magnet used.

Piezo pickups (featuring the same design as piezo microphones) are also used on guitars, in this context they are fitted into the saddle of the guitar, and convert vibrations in the strings directly into alternating current.


 

Speaker Systems

Speakers are included in most of the following equipment. Speakers work by connecting alternating current to a transducer, which converts the signal into physical motion, pushing the cone of the speaker backwards and forwards, which causes disturbance in an elastic medium (for example air) which we then perceive as sound. In this sense, a sound recording is a linear process with, with a microphone transducer at one end, and a speaker transducer at the other.

Speaker systems can be arranged in various ways. A mono system is a single speaker, a stereo system has two speakers and audio can be panned left or right. A surround system is known as spatial audio, and the designations 5.1 or 7.1 refer to the number of speakers in the system. A crossover system is a system which includes different speakers tunes to discrete regions of the frequency spectrum, for example, a sound system in a club or festival which includes subwoofers, which are designed to reproduce powerful low frequencies.

It’s standard practice to listen to check the intended quality of mixes on various systems during mixing and prior to release, this is called referencing. For example, earbuds are a common consumer listening device, and therefore may be what many of your audience will be experiencing the audio on, checking that audio translates well onto earbuds is therefore good practice. 

Monitoring Systems

‘Studio’ Monitors are speakers which are designed specifically for music production. There are two broad types.

Reference monitors are designed to have a flat frequency response and sound as neutral as possible to provide the most accurate possible representation of a mix. The intention in the design is to create a well balanced mix, and to highlight any unwanted details in the audio. Reference monitors are essentially designed to sound like stereo HiFi speakers, so if a mix is balanced on reference monitors, there’s more chance it translate onto consumer playback devices such as home stereos, radios, and car speakers.

Other studio monitors include boosts in the frequency range which for example might make the low frequency content sound louder. This can trick the listener into thinking the low frequency content is subjectively ‘better’, or it could cause you to remove low frequency content because it appears to be too loud, which then weakens the mix in other speakers. DJs for example might prefer these types of speakers however because they might more accurately reflect the tonal quality of a club sound system.

Monitoring Systems - Headphones

Headphones are a playback device designed to fit either inside the user’s ear canal, or be positioned on or over the ears supported by a headband. Both designs incorporate speakers which operate in the same way as any other speaker, by pushing air.

Headphones come in various types and the sound varies as music as any two guitars or singers. Typically wired headphones are used for music production, since Bluetooth headphones earbuds may cause latency issues. 

Studio headphones are always ‘over the ear’ and are either open-back, semi-open, or closed back - The hardware and speakers are generally designed in the same manner, but the amount of padding differs (to great effect). 

Open-back have the least padding around the ears and give the most natural sound at the expense of low frequencies, these sound most like you are listening to speakers in a room, these are closest to reference monitors. Open-back headphones leak sound due to the reduced padding, so someone in the same room will be able to hear your headphones, and if you’re recording vocals you might experience some bleed between the headphones and the vocal mic. If this happens, turn your headphones down, and consider turning down any upwards compression you might have on the armed track. This is often just one of the idiosyncrasies of home recording. 

Closed-back headphones are designed to completely cover the ears and have the most padding, sound shouldn’t leak (if it is leaking your headphones are turned up FAR FAR FAR too loud, turn them down). Closed back headphones normally give the most prominent bass response, although just like with the studio monitors, this can be ‘fake’, and actually damage your mixes. Closed back headphones are generally designed for recording music, not mixing it. 

Semi-open are a blend between the two, so you could see this as a convenient compromise, or perhaps not.

WARNING: Some headphones have differences ohmages. Since headphones are speakers this works in exactly the same way as guitar cabinets and other speaker systems. Some headphones require a headphone amp to function at their best. If in doubt, ask before purchasing.

Public Address (PA) Systems

Public Address systems (PA) describe an array of speakers and amplifiers in any combination which is designed to amplify audio to a crowd of people. This could include a speaker at a conference, a band practising in a rehearsal room whose vocal mics are amplified, a nightclub where a DJ has turntables connected to a sound system, or a festival where with multiple large arrays tuned to amplifying the musicians on the stage at a volume suitable for 100,000 people. 

Whereas guitar amplifiers for example often deliberately change the sound of the inputted instrument, PA systems are typically designed to present the most accurate possible representation of the original sound source.

Instrument Amplifiers

Instrument amplifiers are designed to convert the alternating current outputted by a transducer pickup fitted to an instrument back into an audio signal. Instrument amplifiers have a common purpose, but their architecture and integrated circuitry may differ. Where necessary, instrument amplifiers utilise pre-amplifier circuits to increase the low-impedance signal of an instrument pickup, this circuit may also clip the signal causing distortion and harmonics. The signal may then be shaped using on board EQ before being amplified again using a power amplifier. Amplifiers are designated as ‘heads’ or ‘combos’ depending on whether they have an integrated speaker system. The power amplifier in a combo is converted directly from alternating current to sound the integrated speaker system, whereas a head requires connection to an external speaker cabinet. Amplifiers and external speaker cabinets may have any number of speakers, though the maximum number for combos is typically 4, and cabinets is typically 8. 

Instrument amplifiers are usually designed for a particular instrument, for example, electric guitar, or bass guitar. Some instrument amplifiers also contain effects, such as tremolo, spring reverb, or chorus.

The earliest instrument amplifiers utilised valve technology and this is still in use today, an example of one of the few modern applications of valve technology. Instrument amplifiers may also be solid state, using transistors and MOSFETs in place of valves. Amplifiers may also be hybrid, meaning a combination of valve and transistor technology, for example, a hybrid amplifier might have a valve based pre-amp and a solid state power amp, or vice versa.

As digital technology progressed, digital ‘modelling’ amplifiers became available. These amplifiers use digital signal processing (DSP) to approximate the sound of analogue and valve circuitry, and some of these amplifiers may also include hybrid designs such as valve preamplifiers. DSP amplifiers may also contain complex digital effects units.

Whilst not technically an instrument amplifier, digital plug-in devices are another example of modern DSP technology, and approximate the sound of instrument amplifiers in as a standalone application or for use inside a DAW. These software applications are used when an instrument is directly connected to a computer using a DI signal, and the output signal from the amplifier plug-in can be monitored using the required monitoring path, for example headphones (meaning silent practice and recording). Software amplifier emulations also use impulse response technology. In this context an impulse response is generated using a combination of a speaker system microphone type, and microphone position. The impulse is then used to approximate the frequency response of the speaker/microphone combination during use with the amplifier emulation. In practice, this means that thousands of speaker/microphone/position combinations can be recalled and used.

Consumer Listening Devices

Consumer listening devices include any device readily available to the public, designed to reproduce audio for listening playback. These could include mobile phones, wired or Bluetooth earbuds or ‘over the head’ headphone, radios, car speakers, HiFi systems and portable mini-speakers. 

Cochlear Implants

Cochlear implants are a device which restores sensitivity to sound in people who experience deafness. Implants are attached to the bone behind the ears, and bypass damaged portions of the ear to directly stimulate the auditory nerve.

Physical Modes of Amplification

Physical modes of amplification describe any physical device which amplifies a sound source without using electronic or digital technology. For example, the horn on a gramophone, or a non-electric megaphone.

7. Routing Sound

Routing sound describes the process of taking a signal and duplicating or diverting it to another destination. Sound itself as it exists as soundwaves cannot be duplicated (reverberation is not a true reflection), but it can be diverted when it comes into contact with surfaces, which is why we can hear sound around corners, even when we cannot see the sound source itself.

Routing sound in practice is typically done when the signal exists as an electrical signal, or digital data. This might include using a patch bay to route one microphone to a particular channel on a mixing desk, or it might be using an auxiliary channel on the same mixing desk to duplicate the signal and send it to a reverb unit. In the digital domain, many DAWs mirror the routing options available in hardware mixing desks.

Also consider that MIDI data signals can also be similarly routed. MIDI data itself is not sound, but neither is alternating current. 


 

8. Processing Sound

Physical modes of processing sound

(Note: This section includes details of physical electromechanical ‘effects’ units such as plate and spring reverb).

There are a number of ways in which sound can be manipulated within a physical space. Firstly, remember that sound only exists at electricity or as digitally stored data inside a circuit or machine, and therefore any manipulation of that signal or data is not a direct manipulation of sound itself. For example, changing the EQ setting on a guitar amplifier changes the electrical signal within the amplifier, which then causes the speaker to react differently when reproducing that signal - This is cause and effect. Also remember that due to the physical properties of sound, changes in the vibrations emitted by the speaker will cause the elastic medium to react differently, and therefore for example, changing the volume on the amplifier will result in variations in reverberation within a room. This is also true of how instruments are played, for example blowing the reed on a saxophone harder or softer changes the content of the sound waves produced.

Also remember that we perceive sound as a result of sound waves stimulating our nervous system. Changes in a physical space will cause sound waves to behave differently, and we perceive these differences as changes in sound, for example volume, tone and reverberation.

With that in mind, we can consider how sound is changed not by the player or the equipment, but by the environment it exists within.

In terms of processing sound in a physical space, room design and building materials are a common intentional factor when constructing acoustic spaces. For example, because sound waves reflect off hard surfaces and are absorbed by soft surfaces, hard floors and walls (made from stone, brick or concrete for example) will allow higher amplitude and short cycle sound waves to reverberate for longer. We perceive this type of space as having more high frequency content. A room with soft walls and a carpet will cause sound waves to be converted more quickly into heat energy, and therefore fewer of the high amplitude, short wave cycle sound waves reach our ears, and there is less reverberation.

The shape of a room also has a profound effect on the frequency content and duration of sound waves. Religious buildings are often domed or arched with high ceilings to increase the volume of speakers. Small rooms sound different to rooms of the same floor plan but with a higher ceiling. Long narrow rooms or circular rooms sound different again. Many studios include ‘bass traps’, curtains, or acoustic panelling, all of which are designed to reduce the reverberation of specific frequency content.

Finally, acoustics are not permanently fixed, and the acoustics of any space can be changed simply by making changes to that space. For example, adding people to a room, or a sofa, or book cases, or a loaf of bread, will in some way change that space. A common example of this is in live sound, often sound checks take place when the venue is empty, but when the venue fills up the combined mass of the audience changes the acoustics within the room dramatically. 

In recording studios we can make use of our understanding of physics to process and control the amplitude and frequency content of sound prior to it being picked up by microphones. We could do this by placing amplifiers with the speaker facing against a wall, we can put panelling between string players for separation, we can move a trumpet player from an area with a hard floor to an area with a soft floor, we can ask a singer to sing in a small vocal booth, or we can put padding inside a kick drum. None of these changes affects directly the way the instrument or sound source produces sound waves, though it might affect the way the instrument is played, if for example the two string players could no longer see each other, or the singer was uncomfortable in such a small room.

Some of the earliest effects were simply modes of processing sound within a physical space. For example, earlier ‘echo’ effects were achieved by surrounding guitar amplifiers with metal trash cans filled up with varying amounts of water, the sound from the amplifier resonated in the water, and the volume of water in each trash can changed the length and tone of the echo.

Echo chambers were also popular. Essentially any physical space could be a candidate for an echo chamber, since the technique is to play a sound through a speaker within a space, and capture it with a microphone, this could be a purpose built room, a large empty warehouse or music venue, a tiled bathroom, or the inside of a jam jar (if your speaker and microphone were small enough).

Electromechanical devices: These devices process sound waves in a physical sense, because sound is always a disturbance in an elastic medium, whether that’s air, water, or a steel plate or spring.

Spring reverb was designed in the 1930’s and uses a network of transducers to excite (transmit vibrations into) a metal spring, and convert these back into alternating current. The length of the reverb effect could be changed by dampening the spring. Spring reverb devices are relatively light and portable, and were fitted into early Hammond organs, and later guitar amps - You will still often find spring reverbs in amplifiers today. 

Plate reverbs were invented in the 1950’s, which use a network of transducers to excite a metal plate, and convert these back into alternating current. These plates could be as large as 1.5 metres by 2.5 metres, and their size and weight often made them both incredibly expensive, and a permanent fixture of a studio. Plate reverb was designed to emulate the characteristics of a sound being emitted in a large acoustic space, and later designs could process stereo audio signals.

Dynamic Processing

Dynamic processing describes any process which alters the amplitude of a signal. Typically ‘gain’ is a variation of an inputted signal, and ‘volume’ is a variation of an outputted signal.

Voltage controlled amplifiers are a circuit which changes dynamics depending on a control value. For example, a fader on a mixing desk is a physical controller, linked to a VCA which is capable of changing the amplitude of a signal, often between 0db and some maximum value. Any VCA which exists in the signal patch routed to the input of a sound recording device is referred to as an element of the ‘gain stage’, while any VCA connected to a loudspeaker or other audio output device is referred to as being part of the ‘output stage’.

Similar concepts exist in digital devices and DAWs to control digitised audio. Digital audio has no upper limit for volume, by clipping will occur if the signal breaches 0db.

Passive volume controls, unlike VCAs, are only capable of turning a signal down. For example, the volume control on a guitar is only capable of reducing the maximum inputted signal down to 0db.

Tremolo

Tremolo is an audio effect which varies the amplitude of a signal using an LFO, it is sometimes found on guitar amps, or in digitised form as a plugin effect. Hardware tremolo devices might also use an optocoupler to vary the amplitude of the signal.

Compression and Limiting

A compressor or limiter is an electrical circuit which attenuates the amplitude of an inputted signal if it breaches a threshold. Most compressors feature a user defined variable threshold, but not all. The compressors amplitude reduction circuit will not act on the signal unless the signal breaches the threshold.

If the signal breaches the threshold, the ratio of the compressor circuit determines the number of decibels by which the outputted signal is attenuated. For example, a setting of 2:1 will attenuate the outputted signal by 1db for every 2db the inputted signal is over the threshold, a signal which exceeds the threshold by 8db will be attenuated by 4db. The higher the ratio, the more attenuation is applied to signals which breach the threshold. Any compressor set to a ratio of 10:1 or above is regarded as a limiter, since the attenuation is extreme enough that almost no signal will breach the threshold. A limiter with an inf:1 (infinity to 1) setting is called a ‘brickwall’ limiter. Note that a setting of 1:1 will result in zero attenuation being applied to signals which breach the threshold.

The attack and release settings (if available on the unit) control the attenuation over time. The attack setting determines how quickly any attenuation is applied to the outputted signal, after the point in time at which an inputted signal breaches the threshold. The release setting determines the length of time it takes for the attenuation circuit to return to a ratio of 1:1 after any inputted signal has dropped below the set threshold. These settings allow the user to time the effect of the compressor to the tempo of the song for example.

Some compressors feature a VCA in the gain stage of the circuit, and most compressors feature a VCA on the output stage of the circuit. Hardware compressors feature a number of different designs, having a valve of FET included in the gain stage can result in distortion and the generation of harmonic content, this can be seen as desirable. In fact, valves naturally compress audio signals discretely as what was originally seen as an unintentional, and unwanted byproduct of their design.

Opto compressors work by utilising an optocoupler circuit which emits infrared light which is proportional to the amplitude of the inputted signal. A photosensitive device then conducts current in the same manner as a transistor.

Digital, and plugin compressors generally follow the same design principles, but work on digitised audio signals.

Clipping

Clipping is a technique for removing peaks from digital audio. This is used for two reasons. Firstly, no compressor is fast enough to attenuate all transients in an audio signal. Secondly, due to the way digital audio is mathematically calculated, the amplitude of the signal may actually continue to rise and fall between samples, regardless of the bit depth and sample rate used. Clippers ‘hard’ clip the amplitude at a user defined level, flattening transient peaks in a mathematically perfect way. This causes audible distortion, but when used sparingly, can either be used similarly to a compressor to either avoid unwanted clipping, or raise the overall LUFS level of an audio signal during mastering.

Spectral Processing and Filtering - EQ

Spectral processing describes any electrical circuit or digital algorithm which is designed to increase or decrease discrete frequencies, or frequency ranges, within an audio signal, whilst leaving the remainder of the signal largely unaffected. For example, a VCA increases or decreases the entire signal, whereas an equaliser might be set to reduce the amplitude of the signal in the range of 400Hz - 900Hz by 4db (called a bell curve), or increase the amplitude of all frequencies above 8kHz by 1db (called a shelf). Filtering describes the process of cutting all frequencies above, below, or either side of a set frequency.

Some equalisers are constructed using valve or other technology which might impart the introduction of additional harmonic content to the outputted signal. 

Spatial Processing

Spatial processing describes any process which simulates the effect of sound waves moving in a physical space. This is typically directional. Stereo pan controls route audio signal to either a left or right speaker, which simulates the effect of that sound source being positioned in that region of space. For example, a guitar can appear to arrive from the right of the listener, and a violin from the left. This is a simulation only, the signals can be panned in either direction when mixing, and the mixing engineer is responsible for these decisions, you can prove this to yourself by simply turning around 180 degree, the guitar will still sound like it arrives in your right ear, even though that ear is now facing what was originally your left hand side.

Surround sound systems have more speakers, and therefore the panning might cause the sound to appear from behind the listener, depending on where the speakers are positioned. 

Time Based Processing - Delay, Reverb, Flanger

Time based processing describes any process which modulates the audio signal in the time domain, or generates repeats of the audio signal at intervals.

Flanger

A flanger makes a copy of the audio signal and an LFO modulates the copy in the time domain. This causes a phenomenon known as comb filtering, as the two signals cancel each other out at intervals.

Echo / Delay

An echo / delay device (the names are interchangeable) generates repeats of the audio signal at intervals. Early echo devices used magnetic tape to achieve this, later solid state devices used chips, and later DSP. An echo / delay typically has variable controls to control the interval between the repeats (in the time domain), and the feedback (the number of repeats generated). 

Reverb

Hardware reverb devices are described here. This section discusses the generation of digital reverb effects.

Reverb might be considered a spatial effect, since it is designed to ‘place’ the audio signal in a simulated physical space. However, a digital reverb works by utilising a complex network of delay lines. This means that digital reverb is capable of simulating the two components of physical reverb reflections which we observe in the physical realm. Firstly, there is a section of the DSP which produces short echos, this is called ‘early reflections’, and simulates the effect of the soundwaves which might easily reflect off hard surfaces in a physical space. The section algorithm simulates the effect of these reflections as they continue to reverberate around a physical space, losing energy each time they reflect off a surface. This effect is called ‘diffusion’. Reverbs typically have a ‘length’ setting which determines the length of the reverb is seconds, which specifies how long it will take for the diffused echoes to dissipate and reach 0db.

Modulation Based Processing

Vibrato

Vibrato is an audio effect which utilises an LFO to modulate the pitch of the entire audio signal.

Chorus

Chorus is an audio effect which makes a copy of the audio signal, and modulates the pitch of the copy.

Phaser

A Phaser is an audio effect which makes a copy of the audio signal, and applies notch filters to the copy, which is then modulated in the time domain by an LFO.

Other Effects

WIP***


 

9. Representations of Sound

Sound can be represented via a number of non-audible means, often this is visually using meteringing or sound storage methods but not always. We can also touch and feel sound, and if you are one of the 1 in 1000 people who experience synesthesia you might smell it. In most cases, a decision on whether the audio presents as intended should be made by listening rather than looking, but metering can be useful in ensuring that our gain staging is correct for example.

Metering

Metering is a broad term which describes the various systems and circuits designed to allow us to monitor various aspects of sound. For example:

- A mixing desk might include an LED VU metre, which shows us the incoming voltage on a channel in real time, and VU with a needle showing us compression in db of the bus compressor

- A Synthesiser might have an LED which flashes in time with the speed of an LFO

- A bass amplifier might have an LED which lights up when clipping is occurring, or shows different colours depending on the incoming gain.

- A channel in a DAW might have a digital VU metre, and when a signal has clipped, and by how much.

Some metering is more accurate than others, for example the aforementioned VU metres which utilise a needle have a physical moving part, and are therefore subject to inaccuracies, whereas digital metering in a  DAW is likely to be faster, and possibly even show us the exact moment that the signal clipped, since it is documenting amplitude over time, whereas the analogue VU only shows us the amplitude in a single moment.

Spectral Analysis

Spectral analysis typically splits the audio signal into discrete bands using a mathematical algorithm called a fast fourier transform. The algorithm splits the incoming signal into bands and displays the amplitude of the signal across the frequency range, meaning that for example the viewer can see the amplitude of frequency content at 5kHz. 

‘Audio Waveforms’

Audio waveforms are a digital representation of alternating current over time in recorded PCM digital audio. An audio waveform displays the amplitude, phase and length of the audio, and we might spot transients in the signal. It is also possible to discern the complexity of the audio signal, a solo recording of a sine wave will be displayed visually as a sine wave (and if we calculated the wavelength of the sine wave we could determine its pitch). However, more complex sounds might be displayed as rapid and seemingly erratic changes in amplitude, and audio waveforms do not display the type of instrument producing the sound.

Digital Tuners

Digital tuners allow us to monitor the incoming pitch of a note, measured in Hz, and are typically used to tune instruments. Guitar tuners are one example, though these same devices can also be used to tune an oscillator on a synthesiser, or tune the pitch of a sample in a DAW. Digital tuners are either monophonic or polyphonic, meaning that they can either detect only one note at a time, or multiple.

Physical Sound Phenomena

Sometimes we can see the effect that sound has on the physical world, or movement in the equipment that produces it, such as speaker cones moving, visible vibrations in the environment, or materials moving due to the pressure of the sound waves emitted by a speaker. 

This can also be seen if a sand is poured onto a plate, positioned on top of a speaker facing vertically - When audio is played through the speaker, the vibrations in the plate cause the sand to migrate into patterns.

Visual Scores

WIP*** - (Link to Section in Sound Storage)

Responsive Music Visualisation Software

Responsive music visualisation software is designed to generate images (typically moving images) which are procedurally generated through some interpretation of an incoming audio signal. For example, Windows Media Player or a Max MSP program set up for live audio visual work. 

Lingual Descriptors - Universal Creative Language

WIP - Link here to document explaining the theory, this is a 1200 word document so hasn’t been included here. It is also written from a first person perspective.

Embodiment

WIP*** - Link this to section 6.1 on our nervous system. Also include:

When receiving sensory stimulus from sound waves, the human body might experience certain embodied reflexes or behaviours. These might include tapping a foot in time with the music, dancing, or even feeling nauseous when coming into contact with strong bass frequencies.


 

10. Computing Sound

Laptop and Desktop Computing

A laptop computer is a small portable device which can run on battery power or using a charger connected to mains power. A desktop computer requires mains power and is therefore stationary and typically installed in one location and not moved. There are no particular differences in terms of performance, a laptop might be more powerful than a desktop or vice versa, it all depends on the spec. Laptops are convenient if you intend to make music in more than one location, or perform live music using a computer. Laptops tend to have smaller screens and keyboards however, and the battery doesn’t last long when using high strain programs such as DAWs, so don’t forget the charger. Also consider that having a laptop which is always plugged in to the charger will reduce the lifespan of the battery, which in the end will result in the battery failing (and the laptop always needing to be plugged. Laptops also tend to have less connectivity, such as USB ports, though again this isn’t always the case.

Desktop Computers are usually used in more permanent studios, and can have multiple monitors connected which is quite common. Some laptops have an HDMI port (or HDMI through USB), which allows the use of an external monitor, which might be a work around for the smaller screen of the laptop. Though not always, desktop computers also tend to be easier to modify and upgrade over time than laptops, which can be a bonus as the demands of your creative process grow. They also tend to have better connectivity, for example more USB ports, which is very useful for music technology, as overloading a single USB hub using a dongle will result in performance issues.

In terms of connectivity, at present music technology devices still tend to use USB A connectors, though you can use a small converter if your laptop only has USB C ports. USB types are constantly changing however as microtechnology evolves.

Operating Software (OS)

There are four major companies offering OS, these being Windows (also commonly referred to as PC), Apple Mac, Linux, and Google Chromebook. Any of the first three operating systems mentioned will run the music specific software detailed below. Google Chromebook laptops use web apps only and will not run the most popular music software programs.

Windows is currently on version 11. Apple is currently on macOS 14 Sonoma. Linux is currently on Linux Mint 21.2 "Victoria". Please refer to the manufacturers for version histories as these change as they are updated, they can be a little hard to keep track of, and once you pick an OS you will need to keep track of updates. 

The newest software may not work on a 7 year old OS which is now unsupported. Some OS updates are compulsory, especially with Apple products. However, some updates are not, therefore a 7 year old version of Ableton might still work perfectly well on a 7 year old version of Windows for example. Bear in mind however that most software companies only host the most recent version on their websites, so obtaining a 7 year old legacy copy of a program is unlikely if you were to try today.

Processor

The processor is where the ‘computations’ occur in a computer. The most common manufacturers currently are Intel and AMD for Windows, and Apple for Apple Macs, though there are several other manufacturers. There are also a significant number of models of processor, and the differences can be confusing. Check the minimum system requirements provided by the software manufacturer to ensure that your chosen computer ships with an appropriate processor for your intended use. In most cases, processors cannot be changed or upgraded in a machine. Some computers are sold as older models or old stock, which naturally have older processors installed - It’s worth double checking, even if the computer itself seems like a good deal. Typically, the newest and most powerful processors are the most expensive, but that doesn’t mean that an older model won’t perform perfectly well for your needs. Check to avoid disappointment. 

RAM

RAM dictates the processing speed of the computer. RAM chips are installed into a computer and are measured in gigabytes (GB). 8GB of RAM will perform better than 4GB, 16GB will be better yet. The minimum systems requirements for the most popular DAWS all specify a minimum of 8GB of RAM. There may also be a difference between ‘minimum’ and ‘recommended’ RAM, with ‘recommended’ always being the higher amount. Also consider that although the minimum recommended might be 8GB, more complex sessions with multiple tracks and plugins all running simultaneously might overload the processing power of the computer, this causes glitches and ‘drop outs’ in the audio. This can be resolved by ‘bouncing tracks out’ or disabling non-essential functions.

Most off the shelf computers currently ship with 4GB of RAM and over, some computers have spare slots to install additional RAM chips, but not all, check this before you buy. It can be more affordable to purchase a computer with spare RAM chips and upgrade yourself, vs. an off the shelf model which ships with higher RAM. WARNING: Installing additional RAM (or making any modifications to your computer) is a complicated procedure, and though it can be done at home, it is highly recommended that this is performed by someone with experience, as if done incorrectly it can irreparably damage the computer and/or result in wiped data.  

GPU

The GPU is similar to the RAM chips, but determines the power of the computer to handle graphics and visuals. Music production is generally unconcerned with GPU, the GPU which comes installed in most off the shelf computers will suffice. Advanced GPU cards are normally associated with advanced video editing software or gaming. Again however, please always check the minimum system requirements of your chosen software to ensure your GPU is sufficient before making a purchase. If GPU isn’t mentioned, it could be that there is no minimum requirement, if you are still unsure, contact the software manufacturer.

Digital Audio Workstation

A Digital Audio Workstation (DAW) is a computer program specifically designed to edit, arrange, sequence, and mix digitised audio and MIDI. A DAW is capable of interfacing with hardware audio devices, such as mixing desks and speakers when connected to an audio interface. DAW’s typically exist within a computer system such as a desktop or laptop computer, but scaled down features can also be found in some modern hardware sequencing devices.

Some DAWs look similar to a traditional mixing desk, with faders and VU metres per channel, and a master fader. However, there are many DAWs which do not follow this layout, such as trackers which scroll vertically and use numerical data to sequence audio. There are many DAWs available to use, and a DAW is typically chosen which best suits the workflow of the user, and the task.

Many DAWs also include native and third party plugins. Native plugins are developed by the developers of the DAW program and will only work inside that DAW. Third party plugins are developed by independent developers and typically can be used in multiple DAWs. Plugins are either instrument, effect, or MIDI.

Instrument plugins produce digitised audio within the DAW program, for example a synthesiser or sampled piano instrument.

Effect plugins are any plugin which processes digitised audio within the DAW program, for example a compressor or reverb.

MIDI plugins are any plugin which alters or modulates MIDI sequencing within the DAW.

Some DAWs also include the capacity to code within the program. For example, Ableton Live utilises Max for Live, a program created in conjunction with Max MSP.


 

11. Interacting with Sound

Professional Roles - Producer, mixing engineer, composer, instrumentalist,  

 

WIP*** - Ask for input from the careers team?

 

Modes of Listening - Critical, Active, Passive

 

WIP***

12. Comments Box / User Feedback (Beta)

Please leave name, date and detailed feedback here.

University Of Hull Logo