A week with the Focusrite Scarlett 6i6

Pictured: Scarlett features a stylish brushed red aluminum finish.

I have a bit of a history reviewing audio hardware, specifically audio I/O. Over time, the audio interface has moved away from PCIe to USB, which it now rests at as the defacto state for nearly 15 years after USB 2.0 became widespread. I've owned a few external boxes over the course of a decade, briefly, M-Audio's precursor unit that mimics today's Fast Track (Which I returned), Yamaha GO46 FireWire, and Native Instruments Audio Kontrol, and recorded two albums using the later two. I consider myself a bit of an audio geek, but without the audiophile trappings.

Recently I hit a breaking point; NI Audio Kontrol was not able to accept 1/4 inch unbalanced cables. Mystified, I decided it was time to retire the AudioKontrol and check out the offerings in 2016. Unsurprisingly, audio interfaces offer far more bang for the buck than did even five years ago, at $180 I was able to score the Focusrite Scarlett 6i6, offering more high-quality inputs and outputs than any of my previous devices at a lower price point. Even more impressive for $240, the 18i8 offers a whopping 18 potential inputs and 8 output buses.

The weak point of every USB capture device in my experience has and probably always will be, drivers (and USB itself). As an OS X (excuse me, macOS) user, CoreAudio has been mostly positive. Most USB devices if they're ASIO/CoreAudio compliant, drivers are barely needed for basic I/O. However, if the interfaces have custom buttons / internal routing or other features, then drivers are required. In the case of my AudioKontrol, the drivers were mostly negative causing glitchy behavior, and same went for my week with the M-Audio Fast Track. After dealing with years of prosumerish solutions, I decided to ante-up to Focusrite, renowned for their preamps, skipping budget players like Presonus and M-Audio.

Fair warning, this as much an overview of digital audio as a review. Now onto the review.

FocusRite Scarlett 6i6

focusrite Scarlett 6i6

Pictured: The 6i6 makes for a good speaker rest

The Scarlett 6i6 is 6 in and 6 out but that doesn't quite accurately sum up the ports. A break down includes the following:

Inputs

2 front facing Microphone XLR/ 1/4inch Line Inputs with hardware knobs for gain control and level monitoring (Supports 48v)
2 1/4inch Line Inputs
1 stereo S/PDIF input
Midi in

Outputs

2 1/4inch Headphone outputs with hardware volume knobs
2 1/4inch Line (monitor) headphone outputs with volume knob
2 additional line outputs
1 stereo S/PDIF output
Midi out

If you notice, this doesn't add up to the 6 outputs in the device name but instead a total of 6 inputs and 10 outputs. The reasoning is headphones/monitors are all on the same audio bus bring it back down to 3 output buses: one for the monitors (speakers/amp + two headphones), an additional set of 1/4 inch outputs and an SPIDF cable. Each of the headphones jacks and monitors have independent volume controls but any audio routed to the monitor outputs will be outputted to those three outputs. Also notable, the Scarlett only accepts 4 analog channels in. Most users probably won't use the S/PDIF I/O (more on that later). The full tech specs can be found here.

Setting up

Focusrite surprisingly ships the Scarletts with a host of wall adapters for your country of choice but is firmly rooted in North America; I had to swap to North American standard prongs. Other than that, the Scarlett is pretty straightforward: USB cable to the computer, AC adapter to the wall, audio inputs into the device. For me, this meant plugging in my Numark NS7s into the backports and single mic.

Pictured: The mess of cabling...

Performance: latency

With digital audio, there's always (as of writing this) buffering which requires interjecting latency. No matter the device, there will be latency depending on the buffer size. The math to calculating minimum latency is quite simple: Buffer size/sample rate (in KHz) = latency in milliseconds.

Example:

512 samples/44.1 KHz = 11.7 ms

384 samples/44.1 KHz = 8.7 ms

512 samples/96 KHz = 5.3 ms

384 samples/96 KHz = 4 ms

However, this is only the absolute minimum for ONE direction, and lowering the buffer puts more stress on CPU to be sure that the buffer never is fully depleted. This becomes tougher to accomplish as the CPU is tasked with processing more information such as more fx and more tracks. Total travel times for buffering would like the following:

Example:

(in) 512 samples/44.1 KHz = 11.7 ms + (out) 512 samples/44.1 KHz = 23.4 ms minimum roundtrip travel time

(in) 384 samples/44.1 KHz = 8.7 ms + (out) 384 samples/44.1 KHz = 17.4 ms minimum roundtrip travel time

(in) 512 samples/96 KHz = 5.3 ms + (out) 512 samples/96 KHz = 5.3 ms = 10.6ms minimum roundtrip travel time

(in) 384 samples/96 KHz = 4 ms + (out) 384 samples/96 KHz = 4 ms = 8ms minimum roundtrip travel time

The math above also represents the absolute minimum for travel time for external audio to travel from an input and routed to an audio output. As stated this is the absolute minimum time, the audio travels through USB for the USB clock timer, which fires at 1 ms intervals, thus there's an latency buffer that has nothing to do with audio samples but rather continuous data flow imposed by USB. Lesser devices simply use a buffer size of roughly 6 ms for each direction (I/O) which adds more travel time, whereas higher end devices will finally tune the USB timing to minimize the delay. Someone using a low end USB device with 384 sample buffering can expect roughly a 29ms delay. Higher end boxes such as the Scarlett have fine tuned drivers to shave off crucial milliseconds for the USB buffering, and also include onboard DSP to allow onboard mixing to lower travel time delay. If this all sounds a bit confusing, it isn't as bad as it sounds.

Example:

I would like to route my Mic Input directly into my output so I can monitor my inputs without having to route my audio to my computer, to the DAW then back out USB, all of which introduces a time delay, hence latency. Doing this skips the travel time through the ASIO buffer and USB Clock. The benefit is that I effectively have zero-latency for my input monitoring and my downside is that I cannot make use of any effects in realtime from my DAW.

Higher end audio interface includes DSP effects that can be controlled via the software mixers so basic compression/EQing/reverbs/delays can be applied to live monitoring and/or use other interfaces (Firewire has a slightly better clock timing, but Thunderbolt provides even lower latency due to the PCIe bus clock).

All in all, the big step of buying the Scarlett line over a prosumer audio interface boils down to slightly better drivers and internal mixing.

Performance: The bits of it all

24 bit is an unrealistic thing; it's nearly a meaningless stat when it comes to audio gear. However there are measures that more appropriately reflect the dynamic range, but to fully understand this, we have to talk analog and math.

While I may get flack for saying this, despite issues like latency, digital has had a massive leg up over its analog predecessors, not simply from an archiving/storage perspective but also quality. The much-loved vinyl format hits roughly -80db between a signal to noise, meaning the signal is signal power is roughly 80 times stronger to the noise power, which isn't bad. Digital, however, doesn't have an analog noise floor, and sound pressures are expressed in bit depth, which is a number of steps to the current in the digital-to-analog convert (DAC).

To use an analogy I developed that works reasonably well when writing for an audio publication, Bit depth is akin to bit depth in digital imagery, instead of reflecting how many colors an image can have, it reflects how many steps in volume. The sample rate is the resolution, at which the sound is captured. What becomes interesting is that there's even a formula that explicitly tells you how the maximum dynamic range in decibels for any given bit depth. Using the signal-to-quantization-noise ratio formula: 20*log10(2^BITDEPTH-1), we can calculate the signal to noise ratio. 16-bit audio has a theoretical range of 96.33 dB, which is considerably better than Vinyl, and on par with the best of studio reel-to-reel systems. Also, it's important to understand these values represent a theoretical maximum as the Analog-to-digital converters (ADC) and digital-analog converters (DAC) rarely achieve their maximums. 24-bit audio has a theoretical range of 144.49 dB, far beyond even currently the best hardware on the market. Below I made a simple calculator to play with.

The Focusrite features 109 dB dynamic range on its inputs and outputs which is a little more than 18-bit depth. For the computer savvy 18-bit = 2¹⁸ which is effectively 262144 sound level pressures vs 16-bit's 65536, or 4x times more detail. Focusrite isn't being deceitful listing 24 bit, but rather dealing with the limitations of audio production. Also notably for a reference point, the theoretical maximum for volume reproduction of 24 bit would be from silence to a NASA rocket launch (140 dB), arena rock concerts are known to be in excessive of 120 dB. It's not realistic to use the entire dynamic range of 24 bit and your neighbors would not approve if you could.

Resolution

If I haven't talked sampling rates yet, there's a reason, by most accounts, bit depth matters more than sampling rates after a certain point. 44.1 KHz can reproduce 0Hz-22KHz. Capturing at 96-KHz may reduce sound quality if your target format is 44.1 KHz through alias noise. The best way to imagine this is a photo. If you scale proportionally by half, the image will remain clear whereas, scaling to say, 45.9% of the image size would cause some of the image clarity to be sacrificed. The reason why in applications like Photoshop this isn't that big of a problem is through resampling (scaling) algorithms. This sample principle applies to audio, as the waveform must be recomputed and resampled, creating what is known as aliasing. Bit Depth downconversion uses dithering which is a lot more predictable as its a numeric reduction in values, where a range is compressed. Depending on your target format (movies = 48 KHz) or music (usually 44.1 KHz) capturing at 2x the sampling rate is the target format is preferred. The Scarlett can capture 88.2-KHz, but the advantages of higher sampling rate less obvious since DACs have become quite good over the years at filling in the gaps so-to-speak. What high resolution can do is capture above human hearing sounds, and more accurate articulation of the effects of phasing. It's not night and day, and honestly, I'm mostly hard pressed to tell the difference, as are a lot of people, however, audio processing does better with denser data, and the real advantage almost exists entirely in the DAW.

Since I touched on analog vs digital, I figure I'll put in a quip in the long-standing debate. Most of analog's love has less to with superior quality, but characteristics left due to various mediums limitations. It should also be pointed out that analog effects like harmonic distortions from tube amplification and over-saturation from tape, can and are captured by digital when recording from analog sources. For audiophiles, much of the desire is to recreate how music "used to sound," hence the love of vintage hardware. There's nothing inherently wrong with this except that it often shapes the audio debate in non-quantifiable terms and often leads to absurd claims about analog vs digital. Also to add to the debate mess has been the shift in recording techniques, mixing and mastering over time which also drastically alters the sound of a recording. (see the loudness war)

Lastly, digital for recording/listening intents and purposes exists in tandem with analog. In any digital audio path, the signal must be converted into analog electrical modulations to be fed into a transducer (speaker) or start as analog from a transducer (microphone) and have the analog signal into digital, so the devices that perform this are very important. In short, as it relates to this review, the Focusrite Scarlett quality that's professional at an absurdly low price point and it's a wonderful time for a hobbyist as digital solutions are cheap and extremely high quality. Focusrite isn't the only player making low-cost/high-quality computer audio interfaces this, but it has one of the more attractive packages.

The Real world

At the price point, the Focusrite is well speced, the 2nd generation due out this month gives a modest bump, mostly more headroom on the analog ports, 192 KHz capture/playback and analog protection circuitry for unexpected power surges, all welcome features but not game-changing. The 1st gen can be had for $180, a nice $70 price reduction making it a lot of bang for the buck.

Scarlett Mixing Software

Pictured: Scarlett Mixing Software

The Scarlett drivers are straightforward although the device can be used without them you'll miss out on the analog mixing. After installing the drivers, I rebooted and launched the mixer which immediately updated the firmware of the 6i6, which took mere seconds. No word on what the firmware update did but googling revealed that it improved sample rate switching for OS X (MacOS) users and enabled standalone mode so the device can continue routing audio even if the computer isn't turned on, very cool.

The software mixer is straightforward, with handy routing presets and input gain control which is useful for the inputs that do not have hardware controls. Any configuration preset can be saved as a snapshot and instantly reloaded, likely more useful for the Scarlett featuring more inputs and outputs but still welcome. Several of the mixer elements also control hardware switches on the device like Input Gain control or line level vs instrument for the front facing ports. There's a gift and a curse, the hardware is small and compact, but it means its entirely driver and support dependent to set the device settings, whereas, with previous devices I've owned, line vs instrument gain control was hardware facing. Even with bad drivers, the AudioKontrol functioned as a simple USB input/output device regardless. With any luck though, support will be long-in-tooth.

My setup

Everyone's home studio will look a little different so to give users a chance to contrast and compare; my current setup is as follows:

Computer: Mac Pro 2008 with oodles of upgrades

Monitoring: Vanatoo Transparent Ones with a MartinLogan Dynamo 300 subwoofer, Beyerdynamic DT-990 headphones

Inputs: Numark NS7 Numark Performance Controller (motorized turntable controller with audio output), various Microphones

Midi:Native Instruments Maschine, Korg Padkontrol, Korg Microkey 37, Korg NanoPad, Korg nanoKontrol (all USB)

DAW: Cubase, Logic, Maschine

My mini studio is very hip hop centric, mostly focusing on beat composition. It's not space intensive and uses only a modest amount of hardware, and I don't have any real plans of expanding much beyond it. The only real upgrades is probably replacing my Shure SM7b with something a little more forgiving for a wider range of vocals.

Out the gate, I was already happy to simply be able to listen to my headphones or speakers without having to change my audio settings in OS X, even if only an option click away. Swapping between the two was as simple as turning the volume up, this may not seem like a big deal, but for all the love Vanatoo get, their speakers annoyingly do not have a front facing volume knob. Also, the headphone amp, while some audiophiles scoff at it, is without a doubt reasonably better than my Mac's internal headphone jack that I was reduced to using. At least the Mac Pro headphone jacks aren't pummeled with white noise like the MacBooks. Out the gate, if nothing else more accessible volume knobs and better sound via headphones. I was previously debating a headphone amp for my power hungry DT-990s, but they sound better than before and as good as I recall them when I used a Denon mid-range receiver as my main headphone amp.

The big hiccup came when trying to get audio to work in Cubase, part of it was user error as I could not get audio to output for the life of me in Cubase and only Cubase. In a moment of inspiration, I realized that my ports may not be labeled correctly in the VST panel and noticed that it carried over disabling two outputs. Cubase started showing volume meters for sound but refused to output audio. At this point, I resorted to a classic audio hack for OS X, create an aggregate device of one in the audio midi setup. For whatever reason, it worked. Annoying? Yes. All other applications functioned normally without this, meaning the issue lies somewhere between Cubase's VST engine and the drivers for the Scarlett.

Recording was easy as ever, there isn't much to say, identifying buses was a charm and recording worked great, noiseless and sounded as rich as it should have for the instrument inputs on the back. The Mic Preamps are notably a little sweeter than my Audio Kontrol, simply for the fact it'll accept unbalanced cables. The quality, when tested with a Sennheiser e935 without any other preamp, was clean and defined, and only required roughly half gain. Comparing it to the Audio Kontrol which wasn't terrible, it seemed just a hair "richer" to use a vague, imprecise term. Audio quality is certainly up to professional standards, at least in my book.

The next plus was for the first time; I was able to use live monitoring. With my AudioKontrol, it never worked if it was supposed to. I've always done monitoring via software which has meant delay. Not ideal but it worked. A quick trip to the mixer control panel and the Scarlett worked as expected. I could play my NS7s regardless if I had a track set to monitor in my DAW. It's a real benefit over the lower end devices I was using as I could live monitor my turntables without waiting for the sound to route into my computer and back out.

After a week in Cubase, there are no noticeable glitches, which Cubase on OS X... macOS... is more prone to wonkiness than many other Mac audio apps. I'm pleasantly happy with the device.

What about Midi I/O?

The FocusRite shows up in my audio pane but I'm living in a post midi cable world. The only midi devices I own are controllers: the Korg Microkey and Nano series simply do not feature Midi, leaving my Korg PadKontrol and my Maschine as the only two devices that have midi I/O. For roughly a decade or more, I've been in a Midi via USB world and thus do not own any cables. I can't comment on the Midi performance other than the software recognizes them.

A slightly different take - S/PDIF

The 6i6 is almost ideal, but the S/PDIF coax ports are almost useless for most people. So for anyone asking what S/PDIF (Sony/Phillips Digital Interface Format) is, its the common format developed for transmitting PCM audio or compressed formats such as AC3 (Dolby Digital) or DTS via 75 Ohm Coax (RCA) cables or Toslink (Optical). Toslink over time became the much-preferred format, likely for the "cool" factor, and optical cables require no shielding as RF noise does not affect light. Thus cables are lightweight and small. S/PDIF can be found on virtually home theater receivers, many standalone CD players, almost all DVD players, many computer audio cards / motherboards, all Blu-Ray players and in the professional world, DAT systems and some mixers..

S/PDIF is so ubiquitous that my Mac Pro has I/O via S/PDIF optical and most Macs (MacBook Pros, iMacs) can output S/PDIF optical with a specialized mini-Toslink cable. Digital Coax is a fading format, limited to DAT and some CD/DVD players. Outside of DAT, most formats that use S/PDIF can be transferred directly to optical media (CD/DVDs) from an optical drive bay and thus, its mostly used as a way to transmit out from a computer to a receiver or speaker system. I'm not sure about user stats but coax S/PDIF really strikes me as not very useful. I'd much prefer another set of instrument inputs for S/PDIF, a 6i4 (6 analog inputs) would be more useful, I guess most studio musicians would be in the same boat. At the very least, Toslink would be much preferred as there's a much greater chance someone has a speaker system or receiver that uses it.

The other negative is I still don't know why Cubase has a problem with the Scarlett. I've used 3 other boxes over-the-years and never required any workarounds. It works but it strikes as a precarious position as I'm not sure if I'm a DAW update or OS update away from it not working with Cubase but as of writing this it does under OS X 10.11.5. As Mac Cubase user, I'm in the minority and Logic X works fine with it.

Pros

Value!
Build quality
Easy to use device mixer software
6 outputs linked to the "Monitor" audio bus alone, meaning two separate headphone amps and external speakers with all independent volume controls
low latency for USB

Cons

Mild driver issues with Cubase, works fine with the workaround.
Coax S/PDIF really could be swapped for more useful ports. It's best to think of this as a 4 input device

11/3/2017 Update: I ran into some issues with SoundFlower. Originally I thought my unit had died and had been without for a few months, but finally got around to testing it on my MacBook Pro. It worked without a hitch, and I booted another copy of OS X on my Mac Pro. It worked. I finally narrowed it down to SoundFlower by Rogue Amoeba. If the de-installer doesn't work, I wrote a guide on how to fix it.

01/16/2018 Update: Minor editing for clarifications. Also, after a year and a half, the FocusRite is hard not recommend, even day-to-day use, it functions as my volume control for my speakers and headphones on my desktop when not being used for audio work.