Wav Vs. Flac Vs. Mp3: Audio File Formats Explained
Learn about audio files, without being an audiophile.
FLAC, WAV, AIFF, DSD… There’s no denying that picking the right type of audio to listen to can be crazy confusing. And if you even dip so much as a toe into the world of digital music, you’ll be confronted with discussions about sample rates and bit depth and whether lossless FLAC files can ever match MQA for fidelity. If all of that made your head spin, we’re here to clear it up with our in-depth, completely straightforward guide to file formats. By the time you’re done, no strange acronym will confuse you again.
Sample Rate And Bit Depth Explained
To understand how audio formats work, you need to understand two concepts: Sample Rate and Bit Depth. These are two measures that tell us how accurate a piece of digitally-recorded sound is, and we can understand them by imagining an art critic, looking at a painting. Yes, an art critic. Work with us here. Take a look at this wonderful piece of art:
Let’s say this particular art critic really wants to get to grips with the painting above and really understand it, but can only look at it a certain number of times before she has to move on (she’s busy, OK?). Obviously, the more often the critic looks at the painting - the more she samples it - the more she’ll understand it. This art critic can look at the painting 44,100 times per second. Obviously, if her sample rate were higher - like 96,000 times a second - she’d understand more about the painting. (We’ll talk about why we picked those numbers below - for now, just roll with it).
Now let’s talk about the painting itself. Let’s say our super-speedy critic is looking at a small portion of the picture, which happens to be painted yellow. That’s a piece of information she now has. She also knows that there are many different shades of yellow, and will use that information to critique the painting. So, when she’s writing up her review, she’ll be able to describe that yellow section as butter yellow, or lemon yellow, or gold. Not just plain yellow.
Let’s also say she’s reasonably smart and has the ability to describe sixteen different shades of yellow. Each shade of yellow she can describe is a single bit of information - in this case, she has sixteen bits of information at her disposal. Couple that with her 44,100 glances at the painting, and she’ll have a very in-depth understanding of what she’s looking at.
(The analogy is a little crude, because computers work slightly differently, and sixteen bits of information actually refers to 2 to the 16th power which is 65,536 volumes and…look, trust us, you don’t need to know this, OK? Let’s just go with sixteen bits of information. Sixteen shades of yellow. Breathe, everybody. Breathe.)
Now think of this in music terms. A group of musicians in a studio is the painting, and the software recording them is the art critic. The software ‘looks’ at the incoming sound 44,100 times every second and is able to use up to 16 bits of information each time it does, in order to describe what it’s hearing. Audio recording on a regular CD or Spotify stream is 16-bit/44.1 kHz. Samples are always written as Hertz and kiloHertz (1,000 Hertz). You can record even higher-quality audio at 24-bit/192 kHz, which means that the software is sampling the music 192,000 times every second, and is able to draw on 24 bits of information to accurately describe it.
If none of the above is clear, then all you need to understand is this: the higher the bit depth and sample rate, the higher quality an audio file is.
FLAC vs. WAV vs. MP3
Let’s get MP3s out of the way first, because they suck. If you have even a passing interest in audio fidelity and decent sound, you’re going to want to avoid these. Essentially, an MP3 (MPEG-1 Audio Layer 3) is a file that sacrifices audio resolution for minimal size, cutting out all the bits that we as humans aren’t supposed to be able to hear, and can be read by just about any device on earth.
The downside? We might not technically be able to ‘hear’ the bits that are cut out, but in our opinion, this compression of the file – and we’re going to be using the terms compressed and uncompressed a lot – renders it thin, tinny, and lifeless. No one these days seriously uses MP3s - its creators recently terminated its licensing, declaring it dead. Regardless, you’ll still come across it every now and then, especially if you have an old iTunes Library.
FLAC is where things get really interesting. The Free Lossless Audio Codec pulls off a remarkable trick by allowing you to compress the file size down to roughyl 60% of the original, without losing any noticeable audio quality. Not only is it free and open source, but it allows the transmission of sample rates up to 1,411 kbps, which is significantly higher than anywhere else. This is the format of choice for the streaming service Tidal, and in terms of mass-market streaming audio, it’s considered the gold standard. This is the format you should use if you care deeply about your sound, but don’t want to commit to physical formats like CDs or vinyl.
WAVs are equally common, and useful for anybody wanting decent audio. Essentially, WAVs (Waveform Audio File Format) are higher resolution audio files, that almost always contain uncompressed audio. Technically speaking, a WAV file is simply a presentation of a piece of audio encoded with something known as Pulse Code Modulation (PCM). This is a way of taking analog audio and converting it into digital so that it has a sample rate and bit depth, as described above. Honestly? You don’t need to stress too much about the difference. For all intents and purposes, WAV and PCM are interchangeable terms, and both refer to a reasonably high-quality audio file. To make things even simpler: FLAC is commonly used on streaming services, whereas you’ll normally find WAVs and MP3s on a hard drive in your house.
OGG vs. ALAC. vs. AIFF
Let's start with the fun one: OGG, or Ogg Vorbis. No, we’re not making that up; that’s actually its name. It comes from a character in the Terry Pratchett novel, Small Gods, because every so often, computer engineers display a little bit of personality. Think of OGG files like supercharged MP3s. They’re compressed audio, meaning they get decent file sizes that are friendly to stream over Wi-Fi, but they also manage to avoid much of the audio damage caused by this process. Spotify uses them, and depending on how much you pay, you can get them in various sample rates, from 96kbps on the free tier all the way to 320kbps on the premium. As a general rule, premium-tier Spotify audio quality is considered a perfectly acceptable, if not mind-blowing, way of listening to music.
Then there’s ALAC (Apple Lossless Audio Codec). It’s not quite as good as FLAC in terms of efficiency, and arguably not in terms of audio quality, but you have to work quite hard to notice the difference. It’s the format of choice for Apple Music streamers.
Last but not least, we have AIFF (Audio Interchange File Format). This format is functionally similar to WAV, in that it uses Pulse Code Modulation to encode a piece of audio and present it in a digital format. Back in the day, this was Apple’s answer to Microsoft’s WAV, and it would only work on Mac computers. Now, they’re more or less interchangeable. Put simply, if you have a file that is a WAV or AIFF, you’re dealing with a decent piece of audio. For the most part, these uncompressed formats are used for actual files played through services like iTunes music libraries. You won’t really see them on streaming services, which tend to use special types of lossless, compressed audio mentioned above.
MQA And Hi-Res Audio Explained
One of the things that streaming can’t do, and that actual files on a hard drive can, is deliver true high resolution audio. It’s a bit of a nebulous term, encompassing many things, but it essentially means audio of the very highest quality. We are a tad dubious about manufacturers trumpeting their products as handling it, mostly because it’s so ill-defined, but it’s out there. And there’s been a development recently that might bring it to streaming services.
It’s called MQA (Master Quality Authenticated). Essentially, it delivers tiny files with absolutely enormous sound quality, using some sophisticated digital jiggery-pokery to package them into a FLAC or WAV container, and deliver them down your piddly Wi-Fi signal.
On the one hand, this is obviously great news for audiophiles and their files... On the other, it hasn’t quite taken the industry by storm. MQA is still readily available, but it has yet to make a real dent into the dominant file formats of our time. Although for the record, we would love to see it do so. You can actually listen to MQA audio on Tidal right now, thanks to their Tidal Masters lineup on desktop. You also need compatible hardware to actually run it – like DSD, it requires some fairly specialized internal components to get it working. Fortunately, these things are becoming more readily available. Products like the Mytek Brooklyn Bridge will quite happily handle MQA sound.
DSD files are a bit different. They upend the rules of regular music. But as you’re about to find out, they’re something you truly need to experience. You’ll forgive the fact that this section is slightly long, but it takes a bit of explanation. You can totally skip it if all you want to do is stream off Tidal.
A lot of articles on DSD spend a good deal of time going into the history – one we read recently even starts with the invention of the phonograph, which is eyebrow-raising, to say the least. Here’s what you need to know about how DSD was made. A while back, Sony and Phillips wanted to start experimenting with higher-quality audio formats, and DSD was what they came up with. Short and sweet. Obviously there’s a bit more to it than that, and you can dive into this article which explains it, but that’s more information than we need to get this party started.
Now, as we said, your regular CD or Spotify stream has a bit depth and sample rate of 16-bit/44.1 kHz. DSD Audio, however, has a bit depth and sample rate of 1-bit/2.8224 MHz (MegaHertz). In other words, a piece of DSD Audio is sampled 2,822,400 times every second, and each time, it only produces one bit of information.
Imagine a ruler with 44,100 lines on it. In other words, you can measure something in 44,100 increments. If the bit depth is sixteen, you’ll then be able to gather sixteen bits of information from the segment you’ve just measured. But if you have a ruler with 2,822,400 lines on it, then obviously you’ll be able to take much finer measurements. When you’re taking measurements that fine and that accurate, you simply don’t need sixteen bits of information. You only need one.
That’s because the segment you’ve measured won’t be all that different from the ones to the left and right of it. Having sixteen bits of information won’t be any more beneficial than one bit, in this case. When the sample rate is that high, there’s no benefit to having a higher bit depth. You can simply record that information as a 1 or a 0 - or, in sound terms, whether the amplitude of a sound wave is increasing or decreasing. By the time those 2,822,400 1s and 0s are put together, you have an insanely-detailed picture of whatever it is you’re measuring. It’s like if you suddenly zoomed out of a close-up of a collage, where each tiny segment only differed slightly from the ones around it, to reveal a gorgeous painting. (Obviously, there is a lot more to it than this, but this is the simplest description we can think of).
2.8224 MHz is far from the upper limit, by the way. You’ll frequently see terms like DSD64 and DSD128, which refer to DSD audio with even higher sample rates. The maximum, so far as we know, is DSD256+, which has a sample rate of 12.288 MHz. Recordings in that format are so rare that they’re practically non-existent.
In order to understand exactly how DSD works, you have to be familiar with and conversant in not just concepts like bit depth and sample rate, but quantization, jitter, non-linearity, amplitude, noise-shaping algorithms and more. That is, to be honest, much more detail than most of us will ever need. If you want to find out more, there’s plenty of information elsewhere that goes into excruciating depth (here’s a vaguely-understandable explanation). Basically, what you need to understand is this: DSD audio sounds really damn good. It feels like you’re in the room with the musician, standing right in front of him, while he shows you what he can do. You’re hearing each note in extraordinary, pinpoint detail. It feels as though you’ve stopped listening to a piece of recorded audio on headphones or a pair of speakers, and have transported directly into the recording studio. You’re mainlining pure music, snorting the goddamn motherlode, injecting three minutes of the best audio you’ve ever heard right into your carotid.
We are not even a little bit joking. DSD is incredibly geeky, but holy hell, it’s worth your time. It’s the single highest-quality source of audio we’ve ever heard, and we’re not kidding when we say it’s left us breathless. Forget the maths and clunky examples. If you take one thing away from this guide, it’s this: a song playing back as DSD audio will be among the finest things you have ever heard, and you owe it to yourself to try it out.
To properly experience DSD audio, you’ll need to not only invest in a DAC and amp capable of handling DSD, but you’ll also need a specialized player (we were serious when we said this was geeky). Despite the cost of gathering this equipment, there are a few other downsides. The most obvious is that your favorite artist might not be available in this format. Fortunately, if you’re the kind of person who enjoys Norah Jones, Diana Krall, Carlos Santana, or even Steely Dan, there will be a DSD version of your preferred album available online, at stores like Acoustic Sounds. But if you like Kings of Leon, Kanye West, Taylor Swift, or most chart songs released in the past decade, you may be out of luck. Right now, DSD recordings are overwhelmingly dominated by what one might charitably call legacy acts, and if that’s not what you listen to, you’ve got more than 99 problems.
DSD has proven to be pricey in more ways than one. If you do find the album you’re looking for, be prepared to pay significantly more than you would if you bought it from iTunes – DSD albums often nudge the $25 mark. They take up far more space on the hard drive, too, clocking in at around one to two gigabytes in size. And by the way, don’t expect to be playing these albums from iTunes or on your phone. As we mentioned above, you need a specialized audio player to handle the file format. We use the free Pine Player on our office Mac, but the most popular program is Audirvana Plus, which costs an eye-watering $74. DSD music streaming? Excuse us while we snort uncontrollably. That's just not going to happen - at least not for a very long time.
So, you’ve bought your audio and downloaded Pine Player. Did you think you were done? Not quite. Listening to DSD Audio through your computer’s crappy digital-to-analog converter (DAC) and terrible amp circuitry is like getting a bottle of aged single malt whiskey, and drinking it out of a cereal bowl. Be nice to your music and be kind to your ears. Invest in a DAC that can specifically handle DSD Audio, and you’ll ensure that your music sounds sweet and accurate.
To bypass all this wizardry, you can technically cheat, and up-sample a piece of PCM audio to DSD format. Amp/DACs like the Sony TA-ZH1ES (full review here) actually have circuitry that remasters audio into DSD, up to 11.2 MHz - which is a huge sample rate. However, though it will sound quite good, nothing beats the full experience of hearing properly recorded DSD music.