How Audiobook Creators Actually Process Their Recordings
From raw voice files to polished audiobooks—the real workflow behind what narrators and indie creators do to make audiobooks sound professional.

If you've ever tried recording yourself reading something aloud and then listened back, you know the vibe: it sounds nothing like the smooth, effortless audiobooks you buy on Audible. Maybe there's background hum. Maybe you can hear every breath. Maybe your voice volume jumps all over the place.
That's because audiobook production isn't just about reading well—it's about a ton of invisible audio work that happens after the recording. And if you're thinking about narrating your own book (or starting a side gig as a narrator), understanding the actual workflow matters.
Here's what really happens between "record" and "publish."
Step 1: Recording in a Format That Won't Bite You Later
Most audiobook narrators record in WAV at 44.1kHz or 48kHz sample rate, 24-bit depth. That's not because they're audiophile snobs—it's because WAV is lossless and gives you room to edit without degrading quality.
You don't want to record straight to MP3. MP3 compression is lossy, and every time you edit and re-export, you lose a bit more fidelity. It's like photocopying a photocopy—fine once, ugly after five rounds.
But eventually, you will convert to MP3 or M4B (the audiobook-friendly format). That comes at the end. For now, keep everything in WAV so you can chop, tweak, and re-export cleanly.
And if you need to split chapters or prep different takes for comparison, having a tool that can convert between formats quickly without re-encoding hell is a lifesaver.
Step 2: Cleaning Up the Raw Audio
This is where the magic (or tedium) happens. Raw narration is full of stuff you don't want:
- Breaths between sentences (some are fine, but loud gasps aren't)
- Mouth clicks and lip smacks
- Background hum from AC, computer fans, or traffic
- Retakes where you stumbled on a word
- Long pauses where you lost your place
Professional narrators spend 2-4 hours editing per finished hour of audio. Some of that is trimming out the bad takes. Some is using tools like iZotope RX to surgically remove clicks and pops. Some is just listening closely and cutting breaths that sound too wet or distracting.
Here's a pro trick: record 5 seconds of "room tone" (complete silence in your recording space) at the start of every session. Then use that as a noise profile in your audio editor. Tools like Audacity or Adobe Audition can "learn" what your background noise sounds like and subtract it from the whole file. It's not perfect, but it's shockingly effective.
Step 3: Normalizing Levels So Your Voice Doesn't Whisper-Shout
Even if you think you're reading at a consistent volume, your waveform will show otherwise. Quiet passages dip. Dialogue gets louder. If you don't fix this, listeners will constantly adjust their volume—annoying.
Audiobook platforms like ACX (Amazon's audiobook distributor) have strict loudness requirements:
- Peak levels between -3 dB and 0 dB
- RMS (average loudness) between -18 dB and -23 dB
- No clipping, no silence longer than a few seconds
If you submit files outside those specs, they get rejected. So narrators use normalization and compression (the audio kind, not file compression) to keep everything in the sweet spot.
Most DAWs (digital audio workstations) have a "normalize to RMS" function. You set the target (say, -20 dB RMS), hit apply, and it automatically adjusts your entire recording to match. Done.
Step 4: Splitting Into Chapters
Nobody wants one giant 8-hour audio file. Audiobooks are split by chapter, sometimes by section. That means you're exporting 15-30 separate files per book.
If you recorded chapter-by-chapter, great—you already have separate files. But if you recorded in long sessions (which is more efficient), you'll need to split the master file at the right timestamps.
Some narrators use markers in their DAW to label chapter breaks while recording. Then they export each segment as its own file. Others just eyeball the waveform and cut manually.
Either way, you end up with a folder full of files like Chapter_01.wav, Chapter_02.wav, etc. Then you batch-convert them all to MP3 or M4B for distribution. If you need to batch-process audio files without clicking 30 times, automation is your friend.
Step 5: Adding Metadata (So Listeners Know What They're Hearing)
MP3 and M4B files can store metadata—title, author, narrator, chapter titles, cover art. Platforms like Audible pull this info to display in the app. If you skip this step, your audiobook shows up as "Unknown Artist" with no chapter navigation. Not a great look.
Tools like Mp3tag (Windows) or Kid3 (Mac/Linux) let you batch-edit ID3 tags across all your chapter files at once. You set the album name (book title), artist (author), and individual track titles (chapter names).
For M4B files (which are basically AAC audio wrapped in an MP4 container), you can also embed chapter markers so listeners can skip between sections like they're using a DVD menu. It's a nice touch that makes your audiobook feel more polished.
Step 6: Final Export and Quality Check
Before you upload anywhere, you listen to the whole thing. Not in your DAW—export it to MP3 or M4B and listen on the same device your audience will use. Phone speakers, earbuds, car stereo.
Why? Because playback on different devices can reveal problems you didn't hear in your studio monitors. Maybe there's a low hum that only shows up on phone speakers. Maybe the bass is muddy in a car. You catch it now or you catch it in 1-star reviews.
Some narrators also run their files through ACX Check (a free plugin for Audacity) to verify they meet platform specs before uploading. It scans peak levels, RMS, and silence length. If it passes, you're good to go.
What Format Do You Actually Publish In?
Depends where you're uploading:
- ACX (Amazon/Audible): MP3, 192 kbps CBR, 44.1kHz, mono or stereo
- Findaway Voices: MP3 or M4B, similar specs
- Self-hosting or Patreon: M4B is nice because it bundles chapters + cover art in one file
M4B is technically superior for audiobooks because it supports chapter markers and bookmarking (so listeners can pause and resume exactly where they left off). But not all platforms accept it, so MP3 is the safe fallback.
If you recorded in WAV (which you should have), converting to MP3 or M4B is a one-time lossy step. Do it once, at the end, with high quality settings (320 kbps for MP3, 128-192 kbps for AAC/M4B). That way you preserve as much as possible from your lossless source.
Tools the Pros Actually Use
You don't need a $2,000 software suite to make a good audiobook. Here's what working narrators rely on:
- Audacity (free) — handles recording, editing, noise reduction, normalization
- Reaper ($60) — more powerful DAW if you want advanced features
- iZotope RX ($$$) — the gold standard for cleaning up mouth noise and background sounds
- ACX Check plugin (free) — verifies your files meet Audible's specs
- Mp3tag or Kid3 (free) — batch metadata editing
And honestly? For format conversion, chapter splitting, or quick batch jobs, having a fast audio converter that doesn't make you install anything is clutch. You're already spending hours on the narration and editing—don't add more software headaches.
How Long Does This All Take?
Let's say you're narrating a 60,000-word book. That's roughly 6-7 hours of finished audio. Here's the real time breakdown:
- Recording: 10-12 hours (you'll do retakes, breaks, etc.)
- Editing: 12-28 hours (2-4x the finished length)
- Proofing: 6-7 hours (listening at 1.5x speed to catch issues)
- Exporting and metadata: 2-3 hours
Total: 30-50 hours for a ~7-hour audiobook. That's why audiobook narrators charge $200-400 per finished hour for indie projects. It's not just reading—it's a production pipeline.
What About AI Narration?
Look, AI voices are getting scary good. Google's WaveNet, ElevenLabs, and others can produce narration that sounds almost human. Some indie authors are already using them to cut costs.
But here's the thing: AI still struggles with pacing, emotion, and character voices. It can read a technical manual just fine. It falls apart with dialogue-heavy fiction or anything that needs nuance.
Plus, listeners notice. Audiobook communities are vocal about preferring human narrators. So while AI can handle the grunt work (maybe generating chapter splits or suggesting cuts), you're not replacing human narrators anytime soon.
Final Thoughts: It's a Craft, Not Just a Recording
Making an audiobook sound professional takes way more than a decent mic and a quiet room. It's editing out breaths, balancing levels, exporting in the right format, adding metadata, and doing a final quality pass.
If you're doing this yourself—whether you're an author narrating your own book or a freelance narrator building a portfolio—respect the process. It's tedious, but it's what separates amateur recordings from stuff people actually want to listen to for hours.
And if you need help managing the technical side—like converting files, splitting chapters, or batch-processing metadata—there are tools that make it way less painful. Because the last thing you want after 40 hours of narration work is to fight with audio software for another three.