How to Transcribe Phone Calls with AI (2026)

Most guides about transcribing phone calls jump straight to “step 1: open this app.” That order is wrong. Phone call recording is a legal-consent question first and a technical workflow second, and which step actually matters changes depending on where you are sitting and where the other person is sitting. In the United States, 38 states and the District of Columbia follow one-party consent under federal law (18 U.S.C. § 2511), while 12 states — California, Florida, Illinois, Maryland, Massachusetts, Montana, Nevada, New Hampshire, Pennsylvania, Washington, plus Connecticut and Delaware in specific contexts — require all-party consent. Cross a state line during a recorded call and the stricter state’s rule typically applies. Cross a national border and you are now in a different statute entirely.

Once consent is handled, the technical workflow has gotten dramatically simpler in the last 18 months. iOS 18.1, shipped on October 28, 2024, added native call recording to every supported iPhone — the first time Apple has allowed it without a third-party app since the platform launched in 2007. Google’s Pixel Recorder app has had call recording since 2019. Most VoIP platforms have always offered it. The hard part is no longer capturing the audio; it is turning the resulting low-bitrate, often narrowband recording into a transcript that is actually useful. That is what this guide focuses on, with Atter AI handling the speech-to-text layer at 98.7% accuracy across 90+ languages.

The Audio Quality Floor: 8 kHz Versus 16 kHz

Phone audio has historically lived at 8 kHz / 64 kbps using the G.711 codec — a standard frozen into PSTN switches in the 1970s and still operational on most landline and traditional cellular carriers. Modern HD Voice (VoLTE in the US since around 2014, and now the default on every major US carrier) raises this to 16 kHz using AMR-WB or Opus. The difference is audible: 8 kHz cuts off everything above 4 kHz, which removes most of the brightness in a human voice and is the reason traditional phone calls sound “muffled” compared to a Zoom meeting at the same volume.

This matters for transcription because speech-recognition systems are typically trained on wideband (16 kHz+) audio. A model that sees only 8 kHz at inference time loses several percentage points of accuracy compared to its wideband performance, particularly on names, technical jargon, and any speaker with an accent. Atter AI runs separate narrowband and wideband acoustic models and routes the audio automatically based on its sample rate — uploading an old 8 kHz call still yields strong results because the model was tuned for that signal, but you will get noticeably better output on a modern VoLTE or VoIP call.

When you record on an iPhone or Pixel, the saved file is typically already 16 kHz because the operating system captures the wideband downlink and uplink mix before any PSTN-side downsampling. When you pull a recording from a VoIP platform’s archive (RingCentral, Dialpad, Vonage, 8x8, Zoom Phone), check the export settings — most default to 16 kHz .mp3 or .wav, but some legacy tenants are still on 8 kHz.

Method 1: iPhone Native Call Recording (iOS 18.1+)

The native iPhone capability shipped with iOS 18.1 on October 28, 2024 and is enabled on every iPhone running 18.1 or later, including iPhone XS and newer hardware. The mechanics:

During an active phone call, tap the record button in the upper-left corner of the in-call screen. Apple’s design intentionally announces the recording with a tone heard by both parties — this is the consent UX, not a bug.
The other party hears a verbal announcement: “This call will be recorded.” In US states that require all-party consent, this announcement satisfies the notification requirement, but the called party still needs to remain on the line voluntarily, which is treated as implicit consent in case law.
When the call ends, the recording is saved to the Notes app (not Voice Memos) as an attachment with an auto-generated transcript and AI summary.
The audio file itself can be exported by long-pressing the attachment in Notes → Share → save to Files, AirDrop to a Mac, or send to any app.

For better transcription quality than Apple’s built-in pass, export the audio to Atter AI. Apple’s on-device transcription is English-centric and uses a smaller model than cloud transcription services; if the call involves any non-English content, technical terminology, or speakers with accents, the accuracy gap is significant. We covered the broader iPhone audio workflow in our iPhone Voice Memos guide.

Method 2: Pixel Phone (Built-In Since 2019)

Google’s Recorder app has supported call recording on Pixel phones since Pixel 4 launched in 2019, making it the first major-platform native call recording capability — five years before iPhone caught up. The mechanics:

During an active call, the Recorder shortcut appears in the Quick Settings overlay or directly in the call UI.
An audio announcement plays to the other party: “Hi, this call is being recorded.”
Saved recordings appear in the Recorder app with a live on-device transcript that you can search.
Tap any recording → Share → choose an app or save to Drive.

Pixel’s on-device transcription is English-only and uses Google’s Soli-era on-device speech model, which is good enough for memory search but not for production transcripts. For multilingual calls, customer interviews, or any recording you intend to share as a document, export the .m4a file and run it through Atter AI.

Other Android manufacturers have shipped call recording at various points: Samsung’s Phone app added it in One UI 5 on select markets but disabled it in the US for legal reasons, Xiaomi has it region-locked, and OnePlus removed it after OxygenOS 12. Outside the Pixel line, third-party apps are still the norm on Android.

Method 3: VoIP Platform Recording Exports

If the call happened on RingCentral, Dialpad, 8x8, Vonage, Zoom Phone, Microsoft Teams Phone, Google Voice (paid Workspace tier), or any modern business VoIP, the platform almost certainly recorded the call automatically based on the tenant’s policy. The recordings live in the platform’s call history and can be exported as .mp3 or .wav.

Standard export workflow (varies slightly per platform):

Open the platform’s admin portal or your personal call history view.
Filter by date, extension, or participant.
Select the call → Download recording (or Export for bulk operations).
Open Atter AI → Upload → drop the downloaded file.

For high-volume call centers and sales teams, several VoIP platforms expose webhook or API endpoints that push completed call recordings to a destination URL. Pointing those webhooks at an Atter AI workspace’s inbound endpoint is the cleanest way to keep every call transcribed without manual export. A typical Dialpad enterprise tenant generates 500–2,000 recordings per agent per month; doing this manually does not scale.

Method 4: Third-Party Recording Apps

When neither side of the call is using a native-recording-capable phone, dedicated apps fill the gap. The major players in 2026:

TapeACall (iOS, Android) — 5M+ downloads, $9.99/month or $59.99/year. Three-way call mechanic: routes the call through a recording bridge that captures both legs. Saved files are .mp3 at 16 kHz.
Rev Call Recorder (iOS) — free recording, charges $0.25/minute for transcription. Same three-way-call mechanism as TapeACall.
Cube ACR (Android) — works on a subset of Android devices via VoIP integration; native cellular call recording is mostly broken on Android 11+ due to Google’s accessibility-API restrictions.
Otter (iOS, Android) — Otter does not record native cellular calls; it records via VoIP integrations (Zoom, Meet, Teams) and via on-device mic capture when the call is on speakerphone.

The “put the call on speakerphone and record with a Voice Memo on a second device” workaround still works in 2026 and produces surprisingly usable audio for one-off needs. The far-side speaker audio loses about 6 dB of level compared to direct line capture, but Atter AI’s diarization still separates the two voices because their acoustic signatures (close-mic’d local speaker vs. speaker-played remote speaker) are quite different.

Method 5: Conference Call Bridges and Old Recordings

For dial-in conference bridges (Free Conference Call, GoToMeeting Audio, Zoom Phone audio, traditional teleconferencing services), recordings are typically delivered as a single mono .mp3 or .wav with all participants on one track. Diarization is the bigger challenge than transcription here: an unconfigured call with 6 participants on a single bridge channel produces 6 voices Atter AI must separate from the audio signal alone, since no metadata indicates who spoke when.

Atter AI’s diarization handles up to 10 distinct speakers on a mono channel reliably, with accuracy degrading past that. For 12+ participant bridges (board calls, large town halls), the more useful output is the verbatim transcript with Speaker 1…Speaker N placeholders that you batch-rename based on the meeting roster after the fact.

Old archives of phone call recordings — typical for call center compliance archives that have been running for years — often arrive as .au, .gsm, or 8-bit .wav files. Atter AI accepts all three, transcoding them to a transcription-friendly intermediate before running speech recognition. The accuracy floor on 8 kHz .gsm (used by older mobile-bridge call centers) is meaningfully lower than wideband, but still in the 92–95% range for clean recordings.

The legal landscape is the part most guides skip. In 2026, the federal default under 18 U.S.C. § 2511 is one-party consent: you can record a call you are a participant in, as long as you yourself consent. But state law can be stricter, and 12 states require all parties to consent before a call can be lawfully recorded:

California (Cal. Penal Code § 632) — all parties, confidential communications
Florida (Fla. Stat. § 934.03) — all parties
Illinois (720 ILCS 5/14-2) — all parties, after the 2014 Supreme Court ruling and 2014 legislative amendment
Maryland (Md. Cts. & Jud. Proc. § 10-402) — all parties
Massachusetts (Mass. Gen. Laws ch. 272 § 99) — all parties, criminal penalty
Montana (Mont. Code § 45-8-213) — all parties
Nevada (Nev. Rev. Stat. § 200.620) — all parties, despite a 2024 case that briefly muddied this
New Hampshire (N.H. Rev. Stat. § 570-A:2) — all parties
Pennsylvania (18 Pa. C.S. § 5704) — all parties
Washington (Wash. Rev. Code § 9.73.030) — all parties
Connecticut — all parties for civil purposes, one party for criminal
Delaware — historically two-party, currently one-party after 2024 statutory clarification

For interstate calls, courts generally apply the stricter state’s law when either participant is in a two-party-consent state. The verbal announcement that iPhone and Pixel play automatically is designed to satisfy this notification requirement: the called party hearing the announcement and continuing the conversation is treated as implicit consent under the case law of most two-party states.

Outside the US, the EU’s GDPR requires a lawful basis and typically explicit consent for any call recording involving an EU resident. The UK’s PECR follows the same logic. Canada requires one-party consent under PIPEDA but with notification obligations for commercial recordings. Japan, Australia, and most of East Asia follow one-party consent. China’s Cybersecurity Law treats any recorded conversation as personal information requiring consent. None of this is legal advice — confirm with counsel for your jurisdiction before recording at scale.

Phone Call Native Transcription vs Atter AI

Capability	iPhone Built-In (iOS 18.1)	Pixel Recorder	Atter AI
Native call recording	Yes (iOS 18.1+)	Yes (Pixel 4+)	N/A (transcription layer)
Transcription languages	English-centric	English-only	90+ languages
Accuracy on clean call audio	~92-94%	~92-94%	98.7%
Speaker diarization	Two speakers, basic	Two speakers, basic	Up to 10 speakers
Cross-call search	None	Per-recording only	Full-text across archive
Export formats	TXT only	TXT only	PDF, DOCX, TXT, SRT, VTT, JSON
Length limit	No fixed limit	No fixed limit	No limit
Cost	Included with iPhone	Included with Pixel	$129.99 lifetime / $49.99/yr / $6.99/wk + 3-day free trial

For comparison with other audio sources, see how the same workflow handles audio files online and the slightly different signal characteristics on Zoom calls.

Phone Call Transcription FAQ

Is it legal for me to record and transcribe my own phone calls?

It depends on your jurisdiction. Under US federal law (18 U.S.C. § 2511), one-party consent is sufficient — you, as a participant, can record. But 12 US states require all parties to consent, and interstate calls typically follow the stricter state’s law. Most of the EU and UK require explicit consent. The verbal announcement iPhone (iOS 18.1+) and Pixel play automatically is designed to satisfy notification requirements where they exist. Always confirm with local counsel for high-stakes use.

How accurate is Atter AI on traditional 8 kHz phone audio?

Atter AI’s narrowband-tuned acoustic model achieves 92–95% accuracy on clean 8 kHz audio, depending on speaker accent and topic. On modern 16 kHz wideband audio (VoLTE, VoIP, recorded on iPhone or Pixel), accuracy reaches 98.7% — the same as on Zoom or in-person meetings.

Can I transcribe a recording from a conference bridge with 8 participants?

Yes. Atter AI’s diarization handles up to 10 distinct speakers on a mono channel. For larger calls, the diarization degrades and you may want to rely on the verbatim transcript with placeholder speaker labels that you rename based on the meeting roster.

Does Atter AI work with TapeACall, Rev Call Recorder, and similar recorders?

Yes. All major call recorder apps export to standard formats (.mp3, .m4a, .wav). Upload directly to Atter AI — no manual conversion is needed. Atter AI accepts all common audio formats and re-encodes internally as needed.

Will Apple’s built-in transcription work for non-English calls?

Apple’s on-device transcription on iOS 18.1+ is English-centric with limited support for a handful of major languages. For genuinely multilingual calls — Mandarin, Cantonese, Japanese, Korean, Spanish with non-US accents, or any code-switching — export the audio file to Atter AI, which supports 90+ languages with full diarization.

Can I transcribe a phone call I recorded years ago in 8-bit .wav format?

Yes. Atter AI accepts .au, .gsm, 8-bit .wav, and other legacy formats common in older call center archives. The system transcodes to a transcription-friendly intermediate before running speech recognition. Accuracy is lower than on wideband recordings but still in the high 90s on clean audio.

Is recording a call via speakerphone with a Voice Memo legal in two-party-consent states?

The recording mechanism does not change the legal requirement — if the state requires all-party consent, you must obtain it before starting the recording, regardless of whether you use a built-in feature, a third-party app, or a second device’s Voice Memo. The verbal-announcement consent UX that iPhone and Pixel play is doing work that a Voice Memo capture does not do automatically.

How do I bulk-transcribe a year of call center recordings?

Use Atter AI’s bulk upload via folder or API. Most call platforms (RingCentral, Dialpad, 8x8) expose either bulk export or webhook delivery, both of which work with Atter AI’s workspace ingestion. A typical enterprise call center processing 1,000+ hours of recordings per month per agent benefits from the API integration over manual upload.

Transcribing a Phone Call Is a Legal Question First, a Technical One Second

The Audio Quality Floor: 8 kHz Versus 16 kHz

Method 1: iPhone Native Call Recording (iOS 18.1+)

Method 2: Pixel Phone (Built-In Since 2019)

Method 3: VoIP Platform Recording Exports

Method 4: Third-Party Recording Apps

Method 5: Conference Call Bridges and Old Recordings

Two-Party Consent: The State-By-State Reality

Phone Call Native Transcription vs Atter AI

Phone Call Transcription FAQ

Continue reading

Best Transcription Apps for Lawyers: Privacy, Review, and Multilingual Evidence

Best Podcast Transcription Apps: Choose for Editing, Show Notes, or Privacy

Best Transcription Apps for Interviews: Pick by What Happens Next