AI Transcription

iPhone Voice Memos to Text: What iOS 18 Does (and Doesn't)

iOS 18 added native Voice Memos transcription, but it covers only 12 languages and skips speaker labels. Here’s how to push past those limits.

Voice Memos is the most-used audio recorder on the planet — preinstalled on every one of the roughly 2.2 billion active iPhones in 2026 — and for most of its history it produced an .m4a file that did absolutely nothing else. The September 2024 release of iOS 18 finally added an in-app Transcript view, but it shipped with hard ceilings most people only discover after the recording matters: a 13-language whitelist, on-device-only processing tied to specific iPhone hardware, and no export of the transcript text itself. Two years later in iOS 26, those ceilings are slightly higher but still real.

This guide covers what Apple’s built-in transcription actually does in 2026, where Voice Memos saves files on each of the four platforms it runs on (iPhone, iPad, Mac, Apple Watch), and how to get a full editable transcript with 98.7% accuracy in 90+ languages using AI transcription pipelines that work on every Voice Memo file regardless of iOS version.

What iPhone Voice Memos Actually Produces

Every recording is an .m4a file using AAC compression, mono, and one of two quality presets configured in Settings → Voice Memos → Audio Quality:

SettingSample rateBitrate60-minute file size
Compressed (default)32 kHz~32 kbps~14 MB
Lossless48 kHz~256 kbps~110 MB

The default Compressed setting is the trap most users walk into. At 32 kbps, the audio is fine for human listening but loses the high-frequency speech cues automatic speech recognition leans on. Switching to Lossless before an important interview costs about 100 MB per hour and raises transcription accuracy by 3–5 percentage points on every engine, including Apple’s own.

File naming follows three rules in order: (1) if Location Services is enabled for Voice Memos, the new recording is named after the GPS-resolved place (San Francisco, Home, Office); (2) otherwise it uses the previous recording’s name with an incremented suffix; (3) otherwise it falls back to New Recording. The result is that long-term Voice Memos libraries are full of files named New Recording 47 that nobody can identify without playing back.

What iOS 18+ Native Transcription Can and Cannot Do

The Transcript view that appeared in iOS 18 expanded slightly in iOS 26 but still has clear limits.

Languages. The on-device model supports roughly 13 languages in iOS 26: English (US, UK, Australia, Canada, India, Singapore, South Africa), Spanish (US, Mexico, Spain), Mandarin Chinese (Mainland, Taiwan), Cantonese, French (France, Canada), German (Germany, Switzerland, Austria), Italian, Japanese, Korean, Portuguese (Brazil, Portugal), Arabic (Saudi Arabia), Russian, and Turkish. Recordings in any other language — Vietnamese, Thai, Hindi, Hebrew, Polish, Dutch, the Nordics, every African language, every Southeast Asian language outside Vietnamese — produce no transcript at all. The Transcript tab simply does not appear.

Hardware. Transcription requires a device with Apple Neural Engine generation A15 or newer for on-device transcription: iPhone 13 family or newer, iPad mini 6 / iPad Air 5 / iPad Pro 2021 or newer, and Apple Silicon Macs (M1 onward). Older devices show the recording but never the Transcript view, even when the language is supported.

Export. The transcript text can be selected and copied paragraph by paragraph, but there is no built-in Export Transcript action. You cannot save it as .txt, .docx, .srt, or .vtt. You cannot share the transcript separately from the audio. The only way to extract the full text on iPhone is to long-press, Select All, copy, and paste — and you must do this per scrolled section.

Accuracy. Apple’s on-device model is faster than any cloud service (transcription happens in roughly real time as you record) but trails the best cloud engines by a meaningful margin. On clean studio audio in US English, the on-device model lands around 88–92% word accuracy; on iPhone-mic audio in a noisy café, accuracy drops into the high 70s. Atter AI sits at 98.7% on clean audio in any of its 90+ supported languages — the gap matters most for searchable archives and legal-grade transcripts.

Punctuation and speaker labels. The on-device transcript inserts basic punctuation but does no speaker diarization at all. Every line is attributed to the device. A two-person interview reads as one continuous monologue.

Method 1: Get a Native Transcript on iPhone (iOS 18+)

On a supported device with a supported language:

  1. Open Voice Memos and tap a recording.
  2. Tap the = icon (three lines) in the upper-right of the recording card to open the Transcript view. If the icon is missing, the language or hardware is unsupported.
  3. The transcript appears as scrollable text synchronized to playback. Tap any word to jump to that timestamp.
  4. To copy, long-press → Select AllCopy. Paste into Notes, Mail, or any text app.

The Transcript view is also where the iOS 24+ Summarize with Apple Intelligence feature lives, when enabled. Summaries are short (3–6 bullets), generated entirely on-device, and supported in a subset of the transcription languages — US English, Mandarin, and a few others as of iOS 26.

Method 2: Get a Full Transcript with Atter AI

For everything Apple’s built-in transcript cannot do — unsupported languages, older hardware, file export, speaker labels, summaries longer than six bullets — the workflow is the same regardless of iPhone model:

  1. In Voice Memos, tap the recording → tap the More (...) button → Share → choose Atter AI if the app is installed, or Save to Files to upload manually.
  2. If uploading from Atter AI’s iPhone app, tap Import → Voice Memos and the app reads the recording directly from the Voice Memos library, no intermediate file needed.
  3. Transcription typically completes in 60–90 seconds for a 30-minute recording. Output supports PDF, DOCX, TXT, SRT, VTT, and JSON formats.
  4. The free 3-day trial covers this exact workflow; pricing details are in the comparison table below, and there’s no per-minute or per-file cap regardless of plan.

For long interviews where you need both transcript and summary, Atter AI’s summary length is configurable (one paragraph through full meeting-minutes format) rather than fixed at Apple Intelligence’s six bullets. The same pipeline that handles uploaded audio also drives our audio-to-text guide and our podcast transcription guide — the engine is the same; only the input source differs.

Method 3: Pull the .m4a Off the Device

When you’d rather not install another app on the phone, get the raw file to a computer first:

  • AirDrop to a nearby Mac. Voice Memos → recording → Share → AirDrop. The .m4a lands in ~/Downloads. Fastest path; works offline.
  • iCloud sync. Enable Settings → [Your Name] → iCloud → Voice Memos. Recordings appear in the Voice Memos app on every signed-in Mac and iPad. From the Mac app, drag the recording out of the sidebar to a Finder window to extract the .m4a.
  • Files app. On the iPhone, Voice Memos → Share → Save to Files → pick On My iPhone or any iCloud folder. The recording is now visible to other apps and to a Mac via iCloud Drive.
  • Email or Messages. The 25 MB Mail attachment limit covers Compressed-quality recordings up to roughly 100 minutes; Lossless caps out around 12 minutes. iMessage tolerates files up to 100 MB.

Once the .m4a is on a computer, drag it into Atter AI’s web uploader at the dashboard, or use the macOS app. Either path produces the same cloud-grade transcript.

Method 4: Apple Watch Recording

The Voice Memos complication on Apple Watch records straight from the Watch microphone with the screen off — useful for hallway conversations or quick reminders without pulling out the phone. The Watch records at 16 kHz mono (lower than iPhone’s 32 or 48 kHz), and recordings sync to the paired iPhone within 1–2 minutes of opening Voice Memos on the phone with both devices on Wi-Fi or via Bluetooth handoff.

The 16 kHz Watch recording is sufficient for speech but noticeably reduces transcription accuracy versus iPhone-mic audio. For high-stakes recordings, prefer the iPhone or a wired/Bluetooth mic into the iPhone. The 100-minute Apple Watch battery ceiling during continuous Voice Memo recording is another reason to default to the phone for anything over an hour.

Voice Memos Transcription Gotchas

iCloud sync can lag. Recordings created on an iPhone in airplane mode do not sync until the phone reconnects. If you AirDrop or share before the recording syncs to your Mac, you will get the file but the Mac’s local Transcript view may show “Generating transcript…” indefinitely because the on-device model on the Mac is processing a different copy than the iPhone.

Mid-recording phone calls truncate. If a call comes in mid-recording, Voice Memos pauses the recording and resumes after the call ends — but the recording is split into two files only on iOS 26+. On iOS 18 and earlier, the pause is silent and the resulting file omits the call duration with no marker.

Background noise removal is destructive. The Enhance Recording toggle in the recording editor uses an on-device model to remove background noise. The processed file overwrites the original unless you tap Duplicate first. For transcription purposes, the enhanced version is usually better; for archival or legal purposes, keep both.

The 2 GB ceiling. A single Voice Memo cannot exceed 2 GB. At Lossless quality, that is roughly 18 hours; at Compressed quality, roughly 138 hours. Recordings hitting the ceiling stop silently and the file is closed at whatever timestamp triggered the limit.

Apple Intelligence summarization respects the language whitelist. If your recording is in a language Apple Intelligence does not support (Vietnamese, Hindi, Thai, etc.), the Summarize button is hidden, even on a device that runs Apple Intelligence in other languages. Atter AI summarization runs on 90+ languages with no whitelist.

Apple Native vs Atter AI

CapabilityiOS Voice Memos NativeAtter AI
Accuracy on clean iPhone audio~88–92%98.7%
Languages supported13 (iOS 26)90+
Hardware requiredA15+ Neural EngineAny device with a browser
Speaker labels / diarizationNoneFull, with rename
Export formatsNone (copy-paste only)PDF, DOCX, TXT, SRT, VTT, JSON
Summarization3–6 bullets, fixedConfigurable length, structured minutes
Searching across recordingsOne-recording at a timeFull-text indexed library
CostFree, requires recent iPhone$6.99/wk · $49.99/yr · $129.99 lifetime · 3-day free trial

For meeting recordings — where Voice Memos is sometimes the only fallback because the host forgot to record in Zoom or Teams — pair this guide with the how to transcribe meetings with AI guide for diarization and summary best practices that apply equally to a Voice Memos file.

iPhone Voice Memos Transcription FAQ

Why does my Voice Memos recording not show a Transcript tab?

Three possible reasons. (1) Your iPhone is older than iPhone 13 — the on-device speech model requires the A15 Neural Engine or newer. (2) The recording language is outside the 13-language whitelist Apple supports. (3) You are running iOS 17 or earlier, before the Transcript view shipped. Any of the three hides the icon entirely.

Can I export the Voice Memos transcript as a text file?

Not natively. Apple offers no Export Transcript action in iOS 26. You can select-all and copy to paste into Notes or Mail, but the only way to get a .txt, .docx, .srt, or .vtt file is to run the audio through a transcription service like Atter AI.

Does iCloud sync the Voice Memos transcript or just the audio?

Just the audio. The transcript is re-generated on each device on-demand the first time you open the Transcript view there. On older Macs or iPads that do not support the on-device model, the transcript never appears even though the audio syncs normally.

What languages does Voice Memos transcribe in 2026?

Roughly 13 in iOS 26: English (multiple regions), Spanish (US, Mexico, Spain), Mandarin (China, Taiwan), Cantonese, French (France, Canada), German, Italian, Japanese, Korean, Portuguese (Brazil, Portugal), Arabic (Saudi Arabia), Russian, and Turkish. Atter AI covers 90+ including Vietnamese, Thai, Hindi, Hebrew, Polish, Dutch, Swedish, Norwegian, Finnish, and most African and Southeast Asian languages.

Is the iPhone Voice Memos transcription accurate enough for journalism or legal use?

For headline-style notes, yes — on clean audio it lands around 88–92% on Apple’s on-device model. For verbatim transcripts, court reporting, or any context where every word must be right, no. The 5–10% gap to 98.7%-accurate cloud transcription compounds quickly across a one-hour interview: that’s 60 to 120 misheard words you have to find and fix.

Does Atter AI need internet access to transcribe a Voice Memo?

Yes. The Atter AI engine runs in the cloud, which is what lets it hold a higher accuracy ceiling across 90+ languages without relying on the iPhone’s hardware. Files are encrypted in transit, transcribed, and deleted from temporary storage after processing.

How do I record straight into a transcribable format without using Voice Memos?

The Atter AI iPhone app records and transcribes simultaneously, producing a transcript while you record. The original .m4a is preserved as a sibling file to the transcript. This avoids the export step entirely and works in all 90+ supported languages.

Can the Atter AI app read from my existing Voice Memos library?

Yes. The first time you grant Voice Memos access in iOS Settings, the Atter AI app lists every recording in your library sorted by date. Selecting one imports the underlying .m4a directly without you having to use the Share sheet or save to Files first.