Skip to content
Social Engineering
Intermediate·5 min·DE · EN

Voice cloning & deepfake calls

Three seconds of audio is enough to clone your CEO's voice. Here's how to stay composed when a "familiar" voice calls.

Voice as proof - that era is over

Between 2024 and 2026, AI voice cloning has gone from research toy to mass-market tool. Services that used to need weeks of training data now copy a voice from three seconds of publicly available audio - a LinkedIn video, a podcast clip, a YouTube talk. The clone is good enough to pass on a phone call.

01

Familiar voice ≠ verified person

Voice alone has never been proof of identity - less so now than ever. Verification happens through a channel, not through tone.

02

Call back on a known number

On any unusual request: hang up, call back yourself via a saved number. Not the number that called you.

03

Set up a code word

Agree on a code word with family and key colleagues. On critical calls: 'Say the word.'

How a typical deepfake call unfolds

  1. Research: The attacker finds your role on LinkedIn and identifies your CEO/manager.
  2. Sample: 3-10 seconds of voice audio from the target, e.g. from a podcast clip.
  3. Clone: A cloud service builds a voice model in minutes, often under USD 30.
  4. Call: Using the cloned voice as the CEO, the attacker calls you. Background noise (car, airport, meeting) masks small imperfections.
  5. Request: An urgent wire, a password reset, a data handoff. Under time pressure.
Real case - 2024

A Hong Kong finance employee transfers USD 25 million after a video conference with the "CFO" and several "colleagues". Every participant was a deepfake - voices, faces, micro-reactions included. No second-channel verification was ever attempted.

A second channel was needed - a quick chat message or call to the real person would have exposed it.

Tells that even good AI clones can't fake yet

Even as voices improve, signals remain:

  • Breathing errors: AI voices breathe in unnatural spots, or not at all.
  • Suspiciously clean audio: Real calls have ambient noise. A "studio-clean" call is a flag.
  • Delayed reaction to interruption: Clones respond half a beat late, or repeat themselves mechanically.
  • Inability to improvise: Ask about a shared memory, an insider detail. Clones fail at spontaneous context.

Three moves when a "known" call gets unusual

  1. Hang up politely: "Let me call you right back." No pretext needed.
  2. Verify on a second channel: Call the known number, ping Slack/Teams, walk over.
  3. Act only after verification: Even under pressure. Real bosses understand caution; scammers push.
!

Request for discretion

'Don't tell anyone.' Classic manipulation - real instructions tolerate transparency.

!

Voice perfect, no personal detail

Clones imitate voice, not spontaneous shared memory.

!

Pressure to act on the call

'Do it now, I'll stay on the line.' Legitimate contacts accept callbacks.

!

Caller ID matches - voice slightly 'off'

Spoofing the number is trivial. Voice + number together isn't proof.

The simple rule

A voice on the phone is a hint of identity - not proof. Proof requires a second, independent channel.

This rule held before AI. It is essential now.

Ready to take awareness seriously?

30-minute demo. We'll show you a real phishing campaign, a quarterly report, and the NIS2 mapping - for your industry.