Explainer

How AI Text Detection Actually Works

March 10, 2026 · 9 min read · By Colin

TL;DR

→AI detectors use two approaches: statistical (perplexity, burstiness) and behavioral (opinion drift, specificity). Behavioral signals are harder to fake.
→Perplexity is the most-cited signal but it's model-dependent and gets gamed easily by paraphrasing.
→The behavioral signals Content Trace weights most heavily reflect actual cognitive patterns - things that require a mind actively working through a problem.
→No detector is 100% accurate. Short texts, heavily edited AI content, and structured human writing all reduce reliability.
→A breakdown across 32 signals tells you far more than a single number - look at which sections score low, not just the aggregate.

I'll be honest - when I first started looking into AI detection, I assumed it was basically magic. You paste text, the algorithm does something opaque, a number comes out. I didn't question it too hard because the numbers seemed roughly right. Then I actually dug into how these tools work, and came away with a much more complicated opinion. Some of what I found was reassuring. Some of it wasn't.

The short version: AI detection works better than chance, and anyone who tells you it's either foolproof or useless is oversimplifying. The longer version is more interesting - and more useful if you actually want to understand what these scores mean.

Individual signals Content Trace analyzes per submission

Across 8 weighted sections - from cognitive fingerprinting (16%) to statistical proxies (8%).

Free · Always

The two approaches - and why most tools use the wrong one

There are two ways to detect AI text. The first is statistical: measure properties of the writing itself - how predictable the word choices are, how uniform the sentence lengths run, how much vocabulary varies. These signals emerge from how language models work at a mechanical level. LLMs generate text by picking statistically likely continuations at each step. That process leaves fingerprints.

The second approach is behavioral: look for the presence or absence of things that human writers do naturally. Do opinions shift mid-argument? Are there specific details that feel autobiographical rather than illustrative? Does the writer seem to be figuring something out, or delivering a pre-formed answer?

Most commercial detectors lean almost entirely on the statistical approach because it's faster to compute and easier to explain. That's a mistake, in my view - though I came to this conclusion slowly, because the statistical signals are real. The problem is they're the easiest to game. Humans who write in structured styles (academics, lawyers, technical writers) score badly on them. AI text that's been lightly paraphrased can dodge them entirely. Behavioral signals are harder to fake because they're rooted in something AI genuinely doesn't do: think while writing.

What perplexity actually measures - and where it breaks down

Perplexity is the signal everyone in this space talks about. The concept is straightforward: given what came before in a sentence, how surprising is the next word? Language models assign probabilities to every possible next token. Low perplexity means the text was predictable. High perplexity means the writing was full of unexpected choices.

Human writers have higher perplexity. Not because we're trying to be unpredictable, but because we're not optimizing for statistical safety. We use the weird word that fits better. We reference something specific that shifts the expected trajectory of the paragraph. These aren't conscious choices - they're just what happens when a real person's mind is engaged in putting thought into language.

The model-dependency problem nobody mentions

Here's the catch that took me a while to really internalize: perplexity is measured relative to a specific model. A sentence that's low-perplexity to GPT-4 might be high-perplexity to a smaller model. This means detector accuracy depends heavily on which model the detector was calibrated against, and as frontier models improve, older detectors become less reliable against new AI output. Most detector companies don't acknowledge this publicly.

This is also why burstiness - the variance in sentence length and complexity across a text - tends to be a more durable signal than raw perplexity. Burstiness doesn't depend on any particular model; it's a property of the text itself. Human writing is naturally bursty. AI writing is unnaturally smooth. That pattern holds even as the underlying models evolve.

"Burstiness - the unpredictable swing between short and long sentences - is one of the hardest things for AI to fake, even intentionally."

Content Trace · Statistical Proxies Signal

The behavioral signals that actually matter

Here's what I find genuinely interesting about behavioral detection: the signals aren't arbitrary. They reflect something real about how human cognition shows up in writing - patterns that emerge from having a mind actively working through a problem.

Opinion drift and self-correction

When a human writer works through an argument, their thinking often changes in motion. They start a paragraph committed to one position and end it somewhere slightly different, because writing clarified something they hadn't noticed. Sometimes they catch themselves. Sometimes they just let the drift stand - because it's honest.

AI doesn't do this. The model commits to a conclusion before the first word is produced and executes toward it. The result is writing that's technically logical but lacks the texture of actual thought. No moment where the writer surprised themselves. No awkward pivot.

Cognitive Fingerprinting · 16% weight

Opinion Drift

Human writers change their minds mid-paragraph. AI writing stays exactly on plan from first word to last.

AI"There are several key benefits to using AI writing tools. They save time, improve consistency, and help teams scale content output efficiently."

Human"The time savings are real - I've seen that firsthand. The consistency argument I'm less sure about. Consistent mediocrity isn't the goal, and I think that's worth saying out loud."

Specificity that feels accidentally true

Human writers reach for concrete details - a particular number, a specific place, a named person. These specifics serve two functions: they make the writing credible, and they make it personal. AI reaches for illustrative generalities because it has no real experiences to draw from. It can invent specifics, but invented specifics have a different texture - too clean, too perfectly illustrative. Real specifics are slightly awkward. That imperfect fit is part of what makes them feel true.

You can run any piece of writing through Content Trace to see how it scores on specificity, opinion drift, and all the other behavioral signals - free, no account required. It won't give you a verdict. It'll give you a breakdown, which is more useful.

Why no detector is 100% - and why that's okay

I want to push back on something I see constantly: the framing that AI detectors are only useful if they're perfect. That's not how we evaluate any other diagnostic tool.

A radiologist reading an X-ray isn't right 100% of the time. A fraud detection system isn't right 100% of the time. The question is whether the signal is better than chance, whether the error patterns are systematic in ways you can account for, and whether the tool is honest about its limitations. Good AI detectors can meet that bar - but only if they show their work instead of collapsing everything into a single number that implies certainty it doesn't have.

Word Choice & Phrasing · Signal comparison

AI phrasing

"It's worth noting that perplexity is an important metric that can be leveraged to comprehensively assess the statistical properties of AI-generated content."

Human phrasing

"Perplexity tells you how surprised the model was by each word. Low surprise usually means AI wrote it. That's the whole idea - though it breaks down faster than most detector companies admit."

Frequently asked questions

Can AI detectors be fooled by paraphrasing?

Yes - statistical signals like perplexity can be weakened significantly by paraphrasing AI output. Behavioral signals are more robust, but heavy human editing of AI content does make it genuinely harder to catch. That's an honest limitation.

Why do detectors sometimes flag human writing as AI?

Highly structured human writing - academic papers, legal documents, technical documentation - often scores low on statistical signals because it's deliberately formal and predictable. It's a false positive. This is one reason signal-level breakdowns are more useful than single scores.

Does text length affect accuracy?

Significantly. Short texts under 100 words don't provide enough data points to establish reliable patterns. Content Trace flags confidence as Low on short texts for this exact reason.

What makes Content Trace different from GPTZero or Originality.ai?

Most detectors give you a single score with little explanation. Content Trace breaks the analysis into 32 signals across 8 sections so you can see exactly which patterns triggered the result and form your own judgment.

Will AI detectors become obsolete as models improve?

Statistical detectors will erode as models get better at mimicking surface-level human patterns. Behavioral detectors are more durable because they look for cognitive patterns - things that require actually thinking while writing - that models don't yet replicate reliably.

If you want to go deeper on what separates AI and human writing at a linguistic level, Why AI Writing Sounds Different gets into the specific patterns that trained readers pick up on intuitively. And if you're working with AI drafts and want to make them publishable, How to Humanize AI Content is the practical side of everything covered here.

Want to see these 32 signals in action on your own content?

Try Content Trace free →