How AI Text Detection Actually Works
- →AI detectors use two approaches: statistical (perplexity, burstiness) and behavioral (opinion drift, specificity). Behavioral signals are harder to fake.
- →Perplexity is the most-cited signal but it's model-dependent and gets gamed easily by paraphrasing.
- →The behavioral signals Content Trace weights most heavily reflect actual cognitive patterns - things that require a mind actively working through a problem.
- →No detector is 100% accurate. Short texts, heavily edited AI content, and structured human writing all reduce reliability.
- →A breakdown across 32 signals tells you far more than a single number - look at which sections score low, not just the aggregate.
I'll be honest - when I first started looking into AI detection, I assumed it was basically magic. You paste text, the algorithm does something opaque, a number comes out. I didn't question it too hard because the numbers seemed roughly right. Then I actually dug into how these tools work, and came away with a much more complicated opinion. Some of what I found was reassuring. Some of it wasn't.
The short version: AI detection works better than chance, and anyone who tells you it's either foolproof or useless is oversimplifying. The longer version is more interesting - and more useful if you actually want to understand what these scores mean.
Across 8 weighted sections - from cognitive fingerprinting (16%) to statistical proxies (8%).
The two approaches - and why most tools use the wrong one
There are two ways to detect AI text. The first is statistical: measure properties of the writing itself - how predictable the word choices are, how uniform the sentence lengths run, how much vocabulary varies. These signals emerge from how language models work at a mechanical level. LLMs generate text by picking statistically likely continuations at each step. That process leaves fingerprints.
The second approach is behavioral: look for the presence or absence of things that human writers do naturally. Do opinions shift mid-argument? Are there specific details that feel autobiographical rather than illustrative? Does the writer seem to be figuring something out, or delivering a pre-formed answer?
Most commercial detectors lean almost entirely on the statistical approach because it's faster to compute and easier to explain. That's a mistake, in my view - though I came to this conclusion slowly, because the statistical signals are real. The problem is they're the easiest to game. Humans who write in structured styles (academics, lawyers, technical writers) score badly on them. AI text that's been lightly paraphrased can dodge them entirely. Behavioral signals are harder to fake because they're rooted in something AI genuinely doesn't do: think while writing.
What perplexity actually measures - and where it breaks down
Perplexity is the signal everyone in this space talks about. The concept is straightforward: given what came before in a sentence, how surprising is the next word? Language models assign probabilities to every possible next token. Low perplexity means the text was predictable. High perplexity means the writing was full of unexpected choices.
Human writers have higher perplexity. Not because we're trying to be unpredictable, but because we're not optimizing for statistical safety. We use the weird word that fits better. We reference something specific that shifts the expected trajectory of the paragraph. These aren't conscious choices - they're just what happens when a real person's mind is engaged in putting thought into language.
The model-dependency problem nobody mentions
Here's the catch that took me a while to really internalize: perplexity is measured relative to a specific model. A sentence that's low-perplexity to GPT-4 might be high-perplexity to a smaller model. This means detector accuracy depends heavily on which model the detector was calibrated against, and as frontier models improve, older detectors become less reliable against new AI output. Most detector companies don't acknowledge this publicly.
This is also why burstiness - the variance in sentence length and complexity across a text - tends to be a more durable signal than raw perplexity. Burstiness doesn't depend on any particular model; it's a property of the text itself. Human writing is naturally bursty. AI writing is unnaturally smooth. That pattern holds even as the underlying models evolve.
"Burstiness - the unpredictable swing between short and long sentences - is one of the hardest things for AI to fake, even intentionally."Content Trace · Statistical Proxies Signal
The behavioral signals that actually matter
Here's what I find genuinely interesting about behavioral detection: the signals aren't arbitrary. They reflect something real about how human cognition shows up in writing - patterns that emerge from having a mind actively working through a problem.
Opinion drift and self-correction
When a human writer works through an argument, their thinking often changes in motion. They start a paragraph committed to one position and end it somewhere slightly different, because writing clarified something they hadn't noticed. Sometimes they catch themselves. Sometimes they just let the drift stand - because it's honest.
AI doesn't do this. The model commits to a conclusion before the first word is produced and executes toward it. The result is writing that's technically logical but lacks the texture of actual thought. No moment where the writer surprised themselves. No awkward pivot.
Specificity that feels accidentally true
Human writers reach for concrete details - a particular number, a specific place, a named person. These specifics serve two functions: they make the writing credible, and they make it personal. AI reaches for illustrative generalities because it has no real experiences to draw from. It can invent specifics, but invented specifics have a different texture - too clean, too perfectly illustrative. Real specifics are slightly awkward. That imperfect fit is part of what makes them feel true.
You can run any piece of writing through Content Trace to see how it scores on specificity, opinion drift, and all the other behavioral signals - free, no account required. It won't give you a verdict. It'll give you a breakdown, which is more useful.
Why no detector is 100% - and why that's okay
I want to push back on something I see constantly: the framing that AI detectors are only useful if they're perfect. That's not how we evaluate any other diagnostic tool.
A radiologist reading an X-ray isn't right 100% of the time. A fraud detection system isn't right 100% of the time. The question is whether the signal is better than chance, whether the error patterns are systematic in ways you can account for, and whether the tool is honest about its limitations. Good AI detectors can meet that bar - but only if they show their work instead of collapsing everything into a single number that implies certainty it doesn't have.
Frequently asked questions
If you want to go deeper on what separates AI and human writing at a linguistic level, Why AI Writing Sounds Different gets into the specific patterns that trained readers pick up on intuitively. And if you're working with AI drafts and want to make them publishable, How to Humanize AI Content is the practical side of everything covered here.