LLM-Generated Passwords Expose Major Security Flaws with Predictability, Repetition, and Weakness

In Cybersecurity News - Original News Source is cybersecuritynews.com by Blog Writer

Large language models, commonly known as LLMs, are increasingly being asked to generate passwords — and new research has shown that the passwords they produce are far weaker than they appear.

A password like G7$kL9#mQ2&xP4!w may look convincingly random, but it carries a fundamental flaw that standard password-strength tools consistently miss.

A Secure Password Generated by Nano Banana Pro (Source – Irregular)

The core problem lies in how LLMs actually work. Secure password generation relies on a cryptographically-secure pseudorandom number generator, or CSPRNG, which selects characters from a truly uniform distribution — meaning each character has an equal chance of being picked.

LLMs, by contrast, are trained to predict the most likely next token based on what came before. That prediction process is, by design, fundamentally incompatible with true randomness.

Irregular analysts tested password generation across several major models — the latest versions of GPT, Claude, and Gemini — and identified clear, repeatable patterns across all results.

In 50 independent runs with Claude Opus 4.6, only 30 unique passwords appeared, and one sequence, G7$kL9#mQ2&xP4!w, was generated 18 times, yielding a 36% probability.

GPT-5.2 produced passwords that nearly all started with the letter “v,” while Gemini 3 Flash consistently produced passwords beginning with “K” or “k.” These are not minor quirks — they reflect predictable biases that an attacker could directly exploit.

The issue goes beyond ordinary users asking chatbots for help. Coding agents like Claude Code, Codex, and Gemini-CLI have been found generating LLM-based passwords during software development tasks, sometimes without the developer ever requesting them.

In “vibe-coding” environments — where code is built and deployed without close review — these weak credentials can slip straight into production systems undetected.

How Weak Are These Passwords?

To understand just how weak these passwords are, researchers applied the Shannon entropy formula and used log-probability data pulled directly from the models.

A properly built 16-character password is expected to carry around 98 bits of entropy — a measure of strength that makes brute-force cracking essentially impossible within any realistic timeframe.

LLM-Generated Password Character Statistics (Source – Irregular)

Claude Opus 4.6’s passwords showed only an estimated 27 bits of entropy, and GPT-5.2’s 20-character passwords were even more concerning at roughly 20 bits — low enough to crack in seconds on a standard machine.

Estimated Entropy per Character Position (Source – Irregular)

Changing the temperature setting offered no solution. Running Claude at its maximum temperature of 1.0 still yielded the same repeated patterns, and reducing it to 0.0 caused the same password to appear every single time.

Researchers also found that LLM-generated password prefixes like K7#mP9 and k9#vL appear in public GitHub repositories and online technical documents.

Security teams should audit and rotate any credentials that AI tools or coding agents may have generated.

Developers should configure agents to use cryptographically secure methods, such as openssl rand or /dev/random, and review all AI-generated code for hardcoded passwords before deployment.

Follow us on Google News, LinkedIn, and X to Get More Instant UpdatesSet CSN as a Preferred Source in Google.