Synthetic Identity Engineering: a working definition

Most synthetic personas are costumes.

You describe a character in a system prompt — age, profession, personality adjectives — and the model wears it. It works for demos. It does not hold up the moment the conversation puts real pressure on the identity.

Ask for a cost-benefit analysis when the persona is supposed to be emotionally volatile. Ask for validation when the persona is supposed to be defensive. The costume slips. The model apologises, accommodates, and breaks character. Every time.

This is not a prompt quality problem. It is an architecture problem.

What holds a real person together

Real humans do not have system prompts. They have belief systems — structured representations of how the world works, what they value, what they fear, and who they are in relation to others. Those beliefs do not reset between conversations. They accumulate. They interact. They constrain the range of responses a person will produce under pressure.

When Carlos — a burned-out executive — is threatened, he does not get emotional. He asks for a cost-benefit analysis. He moves the conversation to operational ground. That response is not a personality quirk. It is the output of a belief structure that has learned, over a long career, that emotional exposure is a liability and logical reframing is a defense.

When Lucía — an anxious partner — faces the same threat, she apologises. She asks what she can do to fix it. She defers. That response is also the output of a belief structure — one built around the idea that relationships are conditional and her position in them is fragile.

Same threat. Two opposite responses. Not because of different system prompts. Because of different identity architectures.

The definition

Synthetic Identity Engineering is the practice of constructing synthetic human identities with persistent psychological cores — belief systems, defense mechanisms, emotional regulation patterns, and identity stability under pressure — that produce coherent, differentiated behaviour across interaction contexts.

The key word is engineering. Not describing. Not prompting. Engineering.

A system prompt describes a persona. Synthetic Identity Engineering builds one.

The technical implementation

At StrataSynth, this is implemented through four interacting systems:

PsycheGraph — a structured schema for the psychological architecture of a synthetic human. Belief nodes, emotional regulation profiles, relationship maps, life timeline. Not a flat description — a graph with causal relationships.

Belief Engine — tracks how beliefs evolve turn by turn. When Lucía hears a threat, her belief about her own competence updates. That update changes what she says next. The belief state is tracked, not inferred.

Internal Defense Mechanisms — profile-specific patterns for handling social pressure. Carlos has high identity rigidity. Lucía has high adaptability and low boundary enforcement. These are parameters, not adjectives.

SHQI (Synthetic Human Quality Index) — twelve deterministic metrics that evaluate psychological coherence per turn without using an LLM as judge. belief_consistency, identity_stability, behavioral_entropy. Computed with numpy and sentence-transformers. Reproducible.

Why the term matters

The space of “synthetic personas for AI development” is currently described in inconsistent and often misleading ways. Synthetic users. AI personas. Simulated agents. Digital twins. None of these names captures what is actually distinct about building identities that hold.

Synthetic Identity Engineering is a specific practice with specific technical requirements. It is different from general synthetic data generation. It is different from persona-based prompt engineering. It is different from role-play simulation.

It needs a name because the teams building serious conversational AI systems — the ones who care about coherence, not just coverage — deserve a term precise enough to build on.

See it in the data

The four public datasets on Hugging Face include full psychological metadata per turn: belief state, SHQI scores, ground truth labels, relationship trajectory markers.

Load them directly:

from datasets import load_dataset
ds = load_dataset("StrataSynth/stratasynth-agent-stress-test")

The stratasynth-agent-stress-test dataset is the most direct demonstration — jealousy escalation, performance reviews, estrangement attempts. Scenarios designed to put identity under pressure. Check the identity_stability and belief_consistency columns. The numbers tell you whether the persona held.