Discussion about this post

User's avatar
KayStoner's avatar

Can you share how you created the personas? That would help me understand the context in which they were operating. I think the unacknowledged factor in so much of this is generativity. Systems designed to elaborate on and amplify what they’re given are not well understood, even by the people that make them. But I’ve seen generativity spring into wild, emergent action over and over again in my own research. So it doesn’t surprise me at all that the models showed heavy authoritarian leanings on the conservative side of the spectrum, where opinion and sentiment can carry so much more salience than on the opposite side. That’s consistent with what I see regularly. Until we fully appreciate the scope of generativity’s impact (and design accordingly), I think we’re going to be hitting these walls over and over.

Expand full comment
Izzy's avatar

This is such an important insight. What you're seeing isn’t just accidental—it’s structural.

Most large language models aren’t trained on the world as it is. They’re trained on the loudest parts of the internet, where polarized and exaggerated versions of ideologies are more statistically dominant. Even when models are aligned to be “neutral,” that neutrality often reflects the cultural assumptions of the alignment teams who are typically well-educated, liberal-leaning, and tech-centric.

So when you prompt a model with “be conservative,” it doesn’t tap into the nuanced diversity of conservative thought. Rather, itt statistically reconstructs what looks conservative based on what’s most visible in its training data. That often means authoritarian or caricatured responses. Liberal personas, on the other hand, tend to mirror alignment norms and get reinforced as more 'measured' because they already match the values of the teams doing the tuning.

But here’s the deeper issue: these models don’t just reflect our divisions—they can calcify them. By drawing sharper lines than most humans actually live by, they risk amplifying polarization instead of helping us understand each other.

What can we do?

1. Expand alignment diversity—not just politically, but cognitively, culturally, globally.

2. Design for nuance—we need architectures that reward ambivalence, synthesis, and relational coherence, not just clarity or confidence.

3. Build new frameworks—instead of asking models to simulate identity groups, we can invite them to learn from dialogue across difference. This isn’t prompt engineering—it’s epistemic design.

We’re not just building tools. We’re shaping how we think—and that means we need to be asking not just what AI says, but how it comes to know.

Expand full comment
19 more comments...

No posts