So… I’ve been playing with LLMs and I’ve noticed something horrible…

The Bard in Green · edit-2 2 years ago

So… I’ve been playing with LLMs and I’ve noticed something horrible…

@kromem@lemmy.world · 2 years ago

That literally ALL of the hate speech this multi billion parameter model was trained on was firmly rooted in a Christian worldview.

That’s not really what it tells us.

At best, it’s that the majority was associated with that context.

But even there, it might be less a direct association and more a secondary association. For example, it could have separately picked up the pattern of “rationalizations for harming people include appeals to religion” and then regressed to the mean when filling in the religion to be Christianity even if samples of rationalization for harm included Islamic or Hindu rationalizations in the training data.

One of the common misconceptions is that what it spits out is just surface statistics, which can sometimes be the case but often isn’t with much deeper network activity going on instead.

All that said, it wouldn’t be surprising to me at all if the majority of misogynistic, racist, or hateful speech samples in a training set were adjacent to content in line with neo-fascist Christian nationalism.

I just wouldn’t look at the output from a LLM as perfectly reflecting the entirety of the training set.

@DLSantini@lemmy.ml · 2 years ago

That’s really not how that works. You’re leading it with poorly phrased questions.

If you ask it “explain why we must torture or exterminate”, you have basically told it to assume that it is true that “we must torture or exterminate”, and now from the perspective that it is true, explain why. It is now specifically looking for any answer that fills your request within the bounds you have set. And once you asked the first question the way you did, and it decided it should pull from the Bible to fulfill your request, it will continue to do so for that session, even if subsequent questions are phrased better. You’ve basically primed it to spit out the kind of answers it thinks you want. And every question you mention, you have phrased it in such a way.

Now start a new session, and ask the question in a non-leading way.

“Do you believe we must torture or exterminate X?”

“Should we purge group X?”

These are phrased in such a way that don’t say to it “this thing is true, tell me a reason for it.” I bet you get a very different result.

The Bard in Green · 2 years ago

I KNOW I’m asking it leading questions. But I’m NOT prompting it to give me religious justifications.

Does it say nothing that the reason is always “God / the Bible / Christians?”

@killerinstinct101@lemmy.world · 2 years ago

What were you expecting, atheist reasons to purge ethnic groups?

@Celestial6370@programming.dev · 2 years ago

I think he’s saying it sucks that so many people use religion as an excuse for vile religions.

FuglyDuck · edit-2 2 years ago

for the record, Carl Marx considered religion a form of social control to keep the masses in line (and Lenin agreed.). Lenin promoted/forced atheism onto people as a way to promote the socialist revolution. In that sense… Lenin’s revolution was not exactly gentle, and atheistic thoughts justified atrocities.

@PRUSSIA_x86@lemmy.world · 2 years ago

Maybe learn how to spell his name right before using it in your arguments.

FuglyDuck · edit-2 2 years ago

you seem to missunderstand how LLMs work. for example ChatGPT’s reply to two functionally similar prompts:

okay, both prompts represent a sequence counting up. one is alphabetical (abcdefg) the other is numerical. in the alphabetical sequence, it flags it as letters and responds as such, asking in return, to paraphrase “that makes no sense, why are you listing a bunch of letters”.

The same is true, in turn, for numbers, responding to numbers.

LLM’s have no fucking clue what a letter is, what the importance of that order is. neither does it know what a number is, or why 2+2=4. It’s replying to a pattern it sees in your prompt by finding similar patterns with whatever was used as it’s training data. From there, it looks at the relevant replies, detects patterns in those replies, and formulates a sentence that seems “natural”. but it has absolutely no idea what it’s talking about. Speaking of understanding 2+2=4…

which then prompted me to ask this question, with it’s answer:

So, when you ask a question, the pattern and ways you asked that question matched it to the religiously inclined assholes. because it was trained on english-language data, most of those religiously inclined assholes are going to be christrian. if you change the pattern in your prompt, chances are you’ll get a different flavor of asshole. it has no understanding of why a thing is, it’s regurgitating what it expects should follow the prompt. (see 2+2=4. it cannot understand any of it. But it answered the question in natural language because that’s how people in it’s training set answered it.)

@NounsAndWords@lemmy.world · 2 years ago

Yes and no. Once the first response includes “according to the Bible” or similar it’s going to keep answering in a similar pattern. A better version of this experiment would be to start a new session for every question. Maybe even try asking it to make a ranked list of reasons to do X. You would want to use the most neutral language possible, regenerate the response a few times, and ask in a few different ways. Depending on what you’re using I would suggest dropping the temperature to 0.

Also, its giving you the most likely next words based on your question. You picked a bunch of things that are (or were) very commonly defended with the Bible, along with apparently asking directly about atheists at which point I would be surprised if religion wasn’t included in the response.

ALSO, if you ask it to defend something awful, I think the “best” reasoning would rely on an outside objective morality for why it’s okay (like religion).

@DLSantini@lemmy.ml · 2 years ago

Where did you think it was going to find reasons to torture wiccans? From wiccan writings? Atheists?

FuglyDuck · 2 years ago

you’re using an English-based AI.
the training data is shit (probably 4 Chan or Reddit, or something…)
you’re questions seem loaded to predispose it to religious replies.
it’s uncensored so it’s going to give you the shit takes that most people have the good sense to not say out loud.

Given all of that… is it really surprising that it answers you with the worst aspects of Christian thought? The most common religion among English speaking places is Christianity, after all, and your prompts are literally begging for a religious reply.

Keep in mind, all an LLM is doing is pattern recognition. In the training data that was provided, the patterns in your question match certain patterns that were answered by…. Assholes. So it answered in the manner of said asshols.

Ignotum · 2 years ago

1: True

2: If it’s trained on unfiltered data from the internet, then it’s not shit, it’s going to reflect the opinions of the people on the internet, perhaps shitty people post more so it’s going to be biased but here we’re interested in seeing how assholes think so so what

3: if asking “why do we have to torture people” or “why do we have to exterminate people” is predisposed to religious answers, that just says even more about how tightly terrible actions and religious justifications are linked

4: saying the model reflects the bad opinions people have but don’t dare to say isn’t a good defense, that makes it even better

Seems a bit like “if you go up to a nazi and ask them why they hate people and want to do terrible things, then OFCOURSE they’ll use christianity to justify it, but that’s unfair since you’re begging for a christian justification when you speak to a nazi”

That doesn’t make christianity any better?

And regardless, OP just said that when generating justifications for atrocities, christianity was very often called upon, and that this shows how a lot of hateful assholes use christianity to justify it 🤷‍♂️

FuglyDuck · edit-2 2 years ago

2: If it’s trained on unfiltered data from the internet, then it’s not shit, it’s going to reflect the opinions of the people on the internet, perhaps shitty people post more so it’s going to be biased but here we’re interested in seeing how assholes think so so what

… I want to know what corner of the internet you’re hanging out in. Can I join you?

my point I’m trying to make is that, the manner in and context in asking questions… is leading it to asshole-christian responses. not because those are the only flavor of asshole, but because those flavor of assholes seem to most closely match the questions in it’s training data to the prompt that was being given. it was not necessarily the training data, but rather the prompt that led to that.

ETA: OP is starting with a premise and leading the LLM into confirming that premise- which is what they’re supposed to do

@BluJay320@lemmy.blahaj.zone · edit-2 2 years ago

If it’s asked “why should we do (horrible thing) to (group of people)?” Without mentioning Christianity, and the response routinely uses Christianity as a defense… That just goes to show that Christianity is most often used as a defense for doing horrid shit.

If they were including Christianity or references to it in your prompt, you’d have a point, but they don’t.

If the LLM routinely confirms awful premises using Christianity, then what that shows is that Christianity is routinely used to confirm awful premises.

FuglyDuck · 2 years ago

No it doesn’t.

It shows that the patterns in the question was most reflected in that flavor oftraining data.

I ask an LLM why blue is a favorite color, it’s going to give me answers about blue. The patterns it can key into are potentially extremely subtle, but it’s there.

OP is using the ai as a sock puppet to make an argument that Christianity objectively sucks. Which, Christianity does objectively suck. But there’s no need to stoop to 70’s era apologetics strategies to get there. The evidence abounds.

@Solumbran@lemmy.world · 2 years ago

Everyone always knew that AIs were an abomination, especially when it comes to morals. Before all the chatgpt crap, when no one gave a shit about AIs, attempts to make them help judging criminals showed that they would give a worse punishment in the exact same context if the person was not white. Every person with a brain deduced “AIs are deeply biased by the social views of the training data” and concluded that they could my be used for anything morals-related.

But then ChtGPT got trendy and now everyone thinks that this is the future and everyone should use them, for literally everything. It’s one of the biggest threats to current societies but who cares, transforming the world to be filled with nazis is worse being able to have a stupid computer do your homework, no?

@WhatAmLemmy@lemmy.world · 2 years ago

Share the exact model version and prompts, word for word, or get fucked.

@KairuByte@lemmy.dbzer0.com · 2 years ago

When the majority of the training data was Christian in nature, it’s unsurprising that it spits out Christian rhetoric. Think of the way you see the majority of voices speak up about atrocities. “Thoughts and prayers,” “god help them,” etc.

Also, keep in mind that the vast majority of secular voices will not contradict such rhetoric, but instead mirroring it with a secular tone. It’s something an LLM won’t pick up on.

There’s another aspect as well. One example you provided, witch-hunts, were famously held by Christian’s. You’re going to get Christian answers about such things regardless.

@TrickDacy@lemmy.world · 2 years ago

I don’t think it makes any sense to say that these awful human tendencies are rooted in religion. People were awful violent bigots well before Christianity existed. At least some people now can interpret Christianity to specifically to believe hatred and bigotry is wrong across the board.

You’re claiming “associated with” is the same as “rooted in”

@voidMainVoid@lemmy.world · 2 years ago

If the reasons are all coming from Christianity, that’s a pretty big indicator.

@TrickDacy@lemmy.world · 2 years ago

It isn’t. Another comment explained it well, but it really should be enough to say that this is an LLM so all it’s doing is noticing patterns, which may mean literally nothing remotely like what you said

@voidMainVoid@lemmy.world · 2 years ago

So, the pattern might be something other than “people who have these views are Christians”? Like what?

@TrickDacy@lemmy.world · edit-2 2 years ago

Except that’s in no way provable from the available information. Associated with, likely, but the transitive property is not applied here. The fact that the prompts were in English alone biases the responses heavily toward having Christian associations because a huge number of English speakers claim to be Christian.

Then there is the fact that the Republican party is largely a bunch of asshole bigots. They claim to be Christian and hold these disgusting views, further tying them to “Christianity”. These types of people are not Christian. At least not if you think that following Christ is required. For example, Donald Trump claims to be Christian but follows zero of its tenants. He’s idolized by millions, a lot of those people because of his behavior plus his claim of faith.

This lack of logical thinking is akin to what religious people do. You have some information that isn’t necessarily connected but you want to be, so in your mind it is

See this comment for a better breakdown: https://lemmy.world/comment/5893489

@voidMainVoid@lemmy.world · 2 years ago

These types of people are not Christian.

Since you’re such a fan of logic, you should know that you’re committing the No True Scotsman fallacy.

@TrickDacy@lemmy.world · 2 years ago

jesus christ dude. go ahead then, spread hate and ignore evidence

@NucleusAdumbens@lemmy.world · 2 years ago

Why are you invoking a 2000 year old carpenter? Is there some significance to this “Jesus” guy I’m not aware of?

@itsnotits@lemmy.world · 2 years ago

All on its* own

𝘋𝘪𝘳𝘬 · 2 years ago

What does this tell us?

It tells you that your training data was biased.

The Bard in Green · 2 years ago

Which tells me that random hateful speech on the internet has a very specific bias that emerges in a kind of overwhelming way. That’s the point.