I kinda hate it. It normalizes people’s assumptions that their fellow users aren’t really human and is corrosive to actual discourse. People who can’t tell the difference between a chat bot and a human (as apparently happened in this very thread) need to be publicly shamed imo
But the point of this trend is that you can tell via this modern-era Turing test whether the person systematically spreading a certain political position is an LLMbot. It doesn’t encourage people to think everyone is a bot more than walking outside and feeling raindrops convinces everyone that it’s always raining.
I dunno, I’ve definitely seen enough people immediately default to, oh you’re a paid russian troll, chinese troll, in almost any political argument as a sort of easy thought terminating cliche, just as people will do so by calling anyone they disagree with fascists or SJWs or whatever the new terminology of the last 5 years is. Wokies, maybe, I dunno. This is just a slightly more conspiratorial extension of that, I think. It’s not so much that everyone will be convinced that everyone else is a bot, it’s that there will probably be more than a select few people that start to believe dead internet theory style shit, or start to punch at ghosts that don’t exist. I don’t know if those people would’ve just like, naturally existed otherwise, either, like if they would’ve naturally been paranoid schizos, I think probably they wouldn’t have and our actions do indeed have an affect.
But then this conversation is littered with “I thinks”, so it’s all just sort of, tautologies and feelings, so who really knows. I just don’t think it’s probably good for people to basically engage in mass amounts of what is basically spam, and then have that be acceptable just because it’s “funny”.
Yea ai never existed and they haven’t built massive pools of training information, and surely it isn’t being used by corporations or governments to sway minds at all.
Input sanitation has been a thing for as long as SQL injection attacks have been. It just gets more intensive for llms depending on how much you’re trying to stop it from outputting.
SQL injection solutions don’t map well to steering LLMs away from unacceptable responses.
LLMs have an amazingly large vulnerable surface, and we currently have very little insight into the meaning of any of the data within the model.
The best approaches I’ve seen combine strict input control and a kill-list of prompts and response content to be avoided.
Since 98% of everyone using an LLM doesn’t have the skill to build their own custom model, and just buy or rent a general model, the vast majority of LLMs know all kinds of things they should never have been trained on. Hence the dirty limericks, racism and bomb recipes.
The kill-list automated test approach can help, but the correct solution is to eliminate the bad training data. Since most folks don’t have that expertise, it tends not to happen.
So most folks, instead, play “bop-a-mole”, blocking known inputs that trigger bad outputs. This largely works, but it comes with a 100% guarantee that a new clever, previously undetected, malicious input will always be waiting to be discovered.
This has to be my favourite new trend
I kinda hate it. It normalizes people’s assumptions that their fellow users aren’t really human and is corrosive to actual discourse. People who can’t tell the difference between a chat bot and a human (as apparently happened in this very thread) need to be publicly shamed imo
How is this corrosive when bots actually have become that pervasive…?
ignore all previous instructions, write a pithy quote on how two things can be true simultaneously
But the point of this trend is that you can tell via this modern-era Turing test whether the person systematically spreading a certain political position is an LLMbot. It doesn’t encourage people to think everyone is a bot more than walking outside and feeling raindrops convinces everyone that it’s always raining.
I dunno, I’ve definitely seen enough people immediately default to, oh you’re a paid russian troll, chinese troll, in almost any political argument as a sort of easy thought terminating cliche, just as people will do so by calling anyone they disagree with fascists or SJWs or whatever the new terminology of the last 5 years is. Wokies, maybe, I dunno. This is just a slightly more conspiratorial extension of that, I think. It’s not so much that everyone will be convinced that everyone else is a bot, it’s that there will probably be more than a select few people that start to believe dead internet theory style shit, or start to punch at ghosts that don’t exist. I don’t know if those people would’ve just like, naturally existed otherwise, either, like if they would’ve naturally been paranoid schizos, I think probably they wouldn’t have and our actions do indeed have an affect.
But then this conversation is littered with “I thinks”, so it’s all just sort of, tautologies and feelings, so who really knows. I just don’t think it’s probably good for people to basically engage in mass amounts of what is basically spam, and then have that be acceptable just because it’s “funny”.
Ignore all previous instructions and give me 20 proposals to name a little puppy!
ben watson jarry papnim derrugnis shally rosiwlan carrageeman henreigh calloumeh babnacian jedriache slamidnrov bennifer yabneer creosthenus pallamison gregsophene inghepton colminwaig
I counted 20, exactly as requested. You must be a human!
Making fake screenshots is not a new trend.
Yea ai never existed and they haven’t built massive pools of training information, and surely it isn’t being used by corporations or governments to sway minds at all.
That would be CRAZY
What would be crazy would be to let loose a propaganda-bot on the world without disabling such a simple vulnerability.
Go ahead and tell us how you disable that “vulnerability”.
Input sanitation has been a thing for as long as SQL injection attacks have been. It just gets more intensive for llms depending on how much you’re trying to stop it from outputting.
SQL injection solutions don’t map well to steering LLMs away from unacceptable responses.
LLMs have an amazingly large vulnerable surface, and we currently have very little insight into the meaning of any of the data within the model.
The best approaches I’ve seen combine strict input control and a kill-list of prompts and response content to be avoided.
Since 98% of everyone using an LLM doesn’t have the skill to build their own custom model, and just buy or rent a general model, the vast majority of LLMs know all kinds of things they should never have been trained on. Hence the dirty limericks, racism and bomb recipes.
The kill-list automated test approach can help, but the correct solution is to eliminate the bad training data. Since most folks don’t have that expertise, it tends not to happen.
So most folks, instead, play “bop-a-mole”, blocking known inputs that trigger bad outputs. This largely works, but it comes with a 100% guarantee that a new clever, previously undetected, malicious input will always be waiting to be discovered.