• YeetPics@mander.xyz
      link
      fedilink
      arrow-up
      1
      ·
      4 months ago

      Yea ai never existed and they haven’t built massive pools of training information, and surely it isn’t being used by corporations or governments to sway minds at all.

      That would be CRAZY

      • Peppycito@sh.itjust.works
        link
        fedilink
        arrow-up
        0
        ·
        4 months ago

        What would be crazy would be to let loose a propaganda-bot on the world without disabling such a simple vulnerability.

          • nondescripthandle@lemmy.dbzer0.com
            link
            fedilink
            arrow-up
            0
            ·
            edit-2
            4 months ago

            Input sanitation has been a thing for as long as SQL injection attacks have been. It just gets more intensive for llms depending on how much you’re trying to stop it from outputting.

            • MajorHavoc@programming.dev
              link
              fedilink
              arrow-up
              1
              ·
              edit-2
              4 months ago

              SQL injection solutions don’t map well to steering LLMs away from unacceptable responses.

              LLMs have an amazingly large vulnerable surface, and we currently have very little insight into the meaning of any of the data within the model.

              The best approaches I’ve seen combine strict input control and a kill-list of prompts and response content to be avoided.

              Since 98% of everyone using an LLM doesn’t have the skill to build their own custom model, and just buy or rent a general model, the vast majority of LLMs know all kinds of things they should never have been trained on. Hence the dirty limericks, racism and bomb recipes.

              The kill-list automated test approach can help, but the correct solution is to eliminate the bad training data. Since most folks don’t have that expertise, it tends not to happen.

              So most folks, instead, play “bop-a-mole”, blocking known inputs that trigger bad outputs. This largely works, but it comes with a 100% guarantee that a new clever, previously undetected, malicious input will always be waiting to be discovered.