Watch: How Anthropic found tricks to make AI give answers it shouldn’t

If you build it, people will try to destroy it.Sometimes even people building That’s the guy who breaks it. This is the case at Anthropic, whose latest research demonstrates an interesting vulnerability in current LLM technology. More or less, if you keep asking a question, you can break down the guardrails and end up with large language models telling you things they were not designed to do, such as how to build a bomb.

Of course, with the advancement of open source AI technology, you can build your own LLM locally and ask whatever you want, but for something more consumer-grade, it’s a question worth thinking about. What’s interesting about artificial intelligence today is the fast pace it’s advancing, and how well (or poorly) we as a species are doing at better understanding what we’re building.

If you allow me to think about it, I wonder if as LLMs and other new AI model types get smarter and bigger, we’ll see more of the types of questions and problems that humans have outlined. I’m probably repeating myself. But as we get more general artificial intelligence, it should be more like a thinking entity rather than a computer that we can program, right? If so, might we have a harder time identifying edge cases to the point where this becomes unfeasible? Anyway, let’s talk about what Anthropic has shared recently.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button