For over ten years, computer scientist Randy Goebel and his colleagues in Japan have been quietly conducting one of the most revealing experiments in artificial intelligence —a legal reasoning competition based on the Japanese bar exam. The challenge is to have AI systems retrieve relevant laws and then answer the core question at the heart of every legal case of whether the law was broken or not. That yes/no decision, it turns out, is where AI stumbles hardest. This struggle has profound impl
machinelearning (12)
Our colleagues at Sentinel Labs have provided yet another great research and analysis. As Large Language Models (LLMs) are increasingly incorporated into software‑development workflows, they also have the potential to become powerful new tools for adversaries; as defenders, it is important that we understand the implications of their use and how that use affects the dynamics of the security space.
In Sentinel’s research, they wanted to understand how LLMs are being used and how analysts could s
The cybersecurity company ESET has disclosed that it discovered an artificial intelligence (AI)-powered ransomware variant codenamed PromptLock. Written in Golang, the newly identified strain uses the gpt-oss:20b model from OpenAI locally via the Ollama API to generate malicious Lua scripts in real-time. The open-weight language model was released by OpenAI earlier this month. "PromptLock leverages Lua scripts generated from hard-coded prompts to enumerate the local filesystem, inspect target
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in
A proof-of-concept attack detailed by Neural Trust demonstrates how bad actors can manipulate LLMs into producing prohibited content without issuing an explicitly harmful request. Named "Echo Chamber," the exploit uses a chain of subtle prompts to bypass existing safety guardrails by manipulating the model's emotional tone and contextual assumptions. Developed by Neural Trust researcher Ahmad Alobaid, the attack hinges on context poisoning. Rather than directly asking the model to generate in