AI Red Teaming

IEEE GAISS October 28, 2026 - October 30, 2026 event upcoming

GAISS 2026: IEEE GenAI for Secure Systems

GAISS 2026 is an IEEE conference at the University of Texas at Austin focused on generative AI for secure systems, including red teaming, blue-team automation, governance, and agentic secure AI.

View details Open event page

DEF CON August 6, 2026 - August 9, 2026 event upcoming

DEF CON 34 / AI Village 2026

DEF CON 34 takes place in Las Vegas and is expected to include AI security activity through villages, workshops, contests, and community-led research tracks as schedules firm up.

AI Red Teaming Prompt Injection Adversarial ML

View details Open event page

Black Hat August 1, 2026 - August 6, 2026 event upcoming

Black Hat USA 2026 AI Summit

Black Hat USA 2026 includes an AI Summit and security briefings in Las Vegas focused on how artificial intelligence is changing digital defense.

AI Red Teaming Agent Security Adversarial ML

View details Open event page

The Hacker News AI Security June 11, 2026 news

New Attacks Trick OpenClaw AI Agent Into Running Code and Leaking Secrets

Two security teams have shown, in separate research published this week, that OpenClaw, the popular self-hosted AI agent, can be driven to run attacker-controlled code or hand over sensitive data through ordinary-looking inputs. Imperva buried instructions inside shared contacts, vCards, and location pins that the agen

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Microsoft Security Blog June 10, 2026 news

Turn specs into evals for any agent with ASSERT

Adaptive Spec-driven Scoring for Evaluation and Regression Testing (ASSERT) is an open-source framework for converting natural language behavior requirements into executable evaluations of AI models and agents. The post Turn specs into evals for any agent with ASSERT appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

The Hacker News AI Security June 10, 2026 news

Anthropic Releases Claude Fable 5, Its Most Powerful AI Yet, With Cyber Safeguards

On June 9, Anthropic released Claude Fable 5, the most capable model it has ever made, generally available. It also did something unusual: it shipped one model as two products, split not by capability but by a layer of safety classifiers. Fable 5 goes to the public. Its twin, Claude Mythos 5, the same underlying model

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Microsoft Security Blog June 9, 2026 news

Reconstructing AI activity in investigations

Learn how to investigate AI activity in Microsoft 365 Copilot and Azure AI services using a structured, telemetry-driven approach. This playbook helps security teams reconstruct events, assess data exposure, and detect potential threats faster. The post Reconstructing AI activity in investigations appeared first on Mic

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Microsoft Security Blog June 8, 2026 news

AI brands as bait: How threat actors are using the AI hype in social engineering

As threat actors operationalize AI to accelerate attacks, they are also leveraging the wider global interest around AI itself as a social engineering lure. The post AI brands as bait: How threat actors are using the AI hype in social engineering appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Microsoft Security Blog June 5, 2026 news

Securing CI/CD in an agentic world: Claude Code Github action case

Microsoft Threat Intelligence identified a prompt injection pathway in Claude Code GitHub Action that allowed access to workflow secrets under specific conditions. This research examines the attack chain, responsible disclosure process, Anthropic's mitigation, and guidance for securing AI-powered CI/CD workflows. The p

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Microsoft Security Blog June 4, 2026 news

Updating the taxonomy of failure modes in agentic AI systems: What a year of red teaming taught us

A surge in real-world attacks against agentic AI systems is reshaping how we think about risk. Based on 12 months of red teaming, this update introduces seven new failure modes, from supply chain compromise to goal hijacking, and the practical mitigations teams need now. The post Updating the taxonomy of failure modes

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

OpenAI News June 3, 2026 news

Introducing new capabilities to GPT-Rosalind

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and experimental workflow capabilities.

AI Compliance AI Red Teaming Model Evaluation

Read summary Source link

Microsoft Security Blog June 3, 2026 news

Preinstall to persistence: Inside the Red Hat npm Miasma credential-stealing campaign

A large-scale npm supply chain attack compromised over 90 versions of @redhat-cloud-services packages, silently infecting CI/CD environments and developer systems. The malicious code steals credentials from GitHub, cloud platforms, and local machines, then spreads like a worm by republishing trusted packages. Discover

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Microsoft Security Blog June 2, 2026 news

Microsoft Build 2026: Securing code, agents, and models across the development lifecycle

Discover how Microsoft enables fast, secure AI development with MDASH and new security capabilities. The post Microsoft Build 2026: Securing code, agents, and models across the development lifecycle appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Gartner June 1, 2026 - June 3, 2026 event event archive

Gartner Security & Risk Management Summit 2026

Gartner Security & Risk Management Summit 2026 brings CISOs and security leaders together in National Harbor, Maryland, with tracks covering AI, cyber risk, application security, data security, operations, privacy, and governance.

AI Compliance Agent Security AI Red Teaming

View details Open event page

Anthropic Research May 22, 2026 analysis

Project Glasswing: An initial update

Anthropic reports early Project Glasswing results using Mythos Preview with infrastructure partners and external testers, including large-scale vulnerability discovery and a cautious disclosure posture.

AI Red Teaming Agent Security Model Evaluation

Read summary Source link

Krebs on Security May 12, 2026 news

Patch Tuesday, May 2026 Edition

Artificial intelligence platforms may be just as susceptible to social engineering as human beings, but they are proving remarkably good at finding security vulnerabilities in human-made computer code. That reality is on full display this month with some of the more widely-used software makers -- including Apple, Googl

AI Red Teaming

Read summary Source link

Anthropic April 29, 2026 framework

Anthropic Responsible Scaling Policy v3.2

Anthropic’s current Responsible Scaling Policy page lists v3.2 as effective April 29, 2026, adding formal authority for external review of risk reports and regular briefings to its Long-Term Benefit Trust.

Current notes, events, and source material