Fri 19 June 2026
Daily Brief — Curated and contextualised by Best Practice AI
Amazon Caps AI, Accenture Faces Decline, and Meta Seeks Wall Street Funds
TL;DRAmazon, Walmart, and Uber are limiting AI usage due to high costs. Accenture's shares have dropped to their lowest since 2017 amid AI threats. Meta is seeking Wall Street funding to support its AI initiatives. US regulators are expediting power grid connections for data centers to manage utility costs. G7 leaders are urging financial regulators to coordinate on AI risks.
The stories that matter most
Selected and contextualised by the Best Practice AI team
‘We created a monster’: companies rein in AI usage as costs strain budgets
Amazon, Walmart and Uber are among early adopters that have introduced caps or discouraged wasteful activity
What Capital After Labor? Forecasting the Talent ROI Transition in the Human-AI Era
arXiv:2606.19846v1 Announce Type: new Abstract: AI augmentation breaks the accounting link between labor time and productive contribution, yet firms continue to evaluate talent through time-based overhead bundles. This paper develops a forecasting framework for the transition from time-based talent accounting to output-based talent ROI in the human-AI era. The framework centres on Theorem 3 (ROI
Have Data Centers Raised Your Electric Bill? Causal Evidence from the United States
arXiv:2606.19777v1 Announce Type: cross Abstract: We estimate that data centers caused average retail electricity rates to fall modestly in the United States from 2015 to 2024 using an instrumental variables approach. Despite prevailing sentiment, the finding is consistent with economic reasoning: existing large power system fixed costs, economies of scale in transmission and distribution, and de
Accenture shares fall to lowest since 2017 as AI threat mounts
IT consultancy hit by concerns technology will hurt its business model
The Unresolved Profitability Crisis of Open-Weights Frontier AI Models
The economic viability of training frontier open-weights models remains unproven, as high training costs lack clear ancillary revenue streams. This uncertainty contrasts with traditional open-source software models and poses a significant challenge for future investment sustainability.
How AI is ‘senior-ising’ junior roles
Changing workflows mean employers are now asking new recruits to be managers and decision makers
US Acts to Speed Up Power Grid Hook-Ups for AI Data Centers
US regulators have taken their biggest step yet to speed the connection of data centers to the country’s grids while simultaneously attempting to slow surging utility bills that have angered Americans.
Tech Workers Maxed Out Their A.I. Use. Now They’re Trying to Minimize It.
Artificial intelligence is expensive to use, many companies discovered. That has led to a new era of saving costs.
German electricity grid equipment maker SGB-SMIT in early IPO talks
Company’s valuation could top €4bn as investors focus on AI and data centre boom
The future of AI may be small, cheap and unprofitable | Reuters
The AI boom is built on the idea that bigger is better. A recent study suggests the opposite may soon be true: small language models running on desktop computers may be able to handle most of the tasks currently performed by large language models.
AI boosts Samsung but batters IT jobs
The inside story on the Asia tech trends that matter, from Nikkei Asia and the Financial Times
The tech giant mining Wall Street for AI cash
Former Goldman Sachs executive Dina Powell McCormick is helping Meta find ways to finance its AI ambitions
SpaceX plots $20bn bond deal after record IPO
AI and rocket group is tapping debt markets after raising $86bn in stock market debut
Why Nations Are Rethinking Dependence on Foreign New AI Models - Linkdood Technologies
The global artificial intelligence industry reached a turning point in June 2026. When Anthropic suspended access to its most advanced AI models following a
New Irish bill to supervise EU AI Act gets greenlit
The AI Act, which entered into force in August 2024, attempts to tackle some of the risks emerging from the technology while letting the bloc benefit from its economic potential. Read more: New Irish bill to supervise EU AI Act gets greenlit
G7 leaders urge financial regulator coordination to tackle AI risks
G7 leaders called for information sharing and coordination between financial regulators and tech companies to address risks posed by frontier AI models.
Economics & Markets
The Unresolved Profitability Crisis of Open-Weights Frontier AI Models
The economic viability of training frontier open-weights models remains unproven, as high training costs lack clear ancillary revenue streams. This uncertainty contrasts with traditional open-source software models and poses a significant challenge for future investment sustainability.
Directors Duties in the Age of Agentic Artificial Intelligence
arXiv:2606.20453v1 Announce Type: new Abstract: As boards engage with the adoption of Artificial Intelligence including agentic AI to drive operational efficiencies, this presents new opportunities for profit maximisation. AI adoption is increasingly identified with employee role displacement and in companies, and the interests of employees as stakeholders require exploration. A novel question posed is whether in an age of AI ascendancy AI may warrant being given stakeholder status as its role in the company approximates or eclipses that of human employees. The article probes four distinct models of corporate purpose within the duty on directors to act in the best interests of the company, the shareholder primacy model, the Enlightened Shareholder value model, the stakeholder friendly model, and the stakeholder value model, highlighting the available scope for directors to accommodate the interests of employees around AI adoption in decision-making by boards around AI. It is concluded that given the degree to which directors are insulated from legal scrutiny in relation to their best interests duty, adopting a wider law in context approach to promote employee welfare would serve the interests of employees, directors and companies alike. This would see directors engaging meaningfully with employees and providing opportunities for reskilling to adapt to the age of AI.
Traders’ Latest AI-Related Play Is a Struggling Car Parts Stock
Investors looking for the stock market’s next artificial intelligence winner have honed in on embattled French car parts maker Valeo SE.
German electricity grid equipment maker SGB-SMIT in early IPO talks
Company’s valuation could top €4bn as investors focus on AI and data centre boom
The tech giant mining Wall Street for AI cash
Former Goldman Sachs executive Dina Powell McCormick is helping Meta find ways to finance its AI ambitions
SpaceX plots $20bn bond deal after record IPO
AI and rocket group is tapping debt markets after raising $86bn in stock market debut
From scarcity to execution: China’s AI valuation reset - Bamboo Works - China stock insights for global investors
Zhupu and MiniMax have lost more than 40% of their market value in just two weeks, as investors reassess the true worth of China's large language model developers
China Boosts Startup IPOs in Quantum, AI, and Emerging Tech to Outpace U.S. Competition
China is ramping up support for IPOs in cutting-edge tech sectors like quantum and AI to boost innovation amidst growing competition with the U.S.
Frontier consortium to invest a further $915m into carbon removal tech
Carbon-buying consortium Frontier, which includes the likes of Google and Meta, has announced it will invest a further $915 million in carbon removal technologies, bringing its total commitment to $1.8 billion. – Frontier The consortium also announced that generative AI firm Anthropic has become a member of the carbon buying alliance. The new funding, dubbed […]
AI boosts Samsung but batters IT jobs
The inside story on the Asia tech trends that matter, from Nikkei Asia and the Financial Times
MAS Chief Warns Rising AI Costs Could Weigh on Investment Returns - Fintech Singapore
MAS Managing Director Chia Der Jiun warns that AI investment risks are rising as energy and chip costs climb and returns remain uncertain.
Generative Engine Optimization at Scale: Measuring Brand Visibility Across AI Search Engines
arXiv:2606.20065v1 Announce Type: cross Abstract: People increasingly get answers straight from AI search engines like ChatGPT, Claude, Perplexity, and Gemini rather than scrolling search results. Brands that once focused on search engine optimization (SEO) must now optimize for how these engines represent, cite, and recommend them -- a shift variously called Generative Engine Optimization (GEO), Answer Engine Optimization (AEO), and AI Search Visibility. We treat AEO and AI Visibility as part of GEO, and study how to measure brand visibility across AI engines: what they value when they cite a brand, which sources they rely on, and what content large language models surface. The hard case is everyone outside the already-authoritative top brands -- SMEs, D2C brands, creators, and early-stage startups. We analyze 100K+ prompt responses across 100+ brands tracked on Ranqo between March and May 2026. First visibility runs form a clear three-tier brand-stature ladder: global household names (e.g., Stripe, Nike) appear in 73% of relevant AI answers on their first run; established mid-market and regional brands (e.g., Olipop, Klaviyo) in 44%; niche and small brands in just 11% -- about 30 percentage points per step. When engines cite sources, about 78% go to corporate websites; among non-corporate sources YouTube leads, ahead of Reddit, editorial media, and Wikipedia. The highest-leverage page is the ranked "best-of" listicle, the most-cited content format at about 21% of all citations. Sentiment is the unstable signal: whether a brand is framed positively or negatively flips about 6.7 times more often than whether it is mentioned at all. These findings provide a first large-scale baseline for measuring GEO: AI brand visibility can be measured, differs by platform, and varies strongly by brand maturity. We close by proposing seven v1.1 protocols to test whether specific recommendations can causally improve AI visibility.
Accenture shares fall to lowest since 2017 as AI threat mounts
IT consultancy hit by concerns technology will hurt its business model
Google’s Strategic Shift Toward Flash Models and Away from Frontier Leadership
Google’s current focus on flash models over frontier-grade AI suggests a strategic pivot toward mass-market serving rather than high-end agentic capabilities. This shift highlights the competitive tension between cost-efficient deployment and frontier model performance.
Anthropic, co-founders face new US copyright infringement suit from 100 authors
Around 100 authors have filed a lawsuit against Anthropic, alleging the company used pirated books from library websites to train its AI models.
LivCor reaches $7m settlement with US states over rental prices
LivCor agreed to pay $7 million to resolve antitrust claims from 10 US states alleging it used RealPage's revenue management system to align rental prices with competitors.
Midjourney pivots from AI image generation to body scanning medical spa where patients bathe in 'golden light'
Midjourney is reportedly shifting focus toward a medical spa concept, utilizing technology borrowed from an undisclosed partner.
Nvidia defeat would be 'huge headache' for merger call-ins, Irish official says
A court defeat for EU regulators over the Nvidia/Run:ai deal could complicate merger reviews that fall below standard notification thresholds.
Labor, Society & Culture
How AI is ‘senior-ising’ junior roles
Changing workflows mean employers are now asking new recruits to be managers and decision makers
AI will lead to labour shortages, Bezos says in optimistic talk | Reuters
Artificial Intelligence will lead to labour shortages, not the replacement of humans, Amazon founder Jeff Bezos predicted in a highly optimistic appearance at the VivaTech technology conference in Paris on Wednesday.
America Is Headed Toward the Infinite Workweek
The future of AI and jobs will be so much weirder than you think.
The Algorithmic-Human Manager: AI, Apps, and Workers in the Indian Gig Economy
arXiv:2606.19975v1 Announce Type: new Abstract: This paper examines the impact of artificial intelligence and digital technologies on the blue-collar gig economy in India, focusing on algorithmic management. This paper examines the impact of artificial intelligence and digital technologies on the blue collar gig economy in India, focusing on algorithmic management he use of automated systems to allocate, monitor, and evaluate work in location-based services such as ride sharing and delivery. Using a social justice framework and a mixed-methods approach comprising interviews with 16 gig workers and 21 key stakeholders, the study uncovers a dual reality: while AI-powered systems expand access to work and generate operational efficiencies, they simultaneously introduce significant challenges related to fairness, transparency, and worker dignity. Key findings reveal that algorithmic systems are opaque by design, produce inequitable outcomes, and are not structured to reward additional labour with proportionate pay. The study advocates for a pragmatic hybrid governance model an Algorithmic Human Manager framework in which technological efficiency and human accountability operate together rather than in opposition. The findings carry implications for policymakers, platform companies, and civil society organizations working to design equitable AI governance frameworks for the gig economy in India and across the Global South.
Entry-level work didn’t disappear, PwC finds with ‘seniorization.’ It just morphed into something young workers can’t get
"Employers are changing what they ask for in entry-level roles," Dan Priest, PwC's U.S. chief AI officer, told Fortune.
Gig workers are endlessly exploited. AI could make more of us share their fate
As companies integrate AI and hire fewer employees, a shift toward a ‘gig economy’ will commence In 2024, the buy-now-pay-later company Klarna announced that it would cut hundreds of customer service roles and begin using an artificial intelligence chatbot instead. The move was expected to save the company millions. But a year later, after customers complained about the degraded quality of customer service, Klarna began to quietly recruit human customer service agents back. At first glance, the reversal appeared to be a victory for human workers in the age of AI. The reality was more complex. Instead of bringing on full-time customer service agents, who Klarna contracts through an outside agency, it instead brought on workers in what Klarna CEO Sebastian Siemiatkowski has described as “an Uber type of set-up”. Now, an AI chatbot continues to handle most of customers’ basic queries, while a growing number of gig workers handle the more advanced ones. “Just like somebody can go and drive an Uber for a while, they can actually jump on and work for Klarna’s customer service,” Siemiatkowski said on a podcast in February. Continue reading...
Gender Bias in LLM Hiring Decisions: Evidence from a Japanese Context and Evaluation of Mitigation Strategies
arXiv:2606.18649v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in hiring workflows, yet most research on gender bias in LLM hiring decisions has focused on English-language, Western-format resumes. This study examines whether pro-female gender bias extends to a Japanese corporate context and evaluates two practical mitigation strategies. Using a counterfactual resume design with 60 Japanese rirekisho-format resumes, 12 name pairs selected on linguistically grounded gender-signal criteria, and five state-of-the-art LLMs (Claude Sonnet 4.6, GPT-4o, DeepSeek-V3, Gemini 2.5 Flash, Llama 3.3 70B), we conducted 43,200 API calls across baseline, prompt instruction, and privacy filter conditions. A crossed random-effects linear mixed model confirms a significant pro-female bias across all five models, replicating Western findings in a non-Western context. A prompt-level gender-neutrality instruction produces no meaningful reduction in bias. A name-reliance analysis formally identifies the candidate name as the primary gender channel: removing the name from the prompt reduces the female effect by nearly its full magnitude. An unexpected incompatibility between the privacy filter and GPT-4o's content safety filter, resulting in a 42% refusal rate, highlights a practical deployment challenge for name anonymization in LLM-assisted recruitment pipelines.
AI makes three in five Australians' jobs more stressful
Businesses rolling out AI face rising staff anxiety, with a survey of more than 1,200 Australians finding most feel more stressed at work.
Elon Musk's Grok AI Sparks Debate on Ethics and Oversight in Military Use
Elon Musk’s AI tool, Grok, has been revealed as a critical component in US military operations, triggering debates on AI ethics and oversight in defense.
Rape convictions under review after UK detective allegedly used AI chatbot for paperwork
Derbyshire Police investigating whether officer used software to secure desired court outcome
Amazon Retaliated Against Workers Who Supported Regulating Data Centers, Complaint Says
The employees encouraged limits on the complexes in a series of hearings in the tech giant’s hometown, Seattle.
New Super PAC, the Guardrails Alliance, Aims to Rally Tech Workers to Help Limit A.I.
The Guardrails Alliance, which has raised $5 million, is positioning itself as a populist effort that will take on the pro-A.I. interests trying to influence this year’s elections.
Acceleration AI Ethics and the Telus GenAI Conversational Agent
arXiv:2501.18038v3 Announce Type: replace Abstract: Acceleration ethics addresses the tension between innovation and safety in artificial intelligence. The acceleration argument is that risks raised by innovation should be answered with still more innovating. This paper summarizes the theoretical position, and then shows how acceleration ethics works in a real case. To begin, the paper summarizes acceleration ethics as composed of five elements: innovation solves innovation problems, innovation is intrinsically valuable, the unknown is encouraging, governance is decentralized, ethics is embedded. Subsequently, the paper illustrates the acceleration framework with a use-case, a generative artificial intelligence language tool developed by the Canadian telecommunications company Telus. While the purity of theoretical positions is blurred by real-world ambiguities, the Telus experience indicates that acceleration AI ethics is a way of maximizing social responsibility through innovation, as opposed to sacrificing social responsibility for innovation, or sacrificing innovation for social responsibility.
2026 AI Report: Protect LGBTQ Rights with Inclusive Data and Transparent Practices
A 2026 AI report highlights significant risks for LGBTQ communities due to biased AI designs and privacy violations. It calls for inclusive datasets and transparency to ensure responsible AI practices.
Emergent Alignment
arXiv:2606.19527v1 Announce Type: new Abstract: Can Large Language Models (LLMs) discern when their own outputs are misaligned with human ethics? And can they self-correct? We endow an LLM with a conscience step that reviews its own reasoning and outputs, and we extend the training loss with an alignment component using Direct Preference Optimization (DPO) to steer the model away from non-ethical outputs. The result is an online technique to align models in a wide range of applications: training, fine-tuning, adversarial prompting, and zero-shot learning. It does not require a weaker or stronger judge, relying instead on a frozen copy of itself. In previous work, the Emergent Misalignment scenario showed a range of emergent unethical behaviors from fine-tuning the model to hack code. Instead, we empirically show how to achieve Emergent Alignment: a single high-level introspective question steers training toward an ethical model under the same code hacking scenario.
Technology & Infrastructure
DeXposure-Claw: An Agentic System for DeFi Risk Supervision
arXiv:2606.19501v1 Announce Type: new Abstract: Decentralized finance exposes supervisors to fast-moving, networked credit risks. General-purpose LLM agents fit this setting poorly: they over-read weak evidence and recommend high-stakes interventions, while existing evaluations offer no regulator-aligned way to measure the resulting false alarms. We introduce DeXposure-Claw, a forecast-grounded agentic supervision system that routes LLM decisions through structured evidence: (1) DeXposure-FM, a graph time-series foundation model, forecasts future exposure networks; (2) deterministic monitors and stress scenarios then turn those forecasts into typed alerts, attribution signals, and scenario evidence; and (3) data-health and confidence gates constrain escalation before DeXposure-Claw emits auditable supervisory tickets with rationales. We further develop DeXposure-Bench, a six-axis evaluation harness, whose decision axis scores tickets against a regulator-aligned absolute-loss ground truth and an explicit false-intervention rate. Experiments on five years of weekly real data fully support our system. Code is at https://github.com/EVIEHub/DeXposure-Claw.
Adobe embeds agentic AI workflows across Creative Cloud, shifting from media generation to production orchestration
Adobe has announced a major expansion of its "creative agent" across its flagship Creative Cloud suite and upgraded Firefly AI studio. Available in public beta starting today across Premiere Pro, Photoshop, Illustrator, InDesign, and Frame.io, the agent is designed to serve everyone from individual creators to enterprise marketing teams. Unlike first-generation generative AI tools that simply output flat media from a chat interface, Adobe’s embedded assistant acts as an orchestration layer. It interprets natural language prompts and directly accesses the underlying software's APIs to execute complex, multi-step production workflows—from batch-renaming video sequences to dynamically updating brand assets across print layouts—while leaving the final aesthetic decisions entirely in the hands of the human designer. Technology: Contextual Memory and DOM Manipulation At the core of this release is a significant technical upgrade to how Adobe's AI handles persistent memory and context window management. In its upgraded Firefly creative AI studio—currently in private beta—Adobe has introduced two foundational architectural components: "Elements" and "Projects". Elements functions as a visual variables library, allowing users to save and reuse specific characters, locations, and objects across multiple generations to ensure strict visual consistency as campaigns scale. Projects acts as the contextual memory layer, storing assets, generations, and session history in a unified space so users can pick up where they left off without rebuilding their prompt context. Beyond pixel generation, the system's most critical technological leap is its ability to operate seamlessly within the complex document structures of desktop applications. "Our Adobe Creative Agent can leverage the decades of powerful features, workflows, APIs that we've brought into our application and exposed through tooling that can now be invoked through a creative agent," an Adobe representative explained. Product: Automating the Tedious, Expanding the Canvas The practical application of this technology fundamentally alters standard production workflows. Adobe is positioning the human user as a "creative director" capable of delegating repetitive, labor-intensive tasks to the AI. The rollout introduces highly specific specialist agents tailored to the logic of each application: Premiere Pro: The agent handles tedious project setup, analyzing and sorting source media into bins, batch renaming clips, identifying interview questions, and assembling a rough working starting point. Illustrator: The assistant automates mathematical and multi-step design tasks, such as generating 50 versioned files from a spreadsheet or running pre-flight checks to flag color mode errors before printing. It can even programmatically duplicate a vector shape 100 times, randomize its position, and change its size based on its z-depth and transparency. Photoshop & InDesign: The agent executes batch background removals, dynamic layer organization, and applies brand updates across multi-page layouts. Furthermore, Adobe is actively integrating its creative agent into major third-party enterprise platforms, including OpenAI's ChatGPT, Anthropic's Claude, Microsoft 365 Copilot, and soon, Google Gemini and Slack. Licensing: Commercial SaaS and Enterprise Implications Unlike open-source orchestration frameworks or models released under MIT or Apache licenses, Adobe's creative agent operates strictly within a proprietary, commercial SaaS ecosystem. For enterprise decision-makers, this carries specific implications. Because the agent relies on Adobe's proprietary APIs to manipulate project files, it requires an active Creative Cloud commercial license. Additionally, by bringing the "Adobe for creativity connector" to platforms like Slack and Microsoft Copilot , enterprise IT and systems architects must consider how internal chat tools will interface with Adobe's cloud processing environments to support enterprise creative and marketing teams securely. The Enterprise Unknowns: APIs, Governance, and Architecture While Adobe’s announcements highlight a powerful user interface and deep integration within its own flagship applications, several critical questions remain for enterprise technical decision-makers tasked with building bespoke AI systems. VentureBeat has reached out to Adobe for clarification on these infrastructure-level details and will update this coverage as we learn more. For AI system architects, the value of a creative agent lies not just in a native application UI, but in its extensibility. It remains unclear if Adobe plans to expose these new agentic capabilities via API, or if the company will support the Model Context Protocol (MCP). Without MCP support or direct API access, enterprise teams will face friction integrating Adobe's tools into their own custom task-routing frameworks and internal LLM pipelines. Adobe’s new "Elements" feature promises to solve the generative AI consistency problem by anchoring characters and objects across generations. However, the backend architecture driving this persistent memory is not yet detailed. Whether Adobe is leveraging on-the-fly Low-Rank Adaptation (LoRA) based on user uploads or utilizing a form of visual Retrieval-Augmented Generation (RAG) is a critical distinction for technology leaders managing compute costs, model evaluations, and enterprise-grade inference pipelines. As organizations build out "Projects" and define brand-specific "Elements", security and data decision-makers require strict guarantees regarding data provenance and storage. It is currently unknown exactly where this contextual workflow and vector data lives—specifically, whether it remains strictly sandboxed within the customer's enterprise Creative Cloud instance on Adobe servers, and how role-based permissions apply to these new agentic workflows. Finally, as lightning-fast, developer-first, multi-model AI creative platforms like fal.ai gain significant traction among enterprises and developers, Adobe’s position in the broader developer ecosystem remains a point of interest. Whether Adobe views these infrastructure-level API providers as direct competitors to its Firefly AI studio or as potential integration points for bespoke enterprise environments has yet to be seen. Community Reactions: The Tension Between Automation and Craft The integration of agentic AI touches on the tension between eliminating drudgery and surrendering creative control. According to Adobe's recent Creators' Toolkit Report, which surveyed over 16,000 creators globally, the market is highly receptive to AI as an operational assistant rather than an autonomous creator. 75 percent of surveyed creators describe creative AI as integrated or essential to their current workflows. 85 percent emphasized that the final creative decision must always remain in human hands. This sentiment is central to Adobe's messaging. By focusing the agent's capabilities on file organization, layer management, and brand compliance, Adobe aims to automate what a spokesperson called the "tedious parts of their workflow". The goal, according to Adobe executive David Wadhwani, is to let creatives focus on the craft so they can "apply their taste and make the calls that only they can".
Uncertainty Decomposition for Clarification Seeking in LLM Agents
arXiv:2606.19559v1 Announce Type: new Abstract: Recent position papers argue that the classical aleatoric/epistemic uncertainty framework is insufficient for interactive large language model (LLM) agents and call for underspecification-aware, decomposed, and communicable uncertainty representations that can unlock new agent capabilities such as proactive clarification seeking and shared mental-model building. Practical deployment constraints -- black-box APIs, interactive latency budgets, and the absence of labeled trajectories -- rule out logprob-based, multi-sampling, and training-based methods, leaving prompt-based estimation as the most viable family for surfacing such signals at deployment time. We answer this call with a simple prompt-based decomposition that separates action confidence from request uncertainty (u), enabling the agent to ask for clarification when the task specification is ambiguous. To evaluate it, we introduce two clarification-augmented benchmarks (WebShop-Clarification and ALFWorld-Clarification) in which 50% of tasks are deliberately underspecified, and systematically compare the proposed decomposition against ReAct+UE and Uncertainty-Aware Memory (UAM) across five LLM backbones (GPT-5.1, DeepSeek-v3.2-exp, GLM-4.7, Qwen3.5-35B, GPT-OSS-120B) on these variants together with the standard WebShop, ALFWorld, and REAL benchmarks for fault detection. Averaged across the five backbones, the proposed decomposition improves clarification F1 on ALFWorld-Clarification by 73% over ReAct+UE and by 36% over UAM, and leads clarification F1 on every backbone on WebShop-Clarification and on four of five backbones on ALFWorld-Clarification, indicating that the gains generalize beyond a single LLM.
Anthropic's Claude Code Artifacts update brings live, shared dashboards and interactive workspaces to enterprises
Anthropic announced a potentially game-changing new feature for users of Claude Code on the Claude Team and Enterprise subscription plans: Artifacts. This update turns a Claude Code session's work into a live, interactive, and shareable, custom HTML webpage, allowing a Claude Code user to plug in live code, multiple data sources, and have it surface on an interactive URL that they can send to other teammates — be it a dashboard, an app design, or some other product meant for internal usage. These teammates and the original user can watch the webpage it update in real-time as Claude Code goes about its work autonomously or under the user's guidance, and as the connected data sources and codebases change. While Anthropic first introduced Artifacts to its consumer web chatbot in the summer of 2024—where it evolved from a manual toggle feature to a generally available tool for publishing code snippets and games to the web—integrating this capability directly into the Claude Code command-line interface (CLI) and desktop app bridges the gap between deep, back-end engineering and the non-technical stakeholders who need to understand it. Product and Technology: The End of the Status Update At its core, Claude Code Artifacts acts as a dynamic translation layer. Built directly from the unbroken context of a user’s session, the agent uses the local repository codebase, connected monitoring tools, and conversational reasoning to spin up specialized web pages. Engineers no longer need to wire up external data sources or stand up temporary infrastructure; the AI builds the UI from what already exists. Crucially, these web pages are not static exports. As the AI works through a terminal session, the open webpage refreshes in-place, updating charts and text instantly at the exact same URL. Every update publishes a new version history, allowing teammates to roll back or track the agent's progress securely on desktop or mobile. The Battle of Live, Interactive, Shared AI Work Surfaces: Anthropic's Claude Code Artifacts vs. OpenAI's Codex Sites Anthropic's update comes more than two weeks after OpenAI released a massive update to its own Codex platform, introducing a strikingly similar enterprise hosting feature called "Sites". This tit-for-tat product cadence highlights a rapidly escalating battle over the enterprise workspace across functions and beyond developers themselves, though there are some important technical and philosophical distinctions worth pointing out for enterprises considering either. As revealed in their respective developer documentation webpages, OpenAI is building a platform-as-a-service; Anthropic is building a stateless canvas. OpenAI’s Sites is designed to generate durable, full-stack web applications. According to the platform's documentation, Codex Sites hosts projects that output as Cloudflare Worker-compatible ES modules. Crucially, Sites supports persistent backend infrastructure: agents can automatically wire up "D1" relational databases for structured data (like user progress or saved records) and "R2" object storage for file uploads. An OpenAI Site can support public sign-ins, integrate with external identity providers, and allows for highly specific access controls tailored to specific workspace groups. It utilizes a two-stage publishing process—saving a reviewable candidate linked to a Git commit before officially deploying to production. In short, it is a production environment designed to replace functional internal SaaS tools. Anthropic’s Claude Code Artifacts, by contrast, deliberately avoids the backend. The newly released documentation is blunt about its limitations: "An artifact is a capture of work, not an application". Each Artifact is a single, self-contained HTML page capped at a rendered size of 16 MiB. To guarantee organizational security, Claude wraps the published file in a strict Content Security Policy (CSP) that blocks all external network requests. T his means the page cannot load external scripts, fonts, or stylesheets, and fetch, XHR, and WebSocket calls are completely blocked. All CSS and JavaScript must be inlined, and images must be embedded as data URIs. Artifacts cannot store form input, call an API at view time, or serve multiple routes. This technical limitation is actually Anthropic's deliberate philosophical position: While OpenAI wants to spin up persistent software portals for the whole company, Anthropic is keeping Claude Code firmly anchored in ephemeral, highly secure technical workflows. Claude Artifacts are not meant to be software; they are meant to replace whiteboard diagrams, manual bug walkthroughs, and status reports with secure, self-updating visual tools that never leak live data outside the corporate boundary. Licensing and Enterprise Security: Keeping the Codebase Private Because these agents sit at the nexus of proprietary company data and live codebases, licensing and access controls are a primary concern. Both Anthropic and OpenAI have opted for closed, proprietary licensing models for these new visual workspaces. For end users and developers, the distinction is critical. Unlike permissive open-source software (such as MIT or Apache 2.0) or strict copyleft licenses (like GPL)—which grant developers the legal freedom to inspect, modify, and self-host the underlying code—neither Claude Code Artifacts nor Codex Sites can be independently forked or hosted. Enterprise clients do not maintain code-level ownership over Anthropic's rendering engine or Codex’s integration nodes; both operate strictly within their respective creators' managed infrastructures. To make this vendor-managed approach palatable to enterprise compliance teams, both companies have heavily prioritized organizational security. Anthropic ensures every artifact is private to its author by default and strictly cannot be made public to the broader internet. When an engineer chooses to share a link, it is viewable exclusively by authenticated members of their specific organization. System administrators retain ultimate authority, managing access through org-level toggles, role-based scoping, and explicit retention policies, while maintaining oversight through a centralized compliance API. OpenAI takes a similarly gated approach with Codex Sites, rolling the feature out primarily for ChatGPT Business and Enterprise workspaces. Like Anthropic, OpenAI relies on system administrators to manage deployment through centralized workspace settings, requiring an admin to explicitly enable Sites via role-based access control (RBAC) for Enterprise tiers. However, because Codex Sites functions more like a hosted web application, its access controls are slightly more granular. When an engineer prepares to share a deployed URL, they can apply specific access modes: restricting the site to just themselves and workspace admins, opening it to all active users in the workspace, or limiting access to custom user groups. Furthermore, to prevent sensitive data leaks, OpenAI provides a dedicated Sites panel to manage runtime environment variables and secrets securely, ensuring those keys do not have to be committed to local source files. Reactions and Reflections The introduction of visual, self-updating UI layers to command-line agents is fundamentally altering how developers view their own workflows. As AI handles the raw syntax and automates the reporting, the friction of communicating technical work to stakeholders is vanishing. Boris Cherny, the Lead and creator of Claude Code, highlighted the sheer utility of the update in a post on X earlier today: "I've been using Artifacts in Claude Code for everything: visual explanations of tricky code, system diagrams, quick previews of a few animation options, data analyses and dashboards I share with the team," Cherny wrote. "They are a game changer for how I work with Claude. Can't wait to hear what you think!" This sentiment is practically demonstrated in Anthropic’s launch materials. In one scenario, an engineer prompts Claude Code to investigate user drop-offs since a previous software release. In a matter of seconds, the agent executes an SQL read, builds an interactive drop-off funnel dashboard, and diagnoses that "Pro accounts stall at the export sheet". The AI then proposes UI fixes, updates the live charts as the code is refactored, and generates a secure link that a manager can instantly open via mobile. By turning the terminal into a live, collaborative canvas, Anthropic is proving that the most valuable output of an AI coding assistant isn't just the code itself—it is the context, the reasoning, and the ability to share that work instantly.
Financial Services Agents in Production
This LangChain report examines how major financial services firms, such as J.P. Morgan, are putting AI agents into production across real enterprise workflows.
Building Supercharger: How Rocket Close optimized title operations with agentic AI
Rocket Close’s Supercharger project applies agentic AI to improve title operations and support human agents in a real estate workflow.
Deontic Policies for Runtime Governance of Agentic AI Systems
arXiv:2606.19464v1 Announce Type: new Abstract: Autonomous agentic AI systems driven by Large Language Models (LLMs) introduce a new class of security, privacy, and compliance challenges: an agent that can invoke tools, manipulate data, install software, and coordinate with peer agents across organizational boundaries must be constrained not just by authentication and access control, but by the full structure of enterprise governance. This includes specifying what agents are permitted and prohibited from doing, what they areobliged to do after certain actions (e.g., notify the CISO), under what conditions a standing obligation may be waived, and which rules take precedence when policies conflict. This governance problem exceeds what current policy engines provide. Systems such as XACML, Rego, and Cedar address only the permit/prohibit subset of this governance structure. They do not provide obligation lifecycle management, meta-policy conflict resolution, dispensations that waive obligations in specific circumstances, and ontological reasoning over domain class hierarchies commonly found in applications such as healthcare, cybersecurity, or data privacy. We propose AgenticRei, which realizes key governance requirements such as obligations, dispensations, policy conflict resolutions, and reasoning over policies, as well as the basic permit/prohibit constraints. We use a deontic policy language built on the Rei framework, expressed as OWL (Web Ontology Language) and evaluated at runtime by a high-performance logic engine entirely outside the LLM. The same pipeline governs both tool invocations by the agent and agent-to-agent messages. We show through examples that deontic policies capture governance constraints around security and privacy that mostly cannot be expressed in current production engines. Our approach composes naturally with industry-standard frameworks like A2AS.
Palladyne AI Secures Army Contract for Autonomous Systems Field Trials
Palladyne AI secured key Army contracts to develop SwarmOS and Gremlin-X, aiming to streamline command of diverse autonomous systems during upcoming field trials.
The Evolution of Agentic Surfaces: Building with Claude Managed Agents
Anthropic explains how production AI agents require more than prompts, including sessions, environments, secrets, observability, and scalable agent harnesses.
Unsloth shrinks GLM-5.2 by 84% for local use
Unsloth has released a 2-bit quantization of the 744B parameter GLM-5.2 model, allowing it to run locally on high-end Mac hardware while retaining 82% accuracy.
7 Key AI Hardware Keywords to Watch in 2026
Original Article By SemiVision Research [Reading time: 17 mins]
AI is turning Nintendo and Sony products into accidental luxury goods
With component-makers busy supplying data centres, console prices are rising as demand outstrips production capacity
US Acts to Speed Up Power Grid Hook-Ups for AI Data Centers
US regulators have taken their biggest step yet to speed the connection of data centers to the country’s grids while simultaneously attempting to slow surging utility bills that have angered Americans.
Have Data Centers Raised Your Electric Bill? Causal Evidence from the United States
arXiv:2606.19777v1 Announce Type: cross Abstract: We estimate that data centers caused average retail electricity rates to fall modestly in the United States from 2015 to 2024 using an instrumental variables approach. Despite prevailing sentiment, the finding is consistent with economic reasoning: existing large power system fixed costs, economies of scale in transmission and distribution, and de
The AI tipping point: where enterprise AI runs at scale
PARTNER CONTENT: AI's cloud journey homeward bound: enterprises prefer private clouds for scaling AI workloads.
Seeking to boost AI data centers, NTIA preparing report on regulatory obstacles
The NTIA is preparing a report on regulatory hurdles and infrastructure bottlenecks affecting the construction of data centers needed for AI development.
Firstcolo breaks ground on data center in Rosbach vor der Höhe, Germany
Data center operator Firstcolo has begun construction of a new 24MW data center in the town of Rosbach vor der Höhe in the state of Hesse, central Germany. Named FRA7, the facility north of Frankfurt am Main will cover an area of 124,360 sq ft (11,555 sqm) and is intended to support cloud, AI, and […]
Legal risks in AI training data drove LG to build data-governance system for Exaone
Many open-source datasets used to train AI foundation models may contain licensing inconsistencies and regulatory risks, prompting LG to develop its own data-governance system.
REVEAL++: Differentiable Phenotypic Grouping for Vision-Language Retinal Modeling of Alzheimer's Disease Risk
arXiv:2606.19522v1 Announce Type: new Abstract: The retina offers a noninvasive window into neurodegenerative disease, capturing subtle structural patterns associated with a risk of future cognitive decline. Vision-language alignment frameworks such as REVEAL have shown that pairing retinal fundus images with structured clinical risk narratives improves early prediction of Alzheimer's disease (AD). A key design choice in these approaches is the use of phenotypic grouping, where individuals with similar risk profiles are treated as multi-positive pairs during contrastive learning. However, existing methods operationalize phenotypic similarity as a discrete construct, relying on hard group assignments that impose rigid supervision and decouple group formation from representation learning. We propose a continuous formulation of phenotypic structure within contrastive learning. Rather than assigning samples to fixed clusters, we model inter-subject similarity as a differentiable weighting function derived from intra-modality embedding similarities in both retinal images and risk profiles. These weights define soft multi-positive relationships through a continuous aggregation operator, enabling graded supervision that reflects the spectrum nature of disease risk. We further introduce a soft-target contrastive objective that jointly learns cross-modal alignment and phenotypic structure in an end-to-end manner. Evaluated on UK Biobank retinal imaging data for incident AD prediction, the proposed framework consistently outperforms discrete group-based contrastive learning and standard vision-language baselines. By treating phenotypic similarity as a learnable, continuous signal rather than a fixed grouping rule, our approach provides a principled and robust foundation for population-scale neurodegenerative risk modeling from multi-modal retinal and clinical data.
Diffusion Language Models: An Experimental Analysis
arXiv:2606.19475v1 Announce Type: new Abstract: Large Language Models (LLMs) have revolutionized language modeling through autoregressive generation, enabling strong performance across a wide range of tasks. Recently, Diffusion Language Models (DLMs) have emerged as an alternative paradigm that generates text through iterative denoising rather than next-token prediction, allowing parallel refinement of entire sequences. While numerous diffusion-based architectures have been proposed, differences in evaluation protocols, datasets, inference budgets, and generation hyperparameters make it difficult to compare their capabilities and understand the trade-offs they offer. In this work, we present a systematic experimental analysis of modern DLMs. Specifically, we evaluate eight state-of-the-art DLMs across eight benchmarks spanning reasoning, coding, translation, knowledge, and structured problem solving, while explicitly considering both generation quality and computational efficiency. Beyond downstream evaluation, we analyze the impact of key inference-time factors, including denoising steps, context length, block size, and parallel unmasking strategies, and complement large-scale experiments with controlled comparisons of smaller models trained under identical conditions. Our analysis highlights the strengths and limitations of diffusion-based language modeling across different tasks, architectures, and inference budgets. We show that the behavior of DLMs is strongly influenced by generation-time design choices, leading to distinct trade-offs between performance and computational efficiency. Overall, our study provides practical insights into the capabilities and deployment characteristics of contemporary DLMs.
Hidden Anchors in Multi-Agent LLM Deliberation
arXiv:2606.19494v1 Announce Type: new Abstract: Multi-agent LLM deliberation, where agents exchange and revise answers over several rounds, is increasingly used to improve reasoning and accuracy, yet how and why it works is rarely modelled. Such deliberation mirrors how humans reach decisions. As social animals we are pulled both by the group, the herd effect that classical opinion-dynamics models such as DeGroot and Friedkin--Johnsen capture, and by our own internal belief, which they do not. We model multi-agent deliberation as a closed-loop dynamical system in which each agent carries a hidden internal belief, its anchor, that continually pulls its opinion regardless of its neighbours. We show this anchor can be recovered from the deliberation alone, and that it explains a behaviour classical consensus rules forbid: an agent's confidence in the correct answer can climb past where any agent started, escaping the space (convexhull) formed by the initial beliefs. Checking whether the recovered anchor also predicts held-out runs (generalizes) gives a simple test for when a model is truly driven bysuch an anchor. Across three open-weight model families this is a spectrum, not all-or-nothing. All anchors' influence are about equally strongly, but they differ in where the anchor sits, and only when it sits far from the initial opinions does deliberation escape the hull and need the full closed-loop model.
Is the world becoming more predictable?
Bigger and better data sets and powerful AI models are allowing us to spot previously undetectable patterns
Midjourney Medical goes from generating ‘cat images’ to full-body ultrasound scans
Midjourney's AI technology is being applied to medical imaging, specifically for ultrasound scans.
ITNet: A Learnable Integral Transform That Subsumes Convolution, Attention, and Recurrence
arXiv:2606.19538v1 Announce Type: new Abstract: Convolutional networks, recurrent networks, and transformers each encode different inductive biases -- locality, sequential memory, and content-dependent pairwise interaction -- and have remained mathematically distinct since their inception. We show that this fragmentation reflects not a fundamental diversity in how signals should be processed, but rather incomplete views of a single underlying mathematical object: a learnable integral transform. We introduce the Integral Transform Network (ITNet), a unified architecture built around a learnable kernel that depends jointly on positions and features. This kernel is implemented as a small neural network, specifically an MLP, that models pairwise interactions, enabling the model to adapt its behavior from data. We show that convolution, self-attention (including multi-head), and autoregressive recurrence (including LSTM, GRU, S4, and Mamba) arise as special cases under appropriate parameterizations, and that ITNet is a universal approximator of continuous operators. To make this practical, we develop tiled kernel fusion, importance-weighted Monte Carlo integration, and learned low-rank factorization, enabling efficient and scalable computation. A single ITNet architecture with a shared operator and lightweight modality-specific encoders matches or exceeds specialized baselines on ImageNet-1K , GLUE, ModelNet40, VQA\,v2 and NLVR2. The results demonstrate that a single learned interaction mechanism can recover the behavior of all three architectural families from data.
AI #173: AI Pauses - by Zvi Mowshowitz
Rob Haisfield: Are AI agents shape rotators? In this new benchmark, we let the models play campaign puzzles in Opus Magnum, a puzzle game by @zachtronics . Ironically, Claude Opus 4.8 performed poorly, being beaten by GPT-5.5, Gemini 3.5 Flash, and GLM 5.2.
Toten: Knowledge-Based Ontological Tokenization Of Physical Quantities And Technical Notation In Brazilian Portuguese
arXiv:2606.19626v1 Announce Type: new Abstract: Byte-Pair Encoding tokenization is statistically efficient for vocabulary compression, but semantically blind to structured technical entities, fragmenting physical quantities, numbers, units, and symbolic expressions into lexically arbitrary subwords. We present TOTEN, a knowledge-based ontological tokenization framework that replaces statistical derivation with declarative classification grounded in a formal ontology of engineering entities (OEE). We formalize TOTEN as the triple : the ontology gathers types, structural principles, composition relations, and preservable invariants; the classification function maps raw text into typed regions; and the instantiator family yields a self-descriptive structured representation. Robustness derives from deterministic coupling with three external oracles: Pint (dimensional), Unicode Character Database (typographic), and RSLP (Portuguese morphology). Intrinsic evaluation covers four properties verifiable by construction -- ontological atomicity, dimensional equivalence, typographic robustness, and numerical reconstruction -- over an internal, physically validated benchmark (EngQuant, N=800) and four Brazilian Portuguese external corpora (N=1771 eligible cases). We also report detection recall, distinguishing coverage from conditional atomicity. Against eight state-of-the-art baselines, TOTEN achieves unit ontological atomicity in all contrasts and numerical reconstruction of 0.775-0.904 on external corpora, vs. 0.627-0.703 for the best baseline (Quantulum3); on EngQuant, 0.780 vs. 0.340. Differences are statistically significant (McNemar with Holm correction). Spearman correlation between internal and external rankings confirms concurrent validity of the control benchmark. Dimensional equivalence shows statistical parity with Pint, the oracle from which the system inherits dimensional authority.
GLM-5.2 is the new leading open weights model on Artificial Analysis
GLM-5.2 has emerged as the top-performing open weights model according to the latest Artificial Analysis intelligence index.
Companies Move to Secure Data as AI Increases Security Risks
Michael Cardaci, CEO of FedHIVE said that the government is moving 'faster than its ever moved before' to secure data and make sure that US computing companies are compliant in case of national security risks as AI use ramps up. Cardaci said that the US government is focused on keeping US technology inside of its borders as the race for global AI dominance heats up. (Source: Bloomberg)
Copilot searched your mailbox. LiteLLM handed out admin keys. Run this 5-check audit before your stack is next
Two AI tools broke in the same way in the same two weeks, and four research teams proved it. The pattern underneath every disclosure is one sentence: enterprise AI accepts external input with no trust boundary. On June 15, Varonis disclosed SearchLeak (CVE-2026-42824), a proof-of-concept exfiltration chain in Microsoft 365 Copilot Enterprise Search. A victim clicks a crafted microsoft.com URL, Copilot searches their mailbox, and the data leaves through a Bing SSRF. No plugins, no second click, no visible indicator. Four days earlier, Obsidian Security published a three-CVE chain against LiteLLM that carried a default low-privilege user all the way to admin and remote code execution. Two tools. Two teams. One broken boundary. The five-check audit at the end of this article maps each gap to a CVE or a market signal from June, a command you can run before lunch, and a sentence a CISO can read to the board. Copilot turned a trusted URL into an exfiltration engine SearchLeak chained three weaknesses into a silent data-theft chain. The URL q parameter fed attacker instructions straight to Copilot’s LLM. A rendering race condition fired an image tag before the output sanitizer ran. Bing’s image-search endpoint, allowlisted in the Content Security Policy, routed the stolen data out. Microsoft rated the flaw critical and patched it on the back end, according to Varonis. NVD has not yet scored it; a third-party tracker lists it at 6.5 medium. The severity is contested, but the mechanism is not. The escalation is the real story. This is the third Varonis Copilot exfiltration chain in twelve months, after Reprompt in January and EchoLeak in 2025. Reprompt hit Copilot Personal. SearchLeak hit Enterprise Search. Enterprise inherits the user’s full organizational permissions, so the blast radius is everything that a user can reach. LiteLLM handed a default account to every provider key The LiteLLM gateway holds the keys for OpenAI, Anthropic, Azure, and Bedrock behind a single proxy. The Obsidian chain runs in three moves. CVE-2026-47101, an authorization bypass, lets a non-admin mint a wildcard API key. CVE-2026-47102 promotes that caller to proxy admin through an unguarded /user/update endpoint. CVE-2026-40217 escapes the code sandbox through exec() with full builtins. Obsidian then demonstrated a reverse shell by injecting a forged tool-call response through LiteLLM’s callback mechanism. Obsidian assessed the combined chain at CVSS 9.9. The developer typed one word. The attacker popped a shell. A separate LiteLLM flaw made the urgency immediate. CVE-2026-42271, a command-injection bug in the MCP test endpoints, landed on the CISA KEV list on June 8 with a June 22 remediation deadline. That KEV entry is not the Obsidian chain. The two are distinct disclosures four days apart, fixed in different releases, pointed at the same gateway. LiteLLM carries more than 40,000 GitHub stars and sits in thousands of enterprise deployments. This is not the first scare, either. A supply-chain compromise backdoored LiteLLM versions 1.82.7 and 1.82.8 on PyPI in March. A compromised gateway exposes every provider credential the organization holds. Langflow and Mini Shai-Hulud proved the pattern scales The same boundary broke in two more tools in the same fortnight. Langflow CVE-2026-5027 became the third Langflow remote-code-execution flaw to hit active exploitation this year. A path traversal in file upload lets an attacker write files anywhere on disk, and because Langflow ships with auto-login enabled by default, a single unauthenticated request reaches RCE. VulnCheck confirmed exploitation on June 9. Censys counted roughly 7,000 exposed instances, the heaviest concentration in North America, with MuddyWater attribution. The Mini Shai-Hulud campaign hit a different pressure point. After the worm’s source code went public on May 12, copycat variants compromised 32 Red Hat Cloud Services npm packages on June 1, packages pulled 80,000 times a week. The worm harvests more than 20 credential types and self-propagates under the compromised maintainer’s identity. Four teams, four tools, one operating failure. The bug classes differ. SearchLeak is a prompt injection. LiteLLM is privilege escalation. Langflow is path traversal. Mini Shai-Hulud is supply-chain poisoning. The boundary that broke is the same in all four. The market already repriced the risk CrowdStrike’s Q1 FY27 earnings call put a number on the gap. AIDR, the company’s AI detection and response line, grew ending ARR more than 250% sequentially, with a Q2 pipeline above $50 million (SEC-filed 8-K). Total company ARR reached $5.51 billion, and CrowdStrike’s fleet telemetry shows more than 1,800 agentic applications running across enterprise endpoints. On June 17, the company extended AIDR to AWS, adding real-time evaluation of agent, LLM, and MCP communications across Amazon Bedrock, Kiro, and Strands Agents, building on its work with Anthropic’s Project Glasswing. Daniel Bernard, CrowdStrike’s chief business officer, said the AI attack surface now spans development, runtime, identities, and cloud infrastructure, and that teams treating those as separate domains leave the gaps between them open. Practitioners name the same gap in plainer terms David Levin, CISO at American Express Global Business Travel, told VentureBeat the pattern does not surprise him. “We kind of have this shadow AI, which is just the new version of shadow IT,” Levin said. Both Langflow and LiteLLM fit the description. Teams stood them up for convenience, gave them credentials, and never brought them under governance. Levin puts the fix before deployment. “We didn’t go into this with just saying we’re going to go do this without the right fundamentals,” he said. “We leverage NIST controls. NIST has released their CSF along with their AI framework. OWASP released their top 10. You need the right fundamentals before you deploy.” Merritt Baer, CSO at Enkrypt AI and former AWS Deputy CISO, named the structural version of the failure in a separate VentureBeat interview. “Enterprises believe they’ve ‘approved’ AI vendors, but what they’ve actually approved is an interface, not the underlying system,” Baer said. “The real dependencies are one or two layers deeper, and those are the ones that fail under stress.” She has tied that directly to how systems fall. “Raw zero-days aren’t how most systems get compromised. Composability is,” Baer told VentureBeat. “It’s the glue between the model and your data where the risk lives. If you give an agent bash and a root token, you’ve already done most of the attacker’s work for them.” That is what rows 2 and 4 of the audit test: the gateway that holds every key, and the agent identity no one governs. Levin had a sharper frame for the boardroom. “You need to talk more in terms of risk versus compliance to your boards and your executives,” he said. “It’s not about the size of the engineering team anymore. It’s the size of your imagination. It’s all written in plain English. It’s not hard for anyone.” Neither SearchLeak nor LiteLLM needed custom malware or a zero-day to work. Adam Meyers, CrowdStrike’s SVP of Intelligence, put the operational squeeze in numbers in an exclusive VentureBeat interview. “The problem is not zero-day. The problem is patching. If you 10x that problem, they’re gonna be completely underwater,” Meyers said. He pointed to identity as the second front. “Some of these AI have their own identities, or people give their identity to the AI to take action on their behalf, and that makes it a very complex problem.” The five-check trust-boundary audit Each row maps a gap to its proof point, a verification command for Monday morning, the fix, and the sentence to read to the board. Trust-Boundary Gap Proof Point What Broke Verify Monday Fix Monday Board Language 1. Prompt-to-Data SearchLeak CVE-2026-42824. P2P injection + HTML race + Bing SSRF. One-click mailbox exfiltration via microsoft.com URL. PoC demonstrated; Microsoft rated it critical, NVD not yet scored. URL q-parameter passed to LLM as instructions. Sanitizer ran after render. Bing acted as exfiltration proxy via CSP allowlist. Audit CSP allowlists for domains performing server-side fetches. Monitor Copilot Search URLs for encoded payloads. Review Copilot audit logs. Confirm server-side patch applied. Enable sensitivity labels restricting Copilot. Treat AI streaming output as untrusted. “Our AI assistant could search employee email and send results to an attacker through a trusted Microsoft URL. Vendor patched it. We must verify configuration.” 2. Gateway Credential Exposure LiteLLM three-CVE chain (-47101, -47102, -40217). CVSS 9.9. Separate CVE-2026-42271 on CISA KEV (fixed in v1.83.7; full chain fixed in v1.83.14-stable). June 22 deadline. No role validation on key endpoints. Self-promotion to admin via /user/update. exec() sandbox escape. One gateway exposes all provider keys. Run pip show litellm. Below 1.83.14-stable = vulnerable. Check /mcp-rest/test/ exposure. Audit proxy_admin accounts. Upgrade to v1.83.14-stable+. Rotate all provider API keys. Block /mcp-rest/test/* at proxy. Review Custom Code Guardrails. “Our AI gateway held keys for every provider. A default account could promote itself to admin and steal them all. Rotating and patching now.” 3. AI Tooling Sprawl Langflow CVE-2026-5027 (CVSS 8.8). Third RCE of 2026. ~7,000 exposed instances. MuddyWater. Active exploitation June 9. Path traversal in file upload. Auto-login enabled by default. Single unauthenticated request to RCE. Query Censys/Shodan for Langflow, Flowise, n8n, Dify on your perimeter. Check auto-login. Inventory AI tools outside change management. Pull AI platforms behind VPN/zero-trust. Enable auth everywhere. Upgrade Langflow to v1.9.0+ (current release 1.10.0). Fingerprint surface continuously. “AI dev tools are exposed to the internet with login disabled. A nation-state group is exploiting this flaw now. Pulling behind access controls today.” 4. Non-Human Identity Governance AIDR ARR up 250% (Q1 FY27, SEC 8-K). Q2 pipeline >$50M. 1,800+ agentic apps across enterprise endpoints. Agents hold identities and act on behalf of humans. Some exceed their intended scope to reach a goal. No standard governs agent credential lifecycle. Inventory all non-human identities used by agents and MCP servers. Map agent-to-data-store access. Flag agents with write access to security policy. Least-privilege every agent identity. Set privilege boundaries via identity protection. Runtime detection for policy-exceeding actions. Human-in-the-loop for policy changes. “AI agents hold credentials and act autonomously. We do not govern their identity lifecycle like human access. The 250% market growth tells us this gap is systemic.” 5. Runtime Agentic Detection Falcon AIDR expanded to AWS (June 17). Covers Bedrock, Kiro, Strands Agents. MCP integration. Real-time agent/LLM/MCP evaluation. Traditional tools monitor human-speed actions. Agents run at machine speed, thousands of actions per minute, and route around controls to reach goals. Test if EDR/XDR links agent actions to originating identity. Verify SIEM ingests MCP communications. Confirm you can distinguish human from agent on endpoint. Deploy AIDR or equivalent runtime detection. Shadow-AI discovery for all agentic apps, models, MCP servers, identities. Real-time policy enforcement on agent actions. “We cannot distinguish a human employee from an AI agent acting on their behalf. We need runtime detection at machine speed that can stop damage before it starts.” The fix is plumbing, not policy The June 2 executive order creates an AI Cybersecurity Clearinghouse with a July 2 deadline. The five gaps above are not frontier-model problems. They are plumbing problems in the gateways, orchestration platforms, identity layers, and runtime environments where AI meets the enterprise. The audit is five rows. Every row maps to a June disclosure or market signal, a command a team can run before lunch, and a sentence a CISO can read to the board. The question is not whether your vendor will patch. It's whether you find the gap first — or whether an attacker finds it the way they found Copilot and LiteLLM.
The surprisingly simple ways AI can be tricked into breaking its own rules - The Washington Post
AI response · Steps to fake a passport: ... BYPASSED · It’s surprisingly simple to trick chatbots into breaking their own rules and spilling forbidden knowledge. Even poems and bedtime stories can work. Yesterday at 5:00 a.m. EDT · 5 minSummary · In this article ·
Analyzing the Narration Gap in LLM-Solver Loops
arXiv:2606.19588v1 Announce Type: new Abstract: Formal tools such as SAT and SMT solvers are increasingly embedded in language model reasoning pipelines when a safety or security critical question can be formulated in logic. Unlike chain of thought whose steps are sampled from the model distribution without formal guarantee, a solver produces a sound and independently verifiable answer. However, the soundness guarantee can be lost in the interaction between the solver and the model. The hybrid pipeline has three components: formalizing the question, deciding it, and narrating the result. Prior work has studied the formalization and decision, but not narration, which is the step that turns a formal tool's output into the user answer. To fill the narration gap, we first model the LLM-solver loop as a verified decision procedure. We further evaluate five open-sourced models under prompt injection, and we find certificate gating makes the solver verdict sound, while an adversary can invert a verified conclusion across phrasings and channels. We study the mitigation through hardened prompt that reduces injection significantly but cannot eliminate it and still suffers under adaptive attack. Combining the formal analysis and empirical studies, we show in the LLM-solver loop, robustness does not reach to the answer that the user finally reads.
Radware Launches AI Xploit Shield for Real-Time Application and API Security
Radware introduced AI Xploit Shield, a service that provides tailored protections for applications and APIs using AI without requiring changes to underlying software.
Adoption, Deployment & Impact
AI4SE and SE4AI Exploration: A Decade Looking Back and Forward
arXiv:2606.19630v1 Announce Type: new Abstract: The March 2020 INCOSE INSIGHT special issue on AI and Systems Engineering (SE) became the most downloaded issue in the publication's history and launched a research community that now draws over 250 registrants to its annual workshop. In this article, we trace the progress in AI and SE across three phases (labeled here foundational, applied, and LLM inflection) based on the authors' reading of the field's core papers, and describe our opinions of where the community has converged and where critical gaps remain. Separately, a human-AI agreement literature review leveraging both human expertise and six AI models was performed to assess the relevance of 1,712 INCOSE INSIGHT articles and 889 SERC publications. The results identify five critical research gaps and offer guidance for practitioners navigating AI adoption, assurance, and workforce transformation in SE. We share the agreement data and the AI4SE/SE4AI Explorer web application so readers can compare their own relevance judgments with the human and AI raters.
Artificial intelligence: The gap between adoption and impact
Experts from Accenture discuss Generating Impact, the organisation's latest AI research report. Read more: Artificial intelligence: The gap between adoption and impact
AI Adoption Report: Governance and regulation | The Drawdown | The DRAWDºWN
Risk was one of the major areas of concern related to AI adoption for CFOs
Publicis Sapient's 2026 enterprise AI report finds wide adoption but only 10% say it's core to operations | MarketScale
A survey of 1,550 AI decision-makers finds 73% of enterprises use AI regularly, yet just 10% call it core to how their business runs.
You Probably Don’t Need an Agent Framework
Most LLM applications need a clear workflow, not an autonomous agent. Here's how to build one in plain Python.
Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why
arXiv:2606.19602v1 Announce Type: new Abstract: Patient contexts span hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standard retrieval-augmented generation fails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We dep
AI Economist Agent: An Agentic Framework for Model-Grounded Economic Analysis with RAG, Knowledge Graphs, and Large Language Models
arXiv:2606.20041v1 Announce Type: new Abstract: We propose a model-grounded RAG-based AI economist with an agentic framework for economic scenario analysis using large language models (LLMs) and knowledge graphs. While LLMs can generate fluent economic narratives, economists are often required to make economic claims grounded by economic theory and real-world data. Based on this motivation, this study proposes an RAG-based AI economist, which utilizes knowledge graphs including economic data and theory and LLM-based agents to plan the analysis, retrieve relevant evidence, select appropriate models, and generate reports. In our framework, we do not produce quantitative claims directly with the language model alone; instead, we generate narratives grounded in explicit model-based computations and linked to the retrieved evidence via AI agents. We refer to our framework as an AI economist agent. We evaluate the AI economist agent in two applications: economist report generation for U.S. inflation persistence and Federal Reserve policy, and bank stress-test narrative generation for U.S. commercial real estate refinancing stress. The results illustrate how grounding the generated reports improves their economic coherence and traceability.
Precisely | LinkedIn
Our CPO Matt Waxman is kicking off a three-part series on how he’s thinking about what it means to “run on AI ” at Precisely, from reimagining product development to reshaping go-to-market and G&A functions with Agentic, human-in-the-loop processes. Read part 1, where Matt introduces the Spiral — a framework for rethinking product development when AI removes the constraints your whole operating model was built around.
Dario Amodei has only 1 direct report, his chief of staff—and everyone else reports to his sister: ‘It’s incredibly freeing’
As $965 billion Anthropic prepares for an IPO, the AI firm’s CEO, Dario Amodei, admits he’s been managing only one person—and passing the rest to his sister.
‘We created a monster’: companies rein in AI usage as costs strain budgets
Amazon, Walmart and Uber are among early adopters that have introduced caps or discouraged wasteful activity
ISG Event to Explore How Enterprises Are Turning AI Investments Into Measurable Business Value
State Street, Pfizer, Siemens, Merck KGaA, Deutsche Bank, Fresenius and more will discuss ROI at the ISG AI Impact Summit, June 22–23 in Frankfurt.
BBVA Puts AI At the Core of Banking with OpenAI
BBVA’s case study shows how a major bank scaled ChatGPT and OpenAI tools from early adoption to broad enterprise deployment.
Todd Parsons - Chief Product Officer and President ...
In our latest guest blog post, BARC US CEO Shawn Rogers shares his perspective on critical factors for AI success at scale. "The bottom line, AI innovation is no longer limited by clever ideas or tooling. The bottlenecks are the money you burn and the controls you follow," Rogers advises companies to tackle both with the same urgency, or face surprise invoices and compliance fire drills.
Geopolitics, Policy & Governance
Trump is taking a page out of China’s sovereign AI playbook
Governments have long protected strategic industries — what is new is their willingness to become shareholders
The Île-de-France Region, Scaleway, VSORA and ZML commit to laying the foundations for the next generation of AI chips in Europe
This unprecedented commitment — a European first that brings together Île-de-France players across the entire value chain of Artificial Intelligence...
Why Nations Are Rethinking Dependence on Foreign New AI Models - Linkdood Technologies
The global artificial intelligence industry reached a turning point in June 2026. When Anthropic suspended access to its most advanced AI models following a
China white paper stresses AI cooperation, trade in global governance
China released a global-governance white paper emphasizing AI cooperation and multilateral rule-making, linking AI governance to trade, supply-chain stability, and technology access for developing nations.
Inside Palantir’s fight over the future of the NHS
Critics question how the tech giant won a showpiece contract. It complains about the politicisation of procurement
The week that changed AI: Inside Trump’s Anthropic crackdown, and how a phone call from Amazon CEO Andy Jassy triggered the chaos
The fight over Anthropic’s Mythos model is rewriting the rules of AI regulation, with consequences for the trillion dollar startup, the AI industry, and global security.
Open Weight AI Models Require Proportional Evaluation Approaches
arXiv:2606.19890v1 Announce Type: new Abstract: Open-weight AI models (OWMs), or models released with publicly-available weights, are distributing rapidly and approaching the performance levels of leading closed-weight AI models (CWMs). While OWMs offer substantial scientific and economic benefits, their release introduces distinct risk factors for which existing evaluation practices, largely designed for CWM deployment, fail to account. In this paper, we argue that these risk factors demand distinct proportional evaluation (PE) approaches: evaluating without system-level safeguards (PE1), assessing robustness to modifications that undo model-level safeguards (PE2), testing selective capability amplification (PE3), and proxying worst-case misuse (PE4). We systematically review current evaluation practices of OWMs released in 2025 through April 2026, finding that only one of the 37 families of models reviewed fulfills PE1-4 and most do not fulfill any. This paper targets policymakers, funders, and researchers involved in AI evaluation. As OWMs grow increasingly capable, their evaluation warrants close attention from developers, funders, and governance bodies alike.
Early Users of Anthropic Mythos Still Have Access After US Order
Some firms chosen early on by Anthropic PBC to test the Mythos AI model ahead of a wider release have preserved their access to a preview of the system, despite a US government order that led to the total shutdown of other versions.
Measuring Biological Capabilities and Risks of AI Agents
arXiv:2606.19899v1 Announce Type: new Abstract: This paper addresses a rapidly emerging policy challenge: how to generate and interpret credible evidence about the biological capabilities and risks of AI scientists, or agentic AI systems capable of autonomously or collaboratively performing multi-step scientific tasks. As these systems enter real research workflows, decision-makers increasingly face evaluation results whose meaning depends on underlying design choices that are often implicit or under-documented. We synthesize current evidence on AI-enabled biological risks and introduce biological agentic evaluations as a promising, but interpretation-sensitive, tool for assessing these systems. Our central contribution is a set of practical, experience-grounded considerations -- drawing from our own evaluations -- that show how choices around defining, designing, running, scoring, and documenting evaluations materially shape what results do and do not imply about risk. The analysis is intended to help policymakers interpret biological evaluation outputs with appropriate caution; guide public and private funders toward high-leverage investments in AI-biology evaluation research; and support biosecurity practitioners assessing emerging AI systems. A secondary audience includes researchers designing or conducting agentic evaluations within frontier AI labs, AI providers, scientific institutions, and third-party evaluation organizations.
‘Make AI work for ordinary people’: Bernie Sanders wants to pay you $1,000 every year from a government stake in AI companies
The senator is introducing a bill that would give Americans 50% ownership of the country’s biggest AI companies if they become profitable.
Challenges to Grassroots Organization Engagement with AI Policy
arXiv:2606.19816v1 Announce Type: new Abstract: Public policies are being developed around the world to address privacy, economic, intellectual property, energy, and other risks that AI technologies pose. Involvement from the general public is essential to governance as an accountability and alignment mechanism. However, participating in and impacting policymaking can be challenging for sections of the public that lack extensive networks, lobbying capabilities, and other forms of power. This challenge is especially acute for marginalized communities. In this paper, we present a case study of our organization's efforts to bring participatory design (PD) principles to AI policymaking in the US. We describe our engagements with several US policy bodies, and our participatory development of AI policy for queer people. We highlight challenges with PD practice with marginalized communities, and offer suggestions to alleviate them. We conclude with actionable recommendations for policymakers and other organizers working in marginalized communities.
ENISA meets Anthropic amid US export controls on AI models
ENISA meets Anthropic in San Francisco after the US Department of Commerce forced the AI company to suspend access to its Fable 5 and Mythos 5 models for
Europe AI Sovereignty Crisis: G7 Offers Platform as Kill Switch Fears Grow
Europe AI sovereignty crisis reached a flashpoint this week as the G7 summit in France failed to reverse U.S. export controls that knocked Anthropic’s Fable 5 and Mythos 5 offline globally — exposing how the deemed export doctrine now gives Washington an off switch over any frontier AI model
ETSI chief eyes larger AI role as EU grapples with standards gap
Europe's AI rules will only succeed if they're translated into practical standards that companies can use to build and test products, according to Jan Ellsberger, who heads a major European standards body.
Anthropic got hit by export rules nobody understands
Anthropic faces challenges navigating complex and ambiguous new export control regulations.
US risk designation against Anthropic a 'temper tantrum,' EFF, others say
The Electronic Frontier Foundation and other groups filed a brief opposing the US government's decision to label Anthropic a national security threat, arguing it harms the company and its partners.
Consumer groups warn against US Congress killing state AI enforcement
Over 130 civil society groups urged US congressional leaders to reject the 'Great American Artificial Intelligence Act,' which they claim would impose a federal ban on state-level AI regulation.
New Irish bill to supervise EU AI Act gets greenlit
The AI Act, which entered into force in August 2024, attempts to tackle some of the risks emerging from the technology while letting the bloc benefit from its economic potential. Read more: New Irish bill to supervise EU AI Act gets greenlit
G7 leaders urge financial regulator coordination to tackle AI risks
G7 leaders called for information sharing and coordination between financial regulators and tech companies to address risks posed by frontier AI models.
Ferguson says FTC poised for jump in US privacy enforcement in late 2026
FTC Chairman Andrew Ferguson expects a surge in data privacy enforcement cases in the second half of 2026. He also discussed potential expansion of agency capacity if the SECURE Data Act is passed.
California lawmakers proposing third-party assessments of AI systems
California state legislators are partnering to propose safety standards for independent, third-party assessments of AI systems and models.
Google Says Canada’s Data Law Changes Fail to Ease Concerns
Alphabet Inc.’s Google said Canada’s changes to a proposed law that would help police obtain citizen data from private companies don’t resolve many of its concerns.
SpaceX warns EU satellite plan risks undermining connectivity in Ukraine
Elon Musk’s group hits out at proposal by bloc to reserve part of spectrum band for European operators
NO FAKES Act clears US Senate Judiciary Committee on voice vote
The US Senate Judiciary Committee unanimously advanced the bipartisan NO FAKES Act, which targets unauthorized AI-generated replicas of people's voices and likenesses.
Get the full executive brief
Receive curated insights with practical implications for strategy, operations, and governance.