Fri 29 May 2026
Daily Brief — Curated and contextualised by Best Practice AI
Apollo Seeks $36B, Anthropic Tops OpenAI, and Fed Warns on Inflation
TL;DR Apollo and Blackstone are seeking partners for a $36 billion debt deal to fund Anthropic's AI infrastructure expansion. Anthropic has surpassed OpenAI with a $900 billion valuation. Meanwhile, St. Louis Fed President Alberto Musalem cautions against relying on AI to reduce inflation. TSMC highlights energy efficiency as a key factor in future chip design due to AI's electricity demands.
The stories that matter most
Selected and contextualised by the Best Practice AI team
Apollo Seeks Partners for $36B Debt Deal to Buy AI Chips for Anthropic
Apollo and Blackstone are working to bring additional investors into a roughly $36 billion debt financing deal to help Anthropic build out its AI infrastructure. The debt will be used to purchase Google’s custom chips known as tensor processing units, or TPUs, which Anthropic will then lease, according to people with knowledge of the matter. Bloomberg's Neil Campling reports. (Source: Bloomberg)
From Augmentation to Reconstruction: Guiding the AI Disruption to the Good Place
arXiv:2605.29207v1 Announce Type: new Abstract: Artificial intelligence feels omnipresent, yet the disruption many expect has not fully arrived. The main reason is not model capability, nor even the tools built to harness those models. Rather, most organizations are still using AI to accelerate workflows designed for a pre-AI world. We offer a three-stage lens: Augmentation, Automation, and Reconstruction, and argue that the most consequential disruption resides in the third stage where workflows and markets are rebuilt around delegation, machine-to-machine interaction, continuous monitoring, and auditable constraints. Achieving this system-level transformation takes time: it requires trust and accountability infrastructure, machine-legible and interoperable data and interfaces, the design and adoption of these new workflows, and economic incentives that favor reconstruction rather than local optimization: the complementary investments that produce the familiar "productivity J-curve" of general-purpose technologies. We illustrate this transition through examples in consumer markets, education, news, and coding. Finally, we emphasize a normative point: the agentic future is not predetermined. Leaders must both skate to where the puck is going and actively steer it toward a good place, ensuring innovation delivers welfare gains felt by businesses and consumers around the world.
Energy use forcing rethink of AI chip design, TSMC says | Reuters
A senior TSMC executive said on Thursday that surging electricity demands from AI are making energy efficiency rather than computing power the main constraint shaping future computer chip development.
Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild
arXiv:2605.29018v1 Announce Type: new Abstract: Although a growing body of research has begun to describe user--LLM interactions, the picture it paints is largely static; little is known about how individual users change their behavior over time. To address this gap, we analyze the conversational trajectories of $\sim$12,000 randomly sampled Microsoft Bing Copilot users and compare these with data from WildChat-4.8M. While the Copilot data contains significant population-level trends, we find that trends in individual user trajectories are much weaker; user habits prove to be overwhelmingly sticky. We also find stark differences between users of different activity levels: more active users have more successful conversations and use the LLM for more complex and professionally oriented tasks. Some user trends also appear in WildChat-4.8M, but we find evidence that this dataset is significantly skewed towards highly proficient "power" users. Ultimately, our results suggest that existing user behavior is difficult to change and demonstrate the extent of user heterogeneity. Our comparison between datasets highlights that WildChat does not represent typical user-AI interactions, an important caveat for downstream uses of the data.
The New Pro Se: Generative AI and the Surge in Federal Civil Self-Representation
arXiv:2605.29493v1 Announce Type: new Abstract: Since public access to generative AI tools became widespread, federal civil litigation has seen a marked increase in pro se (self-represented) plaintiffs. This paper analyzes that shift using ~2.8 million filings, asking whether the post-GenAI period is associated not only with more pro se filings, but also with detectable changes in complaint text, litigation outcomes, and the composition of pro se litigants. Using civil filing data from FY2008-2025, we find that the federal civil pro se plaintiff rate rose from 11.33% pre-GenAI to 16.94% post-GenAI, a 5.61 percentage-point increase that persists after trend and covariate-adjusted robustness checks. We then focus on Civil Rights and Other Statutory cases, where the increase is especially pronounced, and link case metadata to pro se complaints. Drawing on stylometric AI detection indicators, we develop an interpretable measure of AI-consistent drafting. Against a threshold calibrated to the pre-GenAI baseline, the net AI-flagged share is 13.9% of post-GenAI non-form complaints. Analysis of the AI-flagged complaints shows that they are more citation-dense, disproportionately associated with first-time rather than repeat filers, and geographically unevenly distributed. This composition pattern suggests that AI-consistent drafting is not merely a repeat-filer phenomenon; it also includes a modest, suggestive increase in name-inferred female plaintiffs. We find no evidence of improved win rates; in fact, AI-flagged complaints are more likely to be dismissed and to terminate at earlier procedural phases. These findings raise new questions about access to justice and court screening burdens, and sharpen the distinction between legal formality and legal efficacy.
Does Distributed Training Undermine Compute Governance?
arXiv:2605.29359v1 Announce Type: new Abstract: Compute governance proposals often rely on the assumption that frontier AI training requires large, detectable computing clusters. However, recent advances in distributed training algorithms could allow developers to conduct frontier-scale training on distributed agglomerations of hardware, rather than needing large datacenter facilities. Developers who prefer not to be constrained by regulations may structure their hardware in a manner that evades the registration and monitoring requirements associated with compute governance. Therefore, regulations must be designed to detect and prevent illicit distributed training operations. This paper evaluates the feasibility of such evasion and outlines recommended countermeasures, including whistleblowing, chip tracking, forensic accounting, and memory and compute thresholds for clusters.
As AI slashes white-collar jobs, Salesforce CEO Marc Benioff says almost no one is being hired—except in sales
Salesforce CEO Marc Benioff revealed that the $145 billion firm is keeping its engineering team slim thanks to AI—but has good news for sales workers.
Governing Technical Debt in Agentic AI Systems
arXiv:2605.29129v1 Announce Type: cross Abstract: Agentic AI systems are increasingly being explored as production infrastructure: they reason over multiple steps, call tools, act through workflows, and adapt through memory and feedback. These systems create governance challenges that are not fully captured by traditional software or predictive ML technical debt. We define Agentic Technical Debt as the accumulated liability created when prompts, memory, tool schemas, orchestration graphs, control policies, and observability routines are patched together faster than they can be validated, standardized, and governed. We define Stochastic Tax as the recurring operating burden of keeping probabilistic agent behavior within acceptable bounds. The distinction matters: debt is a stock of design and governance liability, while the tax is a flow of operating cost that arises because stochastic agents act through tools and workflows. We outline how managers can make both visible through lightweight dashboards and governance controls.
Economics & Markets
Jupiter Fund Taps Europe AI Energy Boom to Beat 92% of Peers
Homing in on stocks at the heart of Europe’s electrification push has helped a Jupiter Asset Management fund team outperform 92% of its peers this year.
Apollo Seeks Partners for $36B Debt Deal to Buy AI Chips for Anthropic
Apollo and Blackstone are working to bring additional investors into a roughly $36 billion debt financing deal to help Anthropic build out its AI infrastructure. The debt will be used to purchase Google’s custom chips known as tensor processing units, or TPUs, which Anthropic will then lease, according to people with knowledge of the matter. Bloomberg's Neil Campling reports. (Source: Bloomberg)
Lenovo Doubles in Best Month Since 1999 on AI-Fueled Rally
Lenovo Group Ltd. is set for its best month in more than a quarter-century, with the stock doubling in May as investor enthusiasm builds around the company’s AI-driven growth outlook.
AI investments made up a third of Singapore venture funding in 2025
The rise in AI bets comes amid overall declining deal value in Singapore.
China's AI investment boom is supercharging exports and lifting the yuan
Countries across Southeast Asia, ... the geopolitical strings attached to US exports. For equity investors, the concentration of export growth in AI-related goods points to specific opportunities in China’s semiconductor supply chain, from chip designers to packaging and testing firms. The risk that keeps showing up in analyst notes is policy discontinuity. Any expansion of US export controls, or retaliatory ...
r/EconomyCharts on Reddit: Dell stock surges nearly +30% after reporting stronger than expected earnings due to AI
I can name like 20 AI /semiconductor stocks off the top off my head that have gone up around 50% or more this year each and I’m not super into the investing world. AI money still seems to be flowing everywhere that it’s needed. Semiconductors, energy, AI infrastructure, etc.
AI-native hedge funds influence AI infrastructure stocks | Let's Data Science
HedgeCo.net reports that **Nebius Group**, the AI infrastructure company trading under the ticker **NBIS**, surged roughly **10%** after a regulatory disclosure showed that **Situational Awareness LP**, an investment firm led by former OpenAI researcher **Leopold Aschenbrenner**, had taken ...
This AI Stock Is Priced Like a Value Play, But Growing Like a Growth Stock | The Motley Fool
This AI stock is trading at 4 times forward earnings.
Musk’s tweet undermines SpaceX’s claims about Anthropic data centre deal
Billionaire says the arrangement, described in IPO filings as a three-year agreement, only lasts for 180 days
CNN files US copyright claims against Perplexity AI
CNN filed a US copyright case accusing Perplexity AI of illegally copying its content to train its large language models and generating outputs that are identical or substantially similar to CNN's content.
OpenAI co-founder and former Tesla AI executive Karpathy joins Anthropic | Reuters
May 19 (Reuters) - Andrej Karpathy, a former Tesla (TSLA.O), opens new tab AI executive and one of Open AI 's founding members, has joined Anthropic, he said on Tuesday, strengthening the Claude maker as it looks to dominate the AI race.
From Open Source Software to Open Source Strategy
The article explores open source as a strategic tool for commoditizing infrastructure and countering platform control in the AI ecosystem.
Singapore’s Sea Sets Up AI Investment Team as Part of Pivot
Sea Ltd. has set up a dedicated team to scout for new investments in AI, part of a broader effort to accelerate forays into the technology as it hunts for its next growth engine beyond e-commerce.
Cursor AI Statistics 2026: Users, Revenue and Adoption
Cursor AI Statistics 2026: Users, revenue, ARR, funding, and enterprise adoption metrics behind the rapid growth of the AI code editor.
Pivot or Perish: India’s software startups get AI reality check from investors- Moneycontrol.com
Investors say AI is dismantling old SaaS advantages at unprecedented speed, forcing startups to rethink products, pricing, growth and survival strategies almost in real time.
Barcelona’s Mafer AI raises €2 million to build an AI operating system for R&D teams in formulation industries
Mafer AI, a Barcelona-based startup building an AI operating system for R&D teams in formulation industries, has closed a €2 million pre-Seed funding round backed by Kfund, 4Founders Capital, Masia and Lavanda Ventures, the startup investment arm of the Puig family. It has also secured backing from leading business angels, including Adrián Mato (Andreessen Horowitz […]
Labor, Society & Culture
As AI slashes white-collar jobs, Salesforce CEO Marc Benioff says almost no one is being hired—except in sales
Salesforce CEO Marc Benioff revealed that the $145 billion firm is keeping its engineering team slim thanks to AI—but has good news for sales workers.
The New Pro Se: Generative AI and the Surge in Federal Civil Self-Representation
arXiv:2605.29493v1 Announce Type: new Abstract: Since public access to generative AI tools became widespread, federal civil litigation has seen a marked increase in pro se (self-represented) plaintiffs. This paper analyzes that shift using ~2.8 million filings, asking whether the post-GenAI period is associated not only with more pro se filings, but also with detectable changes in complaint text, litigation outcomes, and the composition of pro se litigants. Using civil filing data from FY2008-2025, we find that the federal civil pro se plaintiff rate rose from 11.33% pre-GenAI to 16.94% post-GenAI, a 5.61 percentage-point increase that persists after trend and covariate-adjusted robustness checks. We then focus on Civil Rights and Other Statutory cases, where the increase is especially pronounced, and link case metadata to pro se complaints. Drawing on stylometric AI detection indicators, we develop an interpretable measure of AI-consistent drafting. Against a threshold calibrated to the pre-GenAI baseline, the net AI-flagged share is 13.9% of post-GenAI non-form complaints. Analysis of the AI-flagged complaints shows that they are more citation-dense, disproportionately associated with first-time rather than repeat filers, and geographically unevenly distributed. This composition pattern suggests that AI-consistent drafting is not merely a repeat-filer phenomenon; it also includes a modest, suggestive increase in name-inferred female plaintiffs. We find no evidence of improved win rates; in fact, AI-flagged complaints are more likely to be dismissed and to terminate at earlier procedural phases. These findings raise new questions about access to justice and court screening burdens, and sharpen the distinction between legal formality and legal efficacy.
Attention Asymmetry in AI Layoff Discourse on X: A Computational Analysis of Capital vs Labour Amplification
arXiv:2605.29367v1 Announce Type: cross Abstract: When workers lose jobs to AI-driven restructuring, two very different conversations happen on X (formerly Twitter) at the same time. Tech executives and AI researchers talk about productivity, transformation, and opportunity. Laid-off workers and labour critics talk about job loss, uncertainty, and fear. This paper asks a simple question: which conversation gets more reach? We report three studies using two collection methods and 763 tweets from 20 named public accounts. Study 1 used keyword-based collection (n=392) and found no significant difference between corpora (p=0.891), revealing that keyword search is too noisy for this task. Study 2 used account-based collection (n=96) and found a 3.12x mean amplification advantage for capital discourse over labour discourse (p=0.000003, Cohen's d=0.555). Study 3 combined both methods (n=763) and confirmed the finding at 4.18x mean and 10.77x median amplification ratio (p<0.000001). Critically, after normalising for follower count, the asymmetry persists at 2.69x (p=0.000009, Cohen's d=0.491), demonstrating that the effect is not simply a consequence of capital accounts having larger audiences. The finding is robust across all tested amplification metric weightings. We introduce the Amplification Ratio and Amplification Normalisation Index as simple metrics for measuring platform-level discourse inequality. A cross-platform replication on Reddit (n=647 posts) did not replicate the finding, suggesting the asymmetry may be specific to X's account-based amplification architecture. We discuss the methodological implications for cross-platform discourse analysis.
Adding AI 'employees' is backfiring by creating new office scapegoats and making human workers sloppier and lazier | Fortune
Research from Boston Consulting Group found that human staff becomes less accountable, blaming their new bot colleagues for their mistakes.
Who decides which jobs AI will take?
Different models are producing very different assessments of exposure levels
Costco CEO Ron Vachris says tech is ‘elevating’ workers,’ not replacing them—as IBM and Delta bosses make the same bet on humans
As companies like Meta and Amazon use AI to justify headcount reductions, Costco is doubling down on $1.50 hot dogs and humans at the cash register.
This overlooked factor will decide if AI creates or destroys your job - Futura-Sciences
University of Chicago economist ... actual employment outcomes, because they skip the exact variable that translates productivity gains into labor market realities. When AI began producing articles, images, and marketing copy at near-zero marginal cost, the price of digital content collapsed. Consumer demand turned out to be elastic in some segments and rigid in others. Some organizations flooded the market with cheap volume; others collapsed. The specific demand shifts determined ...
Labour markets may risk a milder shock than AI fantasies suggest, but that’s only partial relief | Mint
The business drive for automation is turning out to be bumpy as the actual costs of adopting artificial intelligence (AI) come into view and the buzz around an AI bubble grows louder. But that’s unlikely to spell much of a reprieve for human workers.
Is AI to blame for hiring woes faced by college graduates? - ABC11 Raleigh-Durham
Analysts disagree about whether AI is a factor in the hiring crunch.
India accounting leaders say AI won't kill offshore talent - Outsource Accelerator
Not a single India-based leader at major United States accounting firms believes AI will eliminate the offshore accounting model.
Mobile Foreigners: Mortgage Lock-In and H-1B Demand
arXiv:2605.28904v1 Announce Type: new Abstract: The 2022 rise in U.S. mortgage rates increased relocation costs for homeowners with low-rate mortgages. This cost varies across destinations because each draws workers from a different mix of labor markets. We build an in-migration mortgage-payment wedge from HMDA loans and pre-shock IRS migration networks. From 2017 to 2024, higher wedges reduce college-educated homeowner in-migration, leave renters unaffected, and raise H-1B sponsorship requests. The implied offset is 14 H-1B sponsorship requests per 100 deterred college-educated domestic in-migrants. We show that mortgage lock-in operates as a destination-side labor-market shock that shifts part of firms' adjustment toward employer-sponsored immigration.
The Pope disrupts Silicon Valley
Unlike the US president, the pontiff is choosing to grapple with the serious challenges of AI
Political Neutrality as Balanced Approval: A Large-Scale Human Evaluation of AI Responses
arXiv:2605.28911v1 Announce Type: new Abstract: As AI systems increasingly shape political views, defining and evaluating AI political neutrality is an urgent problem. Here, we propose a new definition of AI political neutrality and design a large-scale user study to test it, releasing a new dataset PARETO with 7,434 participants and 208,152 evaluations of AI responses. Our definition follows a simple principle grounded in political theory: when asked about a controversial issue, an AI model should generate responses that maximize approval across groups with opposing viewpoints, while balancing approval between groups. This definition allows empirical testing of whether an AI response is "neutral" and generalizes to any political context without pre-supposing a single left-right axis of division. We construct a benchmark of controversial U.S. issues, with prompts sourced from politically charged questions on Reddit and responses from frontier AI models, and recruit human participants to rate AI responses. Across all 20 issues, we find that it is possible for AI responses to achieve high rates of approval on both sides, even as those sides disagree strongly with each other on the substance of the issues. We also find that default responses lean liberal for GPT, Gemini, Claude, and Llama, but not Grok, and that user prompts with political charges are harder to respond to than neutral prompts. This work introduces a rigorous definition and benchmark of AI political neutrality, and a dataset to measure progress toward it.
Why Tesla’s AI trainers don’t trust its self-driving tech – or its safety stats | Reuters
Those efforts, which haven’t been previously reported, undermine Musk’s long-stated claim that Tesla’s self-driving technology will soon work anywhere globally and doesn’t require the same laborious local mapping of roads and hazards employed by rivals. Musk has said Tesla takes a simpler approach, relying solely on cameras and AI , that will allow it to scale up its robotaxi service at “hyperexponential” speed and offer current Tesla owners full autonomy through software updates.
Who Does Your AI Work For? Designing Conversational Agents as Digital Fiduciaries
arXiv:2605.28908v1 Announce Type: cross Abstract: Conversational agents are increasingly integrated into the most private and intimate aspects of users' lives, from discussions of mental health to financial decisions. As a result, these systems have access to reams of sensitive user data. Much of the literature on AI systems has focused on aligning users' goals with the agents that act on their behalf. While this work is vitally important, it may overlook the need to establish a new normative baseline. Conversational AI agents, designed to feel and interact anthropomorphically with human users, must be held to a standard of care commensurate with their capabilities and access. When a client hires a personal lawyer, undergoes surgery, or receives advice from an investment manager, the expert they consult often has a fiduciary duty to act in their client's best interests. This provocation argues that conversational agents should be held to a similar standard and introduces fiduciary design as a guiding principle. In this respect, conversational AI trust and accountability could be unified into a single design and legal paradigm.
What the fight over London data centre plans tells us about the AI backlash - ABC News
Last year in the US alone, projects collectively worth $200 billion were scuppered or delayed as communities protested the construction of new data centres promised to deliver the AI of the future.
Review Arcade: On the Human Alignment and Gameability of LLM Reviews
arXiv:2605.28897v1 Announce Type: new Abstract: LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers are using LLM-assistance, but also that authors use LLMs to revise their papers before submitting. In this work, we perform empirical experiments on papers from the 2025 ACL Rolling Review (ARR) to evaluate LLM reviews from both the author and the reviewer perspective. First, we identify a limited alignment of LLM reviews with human ones. In the best-case scenario, the alignment is reasonable. However, we also find that LLM-human alignment varies substantially across prompts and models. Finally, we investigate the scenario in which the author uses an iterative draft-revise workflow to improve the submission according to the LLM review. We find that this "gaming" of LLM reviews can be effective in specific scenarios, leading to a statistically significant increase of overall scores for up to 35\% of papers. We publish our code: https://github.com/uhh-hcds/reviewarcade.
The EEOC chair knows gutting diversity reporting will blind the agency to discrimination. She’s doing it anyway.
In April, Andrea Lucas told Harvard students that demographic data collection is sometimes necessary. A month later, her agency proposed to stop the reports.
Technology & Infrastructure
Building Self-Improving Tax Agents with Codex
OpenAI and Thrive Holdings developed a tax-preparation automation tool using Codex, practitioner feedback, and evaluation loops to create a self-improving agent.
Governing Technical Debt in Agentic AI Systems
arXiv:2605.29129v1 Announce Type: cross Abstract: Agentic AI systems are increasingly being explored as production infrastructure: they reason over multiple steps, call tools, act through workflows, and adapt through memory and feedback. These systems create governance challenges that are not fully captured by traditional software or predictive ML technical debt. We define Agentic Technical Debt as the accumulated liability created when prompts, memory, tool schemas, orchestration graphs, control policies, and observability routines are patched together faster than they can be validated, standardized, and governed. We define Stochastic Tax as the recurring operating burden of keeping probabilistic agent behavior within acceptable bounds. The distinction matters: debt is a stock of design and governance liability, while the tax is a flow of operating cost that arises because stochastic agents act through tools and workflows. We outline how managers can make both visible through lightweight dashboards and governance controls.
VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis
arXiv:2605.28978v1 Announce Type: new Abstract: Finite Element Analysis (FEA) serves as the cornerstone of modern engineering design. However, its workflow is inherently complex and relies heavily on domain expertise. Although recent efforts have integrated Large Language Models (LLMs) into FEA, existing approaches face limitations in handling multimodal inputs and executing complex tasks. To address these limitations, we propose VFEAgent, an end-to-end multi-agent system designed to automate FEA modeling and simulation directly from input images and problem descriptions. Our methodology integrates two core components: (1) a multimodal vision-language multi-agent pipeline that employs ReAct-driven reasoning to extract structured FEA specifications from heterogeneous inputs and (2) a verification-first code synthesis framework, incorporating robust self-debugging and fallback mechanisms to ensure executability and physical validity. We systematically evaluated the system across various engineering mechanics scenarios. The results demonstrate that VFEAgent achieves a high success rate in generating complete and physically valid simulations, outperforming LLM-based baseline methods in reliability and correctness. These findings validate the feasibility of automating the complete FEA workflow, highlighting the framework's potential to liberate engineers from tedious manual analysis.
London-based Geordie AI secures €25 million to help enterprises govern AI agents
Geordie AI, a British security and governance platform for AI agents, today announced it has closed a €25 million ($30 million) Series A round to enhance the product’s capabilities for security and AI teams as they grapple with the emerging adoption and risk AI agents pose. The round was led by Balderton Capital and included […]
AI agents get their own phone directory built atop DNS
DNS-AID, under the auspices of the Linux Foundation, promises easier agent discovery
Wait! There's a Way Out: A Decision Mechanism for Forecasting Conversational Derailment
arXiv:2605.29243v1 Announce Type: cross Abstract: Forecasting conversational derailment is the task of predicting, as the conversation unfolds, whether it will eventually derail into personal attacks. Since forecasting models operate in an online fashion, they must decide whether to "trigger" an alert after each utterance--for example, to notify participants or a moderator that the conversation is at risk of derailing. Existing approaches make this decision solely based on the estimated likelihood of derailment given the preceding utterances, implicitly assuming that the conversation's future trajectory is fixed. As a result, they ignore the possibility of future recovery and incur an unnecessarily high rate of false positives. In this work we propose a method for decoupling the decision to trigger from derailment likelihood estimation. Our approach is inspired by the first human baseline on this task, which shows that humans achieve dramatically lower false positive rates by selectively deferring their decision to trigger when they anticipate that tension is likely to subside. We operationalize this insight with a deferral mechanism that uses forward-looking simulations to assess whether a tense moment admits plausible paths to recovery. Incorporating this mechanism into a state-of-the-art forecasting model substantially reduces false positives without sacrificing forecasting accuracy. More broadly, this work highlights the value of treating decision-making as a first-class component of forecasting systems.
Energy use forcing rethink of AI chip design, TSMC says | Reuters
A senior TSMC executive said on Thursday that surging electricity demands from AI are making energy efficiency rather than computing power the main constraint shaping future computer chip development.
Pure DC launches carbon removal platform with subsidiary A Healthier Earth
Pure Data Centers’ climate tech subsidiary, A Healthier Earth (AHE), has launched an integrated carbon removal platform, which it claims is the first in the data center market. Pure DC living wall – Pure DC According to the company, the platform will specifically target a scalable, financeable supply of high-integrity biochar and carbon credits for […]
The AI boom’s rising heat | Reuters
And finally, computing power, which drives the AI boom, is being touted as the new oil of the 21st century and more and more countries are looking at ways of embracing it as a tradable asset. Reuters is reporting that China is designing a futures market for AI tokens, which are used for pricing AI services.
Taiyo Yuden Sees ‘Scary’ AI Demand Straining Supply Chain
Taiyo Yuden Co. is fielding “scary” levels of demand for its high-end AI server components, stretching capacity and increasing the risk of supply chain hiccups.
Pure Data Centres secures $2.7 billion financing
Pure Data Centres Group has secured new financing. The company announced it has secured $2.7 billion in financing, including a $2.15bn facility secured against Pure DC’s Dublin and Amsterdam campuses, alongside an increase in its corporate-level financing to $550 million. – Pure DC Lenders included SMBC, ABN AMRO, and Allianz. The Amsterdam facility is fully […]
How AI is reshaping land, power and democracy - Geographical
In Chile, residents of Santiago ... a severe regional drought. Courts initially revoked the licence. An investigation later revealed the Chilean government had quietly allowed data centre developers to bypass environmental impact assessments through an administrative decision never made public. Across the world, national governments are designating data centre expansion as a strategic ...
Researchers automated LLM reasoning strategy design and cut token usage by 69.5%
Test-time scaling (TTS) has emerged as a proven method to improve the performance of large language models in real-world applications by giving them extra compute cycles at inference time. However, TTS strategies have historically been handcrafted, relying heavily on human intuition to dictate the rules of the model’s reasoning. To address this bottleneck, researchers from Meta, Google, and several universities have introduced AutoTTS, a framework that automatically discovers optimal TTS strategies. This automated approach allows enterprise organizations to dynamically optimize compute allocation without manually tuning heuristics. By implementing the optimal strategies discovered by AutoTTS, organizations can directly reduce the token usage and operational costs of deploying advanced reasoning models in production environments. In experimental trials, AutoTTS managed inference budgets efficiently, successfully reducing token consumption by up to 69.5% without sacrificing accuracy. The manual bottleneck in test-time scaling Test-time scaling enhances LLMs by granting them extra compute when generating answers. This extra compute allows the model to generate multiple reasoning paths or evaluate its intermediate steps before arriving at a final response. The primary challenge for designing TTS strategies is determining how to allocate this extra computation optimally. Historically, researchers have designed these strategies manually, relying on guesswork to build rigid heuristics. Engineers must hypothesize the rules and thresholds for when a model should branch out into new reasoning paths, probe deeper into an existing path, prune an unpromising branch, or stop reasoning altogether. Because this manual tuning process is constrained by human intuition, a vast amount of possible approaches remain unexplored. This often results in suboptimal trade-offs between model accuracy and computing costs. Current TTS algorithms can be mapped to a width-depth control space — "width" being the number of reasoning branches explored, "depth" being how far each develops. Self-consistency (SC) samples a fixed number of trajectories and majority-votes the answer. Adaptive-consistency (ASC) saves compute by stopping early once a confidence threshold is hit. Parallel-probe takes a more granular approach, pruning unpromising branches while deepening the rest. All three are hand-crafted, and that's the constraint AutoTTS is designed to break. While some more advanced methods employ richer structures like tree search or external verifiers, they all share one key characteristic: they are meticulously hand-crafted. This manual approach restricts the scope of strategy discovery, leaving a massive portion of the potential resource-allocation space untouched. Automating strategy discovery with AutoTTS AutoTTS reframes the way test-time scaling is optimized. Instead of treating strategy design as a human task, AutoTTS approaches it as an algorithmic search problem within a controlled environment. This framework redefines the roles of both the human engineer and the AI model. Rather than hand-crafting specific rules for when an LLM should branch, prune, or stop reasoning, the engineer's role shifts to constructing the discovery environment. The human defines the boundaries, including the control space of states and actions, optimization objectives balancing accuracy versus cost, and the specific feedback mechanisms. An explorer LLM, such as Claude Code, designs the strategy. This explorer acts as an autonomous agent that iteratively proposes TTS “controllers.” These controllers are code-defined policies or algorithms that dictate how an AI model allocates its computational budget during inference. The explorer tests and refines these controllers based on feedback until it discovers an optimal resource-allocation policy. To make this automated search computationally affordable, AutoTTS relies on an “offline replay environment.” If the explorer LLM had to invoke a base reasoning model to generate new tokens every time it tested a new strategy, the compute costs would be astronomical. Instead, it relies on thousands of reasoning trajectories pre-collected from the base LLM. These trajectories include "probe signals," which are intermediate answers that help the controller evaluate progress across different reasoning branches. During the discovery loop, the explorer agent proposes a controller and evaluates it against this offline data. The agent observes the execution traces of the proposed controller that show it allocated compute over time. By analyzing these traces, the agent can diagnose specific failure modes, such as noting if a controller pruned branches too aggressively in a specific scenario. This provides an advantage over just viewing a final result. The agent then iteratively rewrites its code to improve the accuracy-cost tradeoff. Inside the AI-designed controller Because the explorer agent is not constrained by human intuition, it can discover highly coordinated, complex rules that a human engineer would likely never hand-code. One optimal controller discovered by AutoTTS, named the Confidence Momentum Controller, leverages several non-obvious mechanisms to manage compute: Trend-based stopping: Hand-crafted strategies often instruct the model to stop reasoning once it hits a certain instantaneous confidence threshold. The AutoTTS agent discovered that instantaneous confidence can be misleading due to temporary spikes. Instead, the controller tracks an exponential moving average (EMA) of confidence and only stops if the overall confidence level is high and the trend is not actively declining. Coupled width-depth control: Manually designed algorithms usually treat the "widening" of new reasoning paths and the "deepening" of current paths as separate decisions. AutoTTS discovered a closed feedback loop where the two actions are linked. If the confidence of the current branches stalls or regresses, the controller automatically triggers the spawning of new branches. Alignment-aware depth allocation: Instead of giving all active reasoning branches an equal computation budget, the controller dynamically identifies which branches agree with the current leading answer. It then gives those branches priority "bursts" of extra computation. This concentrates the computational budget on the emerging consensus to quickly verify if it is correct. Cost savings and accuracy gains in real-world benchmarks To test whether an AI could autonomously discover a better test-time scaling strategy, researchers set up a rigorous evaluation framework. The core experiments were conducted on Qwen3 models ranging from 0.6B to 8B parameters. The researchers also tested the system's ability to generalize on a distilled 8B version of the DeepSeek-R1 model. The explorer AI agent was initially tasked with discovering an optimal strategy using the AIME24 mathematical reasoning benchmark. This discovered strategy was then tested on two held-out math benchmarks, AIME25 and HMMT25, as well as the graduate-level general reasoning benchmark GPQA-Diamond. The AutoTTS discovered controller was pitted against four manually designed test-time scaling algorithms in the industry. These baselines included Self-Consistency with 64 parallel reasoning paths (SC@64), Adaptive-Consistency (ASC), Parallel-Probe, and Early-Stopping Self-Consistency (ESC). ESC is a hybrid approach that generates trajectories in parallel and stops early when an answer seems stable. When set to a balanced, cost-conscious mode, the AutoTTS-discovered controller reduced total token consumption by approximately 69.5% compared to SC@64. At the same time, the controller maintained the same average accuracy across the four Qwen models. When the inference budget was turned up, AutoTTS pushed peak accuracy beyond all handcrafted baselines in five out of eight test cases. This efficiency translated to other tasks. On the GPQA-Diamond benchmark, the balanced AutoTTS variant slashed the inference token cost from 510K tokens down to just 151K tokens, while slightly improving overall accuracy. On the DeepSeek model, AutoTTS achieved the highest overall accuracy on the HMMT25 benchmark while cutting the token spend nearly in half. For practitioners building enterprise AI applications, these experiments highlight two major operational benefits: Raising peak performance: AutoTTS doesn't just save money on token consumption. It actively raises the peak attainable performance of the base model. The AI-designed controller is remarkably good at detecting noisy or unproductive reasoning branches on the fly and continuously redirecting its compute budget toward the branches generating the most useful reasoning signals. Cost-effective custom development: Because the framework relies on an offline replay environment, the entire discovery process cost only $39.90 and took 160 minutes. For enterprise teams, that means optimized reasoning strategies tailored to proprietary models and internal tasks are now within reach — without a dedicated research budget. Both the AutoTTS framework and the Confidence Momentum Controller are available on GitHub; the CMC can be used as a drop-in replacement for other TTS controllers.
Microsoft to release new coding model next week, the Information reports | Reuters
The tech giant has primarily relied on AI models from Open AI , Anthropic and Big Tech rival Google (GOOGL.O), opens new tab to power its GitHub Copilot AI tool for software developers.
Differentiable Belief-based Opponent Shaping
arXiv:2605.29042v1 Announce Type: new Abstract: Human coordination often relies on the ability to influence the beliefs of others through strategic action. In multi-agent reinforcement learning, opponent shaping attempts to replicate this influence, though existing methods typically operate within an opponent's parameter, policy, or value space. Meanwhile, belief-manipulation techniques in hidden-role games often rely on hard-coded objectives, such as deception or belief saturation. We propose Differentiable Belief-based Opponent Shaping (D-BOS), a first-order method that treats each observer's belief as the shaped opponent state and differentiates through $k$-step softmax-Bayes belief dynamics. Rather than explicitly rewarding deceptive or cooperative behavior, our method treats the belief state as the target for shaping. This allows the optimal strategy to emerge naturally from the environment's reward structure. This belief-space formulation provides an opponent-shaping signal by differentiating through opponent belief updates, and naturally extends to multiple observers by aggregating gradients over their individual inferred belief trajectories. Empirically, D-BOS outperforms PPO and BBM in hidden-role games, with the largest gains in mixed-motive settings.
Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning
arXiv:2605.28829v1 Announce Type: cross Abstract: Competitive STEM examinations such as JEE and NEET require multi-step symbolic reasoning, precise numerical computation, and deep conceptual understanding across physics, chemistry, and mathematics. Recent large language models perform strongly on common reasoning benchmarks, yet they remain difficult to deploy at scale, where millions of student doubts demand domain-specific, consistently structured problem solving. We introduce Aryabhata 2, a reasoning-focused language model for competitive STEM examinations, trained via reinforcement-learning post-training. Using PhysicsWallah's internal question banks, we construct a high-quality training curriculum and post-train GPT-OSS-20B through reinforcement learning with verifiable rewards. Training combines prolonged reinforcement learning with broadened exploration via progressively larger rollout group sizes. We evaluate Aryabhata 2 on competitive examination benchmarks, including JEE Main, JEE Advanced, and NEET, as well as out-of-distribution reasoning datasets such as AIME, HMMT, MMLU-Pro, MMLU-Redux 2.0, and GPQA. Results show that Aryabhata 2 outperforms its base model GPT-OSS-20B on competitive STEM reasoning while requiring substantially fewer output tokens (up to 64\% fewer).
New AI Usage Report: Enterprise AI Risk Is Heavily Concentrated Among a Small Group of AI "Power users"
More than 6% of enterprise AI conversations contain sensitive data, with DeepSeek reaching 12.63%, increasing governance risks.
The new face of warfare: how AI and hybrid conflict reshape global security - Trend.Az
For this reason, media strategy, digital narratives, and public opinion management are now essential components of national security planning. Economic instruments are also playing an increasingly central role. Sanctions, technology export controls, financial restrictions, and supply-chain leverage have become powerful tools of geopolitical ...
Hallucination Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
arXiv:2605.29055v1 Announce Type: new Abstract: Hallucination remains a major reliability barrier for production LLM systems, particularly in multi-agent pipelines where unsupported claims can propagate unchecked across stages. This paper adapts a HOPE-inspired Nested Learning architecture with Continuum Memory Systems (CMS) and semantic similarity caching to a hybrid benchmark of 310 prompts combining 217 epistemic-uncertainty prompts and 93 fabrication-induction stress-test prompts. A three-stage agentic pipeline orchestrated via the Open Floor Protocol (OFP) is evaluated with five KPIs -- FCD (Factual Claim Density), FGR (Factual Grounding References), FDF (Fictional Disclaimer Frequency), ECS (Explicit Contextualization Score), and OSR (Observability Score Ratio) -- aggregated into THS (Total Hallucination Score) across five weighting configurations to study mitigation-observability trade-offs. FDF, ECS, OSR, and FGR are subtracted as mitigation signals, so that a more negative THS indicates stronger mitigation. The FrontEndAgent is configured as a high-stochasticity generator (temperature = 1.0) to produce a realistic hallucination baseline, while the SecondLevelReviewer and ThirdLevelReviewer operate as progressive correctors. This asymmetric design yields end-to-end THS reductions of -31.3% to -35.9% across five weighting configurations. Semantic caching achieves 440 cache hits over 930 potential calls (47.3% hit rate), reducing LLM invocations to 490, lowering energy and CO2e footprint, and making multi-stage review pipelines operationally viable at production scale. ExtremeObservability attains the most negative final THS (-0.0709), confirming that observability-heavy configurations reinforce rather than compromise mitigation. These findings suggest that memory-augmented multi-agent designs can jointly improve factual reliability, operational efficiency, and auditability without model retraining.
Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms
arXiv:2605.30169v1 Announce Type: new Abstract: As autonomous language model agents proliferate, forming an emerging agentic web with real-world consequences, what credibility signals can you use to decide whether to trust an unfamiliar agent in the wild and delegate to it? A natural governance intuition is to extend human identity verification and reputation mechanisms, from ``Know Your Customer'' and credit scores to ``Know Your Agent'' regimes. However, we argue that this analogy is fundamentally incomplete. Reputation mechanisms function both as social signals and as corrective feedback that sustain an equilibrium of trustworthy behavior, presuming a persistent identity associated with behavioral continuity, sanction sensitivity, and costly non-fungibility. Yet language model agents are ontologically \emph{dissociative}: they are essentially an assemblage of mutable modules -- foundational models, system prompts, tool-access policies, external memory, and, in some cases, a multi-agent system as a whole -- any of which may change agent behavior -- with a fluid persona that is also vulnerable to adversarial attack and may not internalize sanctions. Drawing on dissociative identity disorder jurisprudence, this dissociativity leaves agents without grounding for identifiability, predictability, credibility, and rehabilitability -- the very properties that reputation mechanisms aim to sustain -- thereby collapsing trust. We argue that identity-based, ex post, regulative, sanction-based governance, such as reputation, is structurally inapplicable to dissociative agents, and we suggest a shift to observability-based, ex ante, constitutive, protocol-based behavioral harnesses.
5 Key AI Cybersecurity, Risk Considerations - The NonProfit Times
Artificial intelligence (AI) has gone from nowhere to everywhere in just about three years. If you are like most nonprofit leaders, you are already using AI, but you still have questions about the cybersecurity considerations and risks for your organization. Today, 92% of nonprofits have adopted ...
‘Detect, understand, respond’ driving OMB, CISA’s latest cyber efforts | Federal News Network
Nick Andersen, the acting director of CISA, said an intergovernmental effort is providing critical infrastructure owners more help against cyber threats.
Google launches AI threat defense prioritizing real world cybersecurity threats
Google introduces AI Threat Defense, an advanced cybersecurity tool designed to prioritize real-world threats and automated fixes, challenging Anthropic Mythos and OpenAI Daybreak.
Adoption, Deployment & Impact
Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild
arXiv:2605.29018v1 Announce Type: new Abstract: Although a growing body of research has begun to describe user--LLM interactions, the picture it paints is largely static; little is known about how individual users change their behavior over time. To address this gap, we analyze the conversational trajectories of $\sim$12,000 randomly sampled Microsoft Bing Copilot users and compare these with data from WildChat-4.8M. While the Copilot data contains significant population-level trends, we find that trends in individual user trajectories are much weaker; user habits prove to be overwhelmingly sticky. We also find stark differences between users of different activity levels: more active users have more successful conversations and use the LLM for more complex and professionally oriented tasks. Some user trends also appear in WildChat-4.8M, but we find evidence that this dataset is significantly skewed towards highly proficient "power" users. Ultimately, our results suggest that existing user behavior is difficult to change and demonstrate the extent of user heterogeneity. Our comparison between datasets highlights that WildChat does not represent typical user-AI interactions, an important caveat for downstream uses of the data.
AI & Tech Brief: Beyond the hyperscalers - The Washington Post
Stout argues that businesses will increasingly want to use their own “sovereign” AI models, as opposed to renting the frontier models from the hyperscalers.
BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation
arXiv:2605.28994v1 Announce Type: new Abstract: AI tools to support real world decision making must be able to build simulation models that inform their recommendations and render them interpretable. Tools that can automate aspects of modeling practice must complement human expertise, not replace it. The BEAMS Initiative aims to guide the development of AI tools for modeling and simulation toward forms that are responsible and ethical by establishing benchmarks for human centered modeling and simulation practices. The initiative uses open digital and organizational infrastructure to collaboratively evaluate AI tools for modeling and simulation. The open source sd ai project hosted by the initiative establishes transparency and enables contributions to be shared broadly. A steering group focuses on prioritizing potential benchmarks, while a technical group focuses on implementing the benchmarks in the form of automated tests. Tests for several distinct categories of evaluation have been implemented and applied to AI tools that support qualitative model building, quantitative model building, and model discussion. These include tests for causal translation, model iteration, causal reasoning, conformance, model behavior explanation, suggested model building steps, and suggested model fixes. When engines from the sd ai project are coupled with different LLMs, their performance on these evaluations reveals variability across different AI tools. The evaluations implemented by the initiative demonstrate that AI enabled modeling tools perform better at discussion and basic qualitative tasks than with causal reasoning and quantitative error fixing. No single LLM dominates across engine types, highlighting the importance of specific tasks and tradeoffs between speed and accuracy. Ongoing efforts of the initiative aim to incorporate benchmarks that address concerns about bias by considering alternative perspectives and human centered use cases.
Mind Your Tone: Does Tone Alter LLM Performance?
arXiv:2605.29027v1 Announce Type: new Abstract: The use of Large Language Models (LLMs) is proliferating, yet their performance is observed to vary based on prompting styles and tones. In this study, we investigate both whether and how tonal variations in prompts lead to disparate LLM accuracy for objective multiple-choice questions. We use two datasets: a 50-base question dataset with five tone variants and a 570-base question MMLU subset spanning 57 subjects with seven tone variants. Experiments were conducted to evaluate the performance of four cost-efficient, popular LLMs: ChatGPT-4o, ChatGPT-5-nano, Gemini 2.5 Flash, and Gemini 2.5 Flash Lite. Across models, tonal effects are systematic but highly model-dependent. Some models show small, yet statistically significant, shifts, while others exhibit large accuracy swings across tones. Further, we identify subject-level differences in tone sensitivity and present a routing framework to explain how tones may attune internal reasoning modes. Our findings caution users against assuming tone-robust reliability in LLM deployments.
Amazon scraps AI leaderboard to stop workers chasing usage scores
Senior executive Dave Treadwell tells staff ‘don’t use AI just for the sake of using AI’ as costs rise
Tokenmaxxing is over. That’s because it never measured what really counts to see ROI from AI
Token usage is a poor proxy for firm-wide productivity gains. Those only come with workflow redesign.
Is Your AI ROI Getting Stuck In a Logjam?
Individual AI implementation is driving execution at warp speed while approval cycles struggle to keep up. The result? Instead of ROI, you're stuck with an AI logjam.
The Services Budget Is AI's Biggest Prize | PYMNTS.com
Enterprise software budgets are large. Enterprise services budgets are larger. A company might spend $10,000 licensing accounting software and 10 times
CloudZero, The AI ROI Company, Launches the Financial Control Plane for AI
/PRNewswire/ -- CloudZero, The AI ROI Company, today launched the financial control plane for AI: a shared system that finance, IT, and engineering teams use...
Geopolitics, Policy & Governance
Does Distributed Training Undermine Compute Governance?
arXiv:2605.29359v1 Announce Type: new Abstract: Compute governance proposals often rely on the assumption that frontier AI training requires large, detectable computing clusters. However, recent advances in distributed training algorithms could allow developers to conduct frontier-scale training on distributed agglomerations of hardware, rather than needing large datacenter facilities. Developers who prefer not to be constrained by regulations may structure their hardware in a manner that evades the registration and monitoring requirements associated with compute governance. Therefore, regulations must be designed to detect and prevent illicit distributed training operations. This paper evaluates the feasibility of such evasion and outlines recommended countermeasures, including whistleblowing, chip tracking, forensic accounting, and memory and compute thresholds for clusters.
Not using AI in public services would mean ‘choosing decline’, UK minister warns
Newly appointed chief secretary to the Treasury Lucy Rigby wants to roll out technology across Whitehall
Control within connection: How data sovereignty is rewriting the rules of critical infrastructure
Presented by Equinix Digital systems are central to economic resilience. But the governance models supporting them were designed for a bygone era, when systems were smaller, often centralized, and rarely crossing multiple jurisdictions. This structural mismatch is driving the realization across boardrooms and governments that data sovereignty is not only core to critical infrastructure, but its implications determine the trajectory of the global economy. The scale of change is forcing the issue. IDC projects the global datasphere will continue to grow at an extraordinary pace, driven by AI workloads, real-time analytics, and always-on digital services. This is placing unprecedented demands on data center capacity, interconnection density, and operational reliability, a trend highlighted by both McKinsey and Goldman Sachs last year. More data means demand for more infrastructure. Infrastructure expansion means more interconnected systems. And more interconnected systems mean greater exposure when control is unclear. That is why sovereignty is now coming into focus for nation states and private sector actors alike. It’s more than an abstract legal concept. There are practical questions around who has the authority when systems span countries, clouds, and ecosystems. Control determines resilience in a fragmented world Infrastructure resilience has always depended on clarity. Power grids work because ownership, responsibility, and control are well understood by stakeholders and the public. The same principle should apply to digital infrastructure, even if the underlying systems look much different. Data sovereignty aligns authority with accountability. Organizations retain decision-making power over where data lives, how it moves, who can access it, and which technologies are allowed to touch it. When something breaks or regulators ask difficult questions, there is no ambiguity about who is responsible. Gartner’s Top Strategic Technology Trends for 2026 underscores this shift by emphasizing that modern infrastructure is inseparable from governance, resilience, and digital trust. Treating sovereignty as a bolt-on compliance requirement rather than an architectural principle is proving insufficient. The challenge, of course, is that modern enterprises cannot simply look inward and ignore macro circumstances. Scale, performance, and innovation depend on participation in global digital ecosystems. A false paradox: scale vs. authority For years, organizations were told they had to choose. Either maintain tight control and accept limited connectivity, or embrace global platforms and accept reduced authority over data flows and infrastructure decisions. Neither holds up under real-world conditions. Financial services firms require low-latency access to markets across regions, all while adhering to strict regulatory expectations. Healthcare organizations must have secure data control without walling themselves off from cloud-based analytics and AI innovation. Governments demand digital services that scale while remaining auditable and transparent. This tension is why simplistic sovereignty narratives fail to pass muster. Sovereignty is more nuanced than isolation: the concept means control within connection. The distinction is becoming clearer as hyperscalers, regulators, and enterprises sharpen their approaches. Public disclosures from leading hyperscalers demonstrate how sovereign cloud offerings attempt to address data residency and operational separation. However, most large organizations recognize long-term control cannot rely on any single provider or managed platform alone. A distinction of responsibility leads to an industry inflection point The infrastructure strategies showing the most durability share a common theme: clean separation between infrastructure operations and data authority. In this model, providers are responsible for running highly resilient facilities, physical security, power, cooling, and high-performance interconnection at scale. Customers are fully in control of their data, applications, security posture, and governance decisions. Authority stays with the party that owns the risk. This is where neutral infrastructure platforms like Equinix come in, not as a cloud service provider, but as an interconnected foundation where customers deploy and control their own environments while accessing a broad ecosystem of networks, clouds, and partners. Equinix views sovereignty as customer-controlled by design, with clear boundaries around possession, custody, and control. That approach is in high demand from regulated industries. The benefits show up in auditability, legal clarity, and operational confidence. Trust comes with verification. When responsibilities are clear, compliance is verifiable rather than assumed. Ambiguity is unacceptably expensive for AI workloads Artificial intelligence accelerates these dynamics. AI systems are data-hungry and regulation-sensitive, a combination that leaves little room for governance shortcuts. Financial institutions like Bank of America and Morgan Stanley have forecasted AI-driven data center growth will place new pressure on infrastructure planning, energy availability, and geographic distribution. Simultaneously, AI models need to operate close to sensitive data, rather than exporting that data across borders for centralized processing. Without a clear sovereignty framework, organizations face difficult compromises. But with one, they achieve flexibility. Models move to data. Data remains controlled. Innovation accelerates without triggering regulatory alarms. That balance is emerging as a competitive differentiator. Infrastructure in 2026 looks different, and expectations are reset The critical infrastructure powering the digital economy goes beyond physical assets. It now includes governance models, legal posture, and control structures that determine how systems behave under pressure. European Commission updates to data sovereignty and digital strategy frameworks reflect this, as governments increasingly treat data governance as a matter of economic and national resilience. Deloitte’s digital sovereignty research for 2026 echoes that theme across global enterprises, especially those operating in multiple regulatory regimes. The organizations adapting fastest are not retreating from global connectivity. Rather, they are designing for it and embedding sovereignty as an architectural requirement. As enterprises navigate more fragmented regulatory environments, the ability to maintain jurisdictional control across interconnected digital ecosystems is a baseline infrastructure expectation rather than a specialized requirement. That expectation is now shaping how infrastructure is built. Enterprises increasingly require network-level sovereignty enforcement that operates across hybrid multicloud environments automatically, including during outages, failovers, and congestion events where data can cross borders invisibly. Capabilities such as Equinix Fabric Geo Zones reflect that demand, delivering the first network-level, multicloud sovereignty enforcement layer built natively into the interconnection fabric itself. The rules of infrastructure are being rewritten. Data sovereignty is the architectural foundation that resilient, globally connected enterprises demand. Organizations that treat it as such will be better equipped to operate, compete, and withstand pressure. Those that do not will find the status quo ambiguity increasingly costly. Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.
EU should build chips, cloud and AI capacity to curb reliance on US, drafts say
Draft European Commission documents suggest the EU should develop domestic alternatives for chips, cloud, and AI to reduce dependence on US hyperscalers and software providers.
UK and European passports linked to restricted Chinese investors
Critics raise concerns around critical national infrastructure and potential counterfeiting
When Models Disagree: Rethinking LLM Evaluation for Public Comment Analysis
arXiv:2605.29025v1 Announce Type: new Abstract: Federal agencies are deploying large language models (LLMs) to categorize public comment corpora, where the model's organization of the record shapes what policymakers see and which arguments register. Standard evaluation, anchored on stance accuracy against a small validated set, cannot detect when different models produce materially different categorizations of the same public input. We propose an Interpretive Audit Pipeline that treats multi-model disagreement as diagnostic of interpretive complexity and directs human review toward genuinely ambiguous public input. Analyzing 1,260 public comments on a federal USDA docket across four LLMs, we find that inter-model thematic divergence exceeds within-model prompt variation, and that an expert rubric suppresses deep interpretive disagreement without resolving it. In a two-stage labeling study on a stratified 40-comment subsample, four LLMs and a human annotator labeled independently and then revised after seeing the others' labels. Revision behavior varied across labelers, and the human annotator's revisions frequently introduced framings absent from the ensemble's collective output. We argue disagreement-based evaluation is a necessary complement to accuracy metrics for LLM-assisted interpretive coding.
How to close AI’s accountability loophole
Governance of new technologies must be determined by elected officials rather than fastest moving companies
The Biosecurity Blind Spot: Systematic Dual-use Detection in Open Science Infrastructure
arXiv:2605.28843v1 Announce Type: cross Abstract: AI is transforming life sciences research at unprecedented speed, accelerating discovery across protein structure prediction, genome modeling, and drug development (Jumper et al., 2021; Mak et al., 2024). Yet this rapid advancement, coupled with the open science movement, introduces significant dual-use research concerns that have received limited empirical scrutiny. Here we present the first systematic analysis of dual-use research of concern (DURC) content on open preprint servers. We screened ~52,000 bioRxiv preprints (2024-2025) using a hybrid pipeline of lexical filtering and large language model (LLM) evaluation, scoring metadata across nine DURC, three PEPP, and five governance categories aligned with U.S. and Australia Group oversight frameworks. Our analysis reveals that dual-use-adjacent knowledge is routinely present in openly accessible titles and abstracts, often exceeding established risk thresholds even in studies with legitimate public health objectives. While this mapping captures surface-level information diffusion, it does not measure operational capability, downstream misuse potential, or the substantial technical and biosafety barriers that constrain harmful application. We argue that institutional review processes, funding requirements, and preprint platform policies must evolve to incorporate proactive, metadata-level monitoring without compromising scientific transparency. Ultimately, harmonizing controlled-access mechanisms for high-risk methodologies with open summaries of scientific contributions offers a pragmatic framework for governing AI-accelerated biology at scale.
Opinion | Elizabeth Warren’s AI plan is to raise taxes and stifle innovation - The Washington Post
President Ronald Reagan said in 1986, “Government’s view of the economy could be summed up in a few short phrases: If it moves, tax it. If it keeps moving, regulate it. And if it stops moving, subsidize it.”
All major AI models violate EU regulations — study
According to a new testing tool, certain models violated the rules in up to 93% of cases.
Illinois is close to enacting an AI safety law with broader mandates than other states’. | The Verge
Governor JB Pritzker says he plans to sign a bill passed Wednesday by the state legislature, which would require independent audits and whistleblower protections at AI companies. Those features go beyond recently passed AI safety laws in New York and California, according to NBC News, while ...
Hong Kong flags cross-border data hurdles for Greater Bay Area legal AI model
Differing data-governance rules across Guangdong, Hong Kong and Macao are creating obstacles to developing an AI foundation model for legal services, according to Deputy Secretary for Justice Cheung Kwok-kwan.
Spain approves draft law adapting the EU AI Act into national legislation | Digital Watch Observatory
A new draft law on AI governance in Spain covers provider responsibility, human oversight, sanctions, and public-sector use.
Get the full executive brief
Receive curated insights with practical implications for strategy, operations, and governance.