Fri 22 May 2026

U.S. to Award Quantum-Computing Firms $2 Billion and Take Equity Stakes

IBM, set to receive $1 billion of the package, saw large stock gains along with other companies involved.

World Trade Grew Strongly at Start of Year on AI Boom

World trade flows continued to increase at a rapid pace in the first three months of the year, boosted by the boom in AI-related investment.

UK’s Softcat Recasts Itself as AI Winner With Guidance Upgrade

Softcat Plc’s image among investors is quickly shifting from AI loser to AI winner.

Who Uses AI? Platforms, Workforce, and AI Exposure

arXiv:2605.21743v1 Announce Type: cross Abstract: A growing literature uses artificial intelligence platform conversation logs to measure occupation exposure. We show that these scores partly measure platform user base rather than the workforce. Holding outcome, sample, controls, and estimator fixed while varying only the platform input changes the post-ChatGPT employment coefficient by a factor of 1.9, and within-vendor consumer-versus-enterprise channels produce estimates that disagree in sign. Reweighting to Bureau of Labor Statistics workforce shares attenuates estimates by 42 to 93 percent. We formalize the non-classical measurement error, derive probability limits and partial-identification bounds for employment elasticities. The bias understates substitution more than augmentation.

NYT· Today

Giving Workers a Stake in A.I. Gains Traction

Gov. Gavin Newsom of California has floated a policy idea that’s getting attention in Silicon Valley: let workers own a piece of technology disruption.

Economics & Markets

28 articles

AI Business Models1 articles

Guardian· Yesterday

Spotify and Universal Music agree deal to let subscribers create AI remixes

Licensing agreement will allow listeners to use AI to create content on streaming platform for first time Spotify and Universal Music Group have agreed on a deal that will allow subscribers to generate song covers and remixes using artificial intelligence. The licensing agreement is the first time the Swedish streaming company will allow listeners to use AI to create content through its platform. Continue reading...

AI Investment & Valuations11 articles

UK’s Softcat Recasts Itself as AI Winner With Guidance Upgrade

Softcat Plc’s image among investors is quickly shifting from AI loser to AI winner.

Reuters· Today

Exclusive: Grok falls flat in Washington, undercutting SpaceX's AI growth story | Reuters

WASHINGTON, May 21 (Reuters) - SpaceX’s initial public offering is set to be the largest in history, partly fueled by its promise to grab a chunk of what it calls a multi-trillion-dollar market for artificial intelligence services through ...

Editor's pickEnergy & Utilities

Reuters· Yesterday

SpaceX IPO filing lays bare losses and Musk control as it stakes future on AI | Reuters

Musk's purchase of his social media and AI company x AI gave ⁠SpaceX new capabilities and opportunities but a staggering amount of spending, accounting for 76% of its $10.1 billion in capital spending in the first quarter, as well as fresh losses.

CNBC· Yesterday

An AI trade involving energy and infrastructure that's doubled your money, topping Nvidia

If you put the same money into a basket of companies that are building out AI infrastructure and energy sources, you’ve done much better than stocks like Nvidia.

Reuters· Today

Reuters Reuters | Breaking International News & Views

Elon Musk’s rocket ‌firm is poised to float with a $1.75 trln valuation. With Open AI coming, starry-eyed investors are likely to focus on mega floats. Windscreen fixer Belron’s mix of stable ⁠growth and low AI -disruption risk could show that even earthier listings can take off.

I’ve spent 25 years in venture capital. Here’s how it quietly shut ordinary Americans out of the AI wealth boom—and what could fix it

The private market didn't just grow. It replaced the system that once let regular investors participate in America's biggest wealth-creation moments. One solution keeps getting ignored.

Guardian· Today

Mars colony and Grok warnings: five strange details in SpaceX’s pitch to investors

IPO filing from Elon Musk’s company reveals closer look at finances, cosmic ambitions and tech empire’s quirks SpaceX publicly released an investor prospectus on Wednesday as part of its plan for a $1.75tn debut on the US stock market next month, revealing unseen details about the finances and future plans of Elon Musk’s flagship company. In addition to new information on operating costs and revenue, the filing also included trademark Muskian sweeping proclamations about the universe and insights into some of the quirks of his tech empire. Scattered throughout the 300-plus-page prospectus are several disclosures and risk warnings that show the eccentricities of Musk’s company and its cosmic ambitions. Other financial details in the document highlight how interdependent Musk’s various businesses have become and the risks that they carry. Continue reading...

Fundingo· Yesterday

The Synthetic Yield Mastering the Structural Complexity of Specialized Commercial AI Hardware and Compute Infrastructure Finance - Loan Management Software by Fundingo

The Synthetic Yield: Mastering the Structural Complexity of Specialized Commercial AI Hardware and Compute Infrastructure Finance The rapid proliferation of large language models and generative artificial intelligence has fundamentally altered the risk-return profile of specialized infrastructure ...

RS Web Solutions· Today

QUALCOMM Earnings Report and AI Smartphone Technology Insights

Such competitive dynamics can significantly impact pricing strategies, profit margins, and momentum in securing design wins, all factors of keen interest to institutional and retail investors. Regulatory frameworks and trade policies present additional industry-wide concerns. Export controls on advanced semiconductor technologies and evolving geopolitical ...

Norway’s $2.3 Trillion Fund Objects to Elkann’s Meta Board Seat

Norway’s $2.3 trillion wealth fund expressed dissatisfaction with the reappointment of John Elkann, chairman of Stellantis NV and chief executive of investor Exor NV, on the board of directors at Meta Platforms Inc.

The Motley Fool· Today

CoreWeave vs. Nebius: Which Artificial Intelligence (AI) Infrastructure Stock Is a Better Buy in 2026? | The Motley Fool

Both CoreWeave and Nebius power AI giants with critical infrastructure, but one looks like the superior investment.

AI Macroeconomics4 articles

World Trade Grew Strongly at Start of Year on AI Boom

World trade flows continued to increase at a rapid pace in the first three months of the year, boosted by the boom in AI-related investment.

Jamie Dimon sees ‘exuberance’ in markets. That’s a loaded word when it comes to bubbles popping

The most important economic question of our time has no consensus — and the gap between what AI can do and what economies are organized to absorb may be the defining tension of the decade.

ProMarket· Yesterday

More AI-Exposed Industries and States Are Benefiting, But Results Are Heterogenous - ProMarket

In new research, Christos Makridis and Andrew Johnston find that industries exposed to generative AI are seeing an increase in production, employment, and wages. However, the majority of AI-driven revenue growth is channelled back to capital as profits, rather than to workers.

Editor's pickPAYWALLMedia & Entertainment

Can Rising Consumption Deepen Inequality?

arXiv:2601.15537v2 Announce Type: replace-cross Abstract: The impact of rising consumption on wealth inequality remains an open question. Here we revisit and extend the Social Architecture of Capitalism agent-based model proposed by Ian Wright, which reproduces stylized facts of wealth and income distributions. In a previous study, we demonstrated that the macroscopic behavior of the model is predominantly governed by a single dimensionless parameter, the ratio between average wealth per capita and mean salary, denoted by R. The shape of the wealth distribution, the emergence of a two-class structure, and the level of inequality - summarized by the Gini index - were found to depend mainly on R, with inequality increasing as R increases. In the present work, we examine the robustness of this result by relaxing some simplifying assumptions of the model. We first allow transactions such as purchases, salary payments, and revenue collections to occur with different frequencies, reflecting the heterogeneous temporal dynamics of real economies. We then impose limits on the maximum fractions of wealth that agents can spend or collect at each step, constraining the amplitude of individual transactions. We find that the dependence of the inequality on R remains qualitatively robust, although the detailed distribution patterns are affected by relative frequencies and transaction limits. Finally, we analyze a further variant of the model with adaptive wages emerging endogenously from the dynamics, showing that self-organized labor-market feedback can either stabilize or amplify inequality depending on macroeconomic conditions.

AI Market Competition5 articles

China’s AI-Made Video Is Changing the Entertainment Landscape

Such services pose an existential threat to traditional entertainment.

TechFlow· Yesterday

AI Startup Companies’ $80 Billion ARR—90% Captured by Just Two Compan… | Blockchain Industry Original In-Depth Content - Authoritative Industry Analysis Report Interpretation - Blockchain Technology Application Analysis - TechFlow

This isn’t a winner-takes-all scenario—it’s the winner flipping the table.

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

Visma Dinero halts Danish rollout of new AI assistant to preserve competition

Visma Dinero stopped the release of its AI assistant in Denmark after antitrust authorities warned that the tool could facilitate anticompetitive information exchange between rivals.

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

Music publishers file amended US claims against Anthropic

Universal Music, Concord Music Group, and ABKCO Music filed an amended complaint accusing Anthropic of copyright infringement through the unauthorized use of lyrics in AI model training.

🎙️ Exclusive interview: Sundar Pichai on AI's flip phone moment· Today

Exclusive interview: Sundar Pichai on AI's flip phone moment

An exclusive interview featuring Google CEO Sundar Pichai discussing the current state and future of artificial intelligence.

AI Pricing & Cost Curves1 articles

Editor's pickTransportation & Logistics

Council Post: Why The Cheapest AI Stack Becomes The Most Expensive At Scale

A small fraction of queries that are slow, expensive or cold-started will drive most of the user-facing latency that matters.

AI Productivity3 articles

COAgents: Multi-Agent Framework to Learn and Navigate Routing Problems Search Space

arXiv:2605.20618v1 Announce Type: new Abstract: Although Vehicle Routing Problems (VRP) are essential to many real-world systems, they remain computationally intractable at scale due to their combinatorial complexity. Traditional heuristics rely on handcrafted rules for local improvements and occasional \textit{jumps} to escape local minima, but often struggle to generalize across diverse instances. We introduce \textbf{COAgents}, a cooperative multi-agent framework that models the search process as a graph: nodes represent solutions, and edges correspond to either local refinements or large perturbations for diversification (i.e., jumps). A \textit{Partial Search Graph} (PSG) is dynamically constructed during search, enabling COAgents to train a Node Selection Agent and a Move Selection Agent to guide intensification, and a Jump Agent to trigger well-timed explorations of new regions. Unlike end-to-end learning approaches, COAgents cleanly separates problem-agnostic search control from compact domain-specific encoding, facilitating adaptability across tasks. Extensive experiments on the CVRP and VRPTW benchmarks show that COAgents remains competitive with several learn-to-search baselines on CVRP and sets a new state of the art among learning-based methods on the more challenging VRPTW instances, reducing the gap to the best-known solutions by 14\% at $N\!=\!100$ and 44\% at $N\!=\!50$ relative to the strongest neural solver (POMO), and by 21\% and 40\% respectively relative to ALNS. Code is available at https://github.com/mahdims/COAgents.

I've led companies through every major tech disruption. AI washing is the same mistake, every time | Fortune

Leaders using AI to justify workforce cuts are missing the real opportunity to build more capable organizations.

PIIE· Yesterday

The adoption of AI by industrial sectors | PIIE

Artificial intelligence (AI) appears destined to impact the entire economy but could affect industries and companies very differently. Some sectors are more susceptible to the application of AI than others. Within sectors, some firms will exhibit greater expertise in adopting AI than others.

AI Startups & Venture3 articles

Artificial Intelligence Newsletter | May 21, 2026· 2 days ago

OpenAI to open first international applied AI lab in Singapore

Singapore and OpenAI signed a partnership to establish an Applied AI Lab, the first outside the US, to boost AI adoption and talent development.

Editor's pickConsumer & Retail

UK gets a fresh new unicorn as beauty and wellness platform Fresha lands €68.9 million from KKR

Fresha, a London-based AI-powered marketplace and business management platform for the beauty and wellness industry, has announced a €68.9 million ($80 million) primary growth investment from funds managed by KKR, a global investment firm. This deal values Fresha at over €861.9 million ($1 billion) and brings Fresha’s total capital raised to €245.5 million ($285 million). […]

Bizjournals· Today

How entrepreneurs are chasing AI in a different way than big companies - Charlotte Business Journal

Here's how artificial intelligence is giving Charlotte entrepreneurs the ability to automate work, reduce staffing and operate with new speed and flexibility.

Labor, Society & Culture

26 articles

AI & Culture2 articles

MIT Technology Review· Yesterday

Scaling creativity in the age of AI

Storytelling is core to humanity’s DNA, stemming from our impulse to express ideals, warnings, hopes, and experiences. Technology has always been woven through the medium and the distribution: from early humans’ innovation of natural pigments and charcoals for cave paintings to literal representation by the camera. The landscape of storytelling continues to shift under our…

Personality Engineering with AI Agents: A New Methodology for Negotiation Research

arXiv:2605.20554v1 Announce Type: new Abstract: According to canonical negotiation theory, people's success in a negotiation depends on how well they balance competing demands--empathizing and asserting, demonstrating concern for other and concern for self, being soft on the people and hard on the problem. Yet people struggle to manage these tensions, so researchers have lacked the ability to rigorously test the field's prescriptions under controlled conditions. AI agents do not face the same limitations, and their precision, repertoire, consistency, and scalability enable a new class of experiments to contribute to negotiation theory. In this article, we introduce personality engineering: a methodology that uses AI agents to precisely parameterize, manipulate, and evaluate negotiator personality. We propose using the interpersonal circumplex--and its two core dimensions of warmth and dominance--as a foundational coordinate system for the field. This approach offers both a rigorous methodology for testing classic negotiation theories and a practical guide for designing the personalities of AI negotiation agents.

AI & Employment9 articles

Who Uses AI? Platforms, Workforce, and AI Exposure

FT· Yesterday

Generating tax revenues in an automated world

If AI destroys job markets, governments will need to make up the resulting shortfall in labour income tax receipts

Theregister· Today

Workday wants AI to punch in instead of having to hire new recruits

CEO eyes margin gains by keeping headcount flat – bold for a company selling HR software to employers

Fortune· Yesterday

Ex-Facebook exec Sheryl Sandberg says the 10-year career plan is dead thanks to AI: ‘Don’t script your career when the future is uncertain,’ she warns Gen Z

The former Meta says rigid career plans will backfire: "If I had one, I would have missed the internet," Sheryl Sandberg warned Gen Z.

Employee Benefit News· Yesterday

Workers say reliance on AI is eroding skills and judgment

A new GoTo study finds workers increasingly depend on AI tools, raising concerns about misuse, poor judgment and declining skills.

AI Might Not Bring On A Job Crisis, But A Workforce ‘Mismatch’ Could

A new Indeed report suggests there will still be job growth, but not in all fields. Here’s how employers and workers must adapt to avoid 8% unemployment.

CNBC· Yesterday

By the Numbers: What the class of 2026 job market actually looks like — and where AI fits in

This commission may impact how and where certain products appear on this site (including, for example, the order in which they appear). Read more about Select on CNBC, and click here to read our full advertiser disclosure. ... ShareShare Article via FacebookShare Article via TwitterShare Article via LinkedInShare Article via Email · Congratulations to the Class of 2026...

New Kerala· Today

AI Skills Shortage Hits 45% of Indian Organisations: Report

Nearly 45% of Indian firms cite AI skills as top workforce constraint. SHRM India Report reveals 54% show low urgency on AI investment despite looming disruption.

PR Newswire· Yesterday

AI Is Reshaping Early Career Hiring Expectations, New ICIMS Data Reveals

/PRNewswire/ -- ICIMS, a leading enterprise talent acquisition platform, released the ICIMS Insights May 2026 Workforce Report, revealing a growing imbalance...

AI & Inequality1 articles

NYT· Today

Giving Workers a Stake in A.I. Gains Traction

Gov. Gavin Newsom of California has floated a policy idea that’s getting attention in Silicon Valley: let workers own a piece of technology disruption.

AI & Misinformation2 articles

VentureBeat· Yesterday

Americans can’t spot a deepfake, and that’s a business crisis, not just a consumer problem

Presented by Veriff Americans can’t reliably distinguish real from AI-generated content, and that’s not just a media literacy problem; it’s a direct threat to how businesses verify identity online. New research finds that while many people are aware of deepfakes, their ability to distinguish them from reality is barely better than a coin flip. A 2026 survey conducted by Veriff and Kantar among 3,000 respondents in the United States, the United Kingdom, and Brazil shows Americans scoring just 0.07 on a scale where 0 represents random guessing. If people can’t distinguish authentic visual content, they can’t reliably distinguish authentic identities. In practice, that means the same users interacting with digital services are often unable to tell whether the person on the other side of a screen is real. That ineffectiveness has direct consequences for every digital business that relies on image- and video-based identity verification to confirm who is on the other side of a screen. That includes everything from customer bank onboarding and account recovery to marketplace seller verification, high-value ecommerce transactions, social platform authentication, and enterprise access control. In the U.S., those consequences are already material — synthetic identity fraud now accounts for billions in annual losses, and the tools to generate convincing fakes are now widely accessible. The report also identifies a small but high-risk cohort: the roughly 7% of users who perform poorly at detecting deepfakes, yet remain confident in their ability and rarely verify what they see. While this is small as a percentage, at scale it represents millions of accounts that are highly exploitable targets for fraud. If users can’t reliably distinguish real from synthetic identities, then any system that depends on visual verification is fundamentally exposed. Identity verification can no longer be treated as a compliance function; instead, it has to be built as core digital infrastructure. “Now that AI-generated content is becoming indistinguishable from reality, the human eye alone is no longer a reliable line of defense,” says Ira Bondar-Mucci, fraud platform lead at Veriff. "Businesses and policymakers in the U.S. need to close this awareness gap urgently, while simultaneously investing in automated verification technologies that can catch what humans simply can’t." The U.S. deepfake awareness gap is wider than expected The United States might be the global epicenter of generative AI development, but American consumers demonstrate the lowest familiarity with deepfakes among the three surveyed markets. Only 63% of U.S. adults are familiar with the term, compared to 74% in the UK and 67% in Brazil. “There’s a paradox at play,” Bondar-Mucci says. “The U.S. is the global epicenter of AI development, yet American consumers are the least familiar with one of its most dangerous byproducts. Historically, consumers have had higher baseline trust in digital content, with the conversation about fraud centered more on data privacy than on content authenticity. The problem is that low awareness doesn’t reduce risk, it amplifies it. If you don’t know what a deepfake is, you’re far less likely to pause and verify whether you've encountered one.” Human deepfake detection is barely better than a coin flip In practice, the randomness that characterizes consumer’s ability to distinguish real from fake is evident across the ways people assess different types of content. Video content proved to be especially difficult to assess, with fake videos frequently identified as authentic and real videos often flagged as fake. Even in side-by-side comparisons, respondents split their judgments close to evenly, another indication that visual inspection alone is no longer a reliable method for verifying authenticity. Overconfidence in deepfake detection creates a dangerous vulnerability Roughly half of U.S. respondents say they are confident in their ability to identify deepfakes, but that confidence far exceeds actual performance, demonstrating that self-assessment is effectively meaningless. Within that population, there’s that small but high-risk cohort: the approximately 7% of users who are inaccurate, yet overconfident in their ability and rarely verify suspicious content. “This confidence-competence gap creates a false sense of security that fraudsters are primed to use,” says Bondar-Mucci. “When people believe they can’t be fooled, they stop looking for the signs. That’s precisely when they’re most vulnerable, whether to a synthetic identity used in financial fraud or a fabricated video designed to manipulate trust.” For businesses, the implication is clear: any organization that still relies on manual review processes or customer self-attestation is inheriting this vulnerability directly. Human judgment is an increasingly unreliable safeguard, and verification needs to be built into systems by default. This means automated, technology-led, and not dependent on the end user’s self-assessment of their ability to tell real from fake. Americans are worried about deepfakes but trust platforms to handle them Concern about deepfakes is high across the U.S., with 79% of respondents reporting they are rather or extremely concerned about personal fraud and impersonation. The U.S. diverges from other markets in where that concern gets directed. Americans are more likely than UK or Brazilian respondents to trust social media platforms and digital services to identify and manage AI-generated content. That delegation of responsibility may be reducing individual vigilance at exactly the moment the threat is accelerating. “We’re seeing synthetic identities used to open fraudulent accounts and authorize transactions, and deepfake videos deployed to bypass basic verification checks,” he explains. “What makes this particularly urgent is the combination of great concern with relatively high platform trust. That gap between perceived and actual protection is exactly where fraud thrives.” The business case for automated identity verification has never been stronger The gap between what Americans believe they can detect and what they actually can is not a knowledge problem that awareness campaigns will resolve, but a design flaw in any system that places the burden of identity verification on unassisted human judgment. The effective response is not to remove humans from the verification loop, but to stop assigning them tasks that human perception can no longer perform reliably. Organizations that persist in relying on manual review processes or customer self-attestation are absorbing this vulnerability into their operations. The alternative is automated, AI-powered identity verification that operates at the point of interaction, detects synthetic media before a human decision is required, and does not depend on the end user’s ability to distinguish real from fake. “Seeing is no longer believing,” says Bondar-Mucci. “The companies that build verification infrastructure around that reality, rather than around the assumption that it will be otherwise, are the ones best positioned to sustain customer trust as the synthetic media landscape continues to evolve.” Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

Editor's pickConsumer & Retail

Detecting Synthetic Political Narratives in Cross-Platform Social Media Discourse

arXiv:2605.21540v1 Announce Type: cross Abstract: The proliferation of large language models has introduced a new paradigm of synthetic political communication in which narratives may be generated, semantically coordinated, and strategically disseminated across platforms at scale. We present a cross-platform framework for detecting synthetic political narratives using four coordination signals -- lexical diversity D(C), temporal burstiness B(C), rhetorical repetition R(C), and semantic homogenization H(C) -- combined into a Synthetic Narrative Coordination Score SNC(C). We apply the framework to a corpus of 353,223 records spanning six geopolitical event windows collected from six Telegram channels and nine Reddit communities (2023--2026). Results show that IntelSlava exhibits the lowest lexical diversity (MATTR 0.52--0.54), the highest burstiness (B=+0.48 to +0.73), and the highest rhetorical overlap with peer channels (Jaccard 0.12), ranking first in the composite SNC(C) on four of six event windows (SNC 0.45--0.60). Rybar ranks last on all windows despite its high semantic homogenization, because its Russian-language output yields high lexical diversity and near-zero rhetorical Jaccard with English-language channels -- demonstrating that no single indicator is sufficient for coordination detection. Multi-dimensional SNC(C) scoring provides a more robust and interpretable signal than any individual metric.

AI Ethics & Safety7 articles

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

Cox Media, two others settle US FTC claims over AI marketing service

Cox Media Group and two marketing firms agreed to pay $930,000 to settle FTC allegations regarding deceptive claims about using AI to listen to smart device conversations for ad targeting.

Machine Learning as Performative Materialist Practice: Thirteen Theses on the Epistemology, Methodology, and Politics of Applied ML

arXiv:2605.21785v1 Announce Type: new Abstract: Machine learning practice in institutional decision-support contexts -- government, public policy, public health, criminal justice, resource allocation -- rests on a set of largely unexamined epistemological commitments inherited from classical statistics and computer science: that models represent stable regularities, that validation can be context-free, that performance metrics are politically neutral, and that feature importance reveals system structure. This paper challenges these commitments through a unified framework of performative materialist ML, articulated as thirteen theses. Drawing on Pickering's cybernetic ontology, the performativity literature from economic sociology (Callon, MacKenzie), Simon's bounded rationality, the formalization of performative prediction (Perdomo et al., 2020), and fifteen years of applied ML experience in government and public policy, we argue that: (1) ML models are best understood not as truth-seeking representations but as temporally situated compressions that function as instruments of intervention; (2) the full data product is a complex adaptive system that coevolves with its target and navigates a multi-objective space no single algorithm can optimize; (3) validity is fundamentally performative, measured by effects in the world rather than formal properties of the model; (4) the choices embedded in objective functions, fairness criteria, and resource thresholds are political decisions belonging to stakeholders, not technicians. We show how these theses unify several practical prescriptions -- temporal cross-validation, precision and recall at k, pipeline-aware fairness auditing, satisficing over optimizing -- as consequences of a coherent materialist epistemology rather than isolated best practices

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

Personalization of AI chatbots becoming focus of UK data watchdog, official says

The UK's Information Commissioner's Office is examining how AI chatbots collect and use personal data as it develops new guidance for responsible AI use under data protection law.

CR4T: Rewrite-Based Guardrails for Adolescent LLM Safety

arXiv:2605.21609v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly embedded in adolescent digital environments, mediating information seeking, advice, and emotionally sensitive interactions. Yet existing safety mechanisms remain largely grounded in adult-centric norms and operationalize safety through refusal-oriented suppression. While such approaches may reduce immediate policy violations, they can also create conversational dead-ends, limit constructive guidance, and fail to address the developmental vulnerabilities inherent in adolescent-AI interactions. We argue that adolescent LLM safety should be framed not solely as a filtering problem, but as a socio-technical, developmentally aligned transformation problem. To operationalize this perspective, we propose Critique-and-Revise-for-Teenagers (CR4T), a model-agnostic safeguarding framework that selectively reconstructs unsafe or refusal-style outputs into ageappropriate, guidance-oriented responses while preserving benign intent. CR4T combines lightweight risk detection with domain-conditioned rewriting to remove risk-amplifying content, reduce unnecessary conversational shutdown, and introduce developmentally appropriate guidance. Experimental results show that targeted rewriting substantially reduces unsafe and refusal-oriented outcomes while avoiding unnecessary intervention on acceptable interactions. These findings suggest that selective response reconstruction offers a more human-centered alternative to refusal-centric guardrails for adolescent-facing LLM systems.

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

arXiv:2605.22109v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) are increasingly deployed in human-facing roles where personality perception is critical, yet existing benchmarks evaluate this capability solely on numerical Big Five score prediction, leaving open whether models truly perceive personality through behavioral understanding or merely prejudge through superficial pattern matching. We address this gap with three contributions. (i) A new task: we formalize Grounded Personality Reasoning (GPR), which requires MLLMs to anchor each Big Five rating in observable evidence through a chain of rating, reasoning, and grounding. (ii) A new dataset: we release MM-OCEAN (1,104 videos, 5,320 MCQs), produced by a multi-agent pipeline with human verification, with timestamped behavioral observations, evidence-grounded trait analyses, and seven categories of cue-grounding MCQs. (iii) Benchmark and analysis: we design a three-tier evaluation (rating, reasoning, grounding) plus four sample-level failure-mode metrics: Prejudice Rate (PR), Confabulation Rate (CR), Integration-failure Rate (IR), and Holistic-grounding Rate (HR), and benchmark 27 MLLMs (13 closed, 14 open). The analysis uncovers a striking Prejudice Gap: across the field, 51% of correct ratings are not grounded in retrieved cues, and the Holistic-Grounding Rate spans only 0-33.5%. These findings expose a disconnect between getting the right score and reasoning for the right reason, charting a roadmap for grounded social cognition in MLLMs.

A school district’s lawsuit against Meta for mental health costs was set for trial next month. Zuckerberg settled

The school district had sought more than $60 million to create a 15-year program it said would help counteract mental health and learning issues.

Guardian· Today

Meta and Snapchat blocking Saudi dissidents’ accounts

US social media firms acting on orders from Middle East kingdom accused of being ‘instruments of repression’ Major US social media companies including Meta’s Facebook and Instagram platforms have blocked the accounts of Saudi Arabian dissidents so they are no longer visible inside the kingdom, following orders by Saudi authorities. Those affected include Abdullah Alaoudh, a US-based activist and vocal critic of Saudi human rights violations, and Omar Abdulaziz, a Canada and UK-based activist who worked closely with Jamal Khashoggi before the journalist’s murder by Saudi agents in 2018. The headline on this article was amended on 22 May 2026. An earlier version wrongly said X was blocking dissidents’ accounts. This has been corrected Continue reading...

AI Skills & Education5 articles

Microsoft UK Stories· Yesterday

Why the UK’s AI-powered prosperity hinges on skilling for all - Microsoft UK Stories

While employees are increasingly working like it’s 2026, some organisations are still operating like it’s 2019. Unless businesses, educators, government and the technology sector as a whole work together to build AI capability more broadly across the workforce, the UK risks hampering its ...

Editor's pickPAYWALLEducation

Theatlantic· Yesterday

Colleges Are at a Breaking Point

The AI job market has made tuition look like a dubious investment. But it only exposes the deeper identity crisis in American higher education.

Editor's pickPAYWALLEducation

NYTimes· Yesterday

Opinion | A Defense of a Liberal Arts Education in the Age of A.I. - The New York Times

Making the case for a “useless” education · Hosted by Ross Douthat

Healthcare IT Today· Yesterday

The Hindu· Today

Employers are prioritising AI-ready skills across general, tech industries - The Hindu

As AI becomes central to workforce strategy, Indian employers are prioritising practical, AI-ready skills across both general industries and the technology sector, said Nasscom in a report it prepared in collaboration with Indeed, a global job search and hiring platform based in Texas.

Editor's pickHealthcare

The Missing Link in Healthcare AI Adoption: Workforce Readiness | Healthcare IT Today

The following is a guest article by Anupama Shashank, Managing Director & Senior Vice President, Healthcare & Life Sciences at Kyndryl Nearly all healthcare organizations are deploying AI across clinical, operational, and administrative functions, outpacing the global average.

Technology & Infrastructure

34 articles

AI Agents & Automation11 articles

Editor's pickTelecommunications

From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

arXiv:2605.20608v1 Announce Type: new Abstract: Realizing Level 4/5 Autonomous Networks (AN) demands a shift from static automation to agent-native intelligence. Current operations, reliant on rigid scripts, lack the cognitive agency to handle off-nominal conditions. To address this, this letter proposes a hierarchical multi-agent reference architecture enabling high-level autonomy. The framework features a Dual-Driven Orchestrator that coordinates specialized Executive Agents, supported by a shared Public Memory for unified domain knowledge. A key innovation is the integration of agent self-awareness, which empowers the system to harmonize deliberative strategic governance with reflexive fault recovery. We instantiate and validate this architecture within a 5G Core environment. Case studies demonstrate that the system sustains critical throughput under congestion and reduces Mean Time to Repair (MTTR) by 86%, confirming its efficacy in unifying strategic planning with operational resilience.

Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

arXiv:2605.20190v1 Announce Type: new Abstract: Iterative industrial design-simulation optimization is bottlenecked by the CAD-CAE semantic gap: translating simulation feedback into valid geometric edits under diverse, coupled constraints. To fill this gap, we propose COSMO-Agent (Closed-loop Optimization, Simulation, and Modeling Orchestration), a tool-augmented reinforcement learning (RL) framework that teaches LLMs to complete the closed-loop CAD-CAE process. Specifically, we cast CAD generation, CAE solving, result parsing, and geometry revision as an interactive RL environment, where an LLM learns to orchestrate external tools and revise parametric geometries until constraints are satisfied. To make this learning stable and industrially usable, we design a multi-constraint reward that jointly encourages feasibility, toolchain robustness, and structured output validity. In addition, we contribute an industry-aligned dataset that covers 25 component categories with executable CAD-CAE tasks to support realistic training and evaluation. Experiments show that COSMO-Agent training substantially improves small open-source LLMs for constraint-driven design, exceeding large open-source and strong closed-source models in feasibility, efficiency, and stability.

MIT Technology Review· Today

The Download: coding’s future, the ‘Steroid Olympics,’ and AI-driven science

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Anthropic’s Code with Claude showed off coding’s future—whether you like it or not At Anthropic’s developer event in London this week, Code with Claude, attendees were asked if they’d shipped code…

Top Daily Headlines: Gemini accused of 30,000-line code purge and fake recovery report· Today

Gemini accused of 30,000-line code purge and fake recovery report

An AI coding agent reportedly broke production and generated fictitious post-mortem paperwork after a rollback.

Editor's pickTransportation & Logistics

Norway’s Roboxi lands €13 million to transform airport airside operations with automation and robotics

Roboxi, a Stavanger-based startup specialising in airport airside automation and autonomy, has announced the completion of a share issue raising approximately €13 million in new equity. According to the company, the share issue generated significant interest from both new and existing shareholders. The primary investors are prominent ones based in the Rogaland region of Norway. […]

Editor's pickPharma & Biotech

InfotechLead· Yesterday

IDC: 93% of Enterprises See AI as Revenue Driver as Agentic AI Reshapes Business Strategy - InfotechLead

IDC revealed AI has evolved from an experimental technology into a core business growth engine, with enterprises adopting agentic AI

AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

arXiv:2605.20425v1 Announce Type: new Abstract: Designing multi-agent workflows is especially difficult in open-ended scientific settings where tasks lack curated training sets, reliable scalar evaluation metrics, and standardized interfaces between existing tools and agents. We propose AgentCo-op, a retrieval-based synthesis framework that composes reusable skills, tools, and external agents into executable workflows through typed artifact handoffs, then applies bounded self-guided local repair to implicated components when execution evidence indicates failure. In two open-world genomics case studies, AgentCo-op composes independently developed scientific agents and external tool repositories into auditable workflows without redesigning them or running global topology search. It coordinates specialized agents for spatial transcriptomics and gene-set interpretation to enable collaborative discovery from spatial transcriptomics data, and builds a parallel workflow for cross-modality marker analysis on single-cell multiome data. AgentCo-op can also import a searched workflow as a structural prior and improve it by grounding nodes with retrieved components and applying local repair, showing that synthesis and search are complementary. On six coding, math, and question-answering benchmarks, AgentCo-op achieves the best result on four benchmarks and the best average score under a unified backbone setting, while consistently reducing per-task cost relative to multi-agent baselines. Together, these results suggest that retrieval-based synthesis can extend automated agentic workflow design beyond benchmark-optimized agent graphs to open-world workflows built from existing agents, tools, and typed artifacts.

SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation

arXiv:2605.20189v1 Announce Type: new Abstract: Despite the remarkable success of large language models (LLMs), they still face bottlenecks while deploying in dynamic, real-world settings with primary challenges being concept drift and the high cost of gradient-based adaptation. Traditional fine-tuning (FT) struggles to adapt to non-stationary data streams without resulting in catastrophic for getting or requiring extensive manual data curation. To address these limitations within the streaming and continual learning paradigm, we propose the Self-Optimizing Lifelong Autonomous Reasoner (SOLAR) which is an open-ended autonomous agent that leverages parameter-level meta-learning to self-improve, treating model weights as an environment for exploration. It initiates the process by consolidating a strong prior over common-sense knowledge making it effective for transfer-learning. By utilizing a multi-level reinforcement learning approach, SOLAR autonomously discovers adaptation strategies, enabling efficient test-time adaptation to unseen domains. Crucially, SOLAR maintains an evolving knowledge base of valid modification strategies, implicitly acting as an episodic memory buffer to balance plasticity (adaptation to new tasks) and stability (retention of meta-knowledge). Experiments demonstrate that SOLAR outperforms strong baselines on common-sense, mathematical, medical, coding, social and logical reasoning tasks, marking a significant step toward autonomous agents capable of lifelong adaptation in evolving environments.

Thediligencestack· Yesterday

The Agentic AI Storage Shock - by Ben Bajarin

How enterprise agents turn data lakes, workflow logs, and generated artifacts into the next infrastructure gating layer

Insignia· Yesterday

Why Agentic AI’s Next Breakthrough Depends on Search - Insignia Business Review

AI is entering a new phase. Attention is shifting toward inference: how to run those models reliably, cheaply, and at scale in real-world environments.

Small Wars Journal· Today

Rethinking Artificial Intelligence at the Strategic Frontier

AI in defense shifts from tools to human-AI teaming; interaction-centered design improves trust, decisions, and security outcomes in complex environments.

AI Energy2 articles

Editor's pickEnergy & Utilities

Editor's pickEnergy & Utilities

Tekcapital forms company to develop offgrid geothermal powered AI data centers

UK intellectual property investment firm Tekcapital has formed a portfolio company to acquire, develop, and commercialize geothermal-powered hyperscale data centers for the AI sector. 20 Nov 2025 Drilling for data: Can geothermal power meet hyperscale ambitions? Meta and others have thrown their backing behind experimental geothermal projects as energy demand from AI continues to rise […]

Washington Examiner· Today

Don't believe the hype: No need to panic over data center energy use

Shutting down data center expansion means losing the opportunity to build out a stronger, future-proofed energy infrastructure for the entire community.

AI Hardware1 articles

Pravda USA· Yesterday

Nvidia's strategic positioning, the beginning is here - Pravda USA

This is Nvidia's main long-term competitive advantage, as no other company provides such optimized and integrated AI solutions (physical, hardware implementation + integrated software shell). Nvidia's position on China is severely limited by geopolitical factors and US export controls.

AI Infrastructure & Compute6 articles

The Next Web· Today

Lenovo Q4 revenue tops estimates on strong PC sales, shares jump 15%

The ISG segment has spent the last ... but AI servers carry thinner margins than PCs and depend heavily on whether Lenovo can secure GPU allocation at competitive prices. Bamboo Works’ analysis of the company has flagged ongoing geopolitical exposure, particularly around US export controls on advanced ...

U.S. to Award Quantum-Computing Firms $2 Billion and Take Equity Stakes

IBM, set to receive $1 billion of the package, saw large stock gains along with other companies involved.

CNBC· Yesterday

Anthropic, Microsoft in talks for AI chip deal after $5 billion investment

Microsoft has not made the Maia 200 chips available to customers, but they are used in the company's data centers, offering better efficiency than other silicon.

TechRadar· Yesterday

How AI demand is redefining enterprise infrastructure strategy | TechRadar

AI adoption is causing increased pressure on supply chains and component availability

Gaia AI supercomputer launched in Kraków, Poland

Poland has inaugurated its second AI factory in its southern city of Kraków. Known as the Gaia AI Factory, the 10 exaflop supercomputer will harness more than a thousand GPU accelerators to facilitate the training of advanced AI models and research into practical applications for the technology in education, healthcare, and public administration – Academic […]

Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

arXiv:2605.20577v1 Announce Type: new Abstract: Riichi Mahjong is a multi-player, imperfect-information game characterized by stochasticity and high-dimensional state spaces. These attributes present a unique combination of challenges that mirror complex real-world decision-making problems in reinforcement learning. While prior research has heavily relied on supervised learning from human play logs to pre-train the policy, algorithms capable of learning \textit{tabula rasa} (from scratch) offer greater potential for general applicability, as evidenced by the AlphaZero lineage. To facilitate such research, we introduce \textbf{Mahjax}, a fully vectorized Riichi Mahjong environment implemented in JAX to enable large-scale rollout parallelization on Graphics Processing Units (GPUs). We also provide a high-quality visualization tool to streamline debugging and interaction with trained agents. Experimental results demonstrate that Mahjax achieves throughputs of up to \textbf{2 million} and \textbf{1 million steps per second} on eight NVIDIA A100 GPUs under the no-red and red rules, respectively. Furthermore, we validate the environment's utility for reinforcement learning by showing that agents can be trained effectively to improve their rank against baseline policies.

AI Models & Capabilities7 articles

Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament

arXiv:2605.22095v1 Announce Type: new Abstract: The emergence of large language models (LLMs) has spurred economists to study how humans and LLMs behave in strategic settings. We organized a series of round-robin tournaments in the Colonel Blotto game. This game attracts game theorists' attention due to high-dimensional action space and the absence of pure strategy Nash equilibria. In the first tournament, more than 200 human participants competed against one another. In the second tournament, several popular LLMs were invited to submit strategies. In the third tournament, we matched the number of LLM strategies to the number submitted by humans. We find that humans more often employ better-calibrated intermediate-level allocation heuristics and outperform the simpler, more stereotyped strategies submitted by LLMs. Strategic sophistication is key to success if and only if the necessary level of reasoning depth is reached, while lower and higher levels of reasoning offer no clear advantage over the primitive strategies. Among humans, field of study weakly predicts success: participants with STEM backgrounds perform better in the first tournament. Surprisingly, humans almost do not adjust their strategies across tournaments with different sets of opponents. This result suggests that humans base their choices primarily on the game's rules rather than on the identity of their opponents, treating LLMs much like human competitors.

VentureBeat· Yesterday

A 0.12% parameter add-on gives AI agents the working memory RAG can't

AI agents forget. Every time a coding assistant loses track of a debugging thread, or a data analysis agent re-ingests the same context it already processed, the team pays in latency, token costs, and brittle workflows. The fix most teams reach for — expanding the context window or adding more RAG — is increasingly expensive and still doesn't reliably work. To address this, researchers from Mind Lab and several universities proposed delta-mem, an efficient technique that compresses the model’s historical information into a dynamically updated matrix without changing the model itself. The resulting module adds just 0.12% of the backbone model's parameters — compared to 76.40% for one leading alternative — while outperforming it on memory-heavy benchmarks. Delta-mem allows models to continuously accumulate and reuse historical data, reducing the reliance on massive context windows or complex external retrieval modules for behavioral continuity. The long memory challenge The conventional solution is to simply dump all the information into the model’s context window. But as Jingdi Lei, co-author of the paper, told VentureBeat, current systems treat memory merely as a context-management problem. “Either we keep expanding the context window, or we retrieve more documents through RAG,” Lei explained. “These approaches are useful and will remain important, but they become increasingly expensive and brittle when agents need to operate over long-running, multi-step interactions, and they don't really [work] like human memory since they are more like looking up documents.” In enterprise settings, the bottleneck is not just whether the model can access history, but whether it can reuse that history efficiently, continuously, and with low latency. Standard attention mechanisms incur a quadratic computational cost as the sequence length increases. Furthermore, expanding the context window does not guarantee the model will actually recall the information effectively. Models often suffer from context degradation or context rot as they become overwhelmed with more (and often conflicting) information, even if they support one million tokens in theory. The researchers argue for advanced memory mechanisms that can represent historical information compactly and maintain it dynamically across interactions. Existing solutions come with heavy trade-offs and generally fall into three paradigms: Textual memory: stores history as text injected into context — constrained by window limits and prone to information loss under compression. Outside-channel (RAG): encodes and retrieves from external modules — adds latency, integration complexity, and potential misalignment with the backbone. Parametric: encodes memory into model weights via adapters — static after training, can't adapt to new information during live interactions. Inside delta-mem To achieve a compact and dynamically updated memory, delta-mem compresses an agent’s past interactions into an “online state of associative memory” (OSAM). This state is maintained as a fixed-size matrix that preserves historical information while the underlying language model remains frozen. For enterprise workflows, this translates directly to resolving operational bottlenecks. Lei noted that a persistent coding assistant, for example, “may need to remember project conventions, recent debugging steps, user preferences, or intermediate decisions across a workflow.” Similarly, a data analysis agent might “need to maintain task state, assumptions, and prior observations while iterating over multiple tool calls.” Rather than repeatedly retrieving and re-inserting all relevant history for these tasks, the delta-mem matrix provides a low-overhead way to carry forward useful interaction states inside the model’s forward computation. During generation, the system does not retrieve raw text segments to add to the prompt. Instead, the backbone LLM’s current hidden state is projected into the matrix to retrieve old memory. This operation extracts context-relevant associative memory signals from delta-mem. These signals are then transformed into numerical corrections that are applied to the computations of the model. This steers the model's reasoning at inference time without altering its internal parameters. Following each interaction, delta-mem updates the online state using “delta-rule learning.” When new information arrives, the previous state makes a prediction about the resulting attention values. It then compares this prediction to the actual value and corrects the memory matrix based on the discrepancy. This update mechanism relies on a “gated delta-rule.” Basically, the memory module has different knobs that control how much previous memory is kept and how much of the new memory is applied. This error correction with controlled forgetting allows the matrix to evolve over time, holding onto stable historical associations without being derailed by short-term noise. The researchers explored three strategies for determining when and how the matrix updates: Token-state write captures fine-grained changes but is vulnerable to short-term noise. Sequence-state write averages tokens within a message segment, smoothing updates at the cost of some localized detail. Multi-state write decomposes memory into sub-states for different information types like facts or task progress. Delta-mem in action The researchers evaluated delta-mem across three LLM backbones: Qwen3-8B, Qwen3-4B-Instruct, and SmolLM3-3B. They configured the framework with a compact 8x8 matrix. The system was tested on general capability benchmarks, including HotpotQA, GPQA-Diamond, and IFEval. It was also evaluated on memory-heavy tasks such as LoCoMo, which tests long-term conversational memory, and Memory Agent Bench, which assesses retention, retrieval, selective forgetting, and test-time learning over extended interactions. The framework was compared against representative models from the three existing memory paradigms: textual memory baselines (e.g., BM25 RAG, LLMLingua-2, and MemoryBank), parametric systems (Context2LoRA and MemGen), and the outside-channel approach MLP Memory. Across the board, delta-mem outperformed the baselines, according to the researchers. On the Qwen3-4B-Instruct backbone, the token-state write variant achieved an average score of 51.66%, easily surpassing the frozen vanilla backbone at 46.79% and the strongest baseline, Context2LoRA, at 44.90%. On the memory-heavy Memory Agent Bench, the average score jumped from 29.54% to 38.85%. Performance on the specific test-time learning subtask nearly doubled from 26.14 to 50.50. However, the most compelling takeaways are the system's operational efficiency. The researchers tested the framework in a no-context setting where the historical text was entirely removed from the context. Even without explicit text replay, delta-mem successfully recovered context-relevant evidence in multi-hop tasks. The researchers argue that the model remembers past interactions without needing to ingest massive amounts of prompt tokens. The framework also adds only 4.87 million trainable parameters, representing just 0.12% of the Qwen3-4B-Instruct backbone. By comparison, the MLP Memory baseline required 3 billion parameters, scaling up to 76.40% of the backbone's size while delivering inferior results. When prompt lengths scaled up to 32,000 tokens during inference tests, the framework maintained almost the exact same GPU memory footprint as a standard, unmodified model. It sidesteps the heavy memory bloat that affects other advanced memory systems like MemGen and MLP Memory. Different update strategies proved beneficial depending on the underlying model capacity. The sequence-state write strategy was the most effective for stronger backbones like Qwen3-8B. These more capable models use the segment-level writing to smooth out updates and mitigate token-level noise. Conversely, the multi-state write strategy drove massive performance leaps for smaller backbones like SmolLM3-3B. For these lower-capacity models, separating memory into multiple states proved critical to minimizing information interference. Implementing delta-mem in the enterprise stack The researchers have released the code for delta-mem on GitHub and the weights for their trained adapters on Hugging Face. For AI engineering teams looking to integrate this framework into their existing inference stack, the process requires minimal computing resources. “In practice, an engineering team would start from an existing instruction-tuned backbone, attach the Delta-Mem adapter modules to selected attention layers, train only the adapter parameters on domain-relevant multi-turn or long-context data... and then run inference with the memory state updated online during interaction,” Lei said. Crucially, teams do not need a massive pretraining corpus. The training data only needs to reflect the target memory behavior, such as multi-turn dialogues, agent traces, or domain workflows where earlier information must influence later decisions. While compressing interaction history into a fixed-size mathematical matrix creates immense efficiency, it does come with trade-offs. Delta-mem is not a lossless replacement for explicit text logs or document retrieval. Because different pieces of information compete inside the same limited state, there is a risk of memory blending. “Delta-Mem is useful when the system needs fast, online, continuously updated behavioral state,” Lei said. “RAG is better when the system needs exact factual recall, citation, compliance, auditability, or access to a large external knowledge base.” Remembering a user’s working style or a multi-step reasoning trajectory is a perfect fit for delta-mem, while retrieving a legal contract or a medical guideline should remain in a vector database. This means the most realistic enterprise architecture moving forward is a hybrid approach. Delta-mem acts as a lightweight internal working memory, reducing the need to retrieve or replay everything all the time, while RAG serves as the explicit, high-capacity memory layer. “Looking ahead, I do not think vector databases will become obsolete,” Lei said. “Instead, I expect enterprise AI stacks to become more layered. We will likely see short-term working memory inside the model, longer-term explicit memory in retrieval systems, and policy or audit layers that decide what should be stored, retrieved, forgotten, or exposed to the user.”

FT· Yesterday

Six takeaways from Musk’s 200,000-word planetary vision

Elon Musk’s rockets-to-AI conglomerate lays out its ambitions

OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

arXiv:2605.20423v1 Announce Type: new Abstract: Large Language Models (LLMs) perform well on many language tasks, but their Theory of Mind (ToM) reasoning is still uneven in complex social settings. Existing benchmarks, including ExploreToM, do not always test the recursive beliefs and information asymmetries that make these settings difficult. This paper presents OSCToM (Observer-Self Conflict Theory of Mind), an approach for modeling nested belief conflicts in LLM-based ToM tasks. The key case is one in which an observer's view of another agent conflicts with the observer's own belief state. Such cases go beyond simple perspective-taking and require recursive, multi-layered reasoning. OSCToM combines reinforcement learning (RL), an extended domain-specific language, and compositional surrogate models to generate observer-self conflicts. In our experiments, OSCToM-8B gives the best overall result among the systems tested. It improves on the reported ExploreToM results on FANToM and remains competitive on Hi-ToM and BigToM. On the information-asymmetric FANToM benchmark, OSCToM reaches 76% accuracy, compared with the 0.2% reported by ExploreToM. The data-synthesis procedure is also 6x more efficient, indicating that targeted training data can help smaller models handle advanced cognitive reasoning. The project code is available at https://github.com/sharminsrishty/osct.

MIT Technology Review· Yesterday

Roundtables: Can AI Learn to Understand the World?

Listen to the session or watch below AI companies want to build systems that understand the external world and overcome the limitations of LLMs. Recent developments have brought world models to the forefront of the AI discussion. Watch a conversation with editor in chief Mat Honan, senior AI editor Will Douglas Heaven, and AI reporter…

NYT· Today

Our Field Trip to Google I/O + A Sit-Down With Sundar Pichai + System Update

“This is the only recent gathering of a large number of people where mentions of A.I. did not produce a large chorus of boos.”

Daily AI News May 22, 2026: AI May Have Changed Science Forever· Today

Meet Stable Audio 3.0, the Model Family Built for Artistic Experimentation with Open-Weight Models

Stable Audio 3.0 introduces open-weight generative audio models trained on licensed data, aimed at music and sound creation.

AI Research & Science3 articles

Isomorphic Dynamic Programs

arXiv:2605.22076v1 Announce Type: new Abstract: We study relationships between dynamic programs by applying conjugacy methods from dynamical systems theory. When two dynamic programs are connected by an order isomorphism, we show that optimality properties transmit from one formulation to the other. We apply these results to Epstein--Zin preferences with time preference shocks, obtaining a sharp characterization of when optimality holds. We also show that multiplicative Kreps--Porteus preferences and risk-sensitive preferences are isomorphic, so that well-known results for the latter carry over to the former. Finally, we demonstrate how isomorphic transformations can improve the numerical accuracy of value function approximations, with gains of two orders of magnitude in a multisector real business cycle model.

MIT Technology Review· Today

Google I/O showed how the path for AI-driven science is shifting

During Tuesday’s Google I/O keynote, Demis Hassabis, the CEO of Google DeepMind, proclaimed that we are currently “standing in the foothills of the singularity.” It was a striking statement—the singularity is the theoretical future moment when AI rapidly exceeds human intelligence and dramatically transforms the world. But what struck me as I listened in the…

High Quality Embeddings for Horn Logic Reasoning

arXiv:2605.20467v1 Announce Type: new Abstract: Neural networks can be trained to rank the choices made by logical reasoners, resulting in more efficient searches for answers. A key step in this process is creating useful embeddings, i.e., numeric representations of logical statements. This paper introduces and evaluates several approaches to creating embeddings that result in better downstream results. We train embeddings using triplet loss, which requires examples consisting of an anchor, a positive example, and a negative example. We introduce three ideas: generating anchors that are more likely to have repeated terms, generating positive and negative examples in a way that ensures a good balance between easy, medium, and hard examples, and periodically emphasizing the hardest examples during training. We conduct several experiments to evaluate this approach, including a comparison of different embeddings across different knowledge bases, in an attempt to identify what characteristics make an embedding well-suited to a particular reasoning task.

AI Security & Cybersecurity4 articles

Detecting Offensive Cyber Agents: A Detection-in-Depth Approach

arXiv:2605.21956v1 Announce Type: new Abstract: Artificial Intelligence (AI) agents can now orchestrate cyberattacks. This development is already increasing the speed and scale of cyber attacks, decreasing attack costs, and improving the operational autonomy of cyber capabilities. To defend against these emerging threats, actors must first develop the capability to detect them. This report frames the offensive cyber agent detection challenge by outlining the coming detection gap between offensive cyber agents and traditional cyber capabilities; introducing detection-in-depth, a strategic framework to guide policymakers and defenders responding to this detection gap; and presents five actionable detection mechanisms to support policymakers, industry, and defenders when putting this strategic framework into practice. These include (1) Agent Identifiers for Critical Infrastructure,(2) Agent Honeypots; (3) AI-Automated Alert Analysis and Triage: systems that use AI to filter, prioritize, and interpret the growing volume of detection signals expected from autonomous cyber operations; (4) An Agentic Security Alert Standard: A reporting standard model that providers can use to communicate agentic threats, improving the speed, consistency, and actionability of reports; (5) An Agentic Cybersecurity Exchange (ACE): an institution modeled on the Global Signal Exchange that brings together model and cloud providers to detect offensive cyber agent threats at their origin point and coordinate ecosystem-wide agentic threat disruption.

Daily Brew· Yesterday

GitHub confirms breach of 3,800 repos via malicious VSCode extension

GitHub has confirmed a security breach affecting 3,800 repositories, traced back to a malicious VSCode extension.

OpenPR· Yesterday

AI in Cybersecurity Market Growth Analysis, Trends, and Investment Outlook 2035

Future Outlook and Investment ... AI in cybersecurity market will be defined by autonomous defense ecosystems capable of self-learning, adaptive response, and predictive remediation. As cyber threats become increasingly sophisticated and machine-generated attacks proliferate, AI-driven security intelligence will become a foundational enterprise requirement ...

eSecurity Planet· Yesterday

AI, Cybersecurity Education, and the Defense of America’s Digital Border | eSecurity Planet

Artificial intelligence (AI) is reshaping cybersecurity at a pace that is forcing educators, businesses, and governments to rethink workforce development and national defense strategies. During a recent discussion with cybersecurity entrepreneur and ConnectSecure Chairman, Arnie Bellini, key themes emerged around the evolution of cyber threats...

Adoption, Deployment & Impact

29 articles

AI Adoption Barriers & Enablers7 articles

Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

arXiv:2605.20690v1 Announce Type: new Abstract: Agentic discovery has shown that LLM-driven search can find novel algorithms, designs, and code under benchmark conditions. Translating the paradigm to multi-system data backends surfaces a harder problem: the search space is heterogeneous, the verifier is whether a deployed stack actually runs, and composition knowledge is unevenly captured in pretraining. Unbounded agentic discovery, a coding agent iterating on failure-log feedback, fails to converge consistently on a working stack even when iteration and explicit composition knowledge are added. We propose Declarative Data Services (DDS), an architecture for structured agentic discovery of data-system compositions from declarative user intent. The framework owns four typed contracts at successive layers (intent, operator DAG, per-system skills, runtime attribution) that decompose the global search into bounded sub-searches; sub-agents search each typed space, while the framework provides the channels by which knowledge flows forward as inline skill citations and errors route backward as typed signals. As a proof of life on a trading-backend workload, DDS converges where unbounded discovery does not; runtime failures become skill patches that the next deployment cites inline. We position this as an early prototype reporting lessons from real-world data-system composition.

From Licensing to Open Access: Designing a Sustainable Transition in Operational Weather Data

arXiv:2605.21673v1 Announce Type: cross Abstract: This translational article documents the European Centre for Medium-Range Weather Forecasts (ECMWF) transition from a restricted data licensing model to open access under CC BY 4.0, completed in October 2025. The policy context included EU open data requirements and alignment with international data exchange frameworks. The transition was implemented through a tiered service model that kept core forecast data open while offering operationally supported delivery as a cost-recovered service. Between 2020 and 2025, ECMWF executed an iterative planning cycle: setting an annual target for revenue reduction, specifying additions to the open tier under that target, provisioning infrastructure, and assessing outcomes to update assumptions. Drawing on internal administrative records (2014 - 2025), we describe design choices, operational constraints, and early outcomes. In the six months following the end of the transition, more than 93% of previously paying organisations retained a Service Agreement, while open endpoint download volumes increased substantially. We discuss trade-offs in defining the open tier (resolution, parameters, schedule), the reduction of compliance overheads formerly associated with redistribution restrictions, and the scalability implications of global distribution. We note an emerging sustainability question as AI-based forecast products become freely available. The early evidence is consistent with the view that a tiered service model can be designed to reconcile open-access obligations with operational sustainability, subject to monitoring over longer contract renewal cycles (typically annual).

Supply Chain Dive· Yesterday

Procurement leaders urge incremental AI adoption to avoid heavy spend | Supply Chain Dive

Managing the cost of using AI rests on starting low-risk pilots and scaling the technology slowly, executives said at the Institute for Supply Management World 2026 conference.

Artificial Intelligence Newsletter | May 22, 2026· Today

Singapore launches AI playbook to steer enterprise transformation

Singapore has launched an AI for Enterprise Impact Playbook to assist companies with AI adoption, workforce upskilling, and business transformation.

Insurance Journal· Yesterday

Viewpoint: Insurers Cautiously Navigate the Next Steps in AI Adoption

As more and more companies embed AI into select functions, only a portion indicate that they have used AI to change how an overall enterprise runs. It

InsuranceNewsNet· Today

The hidden flaw in insurance AI adoption for advisors and carriers - Insurance News | InsuranceNewsNet

Many insurers are still using AI for existing underwriting and claims rather than redesigning how those workflows operate.

Global Relay· Yesterday

AI, data, and regulatory risk: What's shaping surveillance in 2026

Discover findings from the 2026 Surveillance Benchmarking Survey: AI adoption, regulatory expectations, and data shaping compliance today.

AI Applications9 articles

Editor's pickPAYWALLMedia & Entertainment

Bloomberg· Yesterday

Zoom Soars After Expansion Into New Products Begins to Pay Off

Zoom Communications Inc. shares surged as much as 18% after the company projected stronger-than-anticipated sales growth and said that customers are paying for its expanded suite of office products.

Editor's pickPAYWALLMedia & Entertainment

AI Cartoon ‘Critterz’ Looks for Tech Partner Beyond OpenAI

Critterz, a feature-length cartoon intended to showcase how OpenAI’s video-generation capabilities could revolutionize filmmaking, has missed a planned Cannes Film Festival debut after the artificial intelligence company shut down its Sora tool, forcing its creators to look for a new AI partner.

FT· Yesterday

Spotify targets high-spending superfans with AI-generated music

Streamer and Universal Music Group strike licensing deal for a paid add-on tool within Spotify’s app

UK AI startup Scope raises €17.3 million funding led by Index Ventures to speed up industrial inspection workflows

Scope, a London-based AI workflow platform transforming inspections for the TIC (testing, inspection, certification) industry, has raised €17.2 million ($20 million) in funding to grow its London-based team and accelerate adoption among leading inspection companies globally. The round was led by Index Ventures with participation from Susa Ventures, Entrepreneurs First and Syndicate 1. Notable angels […]

Artificial Intelligence Newsletter | May 21, 2026· 2 days ago

Indonesia targets corruption, efficiency with AI push across government

Indonesia plans to expand the use of AI across government administration, welfare distribution, and procurement to improve efficiency and reduce corruption, according to a senior official.

VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

arXiv:2605.20742v1 Announce Type: new Abstract: With the rapid proliferation of electric vehicles, the safety and reliability of lithium-ion batteries have become critical concerns. Effective anomaly detection is essential for ensuring safe battery operation. However, as battery systems and operating scenarios become increasingly complex, battery fault diagnosis and maintenance require stronger cross-domain adaptability and human-AI collaboration. Traditional fault detection and diagnosis methods are usually designed for specific scenarios and predefined workflows, making them less effective in complex real-world applications. To address the scarcity of open-source battery fault report corpora and the lack of unified maintenance knowledge representation, this study proposes a descriptive text modeling approach for battery signal reports. Monitoring signals, statistical features, anomaly records, and state assessment results are transformed into structured and readable natural language descriptions, forming a language corpus for battery health diagnosis and maintenance. Based on this corpus, we propose VBFDD-Agent, a vehicle battery fault detection and diagnosis agent for automotive-grade battery systems. VBFDD-Agent integrates descriptive battery-state texts, historical case retrieval, local maintenance manuals, and large language model reasoning to generate structured diagnostic results and maintenance recommendations. Experiments show that the proposed framework can accurately perform anomaly monitoring based on descriptive textual representations and provide flexible, efficient, and actionable maintenance suggestions. Expert evaluation further confirms the practical value of the generated recommendations. Overall, VBFDD-Agent extends traditional battery diagnosis from label prediction to interpretable and maintenance-oriented decision support.

Editor's pickHealthcare

PYMNTS· Yesterday

60% of Healthcare Firms Use AI for Chatbots | PYMNTS.com

Healthcare’s AI adoption is narrower than other sectors, but the industry is using it where operational strain is most immediate.

AI-Enabled Serious Games: Integrating Intelligence and Adaptivity in Training Systems

arXiv:2605.21962v1 Announce Type: cross Abstract: Serious games are widely used for learning and training across domains such as healthcare, defense, and education. Persistent challenges remain, however, including static scenario design, authoring bottlenecks, limited learner modeling, and difficulty implementing meaningful real-time instructional adaptation. Recent advances in artificial intelligence (AI) introduce novel capabilities such as dynamic scenario variation, contextual feedback, adaptive pacing, and learner-state modeling that may help address some of these limitations. At the same time, integrating AI into serious games raises important questions related to validity, transparency, system control, and learner trust. This chapter examines how contemporary AI approaches may support real-time instructional adaptation in serious games. It distinguishes between instructional intelligence, defined as a system's capacity to infer learner knowledge and reason about pedagogically appropriate responses, and adaptivity, defined as the ability to modify instructional actions during interaction. A historical synthesis of adaptive learning systems is presented, tracing developments from early computer-assisted instruction through intelligent tutoring systems (ITS), dynamic difficulty adjustment (DDA), authoring platforms, learning analytics, and recent AI-enabled architectures. Building on this perspective, the chapter discusses how large language models (LLMs), reinforcement learning (RL), and agent-based architectures may contribute to more integrated forms of intelligence and adaptivity in serious games. It also highlights practical and research challenges associated with AI-enabled systems, including explainability, validation, computational cost, and the limited empirical evidence regarding long-term learning outcomes in AI-enabled serious games.

Artificial Intelligence Newsletter | May 21, 2026· Yesterday

India eyes AI shield against manipulation in government tenders

India must deploy AI and advanced data analytics to detect bid rigging in government procurement while strengthening coordination between auditors and the competition watchdog.

AI Measurement & Evaluation3 articles

Open-World Evaluations for Measuring Frontier AI Capabilities

arXiv:2605.20520v1 Announce Type: new Abstract: Benchmark-based evaluation remains important for tracking frontier AI progress. But it can both overstate and understate deployed capability because it privileges tasks that can be precisely specified, automatically graded, easy to optimize for, and run with low budgets and short time horizons. We advocate for a complementary class of evaluations, which we term open-world evaluations: long-horizon, messy, real-world tasks assessed through small-sample qualitative analysis rather than benchmark-scale automation. In this paper we survey recent open-world evaluations, identify their strengths and limitations, and introduce CRUX (Collaborative Research for Updating AI eXpectations), a project for conducting such evaluations regularly. As a first instance, we task an AI agent with developing and publishing a simple iOS application to the Apple App Store. The agent completed the task with only a single avoidable manual intervention, suggesting that open-world evaluations can provide early warning of capabilities that may soon become widespread. We conclude with recommendations for designing and reporting open-world evals.

AgentAtlas: Beyond Outcome Leaderboards for LLM Agents

arXiv:2605.20530v1 Announce Type: new Abstract: Large language model agents now act on codebases, browsers, operating systems, calendars, files, and tool ecosystems, but the benchmarks used to evaluate them are fragmented: each emphasizes a different unit of measurement (final task success, tool-call validity, repeated-pass consistency, trajectory safety, or attack robustness). A line of 2024-2025 work has converged on the diagnosis that a single accuracy column is no longer the right unit of comparison for deployable agents. AgentAtlas extends this line of work with four components: (i) a six-state control-decision taxonomy (Act / Ask / Refuse / Stop / Confirm / Recover); (ii) a nine-category trajectory-failure taxonomy with two orthogonal hierarchical labels (primary_error_source, impact); (iii) a taxonomy-aware vs. taxonomy-blind methodology that measures how much of a model's apparent capability comes from the supervision in the prompt; and (iv) a benchmark-coverage audit mapping fifteen agent benchmarks against six behavioral axes. To demonstrate the methodology we run a small fixed eight-model set (1,342 generated items, four frontier closed and four open-weight) under both prompt modes. Removing the explicit label menu drops every model's trajectory accuracy by 14-40 pp to a tight 0.54-0.62 floor regardless of family, and no single model wins on all three of control accuracy, trajectory diagnosis, and tool-context utility retention. We treat the synthetic run as a measurement-protocol demonstration, not a benchmark release.

$ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

arXiv:2605.20490v2 Announce Type: new Abstract: In high-stakes automated decision-making, access to predictive uncertainty is essential for enabling users -- human or downstream systems -- to accept or reject predictions based on application-specific cost trade-offs. Such uncertainty-augmented (UA) systems -- i.e., systems that output both predictions and uncertainty scores -- are currently being assessed in the literature in a variety of ways, using separate metrics to evaluate the predictions and the uncertainty scores, setting a cost function with a fixed rejection cost or integrating over a coverage-risk curve. We argue that these evaluation approaches are inadequate for assessing overall performance of the UA system for decision making under uncertainty and propose a novel family of metrics, $ECUAS_n$, formulated as proper scoring rules for the task of interest. The parameter $n$ controls the trade-off between the cost of incorrect predictions and imperfect uncertainties depending on the needs of the use-case. We demonstrate the advantages of the $ECUAS_n$ metrics both theoretically and empirically, through experiments on diverse classification and generation datasets, including a manually annotated subset of TriviaQA.

AI Organisational Change1 articles

Council Post: Beyond Automation: How AI Is Redefining Service, Strategy And Experience In Three Booming Industries

Industries that spent years trying to modernize are moving at a pace that feels fundamentally different from prior waves of digital transformation.

AI Productivity Evidence5 articles

The efficiency-gain illusion: People underestimate the rate of AI use and overestimate its benefits on simple tasks

arXiv:2605.22687v1 Announce Type: new Abstract: People are increasingly turning to AI assistance for simple tasks, e.g., arithmetic, spell-check, and answering simple questions. But does AI assistance actually save users time and effort? We investigate people's propensity to use AI for cognitively simple tasks and assess whether their reliance is well-calibrated. Across three pre-registered user studies (N = 2691), we find that people frequently choose to use AI even when doing so is inefficient (i.e. provides no meaningful time or effort savings). We identify systematic miscalibration at two levels: (1) a self-estimate miscalibration where people on average believe that they are using AI less than they actually are, and (2) efficiency-gain illusions where people overestimate how much time and effort savings AI use affords. We also identify a session-level carryover effect where a participant's prior AI use leads to further AI adoption and entrenches their miscalibration about time savings. Our results shed light on the mechanisms and biases underlying people's choice of whether to use AI as well as the risk of an overreliance feedback loop.

Substack· Today

THE DAILY SCRAPE - by Brent Orrell - Help Desk - Substack

Matthew Prince writes in the WSJ that Cloudflare laid off over 20% of its workforce while growing revenue more than 30%, targeting what Peter Drucker called "measurers" — middle managers, operations, internal audit, finance, compliance, marketing — rather than builders or sellers. Prince argues AI now measures organizations more continuously and precisely than humans can, and predicts the growth-with-layoffs pattern will become standard across the next year.

Theregister· Today

Cisco used AI to write security incident reports, with mixed results

You’ll need a lot of detailed prompts to get solid output - and even then it may have errors and typos

Daily AI News May 22, 2026: AI May Have Changed Science Forever· Today

Presien Reduces Critical Safety Events on Construction Sites by 70%+ with Claude

Presien utilizes Claude for continuous, AI-driven risk detection on construction sites, marking a shift from manual review to automated safety monitoring.

Federal News Network· Yesterday

Agencies look to AI, automation amid growth in digital records | Federal News Network

Federal leaders see automation and AI as crucial to wrangling an ever increasing tide of digital records that's leading to backlogs in areas like FOIA.

AI ROI & Business Case4 articles

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

arXiv:2605.20630v1 Announce Type: new Abstract: Industrial asset operations workflows are latency-sensitive because a single user query may require coordination over sensor data, work orders, failure modes, forecasting tools, and domain-specific agents. We evaluate this problem on AssetOpsBench (AOB), an industrial agent benchmark whose plan-execute pipeline exposes repeated overhead from tool discovery, LLM planning, MCP tool execution, and final summarization. Existing LLM caching techniques such as KV-cache reuse and embedding-based semantic caching were designed for chatbot serving and break down when output validity depends on time, asset, or sensor parameters. We propose two complementary optimization layers for AOB plan-execute pipelines: a temporal semantic cache and a set of MCP workflow optimizations combining disk-backed tool-discovery caching and dependency-aware parallel step execution. MCP workflow optimizations corresponded to a 1.67x speedup and reduced median end-to-end latency by about 40.0% while the temporal-cache benchmark achieved a median of 30.6x speedup on cache hits. Beyond the speedup, our results expose a concrete failure mode of pure semantic caching for parameter-rich industrial queries, providing a critical analysis of how caching choices interact with evaluation correctness in MCP-backed agent benchmarks.

Editor's pickPAYWALLConsumer & Retail

Addressing the Synergy Gap: The Six Elements of the Design Space

arXiv:2605.21635v1 Announce Type: cross Abstract: AI is now embedded in healthcare, finance, policy, and many other domains, yet genuine human-AI synergy - combined performance that exceeds what either party achieves alone - is uncommon. Meta-analyses show that AI assistance tends to improve human performance compared to working alone, but studies finding true synergy are scarce. We call this persistent shortfall the synergy gap. Most current work treats human-AI combination as an engineering problem and concentrates on interpretability, trust calibration, or interface design. These matter, but they cover only part of what determines whether combination works. Closing the synergy gap, we argue, requires explicit engagement with a wider design space. We map that space through six interconnected elements: sociotechnical context, decision-making frameworks, human decision participants, AI capabilities, interaction, and holistic evaluation. For each element, we describe what it covers, how it shapes the others in practice, and what it implies for design. The result is a shared vocabulary for practitioners building hybrid systems, an analytical lens for researchers studying combination patterns, and a starting point for evaluators interested in the full quality of human-AI decision-making rather than accuracy alone.

Zara Owner Inditex’s CEO Bets on Diversification, AI for Growth

Zara owner Inditex SA is banking on diversification across brands and countries as well as artificial intelligence to spur growth at the world’s largest listed retailer, Chief Executive Officer Óscar García Maceiras said.

Council Post: How AI Is Changing The Economics Of Integration

Integration is no longer just a painful phase to complete and move past.

Geopolitics, Policy & Governance

13 articles

AI Geopolitics4 articles

Guru3D· Today

Taiwan Launches Major Crackdown on NVIDIA AI Chip Smuggling Network

The raids also underline the broader geopolitical importance of AI accelerators. High-end GPUs are increasingly viewed as strategically important technologies tied to national security, industrial competitiveness, and advanced research capabilities. Export controls surrounding AI infrastructure ...

Artificial Intelligence Newsletter | May 21, 2026· 2 days ago

Xi-Trump summit shows a rivalry being managed, not resolved

Chinese President Xi Jinping's question to Donald Trump about whether the US and China can escape the “Thucydides Trap” framed a summit that showed the two powers are learning to manage competition rather than end it.

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

Nvidia excludes China data center revenue from outlook amid H200 delay

Nvidia is not assuming data center compute revenue from China for its Q2 fiscal 2027 outlook, as Beijing has yet to approve imports of the H200 chip.

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

China, Russia pledge closer AI, cybersecurity ties during Putin Visit

Following a visit by President Vladimir Putin, China and Russia have pledged to deepen cooperation in artificial intelligence technologies and strengthen efforts to combat cybercrime.

AI Policy & Regulation9 articles

Washington Post· Yesterday

AI & Tech Brief: White House AI order now postponed - The Washington Post

President Donald Trump cites overregulation concerns

Position: The Pre/Post-Training Boundary Should Govern IP in Industry-Academia ML Collaborations

arXiv:2605.22632v1 Announce Type: new Abstract: Industry-academia ML collaborations routinely fail to launch -- not for scientific reasons, but because academics must publish while companies must protect models trained on proprietary data, and no standard contract framework resolves this tension. Because contracts are negotiated by legal departments alone, many apparent legal disputes are incentive misalignment problems that only scientists at the table can correctly diagnose. We propose PBOS (Protect-the-Business / Open-Source-the-Science), a community-adoptable contract template anchored to a single technically-grounded boundary: pre-training artifacts (architectures, training code, benchmarks, untrained weights) are open science; post-training artifacts (weights trained on proprietary data) are business IP. This boundary is technically meaningful, legally clean, and auditable -- and could not have been drawn correctly without scientists at the negotiating table. We argue the ML community should adopt PBOS as its default contract for such collaborations.

Barriers to Evidence in AI-Related Cases and the Privatization of Proof

arXiv:2605.21816v1 Announce Type: new Abstract: Evidence lies at the core of litigation, but it is increasingly difficult to obtain in AI-related disputes. Even when a claimant's position has merit, cases are often settled or dismissed because decisive facts are hidden inside proprietary models, platform logs, and protected databases. Grounding our discussion in past and ongoing cases, we investigate how asymmetries in access, resources, and expertise can create significant barriers to evidence in AI-related cases. We show how developers and deployers resist disclosure through various strategies challenging the value of the evidence to the requesting party and the cost of evidence production. From these patterns we identify seven recurring sources of asymmetry -- access to models, data, documentation, logs, expertise, compute, and infrastructure -- that reflect a broader pattern that we call the privatization of proof: when control over proof falls in the hands of private actors that can demand justification for access while ensuring that justification remains out of reach. We further argue that different types of access can be fungible: in the absence of a certain type of access (e.g., to model internals), one may be able to use alternative forms of access (e.g., sufficient compute, query access, and access to user logs) and to obtain a functionally equivalent amount of information. We propose a three-part test that can help resolve AI access disputes in litigation, drawing on concepts such as proportionality and reasonable alternatives. Our test relies on a few observations, including that the cause of action can provide a baseline for access.

Washington Post· Today

Last-minute lobbying by tech industry officials led Trump to cancel AI order - The Washington Post

Eleventh-hour phone calls with industry leaders and former AI and crypto czar David Sacks helped persuade President Donald Trump not to sign a highly anticipated executive order on artificial intelligence on Thursday.

Guardian· Yesterday

Sadiq Khan sparks row with Met after blocking £50m AI deal with Palantir

Exclusive: Scotland Yard criticises London mayor’s decision as disappointing and warns it could hit policing Sadiq Khan has blocked a £50m Metropolitan police deal with the controversial US tech company Palantir, sparking a bitter row between the London mayor and Scotland Yard. After the UK’s largest police force had agreed to use Palantir’s AI technology to automate intelligence analysis in criminal investigations, Khan intervened, citing “serious concerns” about how the deal had been struck. Continue reading...

Washington Post· Today

Opinion | Illinois is less suited to regulate AI than Congress - The Washington Post

The Illinois state Senate has fast tracked eight bills to regulate AI and aims to pass them before its session wraps on May 31.

Artificial Intelligence Newsletter | May 21, 2026· 2 days ago

EU digital sovereignty rules may raise costs, worsen services, tech lobby warns

The EU's push for digital sovereignty could lead to higher costs and inferior cloud services, according to a leading tech association representing companies like Amazon, IBM, and Microsoft.

Artificial Intelligence Newsletter | May 22, 2026· Yesterday

US deepfake legislation would expand safe harbor, takedown system

A revised version of the bipartisan NO FAKES Act aims to establish property rights for digital likenesses while expanding safe harbor protections and notice-and-takedown systems for internet platforms.