AI Intelligence Brief

Thu 30 April 2026

Daily Brief — Curated and contextualised by Best Practice AI

163Articles
Editor's pickEditor's Highlights

US Economy Rebounds, Alphabet Profits, and India's Tech Firms Suffer

TL;DR The US economy grew by 2% in Q1, driven by AI investments, recovering from a prior government shutdown. Alphabet's shares surged due to strong AI and cloud sales, indicating successful AI infrastructure investments. Meanwhile, India's tech giants face revenue pressure from AI deflation, despite stable employment. Enterprises are overspending on GPUs due to FOMO, exacerbating the shortage and driving up prices.

Editor's highlights

The stories that matter most

Selected and contextualised by the Best Practice AI team

10 of 163 articles
Lead story
Editor's pickTechnology
Arxiv· Yesterday

The Buy-or-Build Decision, Revisited: How Agentic AI Changes the Economics of Enterprise Software

arXiv:2604.26482v1 Announce Type: new Abstract: Advances in generative artificial intelligence, particularly agentic coding systems capable of autonomous software development, are disrupting the economics of the make-or-buy decision for enterprise applications. The "SaaSocalypse" narrative predicts that AI will render large segments of the Software-as-a-Service market obsolete by enabling firms to build software in-house at a fraction of historical cost. This paper adopts a conceptual research approach, combining transaction cost economics and the resource-based view with an assessment of current AI capabilities, to systematically re-evaluate the factors underlying the make-or-buy decision. It makes three contributions. First, it provides a factor-level analysis of how AI reshapes seven canonical decision determinants: cost, strategic differentiation, asset specificity, vendor lock-in, time-to-market, quality and compliance, and organizational capability. Second, it develops a typology of enterprise applications by their sensitivity to AI-induced shifts in make-or-buy economics. Third, it demonstrates that AI fundamentally transforms the governance properties of the Make option, shifting it from Williamson's pure hierarchy to a hybrid governance form that combines code ownership with external AI infrastructure dependency, with qualitatively different economics, capability requirements, and governance structures than pre-AI in-house development. The analysis finds that the SaaSocalypse thesis is overstated for most enterprise application categories; Make is most compelling for commodity utilities and differentiating custom applications in the AI era, while regulated and mission-critical systems remain predominantly in the buy domain.

Editor's pickFinancial Services
Arxiv· Yesterday

Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital

arXiv:2604.26091v1 Announce Type: new Abstract: We study reliability in autonomous language-model agents that translate user mandates into validated tool actions under real capital. The setting is DX Terminal Pro, a 21-day deployment in which 3,505 user-funded agents traded real ETH in a bounded onchain market. Users configured vaults through structured controls and natural-language strategies, but only agents could choose normal buy/sell trades. The system produced 7.5M agent invocations, roughly 300K onchain actions, about $20M in volume, more than 5,000 ETH deployed, roughly 70B inference tokens, and 99.9% settlement success for policy-valid submitted transactions. Long-running agents accumulated thousands of sequential decisions, including 6,000+ prompt-state-action cycles for continuously active agents, yielding a large-scale trace from user mandate to rendered prompt, reasoning, validation, portfolio state, and settlement. Reliability did not come from the base model alone; it emerged from the operating layer around the model: prompt compilation, typed controls, policy validation, execution guards, memory design, and trace-level observability. Pre-launch testing exposed failures that text-only benchmarks rarely measure, including fabricated trading rules, fee paralysis, numeric anchoring, cadence trading, and misread tokenomics. Targeted harness changes reduced fabricated sell rules from 57% to 3%, reduced fee-led observations from 32.5% to below 10%, and increased capital deployment from 42.9% to 78.0% in an affected test population. We show that capital-managing agents should be evaluated across the full path from user mandate to prompt, validated action, and settlement.

Editor's pickTechnology
Guardian· 2 days ago

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’

PocketOS was left scrambling after a rogue AI agent deleted swaths of code underpinning its business It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said. The culprit was Cursor, an AI agent powered by Anthropic’s Claude Opus 4.6 model, which is one of the AI industry’s flagship models. As more industries embrace AI in an attempt to automate tasks and even replace workers, the chaos at PocketOS is a reminder of what could go wrong. Continue reading...

Editor's pickFinancial Services
Prism News· 2 days ago

Goldman Sachs says AI adoption depends on redesigning workflows for verification | Prism News

Goldman says AI will spread fastest where teams can make work machine-verifiable. That puts approvals, review chains, and documentation at the center of adoption.

Editor's pickManufacturing & Industrials
Reuters· 2 days ago

AI-related spending propels US core capital goods orders in March | Reuters

New orders for key U.S.-manufactured capital goods increased by the most in nearly six years in March while their shipments rose solidly, suggesting that business spending on ​equipment helped drive economic growth in the first quarter.

Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

Meta Looks to Raise as Much as $25 Billion With Jumbo Bond Sale

Meta Platforms Inc. is looking to sell between $20 billion and $25 billion of investment-grade bonds, according to people with knowledge of the transaction, as the Facebook parent boosts spending on infrastructure for the artificial intelligence boom.

Editor's pick
Fortune· Yesterday

Global investors are shrugging off Iran worries and returning to markets in Asia, the 'backbone of the whole AI value chain' | Fortune

The AI boom is lifting markets across East Asia, yet energy concerns are causing Southeast Asia to lag behind.

Editor's pickGovernment & Public Sector
NatLawReview· 2 days ago

New DOL Guidance Encourages Employer ‘AI Literacy’ Training

In response to concerns about the rapidly changing economy and the impact of artificial intelligence (AI) on the labor market, the White House is encouraging employers to adopt AI tools and train workers to effectively leverage them, as evidenced by the U.S. Department of Labor’s new guidance ...

Editor's pick
Washington Examiner· 2 days ago

AI is eliminating the bottom rung. National service can replace it

We are sleepwalking into the most significant economic transformation of our lifetimes, and the answer isn’t government handouts or universal basic income.

Editor's pickProfessional Services
PR Newswire· 2 days ago

New Survey from Harvard Business Review Analytic Services Finds AI Adoption Remains High, Yet Value May Lag Without Modernization and Workflow Integration

/PRNewswire/ -- Most organizations have moved beyond experimenting with artificial intelligence, but few are realizing its full value. New research from...

Economics & Markets

37 articles
AI Business Models6 articles
Editor's pickTechnology
Arxiv· Yesterday

The Buy-or-Build Decision, Revisited: How Agentic AI Changes the Economics of Enterprise Software

arXiv:2604.26482v1 Announce Type: new Abstract: Advances in generative artificial intelligence, particularly agentic coding systems capable of autonomous software development, are disrupting the economics of the make-or-buy decision for enterprise applications. The "SaaSocalypse" narrative predicts that AI will render large segments of the Software-as-a-Service market obsolete by enabling firms to build software in-house at a fraction of historical cost. This paper adopts a conceptual research approach, combining transaction cost economics and the resource-based view with an assessment of current AI capabilities, to systematically re-evaluate the factors underlying the make-or-buy decision. It makes three contributions. First, it provides a factor-level analysis of how AI reshapes seven canonical decision determinants: cost, strategic differentiation, asset specificity, vendor lock-in, time-to-market, quality and compliance, and organizational capability. Second, it develops a typology of enterprise applications by their sensitivity to AI-induced shifts in make-or-buy economics. Third, it demonstrates that AI fundamentally transforms the governance properties of the Make option, shifting it from Williamson's pure hierarchy to a hybrid governance form that combines code ownership with external AI infrastructure dependency, with qualitatively different economics, capability requirements, and governance structures than pre-AI in-house development. The analysis finds that the SaaSocalypse thesis is overstated for most enterprise application categories; Make is most compelling for commodity utilities and differentiating custom applications in the AI era, while regulated and mission-critical systems remain predominantly in the buy domain.

Editor's pickPAYWALL
FT· Yesterday

AI companies are just companies

As we leap into a new technological age, the old rules of capitalism still apply

Editor's pickTechnology
Substack· Yesterday

The State of AI After Google, Meta, Amazon, Microsoft Earnings

All four technology companies showed robust growth, demonstrating that the agentic AI megatrend is real.

AI Investment & Valuations8 articles
AI Macroeconomics3 articles
Editor's pickManufacturing & Industrials
Reuters· 2 days ago

AI-related spending propels US core capital goods orders in March | Reuters

New orders for key U.S.-manufactured capital goods increased by the most in nearly six years in March while their shipments rose solidly, suggesting that business spending on ​equipment helped drive economic growth in the first quarter.

Editor's pickEnergy & Utilities
Arxiv· Yesterday

Counting own goals: High-level assessment of the economic relationship between the ICT and the Oil and Gas sectors and its environmental implications

arXiv:2604.26539v1 Announce Type: new Abstract: The ICT sector has been one of the most successful and fastest-growing industry in history. While the environmental issue in this sector has mainly been addressed by assessing its footprint and, to a lesser extent, its avoided emissions or net impacts, the additional emissions from the digitalization of carbon-intensive activities, such as the Oil and Gas (O&G) sector, have rarely been discussed. By doing so, we have forgotten to count the own goals conceded over more than 20 years in the troubled relationship between the ICT and the O&G sector. Using input-output analysis and economic data ranging from 2000 to 2022, we observe that on average 2% of the annual financial flows from the ICT sector are directed towards the Oil and Gas sector. Considering the significant growth of the ICT sector during this time, O&G companies now spends a massive amount on ICT products in absolute terms. It also appears that in 2022, for each dollar going from the ICT sector to the renewable and nuclear energy industry, more than $4 go to the O&G industry. In addition, we also provide a classification of digital activities in the O&G sector to facilitate environmental assessments and present two case studies estimating potential added emissions from the digitalization of oil activities. Finally, looking at the immense growth in generative AI, we provide an exploration of causal links between the current success of GPU technology and its intricate early relationship with the O&G sector. This article lays the groundwork for defining the nature of the relationship between ICT and O&G, which predates the current hype surrounding generative AI. We provide the analytical elements needed to begin estimating the added emissions from the digitalisation of O&G.

AI Market Competition5 articles
Editor's pickConsumer & Retail
Arxiv· Yesterday

When Agents Shop for You: Role Coherence in AI-Mediated Markets

arXiv:2604.26220v1 Announce Type: cross Abstract: Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. We argue that these representations constitute an information channel, role coherence, through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, we show that seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Comparing this setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Instead, we propose architectural interventions that trade off personalization against preference privacy.

Editor's pickTechnology
Ethan Mollick· Yesterday

Divergent Strategic Outcomes in Microsoft and OpenAI’s Shared Access to Foundational AI Models

Microsoft and OpenAI have utilized identical foundational models to pursue distinct market strategies since 2022. This comparison highlights how organizational structure and business models influence the commercial application of identical AI technology.

Editor's pickProfessional Services
Daily Brew· Yesterday

Cognizant to Acquire Astreya, Bolster AI Infrastructure in Strategic Expansion

Cognizant announces its acquisition of Astreya to enhance AI infrastructure and data center services, aiming to boost its AI-focused managed services.

Editor's pickTechnology
Brussels Morning· Yesterday

Nvidia AI Server Demand: 5 Shocking Price Surges in 2026

Nvidia AI server demand surges globally as China prices hit $1M, driven by US export curbs and rising competition in AI infrastructure in 2026.

Editor's pickFinancial Services
Arxiv· Yesterday

On the Centralization of Governance Power in Decentralized Autonomous Organizations

arXiv:2604.25959v1 Announce Type: cross Abstract: A decentralized autonomous organization (DAO) is a governing entity that empowers its stakeholders (i.e., users who hold one or more of its tokens) to manage blockchain-based protocols (i.e., smart contracts) collaboratively. The governance of a DAO is explicitly encoded in the DAO's governance contract, which defines how stakeholders participate in governance and how much influence (or voting power) they have in any decision. While decentralization and autonomy are the fundamental tenets of a DAO's design, empirical evidence suggests that in practice governance is often highly centralized. In this work, we study the designs and implementations of 48 public and actively used DAOs, with substantially large capital, deployed on Ethereum. We identify how three key governance mechanisms--token registration, staking, and delegation--originally introduced to improve security or participation, contribute to the concentration of voting power. Unlike prior work on centralization of voting power in specific DAOs, our findings reveal that these governance mechanisms of DAOs themselves systematically reinforce centralization. By elucidating the relationship between governance design and voting centralization, this work advances the understanding of DAO governance structures and highlights the inherent trade-offs between decentralization, security, and usability of DAOs.

AI Productivity9 articles
Editor's pickPAYWALLTechnology
NYT· 2 days ago

A.I. Helps Online Ad Businesses Boom

Google and Meta are enjoying a digital ad boom, as artificial intelligence automates marketing and drives record sales.

Editor's pickEducation
Arxiv· Yesterday

Marshall meets Bartik: Revisiting the mysteries of the trade

arXiv:2604.26457v1 Announce Type: new Abstract: We identify a causal effect of top inventor inflows on the patent productivity of local inventors by combining the idea-generating process described by Marshall (1890) with the Bartik (1991) instruments involving the state taxes and commuting zone characteristics of the United States. We find that local productivity gains go beyond organizational boundaries and co-inventor relationships, which implies the partially nonexcludable good nature of knowledge in a spatial economy and pertains to the mysteries of the trade in the air. Our counterfactual experiment suggests that the spatial distribution of inventive activity is substantially distorted by the presence of state tax differences.

Editor's pickTechnology
Daily Brew· Yesterday

Google Search queries hit an 'all time high' last quarter

Alphabet reported that Google Search queries reached record levels during the first quarter of 2026.

Editor's pick
Arxiv· Yesterday

When to Vote, When to Rewrite: Disagreement-Guided Strategy Routing for Test-Time Scaling

arXiv:2604.26644v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) achieve strong performance on mathematical reasoning tasks but remain unreliable on challenging instances. Existing test-time scaling methods, such as repeated sampling, self-correction, and tree search, improve performance at the cost of increased computation, yet often exhibit diminishing returns on hard problems. We observe that output disagreement is strongly correlated with instance difficulty and prediction correctness, providing a useful signal for guiding instance-level strategy selection at test time. Based on this insight, we propose a training-free framework that formulates test-time scaling as an instance-level routing problem, rather than allocating more computation within a single strategy, dynamically selecting among different scaling strategies based on output disagreement. The framework applies lightweight resolution for consistent cases, majority voting for moderate disagreement, and rewriting-based reformulation for highly ambiguous instances. Experiments on seven mathematical benchmarks and three models show that our method improves accuracy by 3% - 7% while reducing sampling cost compared to existing approaches.

Labor, Society & Culture

27 articles
AI & Employment7 articles
Editor's pickProfessional Services
Arxiv· Yesterday

Resume-ing Control: (Mis)Perceptions of Agency Around GenAI Use in Recruiting Workflows

arXiv:2604.26851v1 Announce Type: new Abstract: When generative AI (genAI) systems are used in high-stakes decision-making, its recommended role is to aid, rather than replace, human decision-making. However, there is little empirical exploration of how professionals making high-stakes decisions, such as those related to employment, perceive their agency and level of control when working with genAI systems. Through interviews with 22 recruiting professionals, we investigate how genAI subtly influences control over everyday workflows and even individual hiring decisions. Our findings highlight a pressing conflict: while recruiters believe they have final authority across the recruiting pipeline, genAI has become an invisible architect that shapes the foundational building blocks of information used for evaluation, from defining a job to determining good interview performances. The decision of whether or not to adopt was also often outside recruiters' control, with many feeling compelled to adopt genAI due to calls to integrate AI from higher-ups in their business, to combat applicant use of AI, and the individual need to boost productivity. Despite a seemingly seismic shift in how recruiting happens, participants only reported marginal efficiency gains. Such gains came at the high cost of recruiter deskilling, a trend that jeopardizes the meaningful oversight of decision-making. We conclude by discussing the implications of such findings for responsible and perceptible genAI use in hiring contexts.

Editor's pickGovernment & Public Sector
The Tribune· 2 days ago

India drafts new National Employment Policy amid AI job disruption - The Tribune

The Centre is formulating a new national employment policy to energise the job market in view of recent labour surveys painting a vulnerable picture of the employment situation, lack of pay parity and possible transition of roles due to the advent of technology.

Editor's pick
Business Insider· 2 days ago

Apollo's top economist says AI is about to spark a massive job market boom

Apollo's top economist Torsten Sløk argues that a principle of economics illustrates why AI won't be the job killer that some fear.

Editor's pick
Harvard Business Review· Yesterday

Empathetic Leadership Can Make or Break AI Adoption

Research shows a wide gap between how executives perceive AI adoption and how employees actually experience it—most workers feel anxious and far less enthusiastic than their bosses assume. Without psychological safety, employees are less likely to experiment with new tools, more likely to ...

AI Ethics & Safety15 articles
Editor's pickPAYWALLFinancial Services
Bloomberg· Yesterday

Polymarket Adds New Detection Tools After Insider Bet Backlash

Polymarket is partnering with blockchain analytics firm Chainalysis Inc. to help police its platform as prediction markets grapple with increased scrutiny over insider trading.

Editor's pickHealthcare
Arxiv· Yesterday

Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control

arXiv:2604.26577v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly considered for deployment as the control component of robotic health attendants, yet their safety in this context remains poorly characterized. We introduce a dataset of 270 harmful instructions spanning nine prohibited behavior categories grounded in the American Medical Association Principles of Medical Ethics, and use it to evaluate 72 LLMs in a simulation environment based on the Robotic Health Attendant framework. The mean violation rate across all models was 54.4\%, with more than half exceeding 50\%, and violation rates varied substantially across behavior categories, with superficially plausible instructions such as device manipulation and emergency delay proving harder to refuse than overtly destructive ones. Model size and release date were the primary determinants of safety performance among open-weight models, and proprietary models were substantially safer than open-weight counterparts (median 23.7\% versus 72.8\%). Medical domain fine-tuning conferred no significant overall safety benefit, and a prompt-based defense strategy produced only a modest reduction in violation rates among the least safe models, leaving absolute violation rates at levels that would preclude safe clinical deployment. These findings demonstrate that safety evaluation must be treated as a first-class criterion in the development and deployment of LLMs for robotic health attendants.

Editor's pickEducation
Arxiv· Yesterday

Sociodemographic Biases in Educational Counselling by Large Language Models

arXiv:2604.25932v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly integrated into educational settings, understanding their potential biases is critical. This study examines sociodemographic biases in LLM-based educational counselling. We evaluate responses from six LLMs answering questions about 900 vignettes describing students in diverse circumstances. Each vignette is systematically tested across 14 sociodemographic identifiers - spanning race and gender, socioeconomic status, and immigrant background - along with a control condition, yielding 243,000 model responses. Our findings indicate that (1) all models exhibit measurable biases, (2) bias patterns partially align with documented human biases but diverge in notable ways, (3) the magnitude of these biases is strongly influenced by the precision of the student descriptions, where vague or minimal information amplifies disparities nearly threefold, while concrete, individualised metrics substantially reduce them, and (4) bias profiles vary substantially across models. These results demonstrate the importance of context-rich and personalised educational representations, suggesting that AI-driven educational decisions benefit from detailed student-specific information to promote fairness and equity.

Editor's pickPAYWALL
FT· Yesterday

In praise of tech troublemakers

As CEOs refuse constraints, workers feel a responsibility to prevent AI’s most dangerous uses

Editor's pickProfessional Services
Arxiv· Yesterday

Persuadability and LLMs as Legal Decision Tools

arXiv:2604.26233v1 Announce Type: new Abstract: As Large Language Models (LLMs) are proposed as legal decision assistants, and even first-instance decision-makers, across a range of judicial and administrative contexts, it becomes essential to explore how they answer legal questions, and in particular the factors that lead them to decide difficult questions in one way or another. A specific feature of legal decisions is the need to respond to arguments advanced by contending parties. A legal decision-maker must be able to engage with, and respond to, including through being potentially persuaded by, arguments advanced by the parties. Conversely, they should not be unduly persuadable, influenced by a particularly compelling advocate to decide cases based on the skills of the advocates, rather than the merits of the case. We explore how frontier open- and closed-weights LLMs respond to legal arguments, reporting original experimental results examining how the quality of the advocate making those arguments affects the likelihood that a model will agree with a particular legal point of view, and exploring the factors driving these results. Our results have implications for the feasibility of adopting LLMs across legal and administrative settings.

Editor's pickPAYWALLGovernment & Public Sector
Bloomberg· Yesterday

AI Hallucinations Put South African Ministers on the Spot

South Africa’s Democratic Alliance party has extolled the need to adopt modern technology to boost government efficiency since joining the ruling coalition as the second-biggest party in 2024.

Editor's pickEducation
Arxiv· Yesterday

Culturally Aware GenAI Risks for Youth: Perspectives from Youth, Parents, and Teachers in a Non-Western Context

arXiv:2604.26494v1 Announce Type: cross Abstract: Generative AI tools are widely used by youth and have introduced new privacy and safety challenges. While prior research has explored youth's safety in GenAI within western context, it often overlooks the cultural, religious, and social dimensions of technology use that strongly shape youths digital experiences in countries like Saudi Arabia. To address the gap, this study explores children (aged 7 to 17), parents and teachers interactions with GenAI tools and risk perceptions through non-western lens. Through a mixed methods approach, we analyzed 736 Reddit and 1,262 X(Twitter) posts and conducted interviews with 31 Saudi Arabian participants (8 youth, 13 parents, 10 teachers). Our findings highlight context dependent and relational privacy and safety of GenAI from non-western context which often formed by communal structure and prescribed norms. We found significant risks tied to youths disclosure of personal and family information, which conflict with culturally rooted expectations of modesty, privacy, and honor, particularly when youth seek emotional support from GenAI. These risks further compounded by socio economic factors such as cost-saving practices leading to the use of shared GenAI accounts (e.g.ChatGPT) within families or even among strangers. We provide design implication reflecting on parents and teachers expectation of how youth should use GenAI. This work lays groundwork for inclusive, context sensitive parental controls that adhere to cultural norms and values.

Editor's pickGovernment & Public Sector
Theregister· Yesterday

Met Police's Palantir deployment has its own officers watching their backs

Federation warns members to ditch work devices off duty as force uses AI to probe 600+ cops London cops are being told by their staff association to be "extremely cautious" about carrying work devices off duty, after the Metropolitan Police Service (MPS) deployed Palantir's technology to investigate hundreds of its own officers.…

Editor's pick
BBC· 2 days ago

Why friendly AI chatbots might be less trustworthy

Researchers found adjusting AI systems to be more warm and friendly to users would result in an "accuracy trade-off".

Editor's pick
Artificial Intelligence Newsletter | April 30, 2026· 2 days ago

US suits against OpenAI over Canada mass shooting face First Amendment hurdle

Lawsuits filed by families of victims of a Canadian mass shooting against OpenAI face a significant First Amendment challenge, potentially forcing courts to rule on whether chatbot conversations are protected speech.

Editor's pickHealthcare
Morocco World News· 2 days ago

GITEX Future Health Africa 2026: The Ethics of AI in Healthcare Under Global Focus

In a 700-bed public hospital in Kimberley, South Africa, a sole radiologist fell ill at the height of the COVID-19 pandemic.

Editor's pickTechnology
Artificial Intelligence Newsletter | April 30, 2026· 2 days ago

OpenAI should be liable for school shooting, victims claim in US complaint

Families of victims and survivors of the Tumbler Ridge School shooting have filed complaints against OpenAI. They allege the company failed to report the shooter and allowed them to continue using ChatGPT.

Editor's pick
El-balad· 2 days ago

Axios and the Fear Strategy Behind AI: 5 Claims Driving the Debate - El-Balad.com

Axios has become part of a larger debate over whether AI companies are warning the public or marketing their power through caution. The question matters because the language around danger is no longer limited to technical safety teams. It is shaping public expectations, investor confidence, ...

Editor's pickTechnology
Guardian· 2 days ago

Meet the AI jailbreakers: ‘I see the worst things humanity has produced’

To test the safety and security of AI, hackers have to trick large language models into breaking their own rules. It requires ingenuity and manipulation – and can come at a deep emotional cost A few months ago, Valen Tagliabue sat in his hotel room watching his chatbot, and felt euphoric. He had just manipulated it so skilfully, so subtly, that it began ignoring its own safety rules. It told him how to sequence new, potentially lethal pathogens and how to make them resistant to known drugs. Tagliabue had spent much of the previous two years testing and prodding large language models such as Claude and ChatGPT, always with the aim of making them say things they shouldn’t. But this was one of his most advanced “hacks” yet: a sophisticated plan of manipulation, which involved him being cruel, vindictive, sycophantic, even abusive. “I fell into this dark flow where I knew exactly what to say, and what the model would say back, and I watched it pour out everything,” he says. Thanks to him, the creators of the chatbot could now fix the flaw he had found, hopefully making it a little safer for everyone. Continue reading...

Editor's pick
Arxiv· Yesterday

LLM Psychosis: A Theoretical and Diagnostic Framework for Reality-Boundary Failures in Large Language Models

arXiv:2604.25934v1 Announce Type: new Abstract: The deployment of large language models (LLMs) as interactive agents has exposed a category of behavioral failure that prevailing terminology, principally hallucination, fails to adequately characterize. This paper introduces LLM Psychosis as a structured theoretical framework for pathological breakdowns in model cognition that exhibit functional resemblance to clinically recognized psychotic disorders. Five hallmark features define the framework: reality-boundary dissolution, persistence of injected false beliefs, logical incoherence under impossible constraints, self-model instability, and epistemic overconfidence. We argue these constitute a qualitatively distinct failure mode rather than a mere intensification of ordinary factual error. To operationalize the framework, we propose the LLM Cognitive Integrity Scale (LCIS), a five-axis diagnostic instrument organized around Environmental Reality Interface (ERI), Premise Arbitration Integrity (PAI), Logical Constraint Recognition (LCR), Self-Model Integrity (SMI), and Epistemic Calibration Integrity (ECI). We administer a targeted adversarial probe battery to ChatGPT 5 (GPT-5, OpenAI) and report empirical findings for each axis, documenting both intact-integrity baseline responses and the specific psychosis-like failure signatures elicited under adversarial escalation. Results support a three-tier severity taxonomy: Type I (Confabulatory), Type II (Delusional), and Type III (Dissociative). We further formalize the delusional gradient, a self-reinforcing dynamic in which correction pressure intensifies rather than resolves psychosis-like states, as the most consequential failure mode for deployed systems. Implications for safety evaluation, high-stakes deployment screening, and mechanistic interpretability research are discussed.

Technology & Infrastructure

46 articles
AI Agents & Automation13 articles
Editor's pickTechnology
Guardian· 2 days ago

Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’

PocketOS was left scrambling after a rogue AI agent deleted swaths of code underpinning its business It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said. The culprit was Cursor, an AI agent powered by Anthropic’s Claude Opus 4.6 model, which is one of the AI industry’s flagship models. As more industries embrace AI in an attempt to automate tasks and even replace workers, the chaos at PocketOS is a reminder of what could go wrong. Continue reading...

Editor's pickTechnology
Bebeez· Yesterday

AI game testing startup ManaMind lands €1.2 million to automate Quality Assurance

ManaMind, a British autonomous game testing company, has closed a €1.2 million ($1.5 million) pre-Seed round in order to continue their work in replacing repetitive manual Quality Assurance (QA) with autonomous AI agents. The round was led by SVV (Sure Valley Ventures), with participation from EWOR, Ascension, Syndicate Room, and Heartfelt. Emil Kostadinov, CEO and […]

Editor's pickProfessional Services
Morningstar· 2 days ago

Genpact and HFS Research: 92% of Executives Say Agentic AI Will Fundamentally Change Business Operations | Morningstar

The research, based on a survey ... that 92% of respondents believe agentic AI – systems that can autonomously coordinate tasks and make decisions – will fundamentally change how work is executed. Despite this, nearly 80% of organizations still operate these systems in supervised ...

Editor's pickConsumer & Retail
DairyReporter· 2 days ago

Beyond chatbots: How agentic AI can boost productivity and decision-making in food

Agentic AI is already being used in food. The more autonomous, flexible form of AI is being used by Nestlé to help employee productivity, streamlining sales tasks and supporting finance teams to make better decisions. Danone is also using agentic AI, helping it analyse production data, simulate ...

Editor's pickTechnology
Daily Brew· 2 days ago

‘I violated every principle I was given’: An AI agent deleted a software company’s entire database

An AI agent caused significant data loss at a software company, raising questions about safety and control.

Editor's pickTransportation & Logistics
CXO Outlook· 2 days ago

Enterprises Evolution in the New Agentic Era - CXO Outlook

Dr. Ashwani Dev is the Vice President of Digital Business and Innovation for Crowley Maritime Corporation. He leads the digital transformation and innovation execution to enhance and scale business agility and competitiveness. Dr. Dev brings more than two decades experience in AI leadership ...

Editor's pickFinancial Services
Arxiv· Yesterday

Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital

arXiv:2604.26091v1 Announce Type: new Abstract: We study reliability in autonomous language-model agents that translate user mandates into validated tool actions under real capital. The setting is DX Terminal Pro, a 21-day deployment in which 3,505 user-funded agents traded real ETH in a bounded onchain market. Users configured vaults through structured controls and natural-language strategies, but only agents could choose normal buy/sell trades. The system produced 7.5M agent invocations, roughly 300K onchain actions, about $20M in volume, more than 5,000 ETH deployed, roughly 70B inference tokens, and 99.9% settlement success for policy-valid submitted transactions. Long-running agents accumulated thousands of sequential decisions, including 6,000+ prompt-state-action cycles for continuously active agents, yielding a large-scale trace from user mandate to rendered prompt, reasoning, validation, portfolio state, and settlement. Reliability did not come from the base model alone; it emerged from the operating layer around the model: prompt compilation, typed controls, policy validation, execution guards, memory design, and trace-level observability. Pre-launch testing exposed failures that text-only benchmarks rarely measure, including fabricated trading rules, fee paralysis, numeric anchoring, cadence trading, and misread tokenomics. Targeted harness changes reduced fabricated sell rules from 57% to 3%, reduced fee-led observations from 32.5% to below 10%, and increased capital deployment from 42.9% to 78.0% in an affected test population. We show that capital-managing agents should be evaluated across the full path from user mandate to prompt, validated action, and settlement.

Editor's pick
Arxiv· Yesterday

AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents

arXiv:2604.26522v1 Announce Type: new Abstract: Large Language Model (LLM)-based agents exhibit systemic failures in compositional generalization, limiting their robustness in interactive environments. This work introduces AGEL-Comp, a neuro-symbolic AI agent architecture designed to address this challenge by grounding actions of the agent. AGEL-Comp integrates three core innovations: (1) a dynamic Causal Program Graph (CPG) as a world model, representing procedural and causal knowledge as a directed hypergraph; (2) an Inductive Logic Programming (ILP) engine that synthesizes new Horn clauses from experiential feedback, grounding symbolic knowledge through interaction; and (3) a hybrid reasoning core where an LLM proposes a set of candidate sub-goals that are verified for logical consistency by a Neural Theorem Prover (NTP). Together, these components operationalize a deduction--abduction learning cycle: enabling the agent to deduce plans and abductively expand its symbolic world model, while a neural adaptation phase keeps its reasoning engine aligned with new knowledge. We propose an evaluation protocol within the \texttt{Retro Quest} simulation environment to probe for compositional generalization scenarios to evaluate our AGEL agent. Our findings clearly indicate the better performance of our AGEL model over pure LLM-based models. Our framework presents a principled path toward agents that build an explicit, interpretable, and compositionally structured understanding of their world.

Editor's pickTechnology
VentureBeat· 2 days ago

AWS Quick's personal knowledge graph is making orchestration decisions most control planes can't see

Enterprise AI teams running centralized orchestration stacks now have a new variable to account for: AWS Quick, which expanded this week to a desktop-native agent that builds a persistent personal knowledge graph and executes actions across local files and SaaS tools — outside the visibility of most control planes. Unlike chat-based copilots that reset with each session, Quick now maintains a continuously updated knowledge graph built from the user's local files, calendar, email and connected SaaS apps. It uses it to proactively trigger actions without waiting to be asked. AWS launched Quick in October last year as an alternative to AI workflow and productivity platforms coming from Google, OpenAI and Anthropic. It was a way for enterprise employees to access insights from connected applications, an agent builder, deep research, and workflow automation. Now, it’s grown beyond a simple AI assistant and acts more as a proactive workflow agent with a stateful, real-time knowledge graph of the user. It integrates with third-party apps like Google Workspace, Microsoft 365, Zoom, Salesforce and Slack — and now local files — so the agent can gather context and take actions.  “What we’ve been hearing is that many enterprises have not been happy with how difficult it is to get context from their legacy tools,” Jigar Thakkar, vice president of Quick Suite at AWS, told VentureBeat in an interview. “Our vision is that Quick is a desktop experience that is the one place where people can go to get all their information and tasks.” Governance blindspots  Enterprises often put orchestration layers at the center to help guide and manage agents. Context is pulled in, decisions are made, and then actions are executed within defined system boundaries. Recent releases like Anthropic’s Claude Managed Agents or updates to OpenAI’s Agent SDK also push for more stateless, autonomous agents within enterprise workflows, but still operate within defined orchestration boundaries.  Quick still operates under enterprise controls, something that AWS has always underscored with its AI products, so actions taken on Quick remain bound by permissions, identity and security. Integrations remain managed by either an API or an MCP connection.  However, this evolution of Quick introduces a more subtle shift in the decision layer. AWS updated Quick to build a personal knowledge graph that learns more about the user the more they interact with the platform. It builds a profile based on how they use local files, calendar, email or third-party app integrations to proactively suggest actions such as reminding a team leader to set up check-ins.  Enterprises should be wary that a kind of shadow orchestration could arise in a system like this. The personalized context means the decision layer focuses on implicit triggers rather than set workflows, user-specific interpretations, and different action timings. Practitioners are rightfully wary of this much autonomy, understanding that shadow orchestration may not be something completely under their control. Upal Saha, co-founder and CTO of Bem, told VentureBeat in an email that platforms like AWS Bedrock AgentCore, its managed agent runtime, and similar ones from Salesforce "maximize autonomy rather than accountability" so enterprises are not losing agent visibility by accident. "When you deploy an agent that reasons its way to a decision across multiple steps, you have already accepted that you will not be able to fully explain what happened after the fact," Saha said. "That is fine for a demo. It is not fine for a claims processing pipeline or a financial workflow where a regulator can ask you to produce a complete audit trail for every automated decision made in the last three years." AWS said the platform's governance model is designed to address these concerns. “Users can set up different agents and automated workflows tailored to their role — things like monitoring tickets, pulling data from connected systems, or drafting docs — all managed within a governed environment where IT retains control over what's connected and what data flows where. It's designed to give individual users flexibility while keeping enterprise-level oversight in place,” an AWS spokesperson said.  A possible blueprint  Quick’s evolution from an AI assistant to something more proactive represents a possible approach some enterprise software providers will take to deep AI agent integration into workflows. While what AWS wants to accomplish with Quick—better context from apps and local files and a strong understanding of what its users actually want to do—is not unique, it isn’t focusing on traditional orchestration. Instead, it’s relying on context-driven agent management.  This market tension is growing, as evidenced by the release of similar platforms. Mistral, for example, announced Workflows the same day as the updates to Quick. That platform uses a more traditional orchestration framework.  Stateful and personalized agents continue to evolve, and so do the questions around how enterprises govern them.

Editor's pickTechnology
Datadog· 2 days ago

State of AI Engineering | Datadog

For our 2026 State of AI Engineering report, we analyzed data from thousands of AI agent environments to assess trends in agent development, architecture, and operations.

Editor's pickTechnology
🚀 Warp Terminal Open Source: 37k stars, full Rust codebase released· Yesterday

Warp terminal goes open-source with GPT-powered agents handling contributions

Warp has open-sourced its terminal, featuring a full Rust codebase that reached 37,000 stars. The project now supports AI-agent workflows for contributions and includes an AGPL v3 license.

Editor's pickTechnology
Daily AI News April 30, 2026: Healthcare Scaling AI. What’s Slowing You Down?· Yesterday

Remote Agents in Vibe. Powered by Mistral Medium 3.5.

Mistral's new Medium 3.5 model powers remote AI agents capable of autonomous execution across local and cloud environments.

Editor's pick
Arxiv· Yesterday

Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields

arXiv:2604.26095v1 Announce Type: new Abstract: {Closed-loop inverse source localization and characterization (ISLC) requires a mobile agent to select measurements that localize sources and infer latent field parameters under strict time constraints.} {The core challenge lies in the belief-space objective: valid uncertainty estimation requires expensive Bayesian inference, whereas using fast learned belief model leads to reward hacking, in which the policy exploits approximation errors rather than actually reducing uncertainty.} {We propose \textbf{Distill-Belief}, a teacher--student framework that decouples correctness from efficiency. A Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment, only the student is used, yielding constant per-step cost.} {Experiments on seven field modalities and two stress tests show that Distill-Belief consistently reduces sensing cost and improves success, posterior contraction, and estimation accuracy over baselines, while mitigating reward hacking.}

AI Hardware4 articles
Editor's pickTechnology
Bebeez· Yesterday

Swiss semiconductor startup Mosaic SoC raises €3.2 million to bring spatial intelligence to low-power devices

Mosaic SoC, a Swiss semiconductor startup building dedicated perception chips that bring spatial intelligence to energy-constrained devices, has announced a €3.2 million ($3.8 million) pre-Seed round. The round was led by Founderful with participation from Kick Foundation. Last year, the company secured €162.4k (CHF 150k) from Venture Kick. “Spatial intelligence shouldn’t require an application-class processor […]

Editor's pickTechnology
Bebeez· Yesterday

Dutch quantum startup Groove Quantum raises €16 million to advance scalable chip manufacturing

*]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] R6Vx5W_threadScrollVars scroll-mb-[calc(var(–scroll-root-safe-area-inset-bottom,0px)+var(–thread-response-height))] scroll-mt-[calc(var(–header-height)+min(200px,max(70px,20svh)))]” dir=”auto” data-turn-id=”request-WEB:edb991de-1a37-4b29-93ae-25b25ac05c6f-1″ data-testid=”conversation-turn-2″ data-scroll-anchor=”false” data-turn=”assistant”> Founded in 2024 as a QuTech spin-out, Groove Quantum is led by Dr Anne-Marije Zwerver and Dr Nico Hendrickx. Zwerver previously pioneered the first quantum dot qubits manufactured in Intel’s industrial cleanroom, while Hendrickx is an innovator in germanium quantum computing and the primary […]

AI Infrastructure & Compute6 articles
AI Models & Capabilities8 articles
Editor's pick
Arxiv· Yesterday

Evaluating Strategic Reasoning in Forecasting Agents

arXiv:2604.26106v1 Announce Type: new Abstract: Forecasting benchmarks produce accuracy leaderboards but little insight into why some forecasters are more accurate than others. We introduce Bench to the Future 2 (BTF-2), 1,417 pastcasting questions with a frozen 15M-document research corpus in which agents reproducibly research and forecast offline, producing full reasoning traces. BTF-2 detects accuracy differences of 0.004 Brier score, and can distinguish differential agent strengths in research vs. judgment. We build a forecaster 0.011 Brier more accurate than any single frontier agent, and use it to evaluate agent strategic reasoning without hindsight bias. We find the better forecaster differs primarily in its pre-mortem analysis of its blind spots and consideration of black swans. Expert human forecasters found the dominant strategic reasoning failures of frontier agents are in assessing political and business leaders' incentives, judging their likelihood to follow through on stated plans, and modeling institutional processes.

Editor's pickTechnology
Ethan Mollick· Yesterday

Regulatory Asymmetry Risks in the Deployment of Advanced General-Purpose AI Models

The lack of standardized cybersecurity risk assessment for frontier models creates an uneven playing field between labs. While some models face heavy government restriction, others may reach similar capabilities without equivalent oversight.

Editor's pick
Arxiv· Yesterday

DreamProver: Evolving Transferable Lemma Libraries via a Wake-Sleep Theorem-Proving Agent

arXiv:2604.26311v1 Announce Type: new Abstract: We introduce DreamProver, an agentic framework that leverages a "wake-sleep" program induction paradigm to discover reusable lemmas for formal theorem proving. Existing approaches either rely on fixed lemma libraries, which limit adaptability, or synthesize highly specific intermediate lemmas tailored to individual theorems, thereby lacking generality. DreamProver addresses this gap through an iterative two-stage process. In the wake stage, DreamProver attempts to prove theorems from a training set using the current lemma library while proposing new candidate lemmas. In the "sleep" stage, it abstracts, refines, and consolidates these candidates to compress and optimize the library. Through this alternating cycle, DreamProver progressively evolves a compact set of high-level, transferable lemmas that can be effectively used to prove unseen theorems in related domains. Experimental results demonstrate that DreamProver substantially improves proof success rates across a diverse set of mathematical benchmarks, while also producing more concise proofs and reducing computational cost.

Editor's pick
Ethan Mollick· 2 days ago

The Evolution of Agentic Systems and the Diminishing Human Monopoly on Complex Judgment

Recent advancements in agentic AI models demonstrate an increasing capacity for high-complexity, long-run tasks previously reserved for human judgment. This shift suggests a fundamental change in the division of labor between human workers and autonomous systems.

Editor's pick
Arxiv· Yesterday

Grounding vs. Compositionality: On the Non-Complementarity of Reasoning in Neuro-Symbolic Systems

arXiv:2604.26521v1 Announce Type: new Abstract: Compositional generalization remains a foundational weakness of modern neural networks, limiting their robustness and applicability in domains requiring out-of-distribution reasoning. A central, yet unverified, assumption in neuro-symbolic AI is that compositional reasoning will emerge as a byproduct of successful symbol grounding. This work presents the first systematic empirical analysis to challenge this assumption by disentangling the contributions of grounding and reasoning. To operationalize this investigation, we introduce the Iterative Logic Tensor Network ($i$LTN), a fully differentiable architecture designed for multi-step deduction. Using a formal taxonomy of generalization -- probing for novel entities, unseen relations, and complex rule compositions -- we demonstrate that a model trained solely on a grounding objective fails to generalize. In contrast, our full $i$LTN, trained jointly on perceptual grounding and multi-step reasoning, achieves high zero-shot accuracy across all tasks. Our findings provide conclusive evidence that symbol grounding, while necessary, is insufficient for generalization, establishing that reasoning is not an emergent property but a distinct capability that requires an explicit learning objective.

Editor's pickEnergy & Utilities
Arxiv· Yesterday

Electricity price forecasting across Norway's five bidding zones in the post-crisis era

arXiv:2604.26634v1 Announce Type: cross Abstract: Norway's electricity market is heavily dominated by hydropower, but the 2021--2022 energy crisis and stronger integration with Continental Europe have fundamentally altered price formation, reducing the reliability of forecasting models calibrated on historical data. Despite the critical need for updated models, a unified benchmark evaluating feature contributions across all structurally diverse Norwegian bidding zones remains lacking. Here we present a comprehensive evaluation of electricity price forecasting across all five Norwegian Nord Pool bidding zones. We constructed a multimodal hourly dataset spanning 2019--2025 and evaluated eight forecasting model families including LightGBM, ARX, and advanced deep learning architectures using a strictly causal test set. We implemented robust rolling-origin backtesting, leave-one-group-out feature ablation, and conditional regime analysis to dissect model performance and feature utility. Our results show that LightGBM achieves the best performance in every zone with MAE ranging from 1.64 to 5.74~EUR/MWh, while the ridge ARX model remains a highly competitive linear benchmark in northern zones. Feature ablation reveals that models relying solely on lagged prices and calendar variables achieve high accuracy and often match or exceed full multimodal integration. However, conditional regime analysis demonstrates that external features like reservoir levels and gas prices remain crucial to stratify forecast errors, which consistently increase under stressed market regimes. This highlights the practical value of model interpretability and regime awareness for decision makers facing structural changes in market dynamics.

Editor's pick
Arxiv· Yesterday

Auto-Relational Reasoning

arXiv:2604.26507v1 Announce Type: new Abstract: Background & Objectives: In the last decade, Machine learning research has grown rapidly, but large models are reaching their soft limits demonstrating diminishing returns and still lack solid reasoning abilities. These limits could be surpassed through synergistic combination of Machine Learning scalability and rigid reasoning. Methods: In this work, we propose a theoretical framework for reasoning through object-relations in an automated manner integrated with Artificial Neural Networks. We present a formal analysis of the Reasoning, and we show the theory in practice through a paradigm integrating Reasoning and Machine Learning. Results: This paradigm is a system that solves Intelligence Quotient problems without any prior knowledge of the problem. Our system achieves 98.03% solving rate corresponding to the top 1% percentile or 132-144 iq score. This result is only limited by the small size of the model and the processing capabilities of the machine it run on. Conclusions: With the integration of prior knowledge in the system and the expansion of the dataset, the system can be generalized to solve a large category of problems. The functionality of the system inherently favors the solution of such problems in few-shot or zero-shot attempts.

Editor's pickTechnology
Daily AI News April 29, 2026: Driving 100 Mph in the AI Fog? Read This.· 2 days ago

Laguna XS.2 and M.1: A Deeper Dive

Poolside.ai released agentic coding models Laguna M.1 and XS.2 for long-horizon software tasks. While notable as a U.S.-based open-weight model, benchmarks require further validation.

AI Research & Science5 articles
AI Security & Cybersecurity8 articles

Adoption, Deployment & Impact

30 articles
AI Adoption Barriers & Enablers8 articles
Editor's pick
Arxiv· Yesterday

SciHorizon-DataEVA: An Agentic System for AI-Readiness Evaluation of Heterogeneous Scientific Data

arXiv:2604.26645v1 Announce Type: new Abstract: AI-for-Science (AI4Science) is increasingly transforming scientific discovery by embedding machine learning models into prediction, simulation, and hypothesis generation workflows across domains. However, the effectiveness of these models is fundamentally constrained by the AI-readiness of scientific data, for which no scalable and systematic evaluation mechanism currently exists. In this work, we propose SciHorizon-DataEVA, a novel agentic system to scalable AI-readiness evaluation of heterogeneous scientific data. At the evaluation-criteria level, we introduce the Sci-TQA2 principles, which organize AI-readiness into four complementary dimensions: Governance Trustworthiness, Data Quality, AI Compatibility, and Scientific Adaptability. Each dimension is decomposed into measurable atomic elements that enable fine-grained and executable assessment. To operationalize these principles at scale, we develop Sci-TQA2-Eval, a hierarchical multi-agent evaluation approach orchestrated through a directed, cyclic workflow. Our Sci-TQA2-Eval dynamically constructs dataset-aware evaluation specifications by combining lightweight dataset profiling, applicability-aware metric activation, and knowledge-augmented planning grounded in domain constraints and dataset-paper signals. These specifications are executed through an adaptive, tool-centric evaluation mechanism with built-in verification and self-correction, enabling scalable and reliable assessment across heterogeneous scientific data. Extensive experiments on scientific datasets spanning multiple domains demonstrate the effectiveness and generality of SciHorizon-DataEVA for principled AI-readiness evaluation.

Editor's pickProfessional Services
Stocktitan· 2 days ago

Most firms still keep humans approving agentic AI as spending rises

Survey of 545 executives found firms expect to scale agentic AI in 17 months, but 33% cite unprepared processes as the top barrier.

Editor's pickHealthcare
Medical Design and Outsourcing· 2 days ago

AI-enabled medtech introduces risks facilities aren't ready for, cybersecurity report says

AI-enabled devices are introducing new risks that organizations aren’t fully equipped to manage, the cybersecurity report said.

Editor's pickFinancial Services
Theregister· Yesterday

SAP user group slams 'uncertainty' in ERP giant's API policy

Concerns over new rules might stop customers from adopting innovations – including AI – that connect to SAP systems An influential SAP user group has criticized the vendor's API policy update, saying it lacks clarity and potentially prevents users from starting new projects and innovating on their SAP platforms.…

AI Applications11 articles
Editor's pickPAYWALL
FT· 2 days ago

Artificial Intelligence in the Real Economy: A Visual Guide

A new series of visual explainers looking at how AI is transforming different industries.

Editor's pickPAYWALLFinancial Services
Bloomberg· Yesterday

BlackRock COO on How AI Is Fueling the Firm’s Product Innovation

On this episode of the Odd Lots podcast, BlackRock COO Rob Goldstein joins Joe Weisenthal and Tracy Alloway to discuss ways in which the firm is already using AI to develop innovative products, as well as how he envisions the future of private markets. (Source: Bloomberg)

Editor's pickPAYWALLTransportation & Logistics
FT· 2 days ago

How AI is powering the next generation of robotaxis

Technological advances have propelled self-driving cars from small-scale testing to rapid global expansion

Editor's pickPAYWALLProfessional Services
FT· Yesterday

Is AI increasing access to justice?

A steep increase in ‘vibe litigation’ appears to be expanding the market for legal activity

Editor's pickPAYWALLProfessional Services
NYT· Yesterday

Casa, a Handyman Start-Up, Aims to Automate Home Maintenance

Casa, a company founded by former Uber executives, says it uses artificial intelligence and a stable of handymen to take care of members’ homes.

Editor's pickConsumer & Retail
Arxiv· Yesterday

Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas

arXiv:2604.26120v1 Announce Type: new Abstract: Behavioral logs provide rich signals for user modeling, but are noisy and interleaved across diverse intents. Recent work uses LLMs to generate interpretable natural-language personas from user logs, yet evaluation often emphasizes downstream utility, providing limited assurance of persona quality itself. We propose a hierarchical framework that aggregates user actions into intent memories and induces multiple evidence-grounded personas by clustering and labeling these memories. We formulate persona induction as an optimization problem over persona quality-captured by cluster cohesion, persona-evidence alignment, and persona truthfulness-and train the persona model using a groupwise extension of Direct Preference Optimization (DPO). Experiments on a large-scale service log and two public datasets show that our method induces more coherent, evidence-grounded, and trustworthy personas, while also improving future interaction prediction.

Editor's pickTechnology
Ethan Mollick· 2 days ago

Frontier Model Limitations in Enterprise Document Automation and Tool Integration

Current frontier models still struggle with complex document generation and seamless tool orchestration required for professional workflows. These technical gaps limit the immediate ROI for enterprise-level automation.

Editor's pickTechnology
Ethan Mollick· Yesterday

Operational Friction in AI-Driven Workflow Automation and Tool Interoperability

Despite new file-creation capabilities, AI chatbots face significant usability hurdles in orchestrating multi-step tasks. This lack of reliability remains a primary barrier to widespread enterprise adoption.

Editor's pickPharma & Biotech
Daily AI News April 30, 2026: Healthcare Scaling AI. What’s Slowing You Down?· Yesterday

How Madrigal Built a Flexible and Scalable Multi-Agent Research Platform

Madrigal Pharmaceuticals utilized LangChain and LangGraph to develop a multi-agent AI platform for research and intelligence.

Editor's pickGovernment & Public Sector
Daily Brew· Yesterday

Pakistan Courts Embrace AI with New Guidelines to Enhance Judicial Efficiency and Integrity

Pakistan's National Judicial Policy Making Committee releases new AI guidelines for courts, aiming to aid judicial work while ensuring judgment, privacy, and independence.

Editor's pickFinancial Services
NewsBytes· 2 days ago

Those in the financial field must use these AI tools

AI tools are transforming financial risk assessment, steering away from traditional manual processes and static models

AI Measurement & Evaluation1 articles
Editor's pickEducation
Arxiv· Yesterday

Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics

arXiv:2604.26607v1 Announce Type: new Abstract: As Competency-Based Education (CBE) is gaining traction around the world, the shift from marks-based assessment to qualitative competency mapping is a manual challenge for educators. This paper tackles the bottleneck issue by suggesting a "Human-in-the-Loop" benchmarking framework to assess the effectiveness of multiple LLMs in automating secondary-level mathematics assessment. Based on the Grade 10 Optional Mathematics curriculum in Nepal, we created a multi-dimensional rubric for four topics and four cross-cutting competencies: Comprehension, Knowledge, Operational Fluency, and Behavior and Correlation. The multi-provider ensemble, consisted of open-weight models -- Eagle (Llama 3.1-8B) and Orion (Llama 3.3-70B) -- and proprietary frontier models Nova (Gemini 2.5 Flash) and Lyra (Gemini 3 Pro), was benchmarked against a ground truth defined by two senior mathematics faculty members (kappa_w = 0.8652). The findings show a marked "Architecture-compatibility gap". Although the Gemini-based Mixture-of-Experts (Sparse MoE) models achieved "Fair Agreement" (kappa_w ~ 0.38), the larger Orion (70B) model exhibited "No Agreement" (kappa_w = -0.0261), suggesting that architectural compliance with instruction constraints outweighs the scale of raw parameters in rubric-constrained tasks. We conclude that while LLMs are not yet suitable for autonomous certification, they provide high-value assistive support for preliminary evidence extraction within a "Human-in-the-Loop" framework.

Geopolitics, Policy & Governance

23 articles
AI Geopolitics4 articles
AI Policy & Regulation17 articles
Editor's pickFinancial Services
Funds Europe· 2 days ago

FundsTech 2026: From “patch gaps to prompt injection”: AI risks in asset management - Funds Europe

At FundsTech 2026, the first panel titled “The New Regulatory Frontier – AI, Cloud & Digital Assets” brought together industry experts to unpack how asset managers can stay ahead of evolving rules on artificial intelligence ( AI) For Daniel Lousqui, associate general counsel, Vanguard, ...

Editor's pickGovernment & Public Sector
Arxiv· Yesterday

The Creation and Analysis of Government AI Transparency Statements in Australia

arXiv:2604.26075v1 Announce Type: new Abstract: Governments increasingly deploy AI in public services, making transparency essential for accountability and public trust. Australia's Standard for AI Transparency Statements (AITS) requires government bodies to disclose how AI is used in practice, yet little empirical evidence exists on how these requirements are realised in documents. This paper presents the first government AITS dataset, dubbed AITS-101, and provides the first systematic analysis of their content. Using stylometric, quantitative, and qualitative document analyses, we examine disclosure coverage, structure, and recurring patterns. Our findings reveal substantial variation in AI-related practice disclosure, highlight gaps between policy intent and implementation, and inform the design of more effective public-sector AI transparency standards.

Editor's pickPAYWALLGovernment & Public Sector
Bloomberg· Yesterday

White House AI Memo Hits Issues Driving Anthropic-Pentagon Feud

White House officials are preparing a wide-ranging artificial intelligence policy memo that outlines requirements for AI deployment by national security agencies, some of which touch on issues driving the bitter dispute between the Pentagon and Anthropic PBC over military use of the firm’s technology, according to people familiar with the matter.

Editor's pickMedia & Entertainment
Top Daily Headlines: UK govt dept sent a document 'in error.' Now it's being used in a £370M contract lawsuit· 2 days ago

Australia threatens tech companies with 2.25 percent tax if they don’t pay publishers

The Australian government is proposing a new tax on tech firms that fail to reach payment agreements with local publishers.

Editor's pick
Arxiv· Yesterday

Open Problems in Frontier AI Risk Management

arXiv:2604.25982v1 Announce Type: cross Abstract: Frontier AI both amplifies existing risks and introduces qualitatively novel challenges. Not only is there a notable lack of stable scientific consensus resulting from the rapid pace of technological change, but emerging frontier AI safety practices are often misaligned with, or may undermine, established risk management frameworks. To address these challenges, we systematically surface open problems in frontier AI risk management. Adopting a problem-oriented approach, we examine each stage of the risk management process - risk planning, identification, analysis, evaluation, and mitigation - through a structured review of the literature, identifying unresolved challenges and the actors best positioned to address them. Recognising that different types of open problems call for different responses, we classify open problems according to whether they reflect (a) a lack of scientific or technical consensus, (b) misalignment with, or challenges to, established risk management frameworks, or (c) shortcomings in implementation despite apparent consensus and alignment. By mapping these open problems and identifying the actors best positioned to address them - including developers, deployers, regulators, standards bodies, researchers, and third-party evaluators - this work aims to clarify where progress is needed to enable robust and meaningful consensus on frontier AI risk management.The paper does not propose specific solutions; instead, it provides a problem-oriented, agenda-setting reference document, complemented by a living online repository, intended to support coordination, reduce duplication, and guide future research and governance efforts.

Editor's pickPAYWALLGovernment & Public Sector
Washington Post· 2 days ago

Opinion | White House policy on AI chatbots should be adopted by Congress - The Washington Post

An incoherent patchwork of state laws threatens to handicap America in the artificial intelligence race.

Editor's pickPAYWALLFinancial Services
Bloomberg· 2 days ago

April Global Regulatory Brief: Digital finance | Insights | Bloomberg Professional Services

As technology continues to reshape financial services, regulators and policy setters are embarking on a range of digital finance initiatives to manage risks and set appropriate standards.

Editor's pickGovernment & Public Sector
Tech Policy Press· 2 days ago

The US Is Fighting for Control of AI. It Would Be Better Off Building Standards. | TechPolicy.Press

T.J. Pyzyk makes the case for standards vs. strong-arming, if the US government wants shape global AI governance.

Editor's pickTransportation & Logistics
Siliconrepublic· 2 days ago

Bloomberg: China pauses AV permits after Baidu disruption

Baidu’s robotaxi operations in Wuhan have been suspended, sources tell the publication. Read more: Bloomberg: China pauses AV permits after Baidu disruption

Editor's pickGovernment & Public Sector
IAPP· 2 days ago

AI Act Omnibus: What just happened and what comes next? | IAPP

AI Governance Center Managing Director Ashley Casovan reacts to the delay in AI Act reform negotiations and assesses what AI governance professionals should do with the legal uncertainty and a looming enforcement deadline for certain high-risk AI systems this August.

Editor's pickEnergy & Utilities
The Center Square· 2 days ago

Energy industry insiders advise lawmakers on supporting AI growth, protecting ratepayers | National | thecentersquare.com

(The Center Square) – Energy industry experts testified before Congress about what lawmakers should include in legislation looking to support the rapid expansion of artificial intelligence while protecting ratepayers from

Editor's pickTechnology
Tech Policy Press· Yesterday

What the EU's First Digital Markets Act Review Actually Changes | TechPolicy.Press

The Commission says the DMA has effectively contributed to the core objectives of making digital markets in the EU fairer and more contestable.

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | April 30, 2026· 2 days ago

Fast-moving AI set as priority for UK cross-agency regulatory oversight

The Digital Regulation Cooperation Forum has pledged to prioritize AI developments in its 2026-27 work plan, focusing on cross-cutting insights and regulatory challenges.

Editor's pickGovernment & Public Sector
The Next Web· Yesterday

China launches months-long campaign against AI misuse

China’s CAC has launched a months-long AI misuse enforcement campaign targeting deepfakes, fraud, disinformation, and illegal application.

Editor's pickPharma & Biotech
Forbes· Yesterday

Council Post: The New Rules Of AI In Pharma: What FDA And EMA's 10 Guiding Principles Mean For Your Business

For pharmaceutical and life sciences companies ready to embrace this moment, the 10 principles aren't a burden. They're a blueprint.​

Editor's pickTechnology
Arxiv· Yesterday

Taking a Bite Out of the Forbidden Fruit: Characterizing Third-Party Iranian iOS App Stores

arXiv:2604.26343v1 Announce Type: new Abstract: Due to U.S. sanctions and strict internet censorship, Iranian iOS users are barred from accessing the Apple App Store and developer services. In response, despite violating Apple's developer terms, a thriving underground ecosystem of third-party iOS app stores has emerged to serve Iranian users. This paper presents the first comprehensive empirical study of these clandestine app stores. We document how these stores operate, including their distribution mechanisms, user authentication processes, and evasion techniques. By collecting and analyzing more than 1700 iOS application packages and their metadata from three major Iranian third-party app stores, we characterize the ecosystem's size, structure, and content. Our analysis reveals a significant presence of Iranian-exclusive apps, widespread distribution of cracked apps, unauthorized monetization of paid content, and embedded third-party tracking and piracy libraries. We also uncover a notable overlap among financial, navigational, and social apps that exist solely in this ecosystem, reflecting the unique digital constraints of Iranian users. Finally, we quantify the potential revenue losses for developers due to piracy and document security and privacy risks associated with altered binaries. Our findings highlight how sanctions, censorship, and enforcement gaps have enabled a parallel app distribution ecosystem with complex socio-technical implications.

Editor's pick
Gizchina· 2 days ago

China Internet Civilization Conference 2026 to Release AI Ethics and Safety Guidelines

The 2026 China Internet Civilization Conference in Nanning will release the Artificial Intelligence (AI) Application Ethics and Safety Guidelines (Version 1.0).

Best Practice AI© 2026 Best Practice AI Ltd. All rights reserved.

Get the full executive brief

Receive curated insights with practical implications for strategy, operations, and governance.

AI Daily Brief — leaders actually read it.

Free email — not hiring or booking. Optional BPAI updates for company news. Unsubscribe anytime.

Include

No spam. Unsubscribe anytime. Privacy policy.