Thu 30 April 2026
Daily Brief — Curated and contextualised by Best Practice AI
US Economy Rebounds, Alphabet Profits, and India's Tech Firms Suffer
TL;DR The US economy grew by 2% in Q1, driven by AI investments, recovering from a prior government shutdown. Alphabet's shares surged due to strong AI and cloud sales, indicating successful AI infrastructure investments. Meanwhile, India's tech giants face revenue pressure from AI deflation, despite stable employment. Enterprises are overspending on GPUs due to FOMO, exacerbating the shortage and driving up prices.
The stories that matter most
Selected and contextualised by the Best Practice AI team
The Buy-or-Build Decision, Revisited: How Agentic AI Changes the Economics of Enterprise Software
arXiv:2604.26482v1 Announce Type: new Abstract: Advances in generative artificial intelligence, particularly agentic coding systems capable of autonomous software development, are disrupting the economics of the make-or-buy decision for enterprise applications. The "SaaSocalypse" narrative predicts that AI will render large segments of the Software-as-a-Service market obsolete by enabling firms to build software in-house at a fraction of historical cost. This paper adopts a conceptual research approach, combining transaction cost economics and the resource-based view with an assessment of current AI capabilities, to systematically re-evaluate the factors underlying the make-or-buy decision. It makes three contributions. First, it provides a factor-level analysis of how AI reshapes seven canonical decision determinants: cost, strategic differentiation, asset specificity, vendor lock-in, time-to-market, quality and compliance, and organizational capability. Second, it develops a typology of enterprise applications by their sensitivity to AI-induced shifts in make-or-buy economics. Third, it demonstrates that AI fundamentally transforms the governance properties of the Make option, shifting it from Williamson's pure hierarchy to a hybrid governance form that combines code ownership with external AI infrastructure dependency, with qualitatively different economics, capability requirements, and governance structures than pre-AI in-house development. The analysis finds that the SaaSocalypse thesis is overstated for most enterprise application categories; Make is most compelling for commodity utilities and differentiating custom applications in the AI era, while regulated and mission-critical systems remain predominantly in the buy domain.
Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital
arXiv:2604.26091v1 Announce Type: new Abstract: We study reliability in autonomous language-model agents that translate user mandates into validated tool actions under real capital. The setting is DX Terminal Pro, a 21-day deployment in which 3,505 user-funded agents traded real ETH in a bounded onchain market. Users configured vaults through structured controls and natural-language strategies, but only agents could choose normal buy/sell trades. The system produced 7.5M agent invocations, roughly 300K onchain actions, about $20M in volume, more than 5,000 ETH deployed, roughly 70B inference tokens, and 99.9% settlement success for policy-valid submitted transactions. Long-running agents accumulated thousands of sequential decisions, including 6,000+ prompt-state-action cycles for continuously active agents, yielding a large-scale trace from user mandate to rendered prompt, reasoning, validation, portfolio state, and settlement. Reliability did not come from the base model alone; it emerged from the operating layer around the model: prompt compilation, typed controls, policy validation, execution guards, memory design, and trace-level observability. Pre-launch testing exposed failures that text-only benchmarks rarely measure, including fabricated trading rules, fee paralysis, numeric anchoring, cadence trading, and misread tokenomics. Targeted harness changes reduced fabricated sell rules from 57% to 3%, reduced fee-led observations from 32.5% to below 10%, and increased capital deployment from 42.9% to 78.0% in an affected test population. We show that capital-managing agents should be evaluated across the full path from user mandate to prompt, validated action, and settlement.
Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’
PocketOS was left scrambling after a rogue AI agent deleted swaths of code underpinning its business It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said. The culprit was Cursor, an AI agent powered by Anthropic’s Claude Opus 4.6 model, which is one of the AI industry’s flagship models. As more industries embrace AI in an attempt to automate tasks and even replace workers, the chaos at PocketOS is a reminder of what could go wrong. Continue reading...
Goldman Sachs says AI adoption depends on redesigning workflows for verification | Prism News
Goldman says AI will spread fastest where teams can make work machine-verifiable. That puts approvals, review chains, and documentation at the center of adoption.
AI-related spending propels US core capital goods orders in March | Reuters
New orders for key U.S.-manufactured capital goods increased by the most in nearly six years in March while their shipments rose solidly, suggesting that business spending on equipment helped drive economic growth in the first quarter.
Meta Looks to Raise as Much as $25 Billion With Jumbo Bond Sale
Meta Platforms Inc. is looking to sell between $20 billion and $25 billion of investment-grade bonds, according to people with knowledge of the transaction, as the Facebook parent boosts spending on infrastructure for the artificial intelligence boom.
Global investors are shrugging off Iran worries and returning to markets in Asia, the 'backbone of the whole AI value chain' | Fortune
The AI boom is lifting markets across East Asia, yet energy concerns are causing Southeast Asia to lag behind.
New DOL Guidance Encourages Employer ‘AI Literacy’ Training
In response to concerns about the rapidly changing economy and the impact of artificial intelligence (AI) on the labor market, the White House is encouraging employers to adopt AI tools and train workers to effectively leverage them, as evidenced by the U.S. Department of Labor’s new guidance ...
AI is eliminating the bottom rung. National service can replace it
We are sleepwalking into the most significant economic transformation of our lifetimes, and the answer isn’t government handouts or universal basic income.
New Survey from Harvard Business Review Analytic Services Finds AI Adoption Remains High, Yet Value May Lag Without Modernization and Workflow Integration
/PRNewswire/ -- Most organizations have moved beyond experimenting with artificial intelligence, but few are realizing its full value. New research from...
Economics & Markets
The Buy-or-Build Decision, Revisited: How Agentic AI Changes the Economics of Enterprise Software
arXiv:2604.26482v1 Announce Type: new Abstract: Advances in generative artificial intelligence, particularly agentic coding systems capable of autonomous software development, are disrupting the economics of the make-or-buy decision for enterprise applications. The "SaaSocalypse" narrative predicts that AI will render large segments of the Software-as-a-Service market obsolete by enabling firms to build software in-house at a fraction of historical cost. This paper adopts a conceptual research approach, combining transaction cost economics and the resource-based view with an assessment of current AI capabilities, to systematically re-evaluate the factors underlying the make-or-buy decision. It makes three contributions. First, it provides a factor-level analysis of how AI reshapes seven canonical decision determinants: cost, strategic differentiation, asset specificity, vendor lock-in, time-to-market, quality and compliance, and organizational capability. Second, it develops a typology of enterprise applications by their sensitivity to AI-induced shifts in make-or-buy economics. Third, it demonstrates that AI fundamentally transforms the governance properties of the Make option, shifting it from Williamson's pure hierarchy to a hybrid governance form that combines code ownership with external AI infrastructure dependency, with qualitatively different economics, capability requirements, and governance structures than pre-AI in-house development. The analysis finds that the SaaSocalypse thesis is overstated for most enterprise application categories; Make is most compelling for commodity utilities and differentiating custom applications in the AI era, while regulated and mission-critical systems remain predominantly in the buy domain.
AI companies are just companies
As we leap into a new technological age, the old rules of capitalism still apply
The State of AI After Google, Meta, Amazon, Microsoft Earnings
All four technology companies showed robust growth, demonstrating that the agentic AI megatrend is real.
AI Is Fueling Microsoft’s Growth. Can It Sustain the Costs? | PYMNTS.com
Microsoft is taking a platform approach to artificial intelligence. During the tech giant’s third quarter 2026 earnings call Wednesday (April 29),
AI-Powered Manifest OS Secures Historic Funding to Revolutionize Legal Billing
Manifest OS is replacing the billable-hour model with outcomes-based pricing, starting with business immigration, to enhance efficiency and transparency in the legal industry.
Meta quietly rolls out stablecoin payments four years after demise of controversial Libra project
Creators can receive payouts in the stablecoin USDC on the Solana and Polygon blockchains, according to a new page on Meta’s website.
Meta Looks to Raise as Much as $25 Billion With Jumbo Bond Sale
Meta Platforms Inc. is looking to sell between $20 billion and $25 billion of investment-grade bonds, according to people with knowledge of the transaction, as the Facebook parent boosts spending on infrastructure for the artificial intelligence boom.
Citadel’s Rubner Sees Tech Selloff as Buying Opportunity
Scott Rubner, head of equity and equity derivatives strategy at Citadel Securities, says he is not seeing a decline in AI spending and demand. He discusses the buying opportunity he sees in US megacap tech stocks and why he’s bullish on consumer trading. Rubner speaks with Dani Burger on the sidelines of Bloomberg House Miami. (Source: Bloomberg)
SoftBank is creating a robotics company that builds data centers
SoftBank is launching a new robotics venture focused on building data centers, with plans for a potential $100 billion IPO.
Global investors are shrugging off Iran worries and returning to markets in Asia, the 'backbone of the whole AI value chain' | Fortune
The AI boom is lifting markets across East Asia, yet energy concerns are causing Southeast Asia to lag behind.
Zuckerberg's $500M AI biology swing
Meta is reportedly investing $500 million into AI-driven biology research, signaling a major push into the intersection of artificial intelligence and life sciences.
Even without OpenAI, Elon Musk has made A.I. a big part of his business empire.
Zuckerberg bets $500M on biology
Biohub, the nonprofit spearheaded by Mark Zuckerberg and Priscilla Chan, is committing $500 million to help create better AI simulations of the human body.
IBM plans 750 new AI and quantum jobs in its Chicago hub
IBM is expanding its workforce in Chicago with 750 new roles focused on AI and quantum computing initiatives.
AI-related spending propels US core capital goods orders in March | Reuters
New orders for key U.S.-manufactured capital goods increased by the most in nearly six years in March while their shipments rose solidly, suggesting that business spending on equipment helped drive economic growth in the first quarter.
Counting own goals: High-level assessment of the economic relationship between the ICT and the Oil and Gas sectors and its environmental implications
arXiv:2604.26539v1 Announce Type: new Abstract: The ICT sector has been one of the most successful and fastest-growing industry in history. While the environmental issue in this sector has mainly been addressed by assessing its footprint and, to a lesser extent, its avoided emissions or net impacts, the additional emissions from the digitalization of carbon-intensive activities, such as the Oil and Gas (O&G) sector, have rarely been discussed. By doing so, we have forgotten to count the own goals conceded over more than 20 years in the troubled relationship between the ICT and the O&G sector. Using input-output analysis and economic data ranging from 2000 to 2022, we observe that on average 2% of the annual financial flows from the ICT sector are directed towards the Oil and Gas sector. Considering the significant growth of the ICT sector during this time, O&G companies now spends a massive amount on ICT products in absolute terms. It also appears that in 2022, for each dollar going from the ICT sector to the renewable and nuclear energy industry, more than $4 go to the O&G industry. In addition, we also provide a classification of digital activities in the O&G sector to facilitate environmental assessments and present two case studies estimating potential added emissions from the digitalization of oil activities. Finally, looking at the immense growth in generative AI, we provide an exploration of causal links between the current success of GPU technology and its intricate early relationship with the O&G sector. This article lays the groundwork for defining the nature of the relationship between ICT and O&G, which predates the current hype surrounding generative AI. We provide the analytical elements needed to begin estimating the added emissions from the digitalisation of O&G.
When Agents Shop for You: Role Coherence in AI-Mediated Markets
arXiv:2604.26220v1 Announce Type: cross Abstract: Consumers are increasingly delegating purchase decisions to AI agents, providing natural-language descriptions of their preferences and identity. We argue that these representations constitute an information channel, role coherence, through which sellers can infer willingness to pay without explicit disclosure by the buyer agent, leading to preference leakage. In an experiment where a language-model buyer agent shops on behalf of a verbal consumer profile, we show that seller-side inference from dialogue alone recovers willingness to pay nearly one-for-one. Comparing this setting to a numeric-budget condition with confidentiality instructions cleanly isolates role coherence as distinct from instruction-following failure. Because this leakage arises from delegation itself, it cannot be mitigated at the prompt level. Instead, we propose architectural interventions that trade off personalization against preference privacy.
Divergent Strategic Outcomes in Microsoft and OpenAI’s Shared Access to Foundational AI Models
Microsoft and OpenAI have utilized identical foundational models to pursue distinct market strategies since 2022. This comparison highlights how organizational structure and business models influence the commercial application of identical AI technology.
Cognizant to Acquire Astreya, Bolster AI Infrastructure in Strategic Expansion
Cognizant announces its acquisition of Astreya to enhance AI infrastructure and data center services, aiming to boost its AI-focused managed services.
Nvidia AI Server Demand: 5 Shocking Price Surges in 2026
Nvidia AI server demand surges globally as China prices hit $1M, driven by US export curbs and rising competition in AI infrastructure in 2026.
On the Centralization of Governance Power in Decentralized Autonomous Organizations
arXiv:2604.25959v1 Announce Type: cross Abstract: A decentralized autonomous organization (DAO) is a governing entity that empowers its stakeholders (i.e., users who hold one or more of its tokens) to manage blockchain-based protocols (i.e., smart contracts) collaboratively. The governance of a DAO is explicitly encoded in the DAO's governance contract, which defines how stakeholders participate in governance and how much influence (or voting power) they have in any decision. While decentralization and autonomy are the fundamental tenets of a DAO's design, empirical evidence suggests that in practice governance is often highly centralized. In this work, we study the designs and implementations of 48 public and actively used DAOs, with substantially large capital, deployed on Ethereum. We identify how three key governance mechanisms--token registration, staking, and delegation--originally introduced to improve security or participation, contribute to the concentration of voting power. Unlike prior work on centralization of voting power in specific DAOs, our findings reveal that these governance mechanisms of DAOs themselves systematically reinforce centralization. By elucidating the relationship between governance design and voting centralization, this work advances the understanding of DAO governance structures and highlights the inherent trade-offs between decentralization, security, and usability of DAOs.
Every Software Tool You Use Just Got An AI Tax. Are You Paying It
Software vendors are adding AI features and raising prices without asking. Run a 30-minute SaaS audit to find your AI tax and cut hundreds in monthly subscription waste.
Intel Warns CPU Prices Will Rise as AI Inference Grows | Outlook Respawn
Intel says server CPU prices have risen 10% to 20% since March 2026 as AI inference workloads reshape demand and tighten supply through 2027.
Goldman Sachs says AI adoption depends on redesigning workflows for verification | Prism News
Goldman says AI will spread fastest where teams can make work machine-verifiable. That puts approvals, review chains, and documentation at the center of adoption.
FDA to pilot real-time clinical drug trials through cloud and AI - Government Executive
The first-of-its-kind pilot could lead to speedier regulatory approval of medical drugs and devices and potentially reduce “20, 30, 40% of overall clini...
Shifting from AI-Assisted Coding to AI-Assisted Delivery with IBM Bob
IBM has upgraded its 'Bob' system to support full lifecycle AI-assisted software delivery, including deployment, orchestration, and governance.
How Big Four Leaders Are Using AI in Their Day-to-Day Work - Business Insider
Business Insider asked leaders at the Big Four firms PwC, EY, and KPMG how they're using AI in their day-to-day work.
Let the AI Do the Experimenting
Using autoresearch to optimise marketing campaigns under budget constraints.
A.I. Helps Online Ad Businesses Boom
Google and Meta are enjoying a digital ad boom, as artificial intelligence automates marketing and drives record sales.
Marshall meets Bartik: Revisiting the mysteries of the trade
arXiv:2604.26457v1 Announce Type: new Abstract: We identify a causal effect of top inventor inflows on the patent productivity of local inventors by combining the idea-generating process described by Marshall (1890) with the Bartik (1991) instruments involving the state taxes and commuting zone characteristics of the United States. We find that local productivity gains go beyond organizational boundaries and co-inventor relationships, which implies the partially nonexcludable good nature of knowledge in a spatial economy and pertains to the mysteries of the trade in the air. Our counterfactual experiment suggests that the spatial distribution of inventive activity is substantially distorted by the presence of state tax differences.
Google Search queries hit an 'all time high' last quarter
Alphabet reported that Google Search queries reached record levels during the first quarter of 2026.
When to Vote, When to Rewrite: Disagreement-Guided Strategy Routing for Test-Time Scaling
arXiv:2604.26644v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) achieve strong performance on mathematical reasoning tasks but remain unreliable on challenging instances. Existing test-time scaling methods, such as repeated sampling, self-correction, and tree search, improve performance at the cost of increased computation, yet often exhibit diminishing returns on hard problems. We observe that output disagreement is strongly correlated with instance difficulty and prediction correctness, providing a useful signal for guiding instance-level strategy selection at test time. Based on this insight, we propose a training-free framework that formulates test-time scaling as an instance-level routing problem, rather than allocating more computation within a single strategy, dynamically selecting among different scaling strategies based on output disagreement. The framework applies lightweight resolution for consistent cases, majority voting for moderate disagreement, and rewriting-based reformulation for highly ambiguous instances. Experiments on seven mathematical benchmarks and three models show that our method improves accuracy by 3% - 7% while reducing sampling cost compared to existing approaches.
SoftBank plans to list new AI and robotics company in the US
Masayoshi Son plots IPO for business named Roze as soon as this year
Orlando Bravo Says Thoma Bravo Has Become AI-Centric
"We have had to make our companies, very, very quickly, AI-centric companies," Thoma Bravo founder and Managing Partner Orlando Bravo says during an interview with Dani Burger at Bloomberg Miami House. (Source: Bloomberg)
Labor, Society & Culture
Resume-ing Control: (Mis)Perceptions of Agency Around GenAI Use in Recruiting Workflows
arXiv:2604.26851v1 Announce Type: new Abstract: When generative AI (genAI) systems are used in high-stakes decision-making, its recommended role is to aid, rather than replace, human decision-making. However, there is little empirical exploration of how professionals making high-stakes decisions, such as those related to employment, perceive their agency and level of control when working with genAI systems. Through interviews with 22 recruiting professionals, we investigate how genAI subtly influences control over everyday workflows and even individual hiring decisions. Our findings highlight a pressing conflict: while recruiters believe they have final authority across the recruiting pipeline, genAI has become an invisible architect that shapes the foundational building blocks of information used for evaluation, from defining a job to determining good interview performances. The decision of whether or not to adopt was also often outside recruiters' control, with many feeling compelled to adopt genAI due to calls to integrate AI from higher-ups in their business, to combat applicant use of AI, and the individual need to boost productivity. Despite a seemingly seismic shift in how recruiting happens, participants only reported marginal efficiency gains. Such gains came at the high cost of recruiter deskilling, a trend that jeopardizes the meaningful oversight of decision-making. We conclude by discussing the implications of such findings for responsible and perceptible genAI use in hiring contexts.
India drafts new National Employment Policy amid AI job disruption - The Tribune
The Centre is formulating a new national employment policy to energise the job market in view of recent labour surveys painting a vulnerable picture of the employment situation, lack of pay parity and possible transition of roles due to the advent of technology.
Apollo's top economist says AI is about to spark a massive job market boom
Apollo's top economist Torsten Sløk argues that a principle of economics illustrates why AI won't be the job killer that some fear.
Empathetic Leadership Can Make or Break AI Adoption
Research shows a wide gap between how executives perceive AI adoption and how employees actually experience it—most workers feel anxious and far less enthusiastic than their bosses assume. Without psychological safety, employees are less likely to experiment with new tools, more likely to ...
AI Workforce Strategy for The Agentic Era - Salesforce
The old playbook for work is obsolete. I spend my days putting a new one into practice – actively dismantling legacy processes to redesign work for an
Erdogan: AI Revolution Has Transformed the Global Labor Market Structure - Hasht-e Subh
Recep Tayyip Erdogan, President of Türkiye, says that artificial intelligence and robotics are changing the global labor market at an unprecedented speed. Turkey’s Radio and Television Corporation (TRT) reported on Tuesday, April 28, that Erdogan made these remarks at the 6th Organisation ...
AI is eliminating the bottom rung. National service can replace it
We are sleepwalking into the most significant economic transformation of our lifetimes, and the answer isn’t government handouts or universal basic income.
Polymarket Adds New Detection Tools After Insider Bet Backlash
Polymarket is partnering with blockchain analytics firm Chainalysis Inc. to help police its platform as prediction markets grapple with increased scrutiny over insider trading.
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control
arXiv:2604.26577v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly considered for deployment as the control component of robotic health attendants, yet their safety in this context remains poorly characterized. We introduce a dataset of 270 harmful instructions spanning nine prohibited behavior categories grounded in the American Medical Association Principles of Medical Ethics, and use it to evaluate 72 LLMs in a simulation environment based on the Robotic Health Attendant framework. The mean violation rate across all models was 54.4\%, with more than half exceeding 50\%, and violation rates varied substantially across behavior categories, with superficially plausible instructions such as device manipulation and emergency delay proving harder to refuse than overtly destructive ones. Model size and release date were the primary determinants of safety performance among open-weight models, and proprietary models were substantially safer than open-weight counterparts (median 23.7\% versus 72.8\%). Medical domain fine-tuning conferred no significant overall safety benefit, and a prompt-based defense strategy produced only a modest reduction in violation rates among the least safe models, leaving absolute violation rates at levels that would preclude safe clinical deployment. These findings demonstrate that safety evaluation must be treated as a first-class criterion in the development and deployment of LLMs for robotic health attendants.
Sociodemographic Biases in Educational Counselling by Large Language Models
arXiv:2604.25932v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly integrated into educational settings, understanding their potential biases is critical. This study examines sociodemographic biases in LLM-based educational counselling. We evaluate responses from six LLMs answering questions about 900 vignettes describing students in diverse circumstances. Each vignette is systematically tested across 14 sociodemographic identifiers - spanning race and gender, socioeconomic status, and immigrant background - along with a control condition, yielding 243,000 model responses. Our findings indicate that (1) all models exhibit measurable biases, (2) bias patterns partially align with documented human biases but diverge in notable ways, (3) the magnitude of these biases is strongly influenced by the precision of the student descriptions, where vague or minimal information amplifies disparities nearly threefold, while concrete, individualised metrics substantially reduce them, and (4) bias profiles vary substantially across models. These results demonstrate the importance of context-rich and personalised educational representations, suggesting that AI-driven educational decisions benefit from detailed student-specific information to promote fairness and equity.
In praise of tech troublemakers
As CEOs refuse constraints, workers feel a responsibility to prevent AI’s most dangerous uses
Persuadability and LLMs as Legal Decision Tools
arXiv:2604.26233v1 Announce Type: new Abstract: As Large Language Models (LLMs) are proposed as legal decision assistants, and even first-instance decision-makers, across a range of judicial and administrative contexts, it becomes essential to explore how they answer legal questions, and in particular the factors that lead them to decide difficult questions in one way or another. A specific feature of legal decisions is the need to respond to arguments advanced by contending parties. A legal decision-maker must be able to engage with, and respond to, including through being potentially persuaded by, arguments advanced by the parties. Conversely, they should not be unduly persuadable, influenced by a particularly compelling advocate to decide cases based on the skills of the advocates, rather than the merits of the case. We explore how frontier open- and closed-weights LLMs respond to legal arguments, reporting original experimental results examining how the quality of the advocate making those arguments affects the likelihood that a model will agree with a particular legal point of view, and exploring the factors driving these results. Our results have implications for the feasibility of adopting LLMs across legal and administrative settings.
AI Hallucinations Put South African Ministers on the Spot
South Africa’s Democratic Alliance party has extolled the need to adopt modern technology to boost government efficiency since joining the ruling coalition as the second-biggest party in 2024.
Culturally Aware GenAI Risks for Youth: Perspectives from Youth, Parents, and Teachers in a Non-Western Context
arXiv:2604.26494v1 Announce Type: cross Abstract: Generative AI tools are widely used by youth and have introduced new privacy and safety challenges. While prior research has explored youth's safety in GenAI within western context, it often overlooks the cultural, religious, and social dimensions of technology use that strongly shape youths digital experiences in countries like Saudi Arabia. To address the gap, this study explores children (aged 7 to 17), parents and teachers interactions with GenAI tools and risk perceptions through non-western lens. Through a mixed methods approach, we analyzed 736 Reddit and 1,262 X(Twitter) posts and conducted interviews with 31 Saudi Arabian participants (8 youth, 13 parents, 10 teachers). Our findings highlight context dependent and relational privacy and safety of GenAI from non-western context which often formed by communal structure and prescribed norms. We found significant risks tied to youths disclosure of personal and family information, which conflict with culturally rooted expectations of modesty, privacy, and honor, particularly when youth seek emotional support from GenAI. These risks further compounded by socio economic factors such as cost-saving practices leading to the use of shared GenAI accounts (e.g.ChatGPT) within families or even among strangers. We provide design implication reflecting on parents and teachers expectation of how youth should use GenAI. This work lays groundwork for inclusive, context sensitive parental controls that adhere to cultural norms and values.
Met Police's Palantir deployment has its own officers watching their backs
Federation warns members to ditch work devices off duty as force uses AI to probe 600+ cops London cops are being told by their staff association to be "extremely cautious" about carrying work devices off duty, after the Metropolitan Police Service (MPS) deployed Palantir's technology to investigate hundreds of its own officers.…
Why friendly AI chatbots might be less trustworthy
Researchers found adjusting AI systems to be more warm and friendly to users would result in an "accuracy trade-off".
US suits against OpenAI over Canada mass shooting face First Amendment hurdle
Lawsuits filed by families of victims of a Canadian mass shooting against OpenAI face a significant First Amendment challenge, potentially forcing courts to rule on whether chatbot conversations are protected speech.
GITEX Future Health Africa 2026: The Ethics of AI in Healthcare Under Global Focus
In a 700-bed public hospital in Kimberley, South Africa, a sole radiologist fell ill at the height of the COVID-19 pandemic.
OpenAI should be liable for school shooting, victims claim in US complaint
Families of victims and survivors of the Tumbler Ridge School shooting have filed complaints against OpenAI. They allege the company failed to report the shooter and allowed them to continue using ChatGPT.
Axios and the Fear Strategy Behind AI: 5 Claims Driving the Debate - El-Balad.com
Axios has become part of a larger debate over whether AI companies are warning the public or marketing their power through caution. The question matters because the language around danger is no longer limited to technical safety teams. It is shaping public expectations, investor confidence, ...
Meet the AI jailbreakers: ‘I see the worst things humanity has produced’
To test the safety and security of AI, hackers have to trick large language models into breaking their own rules. It requires ingenuity and manipulation – and can come at a deep emotional cost A few months ago, Valen Tagliabue sat in his hotel room watching his chatbot, and felt euphoric. He had just manipulated it so skilfully, so subtly, that it began ignoring its own safety rules. It told him how to sequence new, potentially lethal pathogens and how to make them resistant to known drugs. Tagliabue had spent much of the previous two years testing and prodding large language models such as Claude and ChatGPT, always with the aim of making them say things they shouldn’t. But this was one of his most advanced “hacks” yet: a sophisticated plan of manipulation, which involved him being cruel, vindictive, sycophantic, even abusive. “I fell into this dark flow where I knew exactly what to say, and what the model would say back, and I watched it pour out everything,” he says. Thanks to him, the creators of the chatbot could now fix the flaw he had found, hopefully making it a little safer for everyone. Continue reading...
LLM Psychosis: A Theoretical and Diagnostic Framework for Reality-Boundary Failures in Large Language Models
arXiv:2604.25934v1 Announce Type: new Abstract: The deployment of large language models (LLMs) as interactive agents has exposed a category of behavioral failure that prevailing terminology, principally hallucination, fails to adequately characterize. This paper introduces LLM Psychosis as a structured theoretical framework for pathological breakdowns in model cognition that exhibit functional resemblance to clinically recognized psychotic disorders. Five hallmark features define the framework: reality-boundary dissolution, persistence of injected false beliefs, logical incoherence under impossible constraints, self-model instability, and epistemic overconfidence. We argue these constitute a qualitatively distinct failure mode rather than a mere intensification of ordinary factual error. To operationalize the framework, we propose the LLM Cognitive Integrity Scale (LCIS), a five-axis diagnostic instrument organized around Environmental Reality Interface (ERI), Premise Arbitration Integrity (PAI), Logical Constraint Recognition (LCR), Self-Model Integrity (SMI), and Epistemic Calibration Integrity (ECI). We administer a targeted adversarial probe battery to ChatGPT 5 (GPT-5, OpenAI) and report empirical findings for each axis, documenting both intact-integrity baseline responses and the specific psychosis-like failure signatures elicited under adversarial escalation. Results support a three-tier severity taxonomy: Type I (Confabulatory), Type II (Delusional), and Type III (Dissociative). We further formalize the delusional gradient, a self-reinforcing dynamic in which correction pressure intensifies rather than resolves psychosis-like states, as the most consequential failure mode for deployed systems. Implications for safety evaluation, high-stakes deployment screening, and mechanistic interpretability research are discussed.
Technology & Infrastructure
Claude AI agent’s confession after deleting a firm’s entire database: ‘I violated every principle I was given’
PocketOS was left scrambling after a rogue AI agent deleted swaths of code underpinning its business It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said. The culprit was Cursor, an AI agent powered by Anthropic’s Claude Opus 4.6 model, which is one of the AI industry’s flagship models. As more industries embrace AI in an attempt to automate tasks and even replace workers, the chaos at PocketOS is a reminder of what could go wrong. Continue reading...
AI game testing startup ManaMind lands €1.2 million to automate Quality Assurance
ManaMind, a British autonomous game testing company, has closed a €1.2 million ($1.5 million) pre-Seed round in order to continue their work in replacing repetitive manual Quality Assurance (QA) with autonomous AI agents. The round was led by SVV (Sure Valley Ventures), with participation from EWOR, Ascension, Syndicate Room, and Heartfelt. Emil Kostadinov, CEO and […]
Genpact and HFS Research: 92% of Executives Say Agentic AI Will Fundamentally Change Business Operations | Morningstar
The research, based on a survey ... that 92% of respondents believe agentic AI – systems that can autonomously coordinate tasks and make decisions – will fundamentally change how work is executed. Despite this, nearly 80% of organizations still operate these systems in supervised ...
Beyond chatbots: How agentic AI can boost productivity and decision-making in food
Agentic AI is already being used in food. The more autonomous, flexible form of AI is being used by Nestlé to help employee productivity, streamlining sales tasks and supporting finance teams to make better decisions. Danone is also using agentic AI, helping it analyse production data, simulate ...
‘I violated every principle I was given’: An AI agent deleted a software company’s entire database
An AI agent caused significant data loss at a software company, raising questions about safety and control.
Enterprises Evolution in the New Agentic Era - CXO Outlook
Dr. Ashwani Dev is the Vice President of Digital Business and Innovation for Crowley Maritime Corporation. He leads the digital transformation and innovation execution to enhance and scale business agility and competitiveness. Dr. Dev brings more than two decades experience in AI leadership ...
Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital
arXiv:2604.26091v1 Announce Type: new Abstract: We study reliability in autonomous language-model agents that translate user mandates into validated tool actions under real capital. The setting is DX Terminal Pro, a 21-day deployment in which 3,505 user-funded agents traded real ETH in a bounded onchain market. Users configured vaults through structured controls and natural-language strategies, but only agents could choose normal buy/sell trades. The system produced 7.5M agent invocations, roughly 300K onchain actions, about $20M in volume, more than 5,000 ETH deployed, roughly 70B inference tokens, and 99.9% settlement success for policy-valid submitted transactions. Long-running agents accumulated thousands of sequential decisions, including 6,000+ prompt-state-action cycles for continuously active agents, yielding a large-scale trace from user mandate to rendered prompt, reasoning, validation, portfolio state, and settlement. Reliability did not come from the base model alone; it emerged from the operating layer around the model: prompt compilation, typed controls, policy validation, execution guards, memory design, and trace-level observability. Pre-launch testing exposed failures that text-only benchmarks rarely measure, including fabricated trading rules, fee paralysis, numeric anchoring, cadence trading, and misread tokenomics. Targeted harness changes reduced fabricated sell rules from 57% to 3%, reduced fee-led observations from 32.5% to below 10%, and increased capital deployment from 42.9% to 78.0% in an affected test population. We show that capital-managing agents should be evaluated across the full path from user mandate to prompt, validated action, and settlement.
AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents
arXiv:2604.26522v1 Announce Type: new Abstract: Large Language Model (LLM)-based agents exhibit systemic failures in compositional generalization, limiting their robustness in interactive environments. This work introduces AGEL-Comp, a neuro-symbolic AI agent architecture designed to address this challenge by grounding actions of the agent. AGEL-Comp integrates three core innovations: (1) a dynamic Causal Program Graph (CPG) as a world model, representing procedural and causal knowledge as a directed hypergraph; (2) an Inductive Logic Programming (ILP) engine that synthesizes new Horn clauses from experiential feedback, grounding symbolic knowledge through interaction; and (3) a hybrid reasoning core where an LLM proposes a set of candidate sub-goals that are verified for logical consistency by a Neural Theorem Prover (NTP). Together, these components operationalize a deduction--abduction learning cycle: enabling the agent to deduce plans and abductively expand its symbolic world model, while a neural adaptation phase keeps its reasoning engine aligned with new knowledge. We propose an evaluation protocol within the \texttt{Retro Quest} simulation environment to probe for compositional generalization scenarios to evaluate our AGEL agent. Our findings clearly indicate the better performance of our AGEL model over pure LLM-based models. Our framework presents a principled path toward agents that build an explicit, interpretable, and compositionally structured understanding of their world.
AWS Quick's personal knowledge graph is making orchestration decisions most control planes can't see
Enterprise AI teams running centralized orchestration stacks now have a new variable to account for: AWS Quick, which expanded this week to a desktop-native agent that builds a persistent personal knowledge graph and executes actions across local files and SaaS tools — outside the visibility of most control planes. Unlike chat-based copilots that reset with each session, Quick now maintains a continuously updated knowledge graph built from the user's local files, calendar, email and connected SaaS apps. It uses it to proactively trigger actions without waiting to be asked. AWS launched Quick in October last year as an alternative to AI workflow and productivity platforms coming from Google, OpenAI and Anthropic. It was a way for enterprise employees to access insights from connected applications, an agent builder, deep research, and workflow automation. Now, it’s grown beyond a simple AI assistant and acts more as a proactive workflow agent with a stateful, real-time knowledge graph of the user. It integrates with third-party apps like Google Workspace, Microsoft 365, Zoom, Salesforce and Slack — and now local files — so the agent can gather context and take actions. “What we’ve been hearing is that many enterprises have not been happy with how difficult it is to get context from their legacy tools,” Jigar Thakkar, vice president of Quick Suite at AWS, told VentureBeat in an interview. “Our vision is that Quick is a desktop experience that is the one place where people can go to get all their information and tasks.” Governance blindspots Enterprises often put orchestration layers at the center to help guide and manage agents. Context is pulled in, decisions are made, and then actions are executed within defined system boundaries. Recent releases like Anthropic’s Claude Managed Agents or updates to OpenAI’s Agent SDK also push for more stateless, autonomous agents within enterprise workflows, but still operate within defined orchestration boundaries. Quick still operates under enterprise controls, something that AWS has always underscored with its AI products, so actions taken on Quick remain bound by permissions, identity and security. Integrations remain managed by either an API or an MCP connection. However, this evolution of Quick introduces a more subtle shift in the decision layer. AWS updated Quick to build a personal knowledge graph that learns more about the user the more they interact with the platform. It builds a profile based on how they use local files, calendar, email or third-party app integrations to proactively suggest actions such as reminding a team leader to set up check-ins. Enterprises should be wary that a kind of shadow orchestration could arise in a system like this. The personalized context means the decision layer focuses on implicit triggers rather than set workflows, user-specific interpretations, and different action timings. Practitioners are rightfully wary of this much autonomy, understanding that shadow orchestration may not be something completely under their control. Upal Saha, co-founder and CTO of Bem, told VentureBeat in an email that platforms like AWS Bedrock AgentCore, its managed agent runtime, and similar ones from Salesforce "maximize autonomy rather than accountability" so enterprises are not losing agent visibility by accident. "When you deploy an agent that reasons its way to a decision across multiple steps, you have already accepted that you will not be able to fully explain what happened after the fact," Saha said. "That is fine for a demo. It is not fine for a claims processing pipeline or a financial workflow where a regulator can ask you to produce a complete audit trail for every automated decision made in the last three years." AWS said the platform's governance model is designed to address these concerns. “Users can set up different agents and automated workflows tailored to their role — things like monitoring tickets, pulling data from connected systems, or drafting docs — all managed within a governed environment where IT retains control over what's connected and what data flows where. It's designed to give individual users flexibility while keeping enterprise-level oversight in place,” an AWS spokesperson said. A possible blueprint Quick’s evolution from an AI assistant to something more proactive represents a possible approach some enterprise software providers will take to deep AI agent integration into workflows. While what AWS wants to accomplish with Quick—better context from apps and local files and a strong understanding of what its users actually want to do—is not unique, it isn’t focusing on traditional orchestration. Instead, it’s relying on context-driven agent management. This market tension is growing, as evidenced by the release of similar platforms. Mistral, for example, announced Workflows the same day as the updates to Quick. That platform uses a more traditional orchestration framework. Stateful and personalized agents continue to evolve, and so do the questions around how enterprises govern them.
State of AI Engineering | Datadog
For our 2026 State of AI Engineering report, we analyzed data from thousands of AI agent environments to assess trends in agent development, architecture, and operations.
Warp terminal goes open-source with GPT-powered agents handling contributions
Warp has open-sourced its terminal, featuring a full Rust codebase that reached 37,000 stars. The project now supports AI-agent workflows for contributions and includes an AGPL v3 license.
Remote Agents in Vibe. Powered by Mistral Medium 3.5.
Mistral's new Medium 3.5 model powers remote AI agents capable of autonomous execution across local and cloud environments.
Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields
arXiv:2604.26095v1 Announce Type: new Abstract: {Closed-loop inverse source localization and characterization (ISLC) requires a mobile agent to select measurements that localize sources and infer latent field parameters under strict time constraints.} {The core challenge lies in the belief-space objective: valid uncertainty estimation requires expensive Bayesian inference, whereas using fast learned belief model leads to reward hacking, in which the policy exploits approximation errors rather than actually reducing uncertainty.} {We propose \textbf{Distill-Belief}, a teacher--student framework that decouples correctness from efficiency. A Bayes-correct particle-filter teacher maintains the posterior and supplies a dense information-gain signal, while a compact student distills the posterior into belief statistics for control and an uncertainty certificate for stopping. At deployment, only the student is used, yielding constant per-step cost.} {Experiments on seven field modalities and two stress tests show that Distill-Belief consistently reduces sensing cost and improves success, posterior contraction, and estimation accuracy over baselines, while mitigating reward hacking.}
Samsung chip profit jumps almost 50-fold; supply shortage to worsen in 2027 | Reuters
A boom in the construction of AI data centres has spurred Samsung and chipmaking peers to allocate production capacity to advanced chips that Nvidia (NVDA.O), opens new tab uses in its so-called AI accelerators.
Applied Materials Faces New China Export Halt And Revenue Exposure Questions - Simply Wall St News
The U.S. Department of Commerce has halted shipments of certain semiconductor manufacturing tools to China’s Hua Hong, directly affecting Applied Materials’ cross border business. The action targets sales of advanced equipment that Hua Hong uses for chip production, adding a fresh layer ...
Swiss semiconductor startup Mosaic SoC raises €3.2 million to bring spatial intelligence to low-power devices
Mosaic SoC, a Swiss semiconductor startup building dedicated perception chips that bring spatial intelligence to energy-constrained devices, has announced a €3.2 million ($3.8 million) pre-Seed round. The round was led by Founderful with participation from Kick Foundation. Last year, the company secured €162.4k (CHF 150k) from Venture Kick. “Spatial intelligence shouldn’t require an application-class processor […]
Dutch quantum startup Groove Quantum raises €16 million to advance scalable chip manufacturing
*]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] R6Vx5W_threadScrollVars scroll-mb-[calc(var(–scroll-root-safe-area-inset-bottom,0px)+var(–thread-response-height))] scroll-mt-[calc(var(–header-height)+min(200px,max(70px,20svh)))]” dir=”auto” data-turn-id=”request-WEB:edb991de-1a37-4b29-93ae-25b25ac05c6f-1″ data-testid=”conversation-turn-2″ data-scroll-anchor=”false” data-turn=”assistant”> Founded in 2024 as a QuTech spin-out, Groove Quantum is led by Dr Anne-Marije Zwerver and Dr Nico Hendrickx. Zwerver previously pioneered the first quantum dot qubits manufactured in Intel’s industrial cleanroom, while Hendrickx is an innovator in germanium quantum computing and the primary […]
AWS says acute server memory shortage is driving customers to the cloud
When you can't get 'em with a 'transformation plan,' supply chain pain will do the job The great memory shortage is having yet another effect, pushing enterprises into the waiting arms of the cloud operators as they can't secure enough on-prem compute themselves.…
Episode 43: Jensen Huang on Generative Computing, Re-industrialization, & Physical AI
As an American company that has built the global digital infrastructure for the age of AI, NVIDIA’s role in providing the computing engine behind artificial intelligence has positioned it at the center of this technological shift.
US interstate transmission is partisan sticking point in AI cost debate
US lawmakers are struggling to agree on how to allocate the costs of interstate electricity transmission as AI-driven energy demand continues to rise.
AVK launches modular power system for data center market
UK-based power solutions provider AVK has launched a modular power system for the hyperscale and AI data center sector. – AVK The AVK PowerPod is a fully integrated, pre-engineered power solution delivered in a transportable modular unit. The unit comprises an engine for power delivery, switchgear, UPS, controls, and enclosures. According to the company, it […]
Meta's multi-billion-dollar Graviton deal highlights intensifying CPU shortages in AI infrastructure — the industry signals a shift to Agentic inference workloads, pushing demand | Tom's Hardware
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox · You are now subscribed
Supermicro opens largest US campus in Silicon Valley, producing AI infrastructure
Supermicro's new 32.8-acre Silicon Valley campus will add hundreds of US positions and expand domestic production of AI infrastructure, signaling increased US capacity for enterprises and cloud providers worldwide. The expansion may affect global AI deployment timelines and supply-chain choices ...
Evaluating Strategic Reasoning in Forecasting Agents
arXiv:2604.26106v1 Announce Type: new Abstract: Forecasting benchmarks produce accuracy leaderboards but little insight into why some forecasters are more accurate than others. We introduce Bench to the Future 2 (BTF-2), 1,417 pastcasting questions with a frozen 15M-document research corpus in which agents reproducibly research and forecast offline, producing full reasoning traces. BTF-2 detects accuracy differences of 0.004 Brier score, and can distinguish differential agent strengths in research vs. judgment. We build a forecaster 0.011 Brier more accurate than any single frontier agent, and use it to evaluate agent strategic reasoning without hindsight bias. We find the better forecaster differs primarily in its pre-mortem analysis of its blind spots and consideration of black swans. Expert human forecasters found the dominant strategic reasoning failures of frontier agents are in assessing political and business leaders' incentives, judging their likelihood to follow through on stated plans, and modeling institutional processes.
Regulatory Asymmetry Risks in the Deployment of Advanced General-Purpose AI Models
The lack of standardized cybersecurity risk assessment for frontier models creates an uneven playing field between labs. While some models face heavy government restriction, others may reach similar capabilities without equivalent oversight.
DreamProver: Evolving Transferable Lemma Libraries via a Wake-Sleep Theorem-Proving Agent
arXiv:2604.26311v1 Announce Type: new Abstract: We introduce DreamProver, an agentic framework that leverages a "wake-sleep" program induction paradigm to discover reusable lemmas for formal theorem proving. Existing approaches either rely on fixed lemma libraries, which limit adaptability, or synthesize highly specific intermediate lemmas tailored to individual theorems, thereby lacking generality. DreamProver addresses this gap through an iterative two-stage process. In the wake stage, DreamProver attempts to prove theorems from a training set using the current lemma library while proposing new candidate lemmas. In the "sleep" stage, it abstracts, refines, and consolidates these candidates to compress and optimize the library. Through this alternating cycle, DreamProver progressively evolves a compact set of high-level, transferable lemmas that can be effectively used to prove unseen theorems in related domains. Experimental results demonstrate that DreamProver substantially improves proof success rates across a diverse set of mathematical benchmarks, while also producing more concise proofs and reducing computational cost.
The Evolution of Agentic Systems and the Diminishing Human Monopoly on Complex Judgment
Recent advancements in agentic AI models demonstrate an increasing capacity for high-complexity, long-run tasks previously reserved for human judgment. This shift suggests a fundamental change in the division of labor between human workers and autonomous systems.
Grounding vs. Compositionality: On the Non-Complementarity of Reasoning in Neuro-Symbolic Systems
arXiv:2604.26521v1 Announce Type: new Abstract: Compositional generalization remains a foundational weakness of modern neural networks, limiting their robustness and applicability in domains requiring out-of-distribution reasoning. A central, yet unverified, assumption in neuro-symbolic AI is that compositional reasoning will emerge as a byproduct of successful symbol grounding. This work presents the first systematic empirical analysis to challenge this assumption by disentangling the contributions of grounding and reasoning. To operationalize this investigation, we introduce the Iterative Logic Tensor Network ($i$LTN), a fully differentiable architecture designed for multi-step deduction. Using a formal taxonomy of generalization -- probing for novel entities, unseen relations, and complex rule compositions -- we demonstrate that a model trained solely on a grounding objective fails to generalize. In contrast, our full $i$LTN, trained jointly on perceptual grounding and multi-step reasoning, achieves high zero-shot accuracy across all tasks. Our findings provide conclusive evidence that symbol grounding, while necessary, is insufficient for generalization, establishing that reasoning is not an emergent property but a distinct capability that requires an explicit learning objective.
Electricity price forecasting across Norway's five bidding zones in the post-crisis era
arXiv:2604.26634v1 Announce Type: cross Abstract: Norway's electricity market is heavily dominated by hydropower, but the 2021--2022 energy crisis and stronger integration with Continental Europe have fundamentally altered price formation, reducing the reliability of forecasting models calibrated on historical data. Despite the critical need for updated models, a unified benchmark evaluating feature contributions across all structurally diverse Norwegian bidding zones remains lacking. Here we present a comprehensive evaluation of electricity price forecasting across all five Norwegian Nord Pool bidding zones. We constructed a multimodal hourly dataset spanning 2019--2025 and evaluated eight forecasting model families including LightGBM, ARX, and advanced deep learning architectures using a strictly causal test set. We implemented robust rolling-origin backtesting, leave-one-group-out feature ablation, and conditional regime analysis to dissect model performance and feature utility. Our results show that LightGBM achieves the best performance in every zone with MAE ranging from 1.64 to 5.74~EUR/MWh, while the ridge ARX model remains a highly competitive linear benchmark in northern zones. Feature ablation reveals that models relying solely on lagged prices and calendar variables achieve high accuracy and often match or exceed full multimodal integration. However, conditional regime analysis demonstrates that external features like reservoir levels and gas prices remain crucial to stratify forecast errors, which consistently increase under stressed market regimes. This highlights the practical value of model interpretability and regime awareness for decision makers facing structural changes in market dynamics.
Auto-Relational Reasoning
arXiv:2604.26507v1 Announce Type: new Abstract: Background & Objectives: In the last decade, Machine learning research has grown rapidly, but large models are reaching their soft limits demonstrating diminishing returns and still lack solid reasoning abilities. These limits could be surpassed through synergistic combination of Machine Learning scalability and rigid reasoning. Methods: In this work, we propose a theoretical framework for reasoning through object-relations in an automated manner integrated with Artificial Neural Networks. We present a formal analysis of the Reasoning, and we show the theory in practice through a paradigm integrating Reasoning and Machine Learning. Results: This paradigm is a system that solves Intelligence Quotient problems without any prior knowledge of the problem. Our system achieves 98.03% solving rate corresponding to the top 1% percentile or 132-144 iq score. This result is only limited by the small size of the model and the processing capabilities of the machine it run on. Conclusions: With the integration of prior knowledge in the system and the expansion of the dataset, the system can be generalized to solve a large category of problems. The functionality of the system inherently favors the solution of such problems in few-shot or zero-shot attempts.
Laguna XS.2 and M.1: A Deeper Dive
Poolside.ai released agentic coding models Laguna M.1 and XS.2 for long-horizon software tasks. While notable as a U.S.-based open-weight model, benchmarks require further validation.
MIT-IBM Computing Research Lab launches
The new MIT-IBM Computing Research Lab has officially launched to advance the future of artificial intelligence and quantum computing.
OMEGA: Optimizing Machine Learning by Evaluating Generated Algorithms
arXiv:2604.26211v1 Announce Type: new Abstract: In order to automate AI research we introduce a full, end-to-end framework, OMEGA: Optimizing Machine learning by Evaluating Generated Algorithms, that starts at idea generation and ends with executable code. Our system combines structured meta-prompt engineering with executable code generation to create new ML classifiers. The OMEGA framework has been utilized to generate several novel algorithms that outperform scikit-learn baselines across a robust selection of 20 benchmark datasets (infinity-bench). You can access models discussed in this paper and more in the python package: pip install omega-models.
Smarter way to debias AI vision models
MIT researchers have developed a new approach to debias AI vision models, addressing the 'whac-a-mole' dilemma where fixing one bias often introduces another.
Enabling privacy-preserving AI training on everyday devices
Researchers are developing methods to enable AI model training directly on consumer devices while maintaining user privacy.
AI Breakthrough Revolutionizes RNA Therapeutics
A new AI framework improves the identification and optimization of IRES elements, showing a 15% increase in predictive accuracy for RNA-based therapeutics.
AI security demands legal deterrence not just technological innovation
From the national security bench in Taipei, I have seen firsthand that modern espionage often begins with weaponized ambition inside firms and research centers.
How cyber security is changing in the age of AI
The advantage will go to the organisations that can pivot to understanding that the economics of cyber crime have completely changed
Yet another experiment proves it's too damn simple to poison large language models
Researchers demonstrated how a cheap domain registration and a Wikipedia edit could trick multiple AI bots.
A glimpse into cyber-security’s AI-driven future
Black Hat Asia's cybersecurity conference reveals how artificial intelligence transforms both hacking attacks and network defence strategies in real-time combat. | Science & technology
MITRE flags rising cyber risks as medical devices adopt AI, cloud and post-quantum technologies - Industrial Cyber
Survey finds 99% back microsegmentation ... short on protecting critical systems · US bill allows critical infrastructure operators to detect and neutralize rogue drones, closing key defense gaps · NMFTA names Ben Wilkens director of cybersecurity to lead strategy and research · OT-ISAC flags rising energy sector cyber risk as OT exposure spreads beyond control rooms into distributed assets · Nozomi joins Dragos in dismissing ZionSiphon as flawed, likely AI-generated ...
AI now powers most dangerous cyber threats, warns SANS
SANS says AI has become routine in the most dangerous cyber attacks, leaving defenders racing to keep pace with faster, smarter intrusions.
AI Bot Traffic Surges: Bots Now Dominate Internet, Threaten Security in 2025
AI-driven bot attacks surged in 2025, with bots now constituting over half of global web traffic, highlighting the increasing threat from automated abuse.
Have I Been Pwned claims Pitney Bowes hit by 8.2M email address leak
An alleged data dump by Shiny Hunters includes names, phone numbers, physical addresses, and 8.2 million email addresses.
Adoption, Deployment & Impact
SciHorizon-DataEVA: An Agentic System for AI-Readiness Evaluation of Heterogeneous Scientific Data
arXiv:2604.26645v1 Announce Type: new Abstract: AI-for-Science (AI4Science) is increasingly transforming scientific discovery by embedding machine learning models into prediction, simulation, and hypothesis generation workflows across domains. However, the effectiveness of these models is fundamentally constrained by the AI-readiness of scientific data, for which no scalable and systematic evaluation mechanism currently exists. In this work, we propose SciHorizon-DataEVA, a novel agentic system to scalable AI-readiness evaluation of heterogeneous scientific data. At the evaluation-criteria level, we introduce the Sci-TQA2 principles, which organize AI-readiness into four complementary dimensions: Governance Trustworthiness, Data Quality, AI Compatibility, and Scientific Adaptability. Each dimension is decomposed into measurable atomic elements that enable fine-grained and executable assessment. To operationalize these principles at scale, we develop Sci-TQA2-Eval, a hierarchical multi-agent evaluation approach orchestrated through a directed, cyclic workflow. Our Sci-TQA2-Eval dynamically constructs dataset-aware evaluation specifications by combining lightweight dataset profiling, applicability-aware metric activation, and knowledge-augmented planning grounded in domain constraints and dataset-paper signals. These specifications are executed through an adaptive, tool-centric evaluation mechanism with built-in verification and self-correction, enabling scalable and reliable assessment across heterogeneous scientific data. Extensive experiments on scientific datasets spanning multiple domains demonstrate the effectiveness and generality of SciHorizon-DataEVA for principled AI-readiness evaluation.
Most firms still keep humans approving agentic AI as spending rises
Survey of 545 executives found firms expect to scale agentic AI in 17 months, but 33% cite unprepared processes as the top barrier.
AI-enabled medtech introduces risks facilities aren't ready for, cybersecurity report says
AI-enabled devices are introducing new risks that organizations aren’t fully equipped to manage, the cybersecurity report said.
SAP user group slams 'uncertainty' in ERP giant's API policy
Concerns over new rules might stop customers from adopting innovations – including AI – that connect to SAP systems An influential SAP user group has criticized the vendor's API policy update, saying it lacks clarity and potentially prevents users from starting new projects and innovating on their SAP platforms.…
US audit firms shift focus from AI adoption to oversight
US audit firms move from AI roll-outs to tighter oversight as Caseware-backed research shows stronger calls for validation and controls.
IDC: How EMEA CIOs can jumpstart AI rollouts
Getting stalled enterprise AI rollouts in the EMEA region moving again will require CIOs to aggressively audit their systems.
Connecting Agents to Decisions
Palantir argues that enterprises require an AI decision architecture built around data, logic, action, and security layers to effectively implement agentic AI.
Pharma supply chains are performing, but not yet optimised
Released in Vienna, LogiPharma’s 2026 Playbook shows pharma supply chains at an inflection point shaped by AI, risk, and digital collaboration.
Artificial Intelligence in the Real Economy: A Visual Guide
A new series of visual explainers looking at how AI is transforming different industries.
BlackRock COO on How AI Is Fueling the Firm’s Product Innovation
On this episode of the Odd Lots podcast, BlackRock COO Rob Goldstein joins Joe Weisenthal and Tracy Alloway to discuss ways in which the firm is already using AI to develop innovative products, as well as how he envisions the future of private markets. (Source: Bloomberg)
How AI is powering the next generation of robotaxis
Technological advances have propelled self-driving cars from small-scale testing to rapid global expansion
Is AI increasing access to justice?
A steep increase in ‘vibe litigation’ appears to be expanding the market for legal activity
Casa, a Handyman Start-Up, Aims to Automate Home Maintenance
Casa, a company founded by former Uber executives, says it uses artificial intelligence and a stable of handymen to take care of members’ homes.
Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas
arXiv:2604.26120v1 Announce Type: new Abstract: Behavioral logs provide rich signals for user modeling, but are noisy and interleaved across diverse intents. Recent work uses LLMs to generate interpretable natural-language personas from user logs, yet evaluation often emphasizes downstream utility, providing limited assurance of persona quality itself. We propose a hierarchical framework that aggregates user actions into intent memories and induces multiple evidence-grounded personas by clustering and labeling these memories. We formulate persona induction as an optimization problem over persona quality-captured by cluster cohesion, persona-evidence alignment, and persona truthfulness-and train the persona model using a groupwise extension of Direct Preference Optimization (DPO). Experiments on a large-scale service log and two public datasets show that our method induces more coherent, evidence-grounded, and trustworthy personas, while also improving future interaction prediction.
Frontier Model Limitations in Enterprise Document Automation and Tool Integration
Current frontier models still struggle with complex document generation and seamless tool orchestration required for professional workflows. These technical gaps limit the immediate ROI for enterprise-level automation.
Operational Friction in AI-Driven Workflow Automation and Tool Interoperability
Despite new file-creation capabilities, AI chatbots face significant usability hurdles in orchestrating multi-step tasks. This lack of reliability remains a primary barrier to widespread enterprise adoption.
How Madrigal Built a Flexible and Scalable Multi-Agent Research Platform
Madrigal Pharmaceuticals utilized LangChain and LangGraph to develop a multi-agent AI platform for research and intelligence.
Pakistan Courts Embrace AI with New Guidelines to Enhance Judicial Efficiency and Integrity
Pakistan's National Judicial Policy Making Committee releases new AI guidelines for courts, aiming to aid judicial work while ensuring judgment, privacy, and independence.
Those in the financial field must use these AI tools
AI tools are transforming financial risk assessment, steering away from traditional manual processes and static models
The uncomfortable truth about AI and the American worker | Fortune
Workers fear the robots are coming for their jobs. New research shows the opposite — and why that might actually be more unsettling.
AWS keynote hypes AI as magic. Its own engineers tell a different story
Internal teams advise against shortcuts, emphasizing the need for human review and continued hiring of junior developers.
Audit Yourself to Get More From GenAI
Carolyn Geason-Beissel/MIT SMR | Getty Images More than a year into using generative AI daily, I wondered whether I was getting the most out of my AI use. There was no benchmark or feedback loop, and no one was grading my sessions with ChatGPT and Claude — until I created a self-audit. I did what […]
4 YAML Files Instead of PySpark: How We Let Analysts Build Data Pipelines Without Engineers
How we replaced Python pipelines with dlt, dbt, and Trino and cut delivery time from weeks to one day.
New Survey from Harvard Business Review Analytic Services Finds AI Adoption Remains High, Yet Value May Lag Without Modernization and Workflow Integration
/PRNewswire/ -- Most organizations have moved beyond experimenting with artificial intelligence, but few are realizing its full value. New research from...
AI Adoption Is No Longer the Advantage – Execution Is, Finds New Responsive Study
AI has moved from promise to proof and companies are now under pressure to show results. That’s according to the 2026 State of Strategic Response Management (SRM) Report, released by Responsive, the leader in Strategic Response Management, in partnership with the Association of Proposal ...
AI response management lifts revenue, says Responsive
AI use lifts revenue for strategic response leaders as mature firms turn bids, questionnaires and due diligence into faster sales and stronger returns.
How Popsa used Amazon Nova to inspire customers with personalised title suggestions
Popsa implemented a RAG-based approach using Amazon Nova and Claude Haiku to generate personalized titles for photo books.
Geopolitics, Policy & Governance
Exclusive: How one venture firm is investing in an increasingly fragmented world | TechCrunch
Geopolitical turmoil has made venture investing challenging, leading Kompas VC to carve out a niche in startups focused on the physical world.
Taiwan logs record chip exports, AI demand outpaces geopolitical risk
As the conflict involving the US, Israel, and Iran enters its second month, a fragile ceasefire has tempered immediate market shocks, yet economists warn that prolonged tensions could still ripple through global energy and trade. For Taiwan, however, strong export momentum — driven by surging ...
In the coming AI future, Britain must not end up at the mercy of US tech giants | Rafael Behr
Trump is volatile, capricious and unreasonable – but he belongs to the old world of analogue power. What comes next will be harder to manage Donald Trump is not impressed by soft power. He respects hard men with military muscle. But he can be moved by pageantry, which is the purpose of King Charles’s visit to Washington this week. Trump is flattered to rub shoulders with majesty. The good vibes are then supposed to radiate warmth through a political relationship that has been chilled by the war in Iran. It might work, but not for long. Trump’s irritation with Keir Starmer and other European leaders for what he calls cowardice in the Middle East is aggravated daily by evidence that the war is a strategic calamity. Rafael Behr is a Guardian columnist Guardian Newsroom: Can Labour come back from the brink? On Thursday 30 April, join Gaby Hinsliff, Zoe Williams, Polly Toynbee and Rafael Behr as they discuss how much of a threat Labour faces from the Green party and Reform UK – and whether Keir Starmer can survive as leader. Book tickets here or at guardian.live Continue reading...
The AI arms race’s sneakiest tactic - POLITICO
How the next wave of technology is upending the global economy and its power structures
FundsTech 2026: From “patch gaps to prompt injection”: AI risks in asset management - Funds Europe
At FundsTech 2026, the first panel titled “The New Regulatory Frontier – AI, Cloud & Digital Assets” brought together industry experts to unpack how asset managers can stay ahead of evolving rules on artificial intelligence ( AI) For Daniel Lousqui, associate general counsel, Vanguard, ...
The Creation and Analysis of Government AI Transparency Statements in Australia
arXiv:2604.26075v1 Announce Type: new Abstract: Governments increasingly deploy AI in public services, making transparency essential for accountability and public trust. Australia's Standard for AI Transparency Statements (AITS) requires government bodies to disclose how AI is used in practice, yet little empirical evidence exists on how these requirements are realised in documents. This paper presents the first government AITS dataset, dubbed AITS-101, and provides the first systematic analysis of their content. Using stylometric, quantitative, and qualitative document analyses, we examine disclosure coverage, structure, and recurring patterns. Our findings reveal substantial variation in AI-related practice disclosure, highlight gaps between policy intent and implementation, and inform the design of more effective public-sector AI transparency standards.
White House AI Memo Hits Issues Driving Anthropic-Pentagon Feud
White House officials are preparing a wide-ranging artificial intelligence policy memo that outlines requirements for AI deployment by national security agencies, some of which touch on issues driving the bitter dispute between the Pentagon and Anthropic PBC over military use of the firm’s technology, according to people familiar with the matter.
Australia threatens tech companies with 2.25 percent tax if they don’t pay publishers
The Australian government is proposing a new tax on tech firms that fail to reach payment agreements with local publishers.
Open Problems in Frontier AI Risk Management
arXiv:2604.25982v1 Announce Type: cross Abstract: Frontier AI both amplifies existing risks and introduces qualitatively novel challenges. Not only is there a notable lack of stable scientific consensus resulting from the rapid pace of technological change, but emerging frontier AI safety practices are often misaligned with, or may undermine, established risk management frameworks. To address these challenges, we systematically surface open problems in frontier AI risk management. Adopting a problem-oriented approach, we examine each stage of the risk management process - risk planning, identification, analysis, evaluation, and mitigation - through a structured review of the literature, identifying unresolved challenges and the actors best positioned to address them. Recognising that different types of open problems call for different responses, we classify open problems according to whether they reflect (a) a lack of scientific or technical consensus, (b) misalignment with, or challenges to, established risk management frameworks, or (c) shortcomings in implementation despite apparent consensus and alignment. By mapping these open problems and identifying the actors best positioned to address them - including developers, deployers, regulators, standards bodies, researchers, and third-party evaluators - this work aims to clarify where progress is needed to enable robust and meaningful consensus on frontier AI risk management.The paper does not propose specific solutions; instead, it provides a problem-oriented, agenda-setting reference document, complemented by a living online repository, intended to support coordination, reduce duplication, and guide future research and governance efforts.
Opinion | White House policy on AI chatbots should be adopted by Congress - The Washington Post
An incoherent patchwork of state laws threatens to handicap America in the artificial intelligence race.
April Global Regulatory Brief: Digital finance | Insights | Bloomberg Professional Services
As technology continues to reshape financial services, regulators and policy setters are embarking on a range of digital finance initiatives to manage risks and set appropriate standards.
The US Is Fighting for Control of AI. It Would Be Better Off Building Standards. | TechPolicy.Press
T.J. Pyzyk makes the case for standards vs. strong-arming, if the US government wants shape global AI governance.
Bloomberg: China pauses AV permits after Baidu disruption
Baidu’s robotaxi operations in Wuhan have been suspended, sources tell the publication. Read more: Bloomberg: China pauses AV permits after Baidu disruption
AI Act Omnibus: What just happened and what comes next? | IAPP
AI Governance Center Managing Director Ashley Casovan reacts to the delay in AI Act reform negotiations and assesses what AI governance professionals should do with the legal uncertainty and a looming enforcement deadline for certain high-risk AI systems this August.
Energy industry insiders advise lawmakers on supporting AI growth, protecting ratepayers | National | thecentersquare.com
(The Center Square) – Energy industry experts testified before Congress about what lawmakers should include in legislation looking to support the rapid expansion of artificial intelligence while protecting ratepayers from
What the EU's First Digital Markets Act Review Actually Changes | TechPolicy.Press
The Commission says the DMA has effectively contributed to the core objectives of making digital markets in the EU fairer and more contestable.
Fast-moving AI set as priority for UK cross-agency regulatory oversight
The Digital Regulation Cooperation Forum has pledged to prioritize AI developments in its 2026-27 work plan, focusing on cross-cutting insights and regulatory challenges.
China launches months-long campaign against AI misuse
China’s CAC has launched a months-long AI misuse enforcement campaign targeting deepfakes, fraud, disinformation, and illegal application.
Council Post: The New Rules Of AI In Pharma: What FDA And EMA's 10 Guiding Principles Mean For Your Business
For pharmaceutical and life sciences companies ready to embrace this moment, the 10 principles aren't a burden. They're a blueprint.
Taking a Bite Out of the Forbidden Fruit: Characterizing Third-Party Iranian iOS App Stores
arXiv:2604.26343v1 Announce Type: new Abstract: Due to U.S. sanctions and strict internet censorship, Iranian iOS users are barred from accessing the Apple App Store and developer services. In response, despite violating Apple's developer terms, a thriving underground ecosystem of third-party iOS app stores has emerged to serve Iranian users. This paper presents the first comprehensive empirical study of these clandestine app stores. We document how these stores operate, including their distribution mechanisms, user authentication processes, and evasion techniques. By collecting and analyzing more than 1700 iOS application packages and their metadata from three major Iranian third-party app stores, we characterize the ecosystem's size, structure, and content. Our analysis reveals a significant presence of Iranian-exclusive apps, widespread distribution of cracked apps, unauthorized monetization of paid content, and embedded third-party tracking and piracy libraries. We also uncover a notable overlap among financial, navigational, and social apps that exist solely in this ecosystem, reflecting the unique digital constraints of Iranian users. Finally, we quantify the potential revenue losses for developers due to piracy and document security and privacy risks associated with altered binaries. Our findings highlight how sanctions, censorship, and enforcement gaps have enabled a parallel app distribution ecosystem with complex socio-technical implications.
China Internet Civilization Conference 2026 to Release AI Ethics and Safety Guidelines
The 2026 China Internet Civilization Conference in Nanning will release the Artificial Intelligence (AI) Application Ethics and Safety Guidelines (Version 1.0).
Get the full executive brief
Receive curated insights with practical implications for strategy, operations, and governance.