Mon 8 June 2026
Daily Brief — Curated and contextualised by Best Practice AI
AI Agents Reshape Work, PepsiCo Delivers Autonomously, and Barclays Sounds Alarm
TL;DR AI agents are transforming knowledge work by shifting focus from execution to verification. PepsiCo has deployed 41 driverless trucks in the US, marking a significant step in logistics automation. Barclays warns that the AI-driven stock rally is entering a 'warning zone' amid inflation and interest rate concerns. Meanwhile, AI companies are aggressively expanding into each other's markets, seeking new revenue streams.
The stories that matter most
Selected and contextualised by the Best Practice AI team
How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope
arXiv:2606.07489v1 Announce Type: cross Abstract: Frontier AI systems are bridging the gap between intelligence and utility by shifting from conversational assistants to autonomous agents that execute tasks end to end. Using production data from Perplexity's Search and Computer products, we study this transition by examining how AI agents accelerate and reshape knowledge work. Three key empirical findings emerge. First, using sessions with near-identical initial query pairs as natural experiments for the same underlying task attempted with both products, Computer performs 26 minutes of autonomous work per user session, versus 33 seconds for Search. Computer automates task decomposition and execution that Search users might otherwise manually orchestrate and implement. As a result, Computer shifts follow-up query distribution toward higher-order work such as verification and extension. Autonomy also increases execution quality, with per-query dissatisfaction rates 55% lower on Computer than on Search. Second, due to its autonomy advantage, Computer reduces completion time from 269 to 36 minutes on matched tasks, lowering estimated time and cost by 87% and 94%, respectively, compared to humans equipped with Search alone. Third, Computer changes the scope of work that users attempt: Computer queries more often cross occupational boundaries, require higher-order cognition, draw on broader expertise, take the form of composite tasks that bundle interdependent subtasks into a single query, and unlock work activities that are essentially absent from Search usage among the same users. Together, the evidence indicates that AI agents accelerate workflows, enhance output quality, reduce costs, and expand the breadth and depth of automated work.
Generative Models Erode Human Temporal Learning Through Market Selection
arXiv:2606.06572v1 Announce Type: cross Abstract: We argue that modern generative models create structural risks for knowledge and cultural production at current, sub-AGI capability levels. We define Human Temporal Learning (HTL) as path-dependent knowledge accumulation through sustained engagement with problems over time. Generative outputs increasingly resemble HTL-intensive work in surface features, so verifying whether a given output reflects genuine human learning grows costly relative to its expected benefit. Once verification loses economic justification, evaluators reward outputs regardless of production mode, and producers who invested years of learning compete on price against outputs that cost almost nothing to generate. We call this pathway value collapse and formalize it through a costly-inspection framework. Cross-domain evidence from academic publishing, legal practice, content platforms, and software security maps onto four stages of verification erosion. Alignment success is orthogonal. Better-aligned models narrow observable gaps between human and AI outputs, making source verification harder and intensifying competitive pressure against HTL-intensive work even when individual AI outputs improve.
Driverless Trucks Are Here—and They’re Delivering Bags of Doritos
PepsiCo has 41 trucks on the road in Arizona, Texas and Arkansas, bringing the technology into the mainstream.
Measuring Agents in Production
arXiv:2512.04123v4 Announce Type: replace Abstract: LLM-based agents already operate in production across many industries, yet we lack an understanding of what technical methods make deployments successful. We present the first systematic study of Measuring Agents in Production, MAP, using first-hand data from agent developers. We conducted 20 case studies via in-depth interviews and surveyed 86 deployed systems practitioners across 26 domains. We investigate why organizations build agents, how they build them, how they evaluate them, and their top development challenges. Our study finds that production agents are built using simple, controllable approaches: 68% execute at most 10 steps before human intervention, 70% rely on prompting off-the-shelf models instead of weight tuning, and 74% depend primarily on human evaluation. Reliability (consistent correct behavior over time) remains the top development challenge, which practitioners currently address through systems-level design. MAP documents the current state of production agents, providing the research community with visibility into deployment realities and underexplored research avenues.
Are we at the start of a new investment super-cycle?
AI, clean energy and defence spending are reinforcing each other, amplifying potential spend
The AI Trade Just Entered 'Warning Zone,' According To Barclays - Benzinga
Barclays says the AI-fueled stock rally has entered a "warning zone" as inflation and rate concerns persist.
AI Companies Are Rapidly Expanding Into Each Other's Markets - Business Insider
Companies are ruthlessly invading each other's turf. Ever-increasing valuations mean companies need to find new sources of revenue.
Is this the dawn of the Tokenpocalypse?
An analysis of the current state of AI token usage and potential future constraints.
Economics & Markets
Nvidia CEO Says Tech Stock Selloff Is a Buying Opportunity | The Pulse 6/8/2026
"The Pulse With Francine Lacqua" is all about conversations with high profile guests in the beating heart of global business, economics, finance and politics. Based in London, we go wherever the story is, bringing you exclusive interviews and market-moving scoops. Today's guests: Lindsay Rosner, Goldman Sachs Asset Management Head of Multi-Sector Investing; Euan Blair, Multiverse CEO & Founder; Kanishka Narayan, UK AI & Online Safety Minister (Source: Bloomberg)
‘We didn’t see this coming’: Wall Street eats its forecasts as stocks sell off globally on fear of AI bubble ahead of SpaceX IPO
Everything you need to know before you reach the office this morning.
₹43,000 Crore Withdrawn by FPIs as AI Trends Shift Investment Focus, ETGovernment
Foreign Portfolio Investors (FPIs) withdrew nearly ₹43,000 crore from Indian equities in the first week of June, driven by a global shift towards AI investments and a weakening rupee. Total outflows have reached ₹2.67 lakh crore in 2026, surpassing last year's figures.
Higher rates, AI boom and structural shifts shaping H2 outlook - Funds Europe
Asset manager Carmignac expects the second half of 2026 to be defined by persistently high long-term interest rates, AI-driven investment and a more fragmented global economic landscape.
The AI Trade Just Entered 'Warning Zone,' According To Barclays - Benzinga
Barclays says the AI-fueled stock rally has entered a "warning zone" as inflation and rate concerns persist.
The Wall Street Journal’s inaugural Best Companies for the Future ranking is dominated by tech firms
This new ranking also showcases tech giants Alphabet, Microsoft, Meta and Cisco, propelled by high scores for innovation, financial strength and AI readiness.
Saving Siri: After two years of stumbles, is Apple's AI moment here? | Reuters
Siri debuted in 2011 and is accessible through the bulk of Apple's installed base of 2.5 billion devices, but hundreds of millions of consumers have been chatting with apps from Open AI and Anthropic instead. In China and elsewhere, consumers are turning to AI agents - bots that can carry out complex tasks on behalf of human users - to manage daily schedules and take care of rote tasks.
Apple Expected to Detail Its A.I. Plans at Conference
For the second time, the company is expected to explain its artificial intelligence plans. Unlike some rivals, it is not reorganizing around the technology.
AI Companies Are Rapidly Expanding Into Each Other's Markets - Business Insider
Companies are ruthlessly invading each other's turf. Ever-increasing valuations mean companies need to find new sources of revenue.
WhatsApp comes under Turkish scrutiny over AI chatbots
Meta Platforms is facing an investigation in Turkey, where the national antitrust watchdog is examining whether the integration of Meta AI into WhatsApp creates illegal impacts on rival providers.
🔮 The AI boom is becoming an entrepreneurship boom #577
Plus, Russia's Bryan Johnson, camping chips, dreaming AI++
AI company PhysicsX raises $300m in Series C funding round
PhysicsX has its headquarters in London, with an additional office in New York, and a presence in California's Bay Area and Singapore. Read more: AI company PhysicsX raises $300m in Series C funding round
Dublin-based AI IP start-up Midnight Labs backed by Sony
Expansion in Japan will enable Midnight to operate in a country that is 'uniquely vulnerable to AI-generated copyright infringement' due to 'sophisticated digital piracy syndicates' operating at 'unprecedented scale'. Read more: Dublin-based AI IP start-up Midnight Labs backed by Sony
Ghent’s Companion.energy raises €7.8 million to help large enterprises manage energy in real time
Companion.energy, a Ghent-based startup helping large industrial and commercial enterprises manage energy in real time, has closed a €7.8 million Seed round to deepen the platform’s multi-asset, multi-market optimisation capabilities, and expand commercially across Europe, starting with Germany and Spain. The round was led by Realyze Ventures (Germany) and Pi Labs (United Kingdom), with participation […]
Labor, Society & Culture
US Tech Workers Are Paying the Price of the AI Boom
The information sector is dominant in profit expansion — and one of the worst places to find work.
Generative Models Erode Human Temporal Learning Through Market Selection
arXiv:2606.06572v1 Announce Type: cross Abstract: We argue that modern generative models create structural risks for knowledge and cultural production at current, sub-AGI capability levels. We define Human Temporal Learning (HTL) as path-dependent knowledge accumulation through sustained engagement with problems over time. Generative outputs increasingly resemble HTL-intensive work in surface features, so verifying whether a given output reflects genuine human learning grows costly relative to its expected benefit. Once verification loses economic justification, evaluators reward outputs regardless of production mode, and producers who invested years of learning compete on price against outputs that cost almost nothing to generate. We call this pathway value collapse and formalize it through a costly-inspection framework. Cross-domain evidence from academic publishing, legal practice, content platforms, and software security maps onto four stages of verification erosion. Alignment success is orthogonal. Better-aligned models narrow observable gaps between human and AI outputs, making source verification harder and intensifying competitive pressure against HTL-intensive work even when individual AI outputs improve.
Walmart tells workers that AI will improve their jobs, not steal them
Retailer’s embrace of artificial intelligence comes amid anxiety that the technology will create mass redundancies
HSBC CEO Elhedery on GCC, Job Market & Banking Amid AI
HSBC's Georges Elhedery said the GCC has emerged from recent turmoil. He also told Francine Lacqua that ‘Human Judgment’ Is Vital as AI gathers pace. (Source: Bloomberg)
AI and the reinvention of work: Why some sectors are hiring more, not less - Digital Journal
Even though 70% of cybersecurity tasks can be automated, the sector is projected to grow 30% by 2030 thanks to AI.
Department of Work and Pensions' answer to AI job fears is a bot to polish your CV
Whitehall says Work Assistant will help jobseekers apply around the clock – provided employers don't mind machine-written applications
An AI-Driven Socio-Technical Framework for Performance Management in Teleworking Environments
To further interpret these dynamics, ... how AI-enabled performance systems may reconfigure power relations within organizations. In particular, algorithmic performance systems can increase visibility and standardization in evaluation processes, which may reduce managerial discretion and shift interpretive authority toward algorithmic outputs. As a result, changes introduced ...
Opinion | Why human intelligence still matters in the age of AI - The Washington Post
RLDX-1, a dexterity-first AI model for robot hands, powers a robot demonstrating packing and sorting at the Computex exhibition in Taipei on June 3.
The Geography of Algorithmic Judgment: LLM Intermediaries, Place Identity, and Racial Steering in Housing Search
arXiv:2606.06694v1 Announce Type: cross Abstract: Large language models (LLMs) are rapidly assuming an intermediary role in housing search through the integration of listing platforms within conversational interfaces, mediating access to information, search, and recommendations within urban settings. We expand on prior work on racial steering in LLMs by conducting a behavioral audit of seven open-weight and closed-source LLMs across four U.S. cities, testing location recommendations across three iterative prompting conditions that progressively add lifestyle preference context and reflect fair housing paired-testing methodologies. We find that steering is an emergent behavior of the model's interpretive license rather than primarily a static property. Steering results from the interaction of a user's identity, preference articulation, and the spatial logic that a model has internalized about learned representations of place, preference, and opportunity in a given city, and how different types of users relate to it. While steering was present, it was not uniform in direction or magnitude across evaluated conditions. Preference-conditioned testing often increased or reconfigured the number of models that exhibited steering behaviors relative to baseline conditions, suggesting that LLMs may interpret what the same housing preference means differently depending on the racial identity of the user. Our findings also demonstrate that the city is not a neutral testing unit for LLM evaluation in place-based sectors, and results from one local market cannot be assumed to generalize to another. Local and domain expertise will be required in the housing sector to ensure that legal and institutional commitments to fair housing are not undermined while adopting AI tools that mediate spatial access.
Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation
arXiv:2606.06514v1 Announce Type: new Abstract: Machine learning systems deployed in high stakes socioeconomic settings routinely display bias. We formalize bias as a symmetry breaking operation: a classifier is fair if its outputs remain invariant under the counterfactual operation of switching a sensitive attribute, with merit features held fixed. We implement loss based regularization as a symmetry restoring mechanism and evaluate the framework on four synthetic datasets with varying levels of noise, correlation, and bias. The framework achieves upwards of 90\% violation reduction, with accuracy costs around 5\%. This framework does not require causal graph knowledge, is computationally lightweight, and generalizes to any sensitive attribute definable as a bit-flip, making it suitable for contexts where local sources of discrimination remain absent from mainstream benchmarks.
Brit fraudsters using AI to doctor 'evidence' in motor insurance claims
Policy-holders increasingly turn fender benders into much more by sprinkling in their favorite AI chatbots, Aviva says
‘A driver of political violence’: how the breakneck AI boom is fueling anti-tech extremism
Backlash against AI is taking an extremist turn, following in the footsteps of earlier techno-pessimist militants When a 20-year-old man from Texas was arrested earlier this year for allegedly trying to burn down OpenAI’s headquarters and Sam Altman’s house, authorities found an anti-AI manifesto alongside his lighter and a jug of kerosene. It was one of a spate of attacks that has caused alarm among researchers, the tech industry and law enforcement about the rise of anti-tech extremism. In April, an Italian “nature pilled” Instagram influencer was arrested in Rome and charged with plotting a series of anti-tech attacks that took inspiration from Ted “The Unabomber” Kaczynski. Two self-described “ecofascists” that carried out a deadly anti-Muslim attack on a mosque in San Diego last month also cited “AI slop” and JD Vance’s ties to Palantir as motivations for their violence in their manifesto. An Indianapolis city councilor woke up earlier this year to gunshots being fired into his home before finding a note that read “NO DATA CENTERS”. Continue reading...
How AI reshapes criminal investigations for South Korean prosecutors
The growing use of AI-generated and manipulated evidence has prompted South Korea's prosecution service to develop forensic tools to detect deepfakes, voice clones, and other digital forgeries.
Beyond Tool Adoption: A Practical Five-Stage Developmental Continuum for AI Literacy in Higher Education
arXiv:2606.00038v4 Announce Type: replace Abstract: Artificial intelligence (AI) literacy is increasingly recognized as a foundational competency for all university graduates. Yet students' engagement with AI tools often clusters at two extremes: avoidance driven by fear, mistrust, ethical concern, or lack of access, and uncritical reliance that produces fluent output while masking misunderstanding. Existing AI literacy frameworks provide valuable competency definitions, but most offer limited guidance for diagnosing where learners begin and how they progress toward responsible, critical engagement. This paper proposes a five-stage AI Literacy Continuum: 0) Not Yet Engaged, 1) Uncritical Use, 2) Informed Use, 3) Critical Evaluation, and 4) Improvement --that describes developmental orientations toward AI use in higher education. The continuum complements dimensional frameworks by providing educators with a practical diagnostic and instructional pathway aligned with international frameworks, including UNESCO and OECD. We present a design-based implementation case from North Carolina State University, where credit-bearing courses and intensive hands-on workshops engaged more than 330 participants between Fall 2024 and Spring 2026. Because the implementation did not use a validated pre/post instrument or comparison group, we frame the findings as observational and practice-based: participants exhibited behaviors consistent with movement from non-engagement or uncritical use toward informed engagement, while sustained and discipline-embedded experiences produced stronger evidence of critical evaluation and improvement-oriented practice. We discuss curricular pathways, opportunity considerations, assessment strategies, and argue that AI literacy should be understood not as tool adoption alone but as a developmental capacity to understand, evaluate, and responsibly apply AI systems in disciplinary and societal contexts.
The AI PM Career Path in 2026 — What to Learn, What's Obsolete, and Where the Jobs Are - FourWeekMBA
The AI PM career path in 2026: what's obsolete (PRDs, stakeholder management, quarterly roadmaps), what to learn (frontier prototyping, overhang bets), where the jobs are.
Technology & Infrastructure
How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope
arXiv:2606.07489v1 Announce Type: cross Abstract: Frontier AI systems are bridging the gap between intelligence and utility by shifting from conversational assistants to autonomous agents that execute tasks end to end. Using production data from Perplexity's Search and Computer products, we study this transition by examining how AI agents accelerate and reshape knowledge work. Three key empirical findings emerge. First, using sessions with near-identical initial query pairs as natural experiments for the same underlying task attempted with both products, Computer performs 26 minutes of autonomous work per user session, versus 33 seconds for Search. Computer automates task decomposition and execution that Search users might otherwise manually orchestrate and implement. As a result, Computer shifts follow-up query distribution toward higher-order work such as verification and extension. Autonomy also increases execution quality, with per-query dissatisfaction rates 55% lower on Computer than on Search. Second, due to its autonomy advantage, Computer reduces completion time from 269 to 36 minutes on matched tasks, lowering estimated time and cost by 87% and 94%, respectively, compared to humans equipped with Search alone. Third, Computer changes the scope of work that users attempt: Computer queries more often cross occupational boundaries, require higher-order cognition, draw on broader expertise, take the form of composite tasks that bundle interdependent subtasks into a single query, and unlock work activities that are essentially absent from Search usage among the same users. Together, the evidence indicates that AI agents accelerate workflows, enhance output quality, reduce costs, and expand the breadth and depth of automated work.
Sorry, I’m Not Available. Talk to the A.I. Version of Me.
The hot new productivity hack for C.E.O.s and Harvard professors? A.I. twins that answer questions and attend meetings.
Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory
arXiv:2606.06523v1 Announce Type: new Abstract: Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge in artificial intelligence. Despite recent advances in LLMs' agentic capabilities, most agent systems still lack formal methods for specifying, verifying, and debugging their workflow and execution trajectories. This challenge mirrors a long-standing problem in mathematics, where the ambiguity of natural languages (NLs) motivates the development of formal languages (FLs). Inspired by this paradigm, we propose **Lean4Agent**, to the best of our knowledge, the first framework that uses Lean4, a dependent-type FL to model and verify agent behavior. **Lean4Agent** launches **FormalAgentLib**, an extensible Lean4 library for formally modeling and verifying agent workflows' semantic consistency under explicit assumptions, and enabling localization of execution-time failures revealed by trajectories. Building on **FormalAgentLib**, we further develop **LeanEvolve**, which applies results in **FormalAgentLib** to revise workflows to enhance its capability. Extensive experiments on a hard problem subset of SWE-Bench-Verified and a subset of ELAIP-Bench across 5 leading LLMs indicate that the verification-passing workflows outperform the failing ones by an average of **11.94%**, and **LeanEvolve** further improves SWE performance by **7.47%** on average. Furthermore, **Lean4Agent** establishes a foundation for a new field of using expressive dependent-type FL to formally model and verify agent behavior.
AEGIS: A Backup Reflex for Physical AI
arXiv:2606.06660v1 Announce Type: new Abstract: Long-horizon robot manipulation tends to fail gradually: one bad step degrades the state, and the policy spirals into a basin from which it cannot recover. The failure is often visible before it happens. We introduce AEGIS (Activation-probe Early-warning, Gated Inference Switching), a selective escalation method that uses a lightweight probe on a weak policy's frozen activations to detect high-risk steps while there is still time to act. When the probe flags a step, control switches to a stronger separate policy, but only for the steps that need it. On LIBERO-Spatial, AEGIS recovers 10.1% of the trajectories the weak policy alone loses, versus 4.6% for budget-matched blind escalation and 5.1% for a random-trigger placebo. These gains are significant under one-sided exact paired McNemar tests with Holm-Bonferroni adjustment over three pre-registered contrasts: +5.4pp over blind escalation, p=8.5e-6; +5.0pp over random triggering, p=1.0e-4; paired-trajectory bootstrap CIs exclude zero. AEGIS activates the stronger policy on only 38% of steps, so the lever is timing rather than compute. The probe clears its precondition with an early-window AUROC of 0.764, 95% CI [0.70, 0.84], read from the weak-policy path over the first 30% of trajectory steps before any handoff. We pre-register the full analysis plan, including a conditional recovered-task-rate estimand and explicit kill criteria, and confirm the result on 700 common-random-number episodes per arm, with nA-fail=646.
Measuring Agents in Production
arXiv:2512.04123v4 Announce Type: replace Abstract: LLM-based agents already operate in production across many industries, yet we lack an understanding of what technical methods make deployments successful. We present the first systematic study of Measuring Agents in Production, MAP, using first-hand data from agent developers. We conducted 20 case studies via in-depth interviews and surveyed 86 deployed systems practitioners across 26 domains. We investigate why organizations build agents, how they build them, how they evaluate them, and their top development challenges. Our study finds that production agents are built using simple, controllable approaches: 68% execute at most 10 steps before human intervention, 70% rely on prompting off-the-shelf models instead of weight tuning, and 74% depend primarily on human evaluation. Reliability (consistent correct behavior over time) remains the top development challenge, which practitioners currently address through systems-level design. MAP documents the current state of production agents, providing the research community with visibility into deployment realities and underexplored research avenues.
Sustainability by Design in Decentralized Autonomous Organizations: An Empirical Review of Governance, Innovation, and Institutional Design
arXiv:2606.05667v1 Announce Type: cross Abstract: Recent innovation theories on economics remain largely grounded in assumptions of hierarchical firms and closed organizational boundaries, offering limited insight into how innovation unfolds within decentralized, digitally native organizations. Decentralized Autonomous Organizations (DAOs) represent an emerging form of innovation ecosystem characterized by blockchain-based transparency, open participation, and token-driven governance, in which sustainability can be embedded directly into organizational design. This study compares two standards, ERC-8004 and Google A2A, who address the same agent interoperability question, while the former is governed by DAO and the latter by corporation consortium. They are examined through an LLM-powered comparative pipeline for large-scale governance discourse analysis, integrating automated annotation, neural topic modeling, and multi-layer network analysis to study socio-technical power structures. The study provides evidence-based insights for scholars, policymakers, and designers seeking to align innovation, technological governance, and sustainability in future organizational forms.
OpenSkill: Open-World Self-Evolution for LLM Agents
arXiv:2606.06741v1 Announce Type: new Abstract: Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, successful trajectories, or verifier signals. Real open-world deployments may provide none of these, offering only a task prompt. In this work, we study open-world self-evolution, where an agent must build both its skills and its own verification signals from scratch, using open-world resources but no target-task supervision. We propose OpenSkill, a framework that bootstraps this loop: it acquires grounded knowledge and verification anchors from documentation, repositories, and the web, synthesizes them into transferable skills, and refines those skills against self-built virtual tasks grounded in the anchors rather than in target answers. The open world thus supplies both the knowledge to be learned and a supervision-independent practice environment, with target-task supervision reserved for final evaluation. Across three benchmarks and two target agents, OpenSkill attains the best automated pass rate while satisfying the no-supervision constraint. Analysis shows its skills transfer across models without model-specific adaptation, and its self-built verifier aligns with ground-truth outcomes despite never accessing them.
Majority of US’s new AI datacenters to be built on drought-hit land
Guardian analysis finds facilities to be built in some of the driest areas as outcry grows over water needed to power AI A record-shattering drought has racked much of the US. But the artificial intelligence industry is pushing ahead regardless, with the majority of planned datacenters set to be built in drought-ridden locations, a Guardian analysis has found. About two-thirds of upcoming datacenters, which typically require a large amount of water to operate, are set to be built in places that have been among the driest in the country over the past year. Continue reading...
Brookfield Head of AI Infrastructure on Bloomberg TV
Brookfield Head of AI Infrastructure and Head of Europe Sikander Rashid joins Bloomberg's Tom Mackenzie to kick off London Tech Week as part of Bloomberg Brief. (Source: Bloomberg)
Nebius signs 22MW capacity agreement with Kao Data in the UK
European AI cloud company Nebius has signed on to lease 22MW of capacity from Kao Data’s Harlow data center campus in the UK. The company has also pledged a £1.7 billion ($2.26bn) investment in the UK, through which it will establish three new Nvidia infrastructure deployments. – Kao Data via LinkedIn The three new sites […]
Inside India's Rs 10,372-crore sovereign AI race: 2 infrastructure stocks powering the 38,000-GPU boom - Stock Insights News | The Financial Express
A ₹10,372 crore AI mission. Over 38,000 GPUs already deployed. As India races to build its own AI ecosystem, a handful of companies are quietly building the infrastructure powering this transformation.
COMPUTEX 2026 Exposes AI Infrastructure’s Real Constraints - HostingJournalist.com
COMPUTEX 2026 Exposes AI Infrastructure’s Real Constraints on HostingJournalist.com.
From optical modules to chips -- China's tech supply chains sustain global AI growth-Xinhua
From optical modules to chips -- China's tech supply chains sustain global AI growth-
Exclusive: The semiconductor battle behind AI data centers and EVs
Beneath the rapid expansion of electric vehicles and artificial intelligence infrastructure, a quieter battle is unfolding in the semiconductor supply chain.
Accelerated Fourier SAT (AFSAT): Fully Realising a GPU-based Symmetric Pseudo-Boolean SAT Solver
arXiv:2606.06641v1 Announce Type: new Abstract: We present Accelerated Fourier SAT (AFSAT), a GPU-accelerated solver for pseudo-Boolean satisfiability based on continuous local search (CLS). AFSAT realises the proof-of-concept approach, FastFourierSAT, into a fully-engineered solver supporting any heterogeneous mixture of symmetric constraint types and lengths within a single problem instance. Using the JAX compiler, AFSAT leverages pure function composition, automatic vectorisation, automatic differentiation, and just-in-time (JIT) compilation to perform massively parallel CLS across batches of candidate assignments. We demonstrate substantially improved numerical stability, runtime performance, and memory efficiency over the proof-of-concept. We achieve this by way of identifying and addressing various limitations that arise from memory latency and floating-point representation, as well as leveraging automatic parallelisation and compact representations. The inherent representational and stability limitations of floating point are partially addressed by a tailored discrete Fourier transform implementation. We achieve near-linear throughput when scaling to multiple accelerators via JAX array sharding.
A Study of Parallel Continuous Local Search
arXiv:2606.06656v1 Announce Type: new Abstract: We study parallel Continuous Local Search (CLS) as a solution approach for Boolean satisfiability problems with symmetric pseudo-Boolean (PB) constraints. Here, the $n$-variable PB-satisfiability problem is relaxed to a continuous optimisation problem with a differentiable objective function on an $n$-dimensional hypercube. For satisfiable instances, the global minimisers of this optimisation problem correspond to satisfying assignments of the SAT problem at hand. We present several novel findings via empirical experiments: (i) redundant constraints can inhibit rather than accelerate convergence; (ii) CLS shows promise as a sub-solver in hybridised settings, quickly completing partial assignments; and (iii) local search rapidly converges to a stable distribution of solution quality (i.e., degree of satisfaction), due to saddle-dense objectives where additional solver steps yield diminishing returns. Our findings inform practical uses of CLS for SAT on modern accelerator hardware.
Our systems editor flew all the way to Taiwan and still couldn't get away from AI
Every show now is an AI show, and that included this year's Computex
Benchmarking LLMs for Community Governance Simulation with Life-history Narratives
arXiv:2605.23783v2 Announce Type: replace Abstract: Effective community governance hinges on understanding what specific residents think and need. Recent work has used large language models (LLMs) to simulate human respondents, offering a scalable, reproducible way to study human attitudes and behaviors at low cost. However, these studies typically prompt the model with just a few demographic variables (age, gender, income), simulating only general role types. This is insufficient for community governance, where decisions depend on the views of specific residents. We bridge this gap with an integrated research framework covering dataset, benchmark, algorithm, and system. The dataset comprises approximately 1.2 million characters of first-person narrative collected through two-hour semi-structured interviews with each of 92 residents in an urban community, organized around nine community-governance domains. The benchmark probes 18 mainstream LLMs across four prompting strategies and shows that adding rich life-history profiles meaningfully raises fidelity above the no-profile baseline, but this gain comes with more input tokens per call from the longer prompts they require. The algorithm, curriculum-LoRA, is a parameter-efficient personalization framework that, by closing this fidelity-cost gap, matches the strongest baseline's fidelity at roughly 10x lower per-call cost and Pareto-dominates every configuration tested. The system integrates curriculum-LoRA into a closed-loop policy-evaluation pipeline. Together, these results bring individual-level LLM-based resident simulation within reach of resource-constrained local administrations, enabling community-governance decisions to be systematically pre-evaluated in silico before real-world deployment.
DiBS: Diffusion-Informed Branch Selection
arXiv:2606.06518v1 Announce Type: new Abstract: Sudoku is a representative constraint satisfaction problem that requires global structural reasoning under strict discrete constraints. The existing works of solving Sudoku mainly focus on two dominant approaches, i.e., traditional heuristic and deep learning solver. However, they suffer from two complementary limitations: learning-based solvers lack hard correctness guarantees, while complete symbolic solvers are still prone to long-tail search. To address these shortcomings, we propose a novel diffusion model-guided approach, termed as DiBS, for the branch selection search process. Specifically, DiBS keeps the symbolic solver complete and uses the diffusion model as a branch-ordering guide. The core method is ranking candidate values under the current partial assignment and lightweight consistency signal. Furthermore, we provide an in-depth theoretical proof to reveal how it works and why it works. Experiments on the challenging Royle 17-clue Sudoku benchmark show that our DiBS substantially reduces search cost relative to strong heuristic baselines, especially in nodes, backtracks, and long-tail percentiles. Besides, these results confirm that learned global guidance is effective on hard instances where branch-order mistakes are most expensive. All codes are available at https://github.com/shanxierdan/DiBS.
mmPISA-bench: Do LLMs Reason Equally Well Across 43 Languages?
arXiv:2606.07069v1 Announce Type: cross Abstract: We introduce mmPISA-bench, a compact high-quality multilingual reasoning benchmark derived from the OECD Programme for International Student Assessment (PISA). The benchmark consists of 25 multiple-choice questions that require reasoning in order to be answered correctly. Each question is provided in official human translations to 43 languages and complemented with machine-translated counterparts (i.e., 2,150 data points in total). We evaluate two mainstream proprietary LLMs across languages, reasoning effort levels, and translation types in terms of their ability to answer the questions correctly. Our results show that modern LLMs can reason effectively across all evaluated languages, achieve accuracy comparable to human test-takers, with some performance variations across covered languages. We further find that machine-translated questions do not degrade accuracy relative to official human translations which suggests that high-quality machine translation (synthetic data) might often be adequate for large-scale multilingual reasoning evaluations where official translations are not available. Finally, we analyze token usage and related inference cost and find that LLMs usage in some languages is simultaneously more expensive and less accurate.
A Geometric Account of Activation Steering through Angle-Norm Decomposition
arXiv:2606.06735v1 Announce Type: new Abstract: Linear activation steering has gained popularity as a simple and empirically effective way to control language model behavior. More recently, spherical steering paradigms have been proposed to address limitations of additive interventions, often motivated by the assumption that hidden-state norm does not carry concept-relevant information. In this work, we revisit this assumption through a controlled empirical study designed to disentangle the roles of angular and radial components. We show that steering methods differ mainly in how they couple two geometric effects: changing a token's angular alignment with a concept direction and changing its hidden-state norm. Across seven language models, we find that concepts are represented primarily in angular structure, supporting the motivation for spherical methods, but that norm remains important for the stability and downstream effects of steering. Our results explain why interventions with similar concept-level effects can behave differently, and suggest that activation steering should be parameterized by interpretable angular and radial components of the intervention, rather than by a single additive coefficient that entangles these two effects.
Inside a Brain-Chip Startup in China
Brain-computer interfaces could dramatically alter how humans interact with technology. Startups around the world are racing to get their devices approved for commercial use. China may be in the lead. (Source: Bloomberg)
Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics
arXiv:2606.06533v1 Announce Type: new Abstract: What would it mean to have a scientific understanding of AI? Models are not static objects: they are snapshots of time-evolving processes shaped by data, objectives, architectures, and optimization dynamics. Yet much of AI research treats models as fixed artifacts, analyzing behaviors after training rather than asking why they emerge. This position paper argues that a science of AI must move beyond post-hoc fixes and study the training dynamics that produce model behavior. Such a science should support progressively stronger forms of understanding: predicting outcomes from early training signals, intervening when trajectories go wrong, and ultimately designing training procedures that more reliably produce desired properties. Scaling laws have made prediction routine for loss; the challenge is extending this success to capabilities, biases, robustness, and safety-relevant behaviors. We articulate requirements for such theories grounded in the history and philosophy of science, examine progress in mechanistic interpretability, fairness, memorization, and simplicity bias, and identify concrete open problems.
CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions
arXiv:2606.06526v1 Announce Type: new Abstract: Large language models have made substantial progress on mathematical reasoning, but existing benchmarks typically evaluate well-specified problems with final answers, step-by-step solutions, or complete proofs. They do not capture collaborative open-problem solving: a setting in which participants propose partial arguments, identify gaps or errors in prior steps, repair flawed reasoning, and gradually synthesize incremental contributions into a proof. We introduce CrowdMath, a dataset of 164 expert-annotated progress chains from the MIT PRIMES--Art of Problem Solving (AoPS) CrowdMath program (2016-2025), a collaborative research initiative whose discussions have led to peer-reviewed publications. Each chain traces a multi-participant forum discussion from an open-problem statement to a completed proof. Posts are labeled by their functional roles in the evolving solution process, including partial progress, proof completion, erroneous reasoning, and error identification. We define evaluation tasks and benchmark six frontier models. Models achieve 83-88% accuracy on next-post prediction, suggesting that they can follow the local flow of mathematical discussion. However, they struggle to identify the functional significance of individual contributions with the best model achieving only 0.42 macro-F1 on post-role classification. CrowdMath exposes a gap between solving well-specified mathematical problems and understanding collaborative mathematical progress as it unfolds.
Growing number of AI hallucinations that are appearing in academic papers
Concerns are rising as AI-generated hallucinations increasingly infiltrate academic research and publications.
Attack Selection in Agentic AI Control Evaluations Meaningfully Decreases Safety
arXiv:2606.06529v1 Announce Type: new Abstract: An attacker that strategically chooses when to attack is much harder to catch than one that attacks indiscriminately. AI control is a safety framework for deploying capable but untrusted AI agents under the oversight of a weaker, trusted monitor and a limited human audit budget. Control evaluations stress-test these protocols by pitting a red-team attack policy against the blue-team monitor, but current evaluations typically assume attackers that do not strategically select when to attack. We study this capability, attack selection, in agentic settings by decomposing attack decisions into a start policy, which decides when an attacker should attack, and a stop policy, which decides when an attacker should abort an ongoing attack. Across two agentic settings, BashArena and LinuxArena, both policies substantially lower measured empirical safety without changing the underlying attack capability. At a 1% audit budget, our start policy reduces safety by 20pp on both BashArena and LinuxArena, and our stop policy reduces safety by 20pp on BashArena and 28pp on LinuxArena. These reductions should be interpreted as upper bounds on the effect of attack selection. Existing control evaluations may therefore yield overly optimistic safety estimates against selective attackers. We recommend that future evaluations, system cards, and safety cases elicit attack selection to produce more realistic safety estimates.
Why Lightspeed and Wiz’s Assaf Rappaport bet $37 million on an AI-powered cyberattacker
After frontier AI models exposed thousands of previously unknown vulnerabilities, startup A is building autonomous defenses designed for an era of AI-native attackers.
SafeGene: Reusable Adapters for Transferable Safety Alignment
arXiv:2606.06519v1 Announce Type: new Abstract: Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make models more vulnerable to malicious prompts, even when the training data is not intentionally harmful. This creates a recurring safety recovery problem as target models are repeatedly updated with new task data or user interactions. We propose SafeGene, a reusable safety-adapter module designed for cross-task reuse within each architecture-compatible model family. Rather than treating safety recovery as a model-specific repair step, SafeGene treats safety capability as an independent, reusable adapter representation decoupled from task-specific updates. This representation is obtained from aligned--degraded model discrepancies, refined into task-transferable safety vectors through data-aware layer selection, and expressed in each downstream task-adapted model via few-shot layer-wise coefficient recalibration. Experiments across multiple model families, downstream tasks, and safety judges show that SafeGene-enhanced models reduce harmful response rates while maintaining downstream performance, outperforming representative safe adaptation methods in safety--utility trade-off.
New AI espionage powers trigger Putin camera scare
Russia paused surveillance system after killing of Iran’s Supreme Leader exposed how AI can be used on CCTV data to target enemies
What Your Posts Reveal: A Benchmark and Agentic Framework for User-Level Privacy Leakage on Social Media
arXiv:2606.06784v1 Announce Type: cross Abstract: Public social media posts can reveal private information through weak cues scattered across text, images, or metadata. Such leakage is often cumulative and cross-post: cues that appear harmless in isolation may jointly expose a user's home, workplace, or routine. However, current research lacks a unified benchmark for user-level multimodal privacy leakage and an evaluation metric that captures exposure severity beyond binary accuracy. To address these gaps, we propose SopriBench, a synthetic benchmark guided by leakage patterns abstracted from a private reference corpus of Rednote and Instagram accounts, covering 50 user profiles and 1,569 images with attributes, contextual sensitivity, granularity, leakage type, inference difficulty, and supporting evidence. We further introduce the Privacy Exposure Score (PES), which weights value granularity by contextual sensitivity. Inspired by abductive reasoning, we introduce Argus, a training-free agentic framework for cumulative leakage inference. Argus forms hypotheses from accumulated evidence, verifies supporting evidence, and aggregates cross-post cues into privacy profiles, achieving 0.55 PES, a 25% improvement over the strongest baseline, with the largest gain on cross-post leakage.
OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks
OpenAI has introduced a new 'Lockdown Mode' designed to harden its models against prompt injection and data leakage vulnerabilities.
Adoption, Deployment & Impact
Driverless Trucks Are Here—and They’re Delivering Bags of Doritos
PepsiCo has 41 trucks on the road in Arizona, Texas and Arkansas, bringing the technology into the mainstream.
Apple’s Revamped AI Is a Long-Term Play in a Down PC, Mobile Phone Market
A revived Siri unlikely to spur iPhone, Mac sales in the near future
Why Your Next Diagnosis May Be Guided by an A.I. Helper
OpenEvidence, a fast-growing start-up, is using artificial intelligence to help doctors find answers to clinical questions for diagnosis and treatment.
Uber Opens London Waitlist for Wayve Robotaxis Ahead of Launch
Uber Technologies Inc. has set up a waitlist for customers interested in riding in a robotaxi in London, ahead of a planned commercial debut in the UK this year.
CARVE-Q: Quantum-Proposed, Classically Certified Interactive Driving Repair
arXiv:2606.06531v1 Announce Type: new Abstract: The critical question after a correct driving veto is not only whether a maneuver is unsafe, but whether the blocked interaction admits a lawful, auditable, and responsibility-bounded repair. Prediction and game-theoretic planners can suggest plausible cooperation, yet they do not return a proof that the repair respects hard rules, right-of-way, cost allocation, and ego fallback. We introduce CARVE, Certified Affordable Repair of Vetoed maneuvers via Envelopes, a certificate architecture for prediction-free interactive repair. Given a vetoed maneuver, CARVE constructs a finite repair lattice and emits a structured certificate recording the binding rule, selected joint repair, right-of-way-scaled cooperation envelope, responsibility-weighted cost split, and ego-only fallback. This certificate view reveals the algorithmic bottleneck: multi-owner repair induces a product lattice $M = \prod_j |\mathcal{A}_j|$. We therefore introduce CARVE-Q, a verifier-shielded quantum-AI search layer that applies quantum minimum finding only to this black-box lattice while leaving all safety authority classical. In the conservative verifier-oracle model, exact classical minimum finding requires $\Theta(M)$ queries in the worst case, whereas Durr-Hoyer/Grover minimum finding uses $O(\sqrt{M})$ oracle queries with high probability. We prove verifier-shielded certificate soundness, priority non-elicitation, black-box query separation, and finite-precision reversible-oracle constructibility. We then demonstrate state-vector minimum finding on CARVE repair oracles up to 65,536 assignments and validate certificate preservation on Lanelet2-grounded INTERACTION replay with 100% right-of-way respect, 100% blame consistency, and zero priority false positives. The result is a trust-bounded quantum-AI pattern for certified autonomy: quantum proposes; CARVE certifies.
‘Poisoned’ AI: the ChatGPT shopping scams that lead to fake websites
Buyers are ripped off after assuming online stores were genuine because they are recommended by an AI tool You want to buy a new bag and so you ask ChatGPT for help. You have always liked Russell & Bromley so you ask ChatGPT what is popular there at the moment. The artificial intelligence (AI) assistant gives you cross body, shoulder, casual and formal options with the prices listed beside them. You click through from the sources to what looks like the official Russell & Bromley site and buy your new bag, which is conveniently on sale. Continue reading...
Evidence-Based Intelligent Diagnostic and Therapeutic Visualization System with Large Language Models: Multi-Turn Interaction and Multimodal Treatment Plan Generation
arXiv:2606.06869v1 Announce Type: new Abstract: Aim: Existing AI-assisted traditional Chinese medicine diagnostic tools suffer from opaque reasoning processes, passive interaction, and limited treatment plan presentation. This study proposes a knowledge-enhanced visual diagnostic system to improve the transparency and interpretability of syndrome differentiation and treatment. Methods: The system is built upon a Neo4j knowledge graph comprising 241 syndromes, 1,263 symptoms, and 2,485 relations. It incorporates a four-stage symptom matching pipeline (exact, semantic, fuzzy, and large language model verification), an information gain-driven proactive questioning strategy optimized with genetic algorithms, and a multimodal treatment presentation integrating artificial intelligence-generated illustrations, three-dimensional meridian-acupoint models, and evidence-based literature. Results: Knowledge graph constraints reduced non-standard outputs by 32%. Case studies validated the effectiveness of the interactive workflow across patient self-assessment, clinician-assisted diagnosis, and traditional Chinese medicine education. Automated paired-comparison evaluation across 30 cases further demonstrated significant improvements in diagnostic trust (Cohen's d = 1.82, p < 0.001), reduced cognitive load (improvements in four of five dimensions), and higher credibility of evidence-based references (4.21 vs. 2.95). Conclusions: The proposed system enhances the transparency of traditional Chinese medicine diagnostic reasoning and the interpretability of treatment plans through knowledge graph-driven visualization and multimodal interaction, offering a practical solution for trustworthy artificial intelligence-assisted traditional Chinese medicine applications.
HKJudge: A Legal Discourse-Annotated Corpus for Interpreting What Courts Find, How They Reason, and What They Rule
arXiv:2606.06679v1 Announce Type: cross Abstract: Court judgments are central to legal practice and jurisprudence, yet discourse analysis of Hong Kong judgments has received limited attention, owing largely to the absence of expert-annotated corpora. We introduce the Hong Kong Judgment Discourse Dataset (HKJudge), the first sentence-level expert-annotated legal discourse corpus. HKJudge includes criminal judgments across all five levels of HK's court hierarchy, comprising $\sim$290k sentences and $\sim$6.5 million tokens, fully annotated by legal linguistics experts. We design a two-tier discourse schema that captures what facts a court finds, how it reasons, and what it rules. At the sentence level, each sentence is assigned one of 26 rhetorical roles. At the span level, sentences are further annotated with three sentencing elements (charge, imprisonment term, fine). Ten legal linguistics annotators produced the annotations with an inter-annotator agreement of $\kappa = 0.8$. We formulate two tasks on HKJudge, termed rhetorical role classification and legal element extraction, and provide the first benchmark evaluation of four BERT-based models, two open-source LLMs under zero-shot and fine-tuning settings, and four commercial LLMs on both tasks. Our work demonstrates the value of sentence-level discourse annotation for modeling the structure of HK judgments and provides a rich data foundation for future work on legal judgment prediction. The HKJudge dataset and code are available at https://github.com/xuanxixi/HKJudge.
Tech leaders are moving beyond AI hype: Here’s what’s actually working
Senior technology leaders from Mars, Orange, Reckitt, and Saint-Gobain discuss how to turn AI ambition into enterprise transformation.
Enterprise AI adoption low due to high token usage, low RoI: Cognizant CEO | Company News - Business Standard
Cognizant Chief Executive Officer (CEO) Ravi Kumar has said the big gap between what artificial intelligence (AI) can do in enterprises and a company’s AI adoption rate is due to high token consumption over the last few years without linking it to return on investments (RoI).
Geopolitics, Policy & Governance
The Trump administration might take an equity stake in OpenAI
Reports suggest the Trump administration is considering acquiring an equity stake in OpenAI as part of its broader AI strategy.
Who is Sriram Krishnan? Indian-origin Trump AI adviser backed by Elon Musk set to leave White House role - The Times of India
He was involved in drafting policy initiatives aimed at improving US competitiveness in AI, including regulatory reforms and proposals to reduce fragmented state-level AI laws. He also contributed to broader strategy discussions focused on national security, innovation, and global competition ...
Long march to tech supremacy - OrissaPOST
As scientists, we had the uneasy privilege of witnessing China’s rise earlier than most. Long before a country’s regional or global dominance shows up in macroeconomic aggregates and stock valuations, it can be inferred from the kinds of signals that scientists notice: scholarly publications, ...
India Launches AI-Driven Mineral Exploration Hub to Boost Strategic Resource Security
India's Geological Survey is launching a new centre in Bengaluru to enhance mineral exploration using AI and machine learning.
South Korea taps former Naver CEO Han Seong-sook for prime minister
President Lee Jae Myung nominated former Naver CEO Han Seong-sook as prime minister, expecting her to drive the government's AI transformation agenda and support small businesses.
Washington wants a piece of OpenAI
The U.S. government is increasingly focused on OpenAI's operations and influence in the AI sector.
Britain’s questionable reliance on Palantir
Government should use the best technology for any task, but be wary of ‘lock-ins’
We must not grant AI agents legal personhood
What kind of sanctions could keep a non-human corporation in check?
OECD chief urges governments not to go it alone on digital taxation
Mathias Cormann warns against fragmented approach as more countries consider duties on tech giants
Starmer gives tech firms ultimatum to block explicit images on children’s phones
Companies such as Apple and Google have until September to install software or face legislation, says PM UK politics live – latest updates Apple and Google have been given until September to install software that blocks explicit images on children’s mobile phones or face legislation enforcing its requirement, Keir Starmer said on Monday. The prime minister said tech companies must activate nudity-detection algorithms or other technical solutions on smartphones and tablets to prevent users taking photos or sharing images of genitalia unless they are verified as adults. Continue reading...
US antitrust laws already ‘adaptable’ for AI, Calif. enforcer says
Competition policy stakeholders calling for new regulatory frameworks for AI would do well to remember US antitrust laws are adaptable to handle innovative technologies, a California DOJ enforcer said.
Brit maritime agency heralds fresh global rules for crewless cargo ships
If you thought driverless cars were bad, imagine a 200,000 ton container ship
US SEC should cancel 2023 cyber disclosure rules, banking group says
A major banking group urged the SEC to eliminate its 2023 cybersecurity disclosure rules, arguing that the current four-day reporting window is irresponsible given the threats posed by frontier AI models.
Europe’s third way on AI is easier said than done | Euractiv
The Commission’s sovereignty package is ambitious. But ownership is not the same as control: China’s and India’s different experiences show why
Indonesia updates e-commerce rules, adds AI and ride-hailing provisions
Indonesia's Trade Ministry has revised e-commerce regulations to include new rules governing the use of AI in product promotion and marketing, alongside other digital governance updates.
Get the full executive brief
Receive curated insights with practical implications for strategy, operations, and governance.