Mon 8 June 2026

How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope

arXiv:2606.07489v1 Announce Type: cross Abstract: Frontier AI systems are bridging the gap between intelligence and utility by shifting from conversational assistants to autonomous agents that execute tasks end to end. Using production data from Perplexity's Search and Computer products, we study this transition by examining how AI agents accelerate and reshape knowledge work. Three key empirical findings emerge. First, using sessions with near-identical initial query pairs as natural experiments for the same underlying task attempted with both products, Computer performs 26 minutes of autonomous work per user session, versus 33 seconds for Search. Computer automates task decomposition and execution that Search users might otherwise manually orchestrate and implement. As a result, Computer shifts follow-up query distribution toward higher-order work such as verification and extension. Autonomy also increases execution quality, with per-query dissatisfaction rates 55% lower on Computer than on Search. Second, due to its autonomy advantage, Computer reduces completion time from 269 to 36 minutes on matched tasks, lowering estimated time and cost by 87% and 94%, respectively, compared to humans equipped with Search alone. Third, Computer changes the scope of work that users attempt: Computer queries more often cross occupational boundaries, require higher-order cognition, draw on broader expertise, take the form of composite tasks that bundle interdependent subtasks into a single query, and unlock work activities that are essentially absent from Search usage among the same users. Together, the evidence indicates that AI agents accelerate workflows, enhance output quality, reduce costs, and expand the breadth and depth of automated work.

Editor's pickPAYWALLTransportation & Logistics

Generative Models Erode Human Temporal Learning Through Market Selection

arXiv:2606.06572v1 Announce Type: cross Abstract: We argue that modern generative models create structural risks for knowledge and cultural production at current, sub-AGI capability levels. We define Human Temporal Learning (HTL) as path-dependent knowledge accumulation through sustained engagement with problems over time. Generative outputs increasingly resemble HTL-intensive work in surface features, so verifying whether a given output reflects genuine human learning grows costly relative to its expected benefit. Once verification loses economic justification, evaluators reward outputs regardless of production mode, and producers who invested years of learning compete on price against outputs that cost almost nothing to generate. We call this pathway value collapse and formalize it through a costly-inspection framework. Cross-domain evidence from academic publishing, legal practice, content platforms, and software security maps onto four stages of verification erosion. Alignment success is orthogonal. Better-aligned models narrow observable gaps between human and AI outputs, making source verification harder and intensifying competitive pressure against HTL-intensive work even when individual AI outputs improve.

WSJ· Today

Driverless Trucks Are Here—and They’re Delivering Bags of Doritos

PepsiCo has 41 trucks on the road in Arizona, Texas and Arkansas, bringing the technology into the mainstream.

Measuring Agents in Production

arXiv:2512.04123v4 Announce Type: replace Abstract: LLM-based agents already operate in production across many industries, yet we lack an understanding of what technical methods make deployments successful. We present the first systematic study of Measuring Agents in Production, MAP, using first-hand data from agent developers. We conducted 20 case studies via in-depth interviews and surveyed 86 deployed systems practitioners across 26 domains. We investigate why organizations build agents, how they build them, how they evaluate them, and their top development challenges. Our study finds that production agents are built using simple, controllable approaches: 68% execute at most 10 steps before human intervention, 70% rely on prompting off-the-shelf models instead of weight tuning, and 74% depend primarily on human evaluation. Reliability (consistent correct behavior over time) remains the top development challenge, which practitioners currently address through systems-level design. MAP documents the current state of production agents, providing the research community with visibility into deployment realities and underexplored research avenues.

Are we at the start of a new investment super-cycle?

AI, clean energy and defence spending are reinforcing each other, amplifying potential spend

Benzinga· Today

The AI Trade Just Entered 'Warning Zone,' According To Barclays - Benzinga

Barclays says the AI-fueled stock rally has entered a "warning zone" as inflation and rate concerns persist.

Business Insider· Today

AI Companies Are Rapidly Expanding Into Each Other's Markets - Business Insider

Companies are ruthlessly invading each other's turf. Ever-increasing valuations mean companies need to find new sources of revenue.

Is this the dawn of the Tokenpocalypse?

An analysis of the current state of AI token usage and potential future constraints.

Economics & Markets

18 articles

AI Business Models1 articles

Editor's pickPAYWALLFinancial Services

FT: ChatGPT getting a ‘superapp’ revamp before OpenAI hits IPO

ChatGPT is reportedly moving away from chatbots to agents that perform tasks. Read more: FT: ChatGPT getting a ‘superapp’ revamp before OpenAI hits IPO

AI Investment & Valuations6 articles

Nvidia CEO Says Tech Stock Selloff Is a Buying Opportunity | The Pulse 6/8/2026

"The Pulse With Francine Lacqua" is all about conversations with high profile guests in the beating heart of global business, economics, finance and politics. Based in London, we go wherever the story is, bringing you exclusive interviews and market-moving scoops. Today's guests: Lindsay Rosner, Goldman Sachs Asset Management Head of Multi-Sector Investing; Euan Blair, Multiverse CEO & Founder; Kanishka Narayan, UK AI & Online Safety Minister (Source: Bloomberg)

‘We didn’t see this coming’: Wall Street eats its forecasts as stocks sell off globally on fear of AI bubble ahead of SpaceX IPO

Everything you need to know before you reach the office this morning.

Economictimes· Yesterday

₹43,000 Crore Withdrawn by FPIs as AI Trends Shift Investment Focus, ETGovernment

Foreign Portfolio Investors (FPIs) withdrew nearly ₹43,000 crore from Indian equities in the first week of June, driven by a global shift towards AI investments and a weakening rupee. Total outflows have reached ₹2.67 lakh crore in 2026, surpassing last year's figures.

Funds Europe· Today

Higher rates, AI boom and structural shifts shaping H2 outlook - Funds Europe

Asset manager Carmignac expects the second half of 2026 to be defined by persistently high long-term interest rates, AI-driven investment and a more fragmented global economic landscape.

Benzinga· Today

The AI Trade Just Entered 'Warning Zone,' According To Barclays - Benzinga

Barclays says the AI-fueled stock rally has entered a "warning zone" as inflation and rate concerns persist.

WSJ· Today

The Wall Street Journal’s inaugural Best Companies for the Future ranking is dominated by tech firms

This new ranking also showcases tech giants Alphabet, Microsoft, Meta and Cisco, propelled by high scores for innovation, financial strength and AI readiness.

AI Macroeconomics2 articles

Are we at the start of a new investment super-cycle?

AI, clean energy and defence spending are reinforcing each other, amplifying potential spend

Multiperiod Groundwater Markets

arXiv:2605.26363v2 Announce Type: replace-cross Abstract: Motivated by the emergence of local groundwater exchanges, we construct and analyze stochastic models of dynamic groundwater markets. Our primary focus is endogenizing the price formation and groundwater pumping strategies in a closed market with stochastic groundwater allocations and opportunities for intertemporal transfer through rights banking. In our model, several agents, interpreted as farmers or agricultural districts, make competitive decisions on water consumption to produce a basket of goods, as well as on trading allocations among themselves, or banking them for future periods. We define the respective discrete-time non-zero-sum non-cooperative game and construct its sub-game perfect Nash equilibria characterized by the groundwater price process $\{p^\circ(t)\}$. We furthermore construct an algorithm to determine equilibrium strategies and prices through a machine learning approach on top of best-response iterations. Extensive numerical experiments illustrate dynamic phenomena, including the role of groundwater recharge dynamics, agents' risk aversion and groundwater allocations. Our model provides insights into competitive effects in environmental markets with banking features.

AI Market Competition4 articles

Reuters· Today

Saving Siri: After two years of stumbles, is Apple's AI moment here? | Reuters

Siri debuted in 2011 and is accessible through the bulk of Apple's installed ‌base of 2.5 billion devices, but hundreds of millions of consumers have been chatting with apps from Open AI and Anthropic instead. In China and elsewhere, consumers are turning to AI agents - bots that can carry out complex tasks on behalf of human users - to manage daily schedules and take care of rote tasks.

NYT· Today

Apple Expected to Detail Its A.I. Plans at Conference

For the second time, the company is expected to explain its artificial intelligence plans. Unlike some rivals, it is not reorganizing around the technology.

Business Insider· Today

AI Companies Are Rapidly Expanding Into Each Other's Markets - Business Insider

Companies are ruthlessly invading each other's turf. Ever-increasing valuations mean companies need to find new sources of revenue.

Artificial Intelligence Newsletter | June 8, 2026· 3 days ago

WhatsApp comes under Turkish scrutiny over AI chatbots

Meta Platforms is facing an investigation in Turkey, where the national antitrust watchdog is examining whether the integration of Meta AI into WhatsApp creates illegal impacts on rival providers.

AI Pricing & Cost Curves1 articles

Is this the dawn of the Tokenpocalypse?

An analysis of the current state of AI token usage and potential future constraints.

AI Startups & Venture4 articles

Exponentialview· Yesterday

🔮 The AI boom is becoming an entrepreneurship boom #577

Plus, Russia's Bryan Johnson, camping chips, dreaming AI++

AI company PhysicsX raises $300m in Series C funding round

PhysicsX has its headquarters in London, with an additional office in New York, and a presence in California's Bay Area and Singapore. Read more: AI company PhysicsX raises $300m in Series C funding round

Editor's pickEnergy & Utilities

Dublin-based AI IP start-up Midnight Labs backed by Sony

Expansion in Japan will enable Midnight to operate in a country that is 'uniquely vulnerable to AI-generated copyright infringement' due to 'sophisticated digital piracy syndicates' operating at 'unprecedented scale'. Read more: Dublin-based AI IP start-up Midnight Labs backed by Sony

Bebeez· Today

Ghent’s Companion.energy raises €7.8 million to help large enterprises manage energy in real time

Companion.energy, a Ghent-based startup helping large industrial and commercial enterprises manage energy in real time, has closed a €7.8 million Seed round to deepen the platform’s multi-asset, multi-market optimisation capabilities, and expand commercially across Europe, starting with Germany and Spain. The round was led by Realyze Ventures (Germany) and Pi Labs (United Kingdom), with participation […]

Labor, Society & Culture

22 articles

AI & Employment8 articles

US Tech Workers Are Paying the Price of the AI Boom

The information sector is dominant in profit expansion — and one of the worst places to find work.

Editor's pickPAYWALLConsumer & Retail

Generative Models Erode Human Temporal Learning Through Market Selection

Editor's pickPAYWALLFinancial Services

Walmart tells workers that AI will improve their jobs, not steal them

Retailer’s embrace of artificial intelligence comes amid anxiety that the technology will create mass redundancies

HSBC CEO Elhedery on GCC, Job Market & Banking Amid AI

HSBC's Georges Elhedery said the GCC has emerged from recent turmoil. He also told Francine Lacqua that ‘Human Judgment’ Is Vital as AI gathers pace. (Source: Bloomberg)

Digital Journal· Yesterday

AI and the reinvention of work: Why some sectors are hiring more, not less - Digital Journal

Even though 70% of cybersecurity tasks can be automated, the sector is projected to grow 30% by 2030 thanks to AI.

Theregister· Today

Department of Work and Pensions' answer to AI job fears is a bot to polish your CV

Whitehall says Work Assistant will help jobseekers apply around the clock – provided employers don't mind machine-written applications

Editor's pickPAYWALLManufacturing & Industrials

MDPI· Today

An AI-Driven Socio-Technical Framework for Performance Management in Teleworking Environments

To further interpret these dynamics, ... how AI-enabled performance systems may reconfigure power relations within organizations. In particular, algorithmic performance systems can increase visibility and standardization in evaluation processes, which may reduce managerial discretion and shift interpretive authority toward algorithmic outputs. As a result, changes introduced ...

Washington Post· Yesterday

Opinion | Why human intelligence still matters in the age of AI - The Washington Post

RLDX-1, a dexterity-first AI model for robot hands, powers a robot demonstrating packing and sorting at the Computex exhibition in Taipei on June 3.

AI & Inequality2 articles

Learning Fair Demand Models

arXiv:2606.06830v1 Announce Type: new Abstract: Data-driven pricing is increasingly prevalent in sectors such as airlines, lending, insurance, and retail. By learning demand models from customer features and setting prices accordingly, these systems may generate discriminatory outcomes that raise fairness concerns. This leads to fundamental questions - how and where should systems incorporate fairness considerations in the pricing pipeline, and how does it ultimately affect societal outcomes? To answer these, we study a stylized model where a seller has a two-stage decision pipeline comprising linear demand model estimation followed by price optimization. The seller considers fairness notions in training loss, price, and demand, under both parity-wise and Rawlsian perspectives. We show that equalizing training loss across consumer groups leads to multiple solutions, which in turn can result in undesirable outcomes despite being a standard approach in fair machine learning. Focusing instead on fairness applied directly to prices or demand, we compare two strategies that enforce fairness in either the demand estimation stage or the price optimization stage. For parity-wise fairness, we characterize when each strategy yields higher social welfare under small fairness levels. We show that when market sizes and prices in the dataset are similar, imposing price fairness in the estimation stage is more beneficial to consumers, whereas imposing demand fairness in the optimization stage yields better consumer outcomes. For Rawlsian fairness, the two strategies coincide exactly. Lastly, we extend our model to alternate demand functions and conduct a case study using real-world vaccine pricing data.

Editor's pickHealthcare

Editor's pickMedia & Entertainment

AI ‘digital twins’ are transforming heart care but will they work for women?

Sumesh Sasidharan of the Faculty of Medicine at Aix-Marseille Université explores how transformations in medtech may not impact all patients equally. Read more: AI ‘digital twins’ are transforming heart care but will they work for women?

AI & Misinformation2 articles

Editor's pickMedia & Entertainment

Meta made its own AI-generated clickbait news feed

Meta has reportedly created an internal news feed populated by AI-generated clickbait content.

Editor's pickConsumer & Retail

Viral AI Video of Trump Sparks Debate Over Political Messaging and AI's Role in Politics

A new AI-generated video glorifying Donald Trump, posted on Truth Social, has sparked debates over AI's role in political messaging.

AI Ethics & Safety5 articles

The Geography of Algorithmic Judgment: LLM Intermediaries, Place Identity, and Racial Steering in Housing Search

arXiv:2606.06694v1 Announce Type: cross Abstract: Large language models (LLMs) are rapidly assuming an intermediary role in housing search through the integration of listing platforms within conversational interfaces, mediating access to information, search, and recommendations within urban settings. We expand on prior work on racial steering in LLMs by conducting a behavioral audit of seven open-weight and closed-source LLMs across four U.S. cities, testing location recommendations across three iterative prompting conditions that progressively add lifestyle preference context and reflect fair housing paired-testing methodologies. We find that steering is an emergent behavior of the model's interpretive license rather than primarily a static property. Steering results from the interaction of a user's identity, preference articulation, and the spatial logic that a model has internalized about learned representations of place, preference, and opportunity in a given city, and how different types of users relate to it. While steering was present, it was not uniform in direction or magnitude across evaluated conditions. Preference-conditioned testing often increased or reconfigured the number of models that exhibited steering behaviors relative to baseline conditions, suggesting that LLMs may interpret what the same housing preference means differently depending on the racial identity of the user. Our findings also demonstrate that the city is not a neutral testing unit for LLM evaluation in place-based sectors, and results from one local market cannot be assumed to generalize to another. Local and domain expertise will be required in the housing sector to ensure that legal and institutional commitments to fair housing are not undermined while adopting AI tools that mediate spatial access.

Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

arXiv:2606.06514v1 Announce Type: new Abstract: Machine learning systems deployed in high stakes socioeconomic settings routinely display bias. We formalize bias as a symmetry breaking operation: a classifier is fair if its outputs remain invariant under the counterfactual operation of switching a sensitive attribute, with merit features held fixed. We implement loss based regularization as a symmetry restoring mechanism and evaluate the framework on four synthetic datasets with varying levels of noise, correlation, and bias. The framework achieves upwards of 90\% violation reduction, with accuracy costs around 5\%. This framework does not require causal graph knowledge, is computationally lightweight, and generalizes to any sensitive attribute definable as a bit-flip, making it suitable for contexts where local sources of discrimination remain absent from mainstream benchmarks.

Theregister· Today

Brit fraudsters using AI to doctor 'evidence' in motor insurance claims

Policy-holders increasingly turn fender benders into much more by sprinkling in their favorite AI chatbots, Aviva says

Guardian· Yesterday

‘A driver of political violence’: how the breakneck AI boom is fueling anti-tech extremism

Backlash against AI is taking an extremist turn, following in the footsteps of earlier techno-pessimist militants When a 20-year-old man from Texas was arrested earlier this year for allegedly trying to burn down OpenAI’s headquarters and Sam Altman’s house, authorities found an anti-AI manifesto alongside his lighter and a jug of kerosene. It was one of a spate of attacks that has caused alarm among researchers, the tech industry and law enforcement about the rise of anti-tech extremism. In April, an Italian “nature pilled” Instagram influencer was arrested in Rome and charged with plotting a series of anti-tech attacks that took inspiration from Ted “The Unabomber” Kaczynski. Two self-described “ecofascists” that carried out a deadly anti-Muslim attack on a mosque in San Diego last month also cited “AI slop” and JD Vance’s ties to Palantir as motivations for their violence in their manifesto. An Indianapolis city councilor woke up earlier this year to gunshots being fired into his home before finding a note that read “NO DATA CENTERS”. Continue reading...

Artificial Intelligence Newsletter | June 8, 2026· Today

How AI reshapes criminal investigations for South Korean prosecutors

The growing use of AI-generated and manipulated evidence has prompted South Korea's prosecution service to develop forensic tools to detect deepfakes, voice clones, and other digital forgeries.

AI Skills & Education3 articles

Beyond Tool Adoption: A Practical Five-Stage Developmental Continuum for AI Literacy in Higher Education

arXiv:2606.00038v4 Announce Type: replace Abstract: Artificial intelligence (AI) literacy is increasingly recognized as a foundational competency for all university graduates. Yet students' engagement with AI tools often clusters at two extremes: avoidance driven by fear, mistrust, ethical concern, or lack of access, and uncritical reliance that produces fluent output while masking misunderstanding. Existing AI literacy frameworks provide valuable competency definitions, but most offer limited guidance for diagnosing where learners begin and how they progress toward responsible, critical engagement. This paper proposes a five-stage AI Literacy Continuum: 0) Not Yet Engaged, 1) Uncritical Use, 2) Informed Use, 3) Critical Evaluation, and 4) Improvement --that describes developmental orientations toward AI use in higher education. The continuum complements dimensional frameworks by providing educators with a practical diagnostic and instructional pathway aligned with international frameworks, including UNESCO and OECD. We present a design-based implementation case from North Carolina State University, where credit-bearing courses and intensive hands-on workshops engaged more than 330 participants between Fall 2024 and Spring 2026. Because the implementation did not use a validated pre/post instrument or comparison group, we frame the findings as observational and practice-based: participants exhibited behaviors consistent with movement from non-engagement or uncritical use toward informed engagement, while sustained and discipline-embedded experiences produced stronger evidence of critical evaluation and improvement-oriented practice. We discuss curricular pathways, opportunity considerations, assessment strategies, and argue that AI literacy should be understood not as tool adoption alone but as a developmental capacity to understand, evaluate, and responsibly apply AI systems in disciplinary and societal contexts.

FourWeekMBA· Yesterday

The AI PM Career Path in 2026 — What to Learn, What's Obsolete, and Where the Jobs Are - FourWeekMBA

The AI PM career path in 2026: what's obsolete (PRDs, stakeholder management, quarterly roadmaps), what to learn (frontier prototyping, overhang bets), where the jobs are.

The Economic Times· Today

How global AI forums are redefining university credibility - The Economic Times

As AI reshapes higher education, universities are increasingly being evaluated on their readiness for an AI-driven future. Beyond traditional measures of excellence, engagement with global AI ecosystems and conversations is emerging as a key signal of relevance.

Public Attitudes to AI2 articles

The coming rise of anti-AI populism

Anxiety about the technology is set to generate a political backlash

What Do People Actually Want From AI? Mapping Preference Plurality

arXiv:2606.06674v1 Announce Type: cross Abstract: Large Language Models (LLMs) are often fine-tuned through Reinforcement Learning from Human Feedback (RLHF) to align with people's preferences and values. However, this method has known limitations: it aggregates conflicting preferences, often relies on unrepresentative samples, and uses only binary comparisons. Analysing 1,500 open-ended responses from the PRISM dataset across 75 countries, we examine what people actually want from AI systems and reveal concrete failures of current methods. We find that different people want different things: most values are requested by fewer than a quarter of respondents, with truthfulness the sole exception at 49%. Furthermore, the same words hide divergent meanings: when people describe what they mean by "truthfulness", they reveal distinct, potentially incompatible, epistemological bases, as some ask for sourced claims, some for expert opinions, and some even ask for unpopular views. Certain capabilities, namely how human-like a model behaves, and some features, like AI guardrails, are outright controversial, with some desiring them and others rejecting them. We additionally find that people often use contextual distinctions (what AI should do "by default" versus "if requested") that binary comparisons cannot capture. These findings expose fundamental problems in current alignment practices. When 49% request truthfulness but define it differently, this is unlikely to be captured by a single reward model. The persistence of high hallucination rates in well-funded models, despite users' clear demands for accuracy, suggests that current methods fail to identify actual preferences. This paper sheds light on the situated, contested, imperfect signals that are currently being flattened into universal preference models, a practice others have characterised as epistemic violence.

Technology & Infrastructure

32 articles

AI Agents & Automation7 articles

Editor's pickPAYWALLProfessional Services

How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope

NYT· Yesterday

Sorry, I’m Not Available. Talk to the A.I. Version of Me.

The hot new productivity hack for C.E.O.s and Harvard professors? A.I. twins that answer questions and attend meetings.

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

arXiv:2606.06523v1 Announce Type: new Abstract: Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge in artificial intelligence. Despite recent advances in LLMs' agentic capabilities, most agent systems still lack formal methods for specifying, verifying, and debugging their workflow and execution trajectories. This challenge mirrors a long-standing problem in mathematics, where the ambiguity of natural languages (NLs) motivates the development of formal languages (FLs). Inspired by this paradigm, we propose **Lean4Agent**, to the best of our knowledge, the first framework that uses Lean4, a dependent-type FL to model and verify agent behavior. **Lean4Agent** launches **FormalAgentLib**, an extensible Lean4 library for formally modeling and verifying agent workflows' semantic consistency under explicit assumptions, and enabling localization of execution-time failures revealed by trajectories. Building on **FormalAgentLib**, we further develop **LeanEvolve**, which applies results in **FormalAgentLib** to revise workflows to enhance its capability. Extensive experiments on a hard problem subset of SWE-Bench-Verified and a subset of ELAIP-Bench across 5 leading LLMs indicate that the verification-passing workflows outperform the failing ones by an average of **11.94%**, and **LeanEvolve** further improves SWE performance by **7.47%** on average. Furthermore, **Lean4Agent** establishes a foundation for a new field of using expressive dependent-type FL to formally model and verify agent behavior.

AEGIS: A Backup Reflex for Physical AI

arXiv:2606.06660v1 Announce Type: new Abstract: Long-horizon robot manipulation tends to fail gradually: one bad step degrades the state, and the policy spirals into a basin from which it cannot recover. The failure is often visible before it happens. We introduce AEGIS (Activation-probe Early-warning, Gated Inference Switching), a selective escalation method that uses a lightweight probe on a weak policy's frozen activations to detect high-risk steps while there is still time to act. When the probe flags a step, control switches to a stronger separate policy, but only for the steps that need it. On LIBERO-Spatial, AEGIS recovers 10.1% of the trajectories the weak policy alone loses, versus 4.6% for budget-matched blind escalation and 5.1% for a random-trigger placebo. These gains are significant under one-sided exact paired McNemar tests with Holm-Bonferroni adjustment over three pre-registered contrasts: +5.4pp over blind escalation, p=8.5e-6; +5.0pp over random triggering, p=1.0e-4; paired-trajectory bootstrap CIs exclude zero. AEGIS activates the stronger policy on only 38% of steps, so the lever is timing rather than compute. The probe clears its precondition with an early-window AUROC of 0.764, 95% CI [0.70, 0.84], read from the weak-policy path over the first 30% of trajectory steps before any handoff. We pre-register the full analysis plan, including a conditional recovered-task-rate estimand and explicit kill criteria, and confirm the result on 700 common-random-number episodes per arm, with nA-fail=646.

Measuring Agents in Production

Sustainability by Design in Decentralized Autonomous Organizations: An Empirical Review of Governance, Innovation, and Institutional Design

arXiv:2606.05667v1 Announce Type: cross Abstract: Recent innovation theories on economics remain largely grounded in assumptions of hierarchical firms and closed organizational boundaries, offering limited insight into how innovation unfolds within decentralized, digitally native organizations. Decentralized Autonomous Organizations (DAOs) represent an emerging form of innovation ecosystem characterized by blockchain-based transparency, open participation, and token-driven governance, in which sustainability can be embedded directly into organizational design. This study compares two standards, ERC-8004 and Google A2A, who address the same agent interoperability question, while the former is governed by DAO and the latter by corporation consortium. They are examined through an LLM-powered comparative pipeline for large-scale governance discourse analysis, integrating automated annotation, neural topic modeling, and multi-layer network analysis to study socio-technical power structures. The study provides evidence-based insights for scholars, policymakers, and designers seeking to align innovation, technological governance, and sustainability in future organizational forms.

OpenSkill: Open-World Self-Evolution for LLM Agents

arXiv:2606.06741v1 Announce Type: new Abstract: Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, successful trajectories, or verifier signals. Real open-world deployments may provide none of these, offering only a task prompt. In this work, we study open-world self-evolution, where an agent must build both its skills and its own verification signals from scratch, using open-world resources but no target-task supervision. We propose OpenSkill, a framework that bootstraps this loop: it acquires grounded knowledge and verification anchors from documentation, repositories, and the web, synthesizes them into transferable skills, and refines those skills against self-built virtual tasks grounded in the anchors rather than in target answers. The open world thus supplies both the knowledge to be learned and a supervision-independent practice environment, with target-task supervision reserved for final evaluation. Across three benchmarks and two target agents, OpenSkill attains the best automated pass rate while satisfying the no-supervision constraint. Analysis shows its skills transfer across models without model-specific adaptation, and its self-built verifier aligns with ground-truth outcomes despite never accessing them.

AI Energy1 articles

AI Magazine· Today

The Environmental Impact of Google Search's AI Overviews | AI Magazine

AI Overviews are becoming increasingly common in Google Search, prompting questions about energy use, transparency, user choice and the open web

AI Infrastructure & Compute10 articles

Editor's pickPAYWALLEnergy & Utilities

Guardian· Today

Majority of US’s new AI datacenters to be built on drought-hit land

Guardian analysis finds facilities to be built in some of the driest areas as outcry grows over water needed to power AI A record-shattering drought has racked much of the US. But the artificial intelligence industry is pushing ahead regardless, with the majority of planned datacenters set to be built in drought-ridden locations, a Guardian analysis has found. About two-thirds of upcoming datacenters, which typically require a large amount of water to operate, are set to be built in places that have been among the driest in the country over the past year. Continue reading...

Brookfield Head of AI Infrastructure on Bloomberg TV

Brookfield Head of AI Infrastructure and Head of Europe Sikander Rashid joins Bloomberg's Tom Mackenzie to kick off London Tech Week as part of Bloomberg Brief. (Source: Bloomberg)

Bebeez· Today

Nebius signs 22MW capacity agreement with Kao Data in the UK

European AI cloud company Nebius has signed on to lease 22MW of capacity from Kao Data’s Harlow data center campus in the UK. The company has also pledged a £1.7 billion ($2.26bn) investment in the UK, through which it will establish three new Nvidia infrastructure deployments. – Kao Data via LinkedIn The three new sites […]

The Financial Express· Yesterday

Inside India's Rs 10,372-crore sovereign AI race: 2 infrastructure stocks powering the 38,000-GPU boom - Stock Insights News | The Financial Express

A ₹10,372 crore AI mission. Over 38,000 GPUs already deployed. As India races to build its own AI ecosystem, a handful of companies are quietly building the infrastructure powering this transformation.

Hostingjournalist· Yesterday

COMPUTEX 2026 Exposes AI Infrastructure’s Real Constraints - HostingJournalist.com

COMPUTEX 2026 Exposes AI Infrastructure’s Real Constraints on HostingJournalist.com.

Xinhua· Yesterday

From optical modules to chips -- China's tech supply chains sustain global AI growth-Xinhua

From optical modules to chips -- China's tech supply chains sustain global AI growth-

DIGITIMES· Today

Exclusive: The semiconductor battle behind AI data centers and EVs

Beneath the rapid expansion of electric vehicles and artificial intelligence infrastructure, a quieter battle is unfolding in the semiconductor supply chain.

Accelerated Fourier SAT (AFSAT): Fully Realising a GPU-based Symmetric Pseudo-Boolean SAT Solver

arXiv:2606.06641v1 Announce Type: new Abstract: We present Accelerated Fourier SAT (AFSAT), a GPU-accelerated solver for pseudo-Boolean satisfiability based on continuous local search (CLS). AFSAT realises the proof-of-concept approach, FastFourierSAT, into a fully-engineered solver supporting any heterogeneous mixture of symmetric constraint types and lengths within a single problem instance. Using the JAX compiler, AFSAT leverages pure function composition, automatic vectorisation, automatic differentiation, and just-in-time (JIT) compilation to perform massively parallel CLS across batches of candidate assignments. We demonstrate substantially improved numerical stability, runtime performance, and memory efficiency over the proof-of-concept. We achieve this by way of identifying and addressing various limitations that arise from memory latency and floating-point representation, as well as leveraging automatic parallelisation and compact representations. The inherent representational and stability limitations of floating point are partially addressed by a tailored discrete Fourier transform implementation. We achieve near-linear throughput when scaling to multiple accelerators via JAX array sharding.

A Study of Parallel Continuous Local Search

arXiv:2606.06656v1 Announce Type: new Abstract: We study parallel Continuous Local Search (CLS) as a solution approach for Boolean satisfiability problems with symmetric pseudo-Boolean (PB) constraints. Here, the $n$-variable PB-satisfiability problem is relaxed to a continuous optimisation problem with a differentiable objective function on an $n$-dimensional hypercube. For satisfiable instances, the global minimisers of this optimisation problem correspond to satisfying assignments of the SAT problem at hand. We present several novel findings via empirical experiments: (i) redundant constraints can inhibit rather than accelerate convergence; (ii) CLS shows promise as a sub-solver in hybridised settings, quickly completing partial assignments; and (iii) local search rapidly converges to a stable distribution of solution quality (i.e., degree of satisfaction), due to saddle-dense objectives where additional solver steps yield diminishing returns. Our findings inform practical uses of CLS for SAT on modern accelerator hardware.

Theregister· Yesterday

Our systems editor flew all the way to Taiwan and still couldn't get away from AI

Every show now is an AI show, and that included this year's Computex

AI Models & Capabilities4 articles

Benchmarking LLMs for Community Governance Simulation with Life-history Narratives

arXiv:2605.23783v2 Announce Type: replace Abstract: Effective community governance hinges on understanding what specific residents think and need. Recent work has used large language models (LLMs) to simulate human respondents, offering a scalable, reproducible way to study human attitudes and behaviors at low cost. However, these studies typically prompt the model with just a few demographic variables (age, gender, income), simulating only general role types. This is insufficient for community governance, where decisions depend on the views of specific residents. We bridge this gap with an integrated research framework covering dataset, benchmark, algorithm, and system. The dataset comprises approximately 1.2 million characters of first-person narrative collected through two-hour semi-structured interviews with each of 92 residents in an urban community, organized around nine community-governance domains. The benchmark probes 18 mainstream LLMs across four prompting strategies and shows that adding rich life-history profiles meaningfully raises fidelity above the no-profile baseline, but this gain comes with more input tokens per call from the longer prompts they require. The algorithm, curriculum-LoRA, is a parameter-efficient personalization framework that, by closing this fidelity-cost gap, matches the strongest baseline's fidelity at roughly 10x lower per-call cost and Pareto-dominates every configuration tested. The system integrates curriculum-LoRA into a closed-loop policy-evaluation pipeline. Together, these results bring individual-level LLM-based resident simulation within reach of resource-constrained local administrations, enabling community-governance decisions to be systematically pre-evaluated in silico before real-world deployment.

DiBS: Diffusion-Informed Branch Selection

arXiv:2606.06518v1 Announce Type: new Abstract: Sudoku is a representative constraint satisfaction problem that requires global structural reasoning under strict discrete constraints. The existing works of solving Sudoku mainly focus on two dominant approaches, i.e., traditional heuristic and deep learning solver. However, they suffer from two complementary limitations: learning-based solvers lack hard correctness guarantees, while complete symbolic solvers are still prone to long-tail search. To address these shortcomings, we propose a novel diffusion model-guided approach, termed as DiBS, for the branch selection search process. Specifically, DiBS keeps the symbolic solver complete and uses the diffusion model as a branch-ordering guide. The core method is ranking candidate values under the current partial assignment and lightweight consistency signal. Furthermore, we provide an in-depth theoretical proof to reveal how it works and why it works. Experiments on the challenging Royle 17-clue Sudoku benchmark show that our DiBS substantially reduces search cost relative to strong heuristic baselines, especially in nodes, backtracks, and long-tail percentiles. Besides, these results confirm that learned global guidance is effective on hard instances where branch-order mistakes are most expensive. All codes are available at https://github.com/shanxierdan/DiBS.

mmPISA-bench: Do LLMs Reason Equally Well Across 43 Languages?

arXiv:2606.07069v1 Announce Type: cross Abstract: We introduce mmPISA-bench, a compact high-quality multilingual reasoning benchmark derived from the OECD Programme for International Student Assessment (PISA). The benchmark consists of 25 multiple-choice questions that require reasoning in order to be answered correctly. Each question is provided in official human translations to 43 languages and complemented with machine-translated counterparts (i.e., 2,150 data points in total). We evaluate two mainstream proprietary LLMs across languages, reasoning effort levels, and translation types in terms of their ability to answer the questions correctly. Our results show that modern LLMs can reason effectively across all evaluated languages, achieve accuracy comparable to human test-takers, with some performance variations across covered languages. We further find that machine-translated questions do not degrade accuracy relative to official human translations which suggests that high-quality machine translation (synthetic data) might often be adequate for large-scale multilingual reasoning evaluations where official translations are not available. Finally, we analyze token usage and related inference cost and find that LLMs usage in some languages is simultaneously more expensive and less accurate.

A Geometric Account of Activation Steering through Angle-Norm Decomposition

arXiv:2606.06735v1 Announce Type: new Abstract: Linear activation steering has gained popularity as a simple and empirically effective way to control language model behavior. More recently, spherical steering paradigms have been proposed to address limitations of additive interventions, often motivated by the assumption that hidden-state norm does not carry concept-relevant information. In this work, we revisit this assumption through a controlled empirical study designed to disentangle the roles of angular and radial components. We show that steering methods differ mainly in how they couple two geometric effects: changing a token's angular alignment with a concept direction and changing its hidden-state norm. Across seven language models, we find that concepts are represented primarily in angular structure, supporting the motivation for spherical methods, but that norm remains important for the stability and downstream effects of steering. Our results explain why interventions with similar concept-level effects can behave differently, and suggest that activation steering should be parameterized by interpretable angular and radial components of the intervention, rather than by a single additive coefficient that entangles these two effects.

AI Research & Science4 articles

Inside a Brain-Chip Startup in China

Brain-computer interfaces could dramatically alter how humans interact with technology. Startups around the world are racing to get their devices approved for commercial use. China may be in the lead. (Source: Bloomberg)

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

arXiv:2606.06533v1 Announce Type: new Abstract: What would it mean to have a scientific understanding of AI? Models are not static objects: they are snapshots of time-evolving processes shaped by data, objectives, architectures, and optimization dynamics. Yet much of AI research treats models as fixed artifacts, analyzing behaviors after training rather than asking why they emerge. This position paper argues that a science of AI must move beyond post-hoc fixes and study the training dynamics that produce model behavior. Such a science should support progressively stronger forms of understanding: predicting outcomes from early training signals, intervening when trajectories go wrong, and ultimately designing training procedures that more reliably produce desired properties. Scaling laws have made prediction routine for loss; the challenge is extending this success to capabilities, biases, robustness, and safety-relevant behaviors. We articulate requirements for such theories grounded in the history and philosophy of science, examine progress in mechanistic interpretability, fairness, memorization, and simplicity bias, and identify concrete open problems.

CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions

arXiv:2606.06526v1 Announce Type: new Abstract: Large language models have made substantial progress on mathematical reasoning, but existing benchmarks typically evaluate well-specified problems with final answers, step-by-step solutions, or complete proofs. They do not capture collaborative open-problem solving: a setting in which participants propose partial arguments, identify gaps or errors in prior steps, repair flawed reasoning, and gradually synthesize incremental contributions into a proof. We introduce CrowdMath, a dataset of 164 expert-annotated progress chains from the MIT PRIMES--Art of Problem Solving (AoPS) CrowdMath program (2016-2025), a collaborative research initiative whose discussions have led to peer-reviewed publications. Each chain traces a multi-participant forum discussion from an open-problem statement to a completed proof. Posts are labeled by their functional roles in the evolving solution process, including partial progress, proof completion, erroneous reasoning, and error identification. We define evaluation tasks and benchmark six frontier models. Models achieve 83-88% accuracy on next-post prediction, suggesting that they can follow the local flow of mathematical discussion. However, they struggle to identify the functional significance of individual contributions with the best model achieving only 0.42 macro-F1 on post-role classification. CrowdMath exposes a gap between solving well-specified mathematical problems and understanding collaborative mathematical progress as it unfolds.

Editor's pickDefense & National Security

Daily Brew· Today

Growing number of AI hallucinations that are appearing in academic papers

Concerns are rising as AI-generated hallucinations increasingly infiltrate academic research and publications.

AI Security & Cybersecurity6 articles

Editor's pickDefense & National Security

Attack Selection in Agentic AI Control Evaluations Meaningfully Decreases Safety

arXiv:2606.06529v1 Announce Type: new Abstract: An attacker that strategically chooses when to attack is much harder to catch than one that attacks indiscriminately. AI control is a safety framework for deploying capable but untrusted AI agents under the oversight of a weaker, trusted monitor and a limited human audit budget. Control evaluations stress-test these protocols by pitting a red-team attack policy against the blue-team monitor, but current evaluations typically assume attackers that do not strategically select when to attack. We study this capability, attack selection, in agentic settings by decomposing attack decisions into a start policy, which decides when an attacker should attack, and a stop policy, which decides when an attacker should abort an ongoing attack. Across two agentic settings, BashArena and LinuxArena, both policies substantially lower measured empirical safety without changing the underlying attack capability. At a 1% audit budget, our start policy reduces safety by 20pp on both BashArena and LinuxArena, and our stop policy reduces safety by 20pp on BashArena and 28pp on LinuxArena. These reductions should be interpreted as upper bounds on the effect of attack selection. Existing control evaluations may therefore yield overly optimistic safety estimates against selective attackers. We recommend that future evaluations, system cards, and safety cases elicit attack selection to produce more realistic safety estimates.

Why Lightspeed and Wiz’s Assaf Rappaport bet $37 million on an AI-powered cyberattacker

After frontier AI models exposed thousands of previously unknown vulnerabilities, startup A is building autonomous defenses designed for an era of AI-native attackers.

Editor's pickPAYWALLDefense & National Security

SafeGene: Reusable Adapters for Transferable Safety Alignment

arXiv:2606.06519v1 Announce Type: new Abstract: Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make models more vulnerable to malicious prompts, even when the training data is not intentionally harmful. This creates a recurring safety recovery problem as target models are repeatedly updated with new task data or user interactions. We propose SafeGene, a reusable safety-adapter module designed for cross-task reuse within each architecture-compatible model family. Rather than treating safety recovery as a model-specific repair step, SafeGene treats safety capability as an independent, reusable adapter representation decoupled from task-specific updates. This representation is obtained from aligned--degraded model discrepancies, refined into task-transferable safety vectors through data-aware layer selection, and expressed in each downstream task-adapted model via few-shot layer-wise coefficient recalibration. Experiments across multiple model families, downstream tasks, and safety judges show that SafeGene-enhanced models reduce harmful response rates while maintaining downstream performance, outperforming representative safe adaptation methods in safety--utility trade-off.

FT· Today

New AI espionage powers trigger Putin camera scare

Russia paused surveillance system after killing of Iran’s Supreme Leader exposed how AI can be used on CCTV data to target enemies

What Your Posts Reveal: A Benchmark and Agentic Framework for User-Level Privacy Leakage on Social Media

arXiv:2606.06784v1 Announce Type: cross Abstract: Public social media posts can reveal private information through weak cues scattered across text, images, or metadata. Such leakage is often cumulative and cross-post: cues that appear harmless in isolation may jointly expose a user's home, workplace, or routine. However, current research lacks a unified benchmark for user-level multimodal privacy leakage and an evaluation metric that captures exposure severity beyond binary accuracy. To address these gaps, we propose SopriBench, a synthetic benchmark guided by leakage patterns abstracted from a private reference corpus of Rednote and Instagram accounts, covering 50 user profiles and 1,569 images with attributes, contextual sensitivity, granularity, leakage type, inference difficulty, and supporting evidence. We further introduce the Privacy Exposure Score (PES), which weights value granularity by contextual sensitivity. Inspired by abductive reasoning, we introduce Argus, a training-free agentic framework for cumulative leakage inference. Argus forms hypotheses from accumulated evidence, verifies supporting evidence, and aggregates cross-post cues into privacy profiles, achieving 0.55 PES, a 25% improvement over the strongest baseline, with the largest gain on cross-post leakage.

Daily Brew· 2 days ago

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

OpenAI has introduced a new 'Lockdown Mode' designed to harden its models against prompt injection and data leakage vulnerabilities.

Adoption, Deployment & Impact

16 articles

AI Adoption Barriers & Enablers2 articles

PYMNTS· Today

The AI You’re Given vs. the AI You Chose | PYMNTS.com

Eighty-six million Americans now use AI at work, and most carry it home. Give workers a choice and the order of finish changes completely.

Editor's pickPAYWALLTransportation & Logistics

Bangkok Post· Today

Bangkok Post - Accenture touts advance of enterprise AI in the region

Enterprise artificial intelligence (AI) is leaving the sandbox, shifting from back-end pilots to large-scale, customer-facing deployments.

AI Applications8 articles

WSJ· Today

Driverless Trucks Are Here—and They’re Delivering Bags of Doritos

PepsiCo has 41 trucks on the road in Arizona, Texas and Arkansas, bringing the technology into the mainstream.

Editor's pickPAYWALLHealthcare

Apple’s Revamped AI Is a Long-Term Play in a Down PC, Mobile Phone Market

A revived Siri unlikely to spur iPhone, Mac sales in the near future

NYT· Today

Why Your Next Diagnosis May Be Guided by an A.I. Helper

OpenEvidence, a fast-growing start-up, is using artificial intelligence to help doctors find answers to clinical questions for diagnosis and treatment.

Editor's pickPAYWALLTransportation & Logistics

Editor's pickTransportation & Logistics

Uber Opens London Waitlist for Wayve Robotaxis Ahead of Launch

Uber Technologies Inc. has set up a waitlist for customers interested in riding in a robotaxi in London, ahead of a planned commercial debut in the UK this year.

Editor's pickConsumer & Retail

CARVE-Q: Quantum-Proposed, Classically Certified Interactive Driving Repair

arXiv:2606.06531v1 Announce Type: new Abstract: The critical question after a correct driving veto is not only whether a maneuver is unsafe, but whether the blocked interaction admits a lawful, auditable, and responsibility-bounded repair. Prediction and game-theoretic planners can suggest plausible cooperation, yet they do not return a proof that the repair respects hard rules, right-of-way, cost allocation, and ego fallback. We introduce CARVE, Certified Affordable Repair of Vetoed maneuvers via Envelopes, a certificate architecture for prediction-free interactive repair. Given a vetoed maneuver, CARVE constructs a finite repair lattice and emits a structured certificate recording the binding rule, selected joint repair, right-of-way-scaled cooperation envelope, responsibility-weighted cost split, and ego-only fallback. This certificate view reveals the algorithmic bottleneck: multi-owner repair induces a product lattice $M = \prod_j |\mathcal{A}_j|$. We therefore introduce CARVE-Q, a verifier-shielded quantum-AI search layer that applies quantum minimum finding only to this black-box lattice while leaving all safety authority classical. In the conservative verifier-oracle model, exact classical minimum finding requires $\Theta(M)$ queries in the worst case, whereas Durr-Hoyer/Grover minimum finding uses $O(\sqrt{M})$ oracle queries with high probability. We prove verifier-shielded certificate soundness, priority non-elicitation, black-box query separation, and finite-precision reversible-oracle constructibility. We then demonstrate state-vector minimum finding on CARVE repair oracles up to 65,536 assignments and validate certificate preservation on Lanelet2-grounded INTERACTION replay with 100% right-of-way respect, 100% blame consistency, and zero priority false positives. The result is a trust-bounded quantum-AI pattern for certified autonomy: quantum proposes; CARVE certifies.

Guardian· Yesterday

‘Poisoned’ AI: the ChatGPT shopping scams that lead to fake websites

Buyers are ripped off after assuming online stores were genuine because they are recommended by an AI tool You want to buy a new bag and so you ask ChatGPT for help. You have always liked Russell & Bromley so you ask ChatGPT what is popular there at the moment. The artificial intelligence (AI) assistant gives you cross body, shoulder, casual and formal options with the prices listed beside them. You click through from the sources to what looks like the official Russell & Bromley site and buy your new bag, which is conveniently on sale. Continue reading...

Editor's pickHealthcare

Evidence-Based Intelligent Diagnostic and Therapeutic Visualization System with Large Language Models: Multi-Turn Interaction and Multimodal Treatment Plan Generation

arXiv:2606.06869v1 Announce Type: new Abstract: Aim: Existing AI-assisted traditional Chinese medicine diagnostic tools suffer from opaque reasoning processes, passive interaction, and limited treatment plan presentation. This study proposes a knowledge-enhanced visual diagnostic system to improve the transparency and interpretability of syndrome differentiation and treatment. Methods: The system is built upon a Neo4j knowledge graph comprising 241 syndromes, 1,263 symptoms, and 2,485 relations. It incorporates a four-stage symptom matching pipeline (exact, semantic, fuzzy, and large language model verification), an information gain-driven proactive questioning strategy optimized with genetic algorithms, and a multimodal treatment presentation integrating artificial intelligence-generated illustrations, three-dimensional meridian-acupoint models, and evidence-based literature. Results: Knowledge graph constraints reduced non-standard outputs by 32%. Case studies validated the effectiveness of the interactive workflow across patient self-assessment, clinician-assisted diagnosis, and traditional Chinese medicine education. Automated paired-comparison evaluation across 30 cases further demonstrated significant improvements in diagnostic trust (Cohen's d = 1.82, p < 0.001), reduced cognitive load (improvements in four of five dimensions), and higher credibility of evidence-based references (4.21 vs. 2.95). Conclusions: The proposed system enhances the transparency of traditional Chinese medicine diagnostic reasoning and the interpretability of treatment plans through knowledge graph-driven visualization and multimodal interaction, offering a practical solution for trustworthy artificial intelligence-assisted traditional Chinese medicine applications.

HKJudge: A Legal Discourse-Annotated Corpus for Interpreting What Courts Find, How They Reason, and What They Rule

arXiv:2606.06679v1 Announce Type: cross Abstract: Court judgments are central to legal practice and jurisprudence, yet discourse analysis of Hong Kong judgments has received limited attention, owing largely to the absence of expert-annotated corpora. We introduce the Hong Kong Judgment Discourse Dataset (HKJudge), the first sentence-level expert-annotated legal discourse corpus. HKJudge includes criminal judgments across all five levels of HK's court hierarchy, comprising $\sim$290k sentences and $\sim$6.5 million tokens, fully annotated by legal linguistics experts. We design a two-tier discourse schema that captures what facts a court finds, how it reasons, and what it rules. At the sentence level, each sentence is assigned one of 26 rhetorical roles. At the span level, sentences are further annotated with three sentencing elements (charge, imprisonment term, fine). Ten legal linguistics annotators produced the annotations with an inter-annotator agreement of $\kappa = 0.8$. We formulate two tasks on HKJudge, termed rhetorical role classification and legal element extraction, and provide the first benchmark evaluation of four BERT-based models, two open-source LLMs under zero-shot and fine-tuning settings, and four commercial LLMs on both tasks. Our work demonstrates the value of sentence-level discourse annotation for modeling the structure of HK judgments and provides a rich data foundation for future work on legal judgment prediction. The HKJudge dataset and code are available at https://github.com/xuanxixi/HKJudge.

AI Measurement & Evaluation1 articles

We should be getting better at AI by now

From cancelled novels to legal fines, the scale of blunders only halfway through the year suggests the opposite

AI Organisational Change1 articles

AI disruption arrived 6 years early—now executives are drawing the line

New data shows no job is immune from AI disruption. But leaders say the real challenge isn't technology—it's governance.

AI Productivity Evidence1 articles

Detective scaffolding for within-session reasoning development: a three-phase framework evaluated in polymer engineering and pre-university outreach

arXiv:2606.07279v1 Announce Type: cross Abstract: This paper presents a detective scaffolding framework -- a three-phase instructional sequence (Hypothesis Activation -> Evidence Structuring -> Causal Integration) in which engineering students investigate a realistic industrial defect scenario using staged in-class polls as designed evidence probes. Unlike conventional uses of student response systems for engagement, the framework positions each poll as an Evidence-Centred Design instrument targeting a specific reasoning capability. In the primary implementation, 80 Year~3 polymer engineering students progressed from prior-knowledge-driven misconception (71% attributing defects to temperature) to complete root-cause convergence (100\% identifying humidity; Fisher's exact test, $p < .001$) across four sequenced prompts within a single 90-minute lecture slot. A dual-accuracy analysis revealed that at one intermediate stage, textbook-correct and analytically valid responses diverged, illustrating why conventional scoring can misrepresent reasoning quality. In a transferability study, 26 Year~12 students with no engineering background achieved identical root-cause identification rates across two adapted scenarios, with significant gains in data-analysis confidence and AI explanation ability. The results suggest that the pedagogical structure, rather than disciplinary content, drives the convergence effect, implying portability across disciplines and educational levels.

AI ROI & Business Case3 articles

Tech leaders are moving beyond AI hype: Here’s what’s actually working

Senior technology leaders from Mars, Orange, Reckitt, and Saint-Gobain discuss how to turn AI ambition into enterprise transformation.

Business Standard· Yesterday

Enterprise AI adoption low due to high token usage, low RoI: Cognizant CEO | Company News - Business Standard

Cognizant Chief Executive Officer (CEO) Ravi Kumar has said the big gap between what artificial intelligence (AI) can do in enterprises and a company’s AI adoption rate is due to high token consumption over the last few years without linking it to return on investments (RoI).

Henrico Dolfing· Yesterday

The AI Cost Stack: Why Professional Services Firms Are Looking for AI Savings in the Wrong Place - Henrico Dolfing

Professional-services firms increasingly discover that AI creates an entirely new layer of operational work. Models must be connected to data sources. Outputs must be integrated into workflows. Systems must interact with document repositories, ERP environments, CRM platforms, audit applications, ...

Geopolitics, Policy & Governance

17 articles

AI Geopolitics2 articles

FourWeekMBA· Yesterday

OpenAI Gives the EU a Cybersecurity-Specific GPT — The Geopolitics of Model Access - FourWeekMBA

OpenAI gives the EU a cybersecurity-specific GPT-5.5-Cyber model. Model access is becoming a geopolitical tool — building strategic dependency.

Guardian· Today

Silicon Valley including Meta has embraced Maga politics, says Nick Clegg

Meta’s former head of global affairs says executives pivoted right in some cases for ‘rather more self-interested’ reasons Silicon Valley companies including Meta have decided to embrace Maga politics, some for “rather more self-interested” reasons, the former UK deputy prime minister Nick Clegg has said. Clegg, who spent nearly seven years at Meta as the head of global affairs, told The Rest is Money podcast that it felt like “a very good time for me to move on” when he left the company in March 2025, three months into the second Trump administration. Continue reading...

AI National Strategy5 articles

Daily Brew· 2 days ago

The Trump administration might take an equity stake in OpenAI

Reports suggest the Trump administration is considering acquiring an equity stake in OpenAI as part of its broader AI strategy.

Times of India· Yesterday

Who is Sriram Krishnan? Indian-origin Trump AI adviser backed by Elon Musk set to leave White House role - The Times of India

He was involved in drafting policy initiatives aimed at improving US competitiveness in AI, including regulatory reforms and proposals to reduce fragmented state-level AI laws. He also contributed to broader strategy discussions focused on national security, innovation, and global competition ...

OrissaPOST· Today

Long march to tech supremacy - OrissaPOST

As scientists, we had the uneasy privilege of witnessing China’s rise earlier than most. Long before a country’s regional or global dominance shows up in macroeconomic aggregates and stock valuations, it can be inferred from the kinds of signals that scientists notice: scholarly publications, ...

Daily Brew· Today

India Launches AI-Driven Mineral Exploration Hub to Boost Strategic Resource Security

India's Geological Survey is launching a new centre in Bengaluru to enhance mineral exploration using AI and machine learning.

Artificial Intelligence Newsletter | June 8, 2026· Today

South Korea taps former Naver CEO Han Seong-sook for prime minister

President Lee Jae Myung nominated former Naver CEO Han Seong-sook as prime minister, expecting her to drive the government's AI transformation agenda and support small businesses.

AI Policy & Regulation10 articles

🇺🇸 Washington wants a piece of OpenAI· Today

Washington wants a piece of OpenAI

The U.S. government is increasingly focused on OpenAI's operations and influence in the AI sector.

Editor's pickPAYWALLGovernment & Public Sector

Britain’s questionable reliance on Palantir

Government should use the best technology for any task, but be wary of ‘lock-ins’

Editor's pickPAYWALLGovernment & Public Sector

FT· Today

We must not grant AI agents legal personhood

What kind of sanctions could keep a non-human corporation in check?

FT· Today

OECD chief urges governments not to go it alone on digital taxation

Mathias Cormann warns against fragmented approach as more countries consider duties on tech giants

Guardian· Today

Starmer gives tech firms ultimatum to block explicit images on children’s phones

Companies such as Apple and Google have until September to install software or face legislation, says PM UK politics live – latest updates Apple and Google have been given until September to install software that blocks explicit images on children’s mobile phones or face legislation enforcing its requirement, Keir Starmer said on Monday. The prime minister said tech companies must activate nudity-detection algorithms or other technical solutions on smartphones and tablets to prevent users taking photos or sharing images of genitalia unless they are verified as adults. Continue reading...

Artificial Intelligence Newsletter | June 8, 2026· 3 days ago

US antitrust laws already ‘adaptable’ for AI, Calif. enforcer says

Competition policy stakeholders calling for new regulatory frameworks for AI would do well to remember US antitrust laws are adaptable to handle innovative technologies, a California DOJ enforcer said.

Editor's pickTransportation & Logistics

Theregister· Yesterday

Brit maritime agency heralds fresh global rules for crewless cargo ships

If you thought driverless cars were bad, imagine a 200,000 ton container ship

Artificial Intelligence Newsletter | June 8, 2026· 3 days ago

US SEC should cancel 2023 cyber disclosure rules, banking group says

A major banking group urged the SEC to eliminate its 2023 cybersecurity disclosure rules, arguing that the current four-day reporting window is irresponsible given the threats posed by frontier AI models.