AI Intelligence Brief

Wed 1 July 2026

Daily Brief — Curated and contextualised by Best Practice AI

246Articles
Editor's pickSummary

Lee bets on chips, Dell sacrifices margins, and labor stays

TL;DRSouth Korean President Lee Jae Myung has announced an $880 billion investment to turn the nation's southwest into a global semiconductor hub. The Bank for International Settlements has warned that current hyperscaler spending levels mirror the dotcom bubble. Dell reported a 26% decline in gross margins as server sales outpace traditional hardware. Meanwhile, US firms spending heavily on AI are currently hiring staff faster than their peers, contradicting fears of immediate mass automation.

Editor's highlights

The stories that matter most

Selected and contextualised by the Best Practice AI team

12 of 246 articles
Lead story
Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

Lee’s $880 Billion AI Bet Ties Legacy to Korea’s Chip Boom

With South Korea already one of the biggest winners of the global AI boom, President Lee Jae Myung is now staking his legacy on a plan to transform the nation’s less developed southwest into a global chip hub.

Editor's pickProfessional Services
Arxiv· Today

Human Capital, AI, and Labor Commoditization

arXiv:2606.21880v2 Announce Type: replace Abstract: Has generative AI changed how labor markets value human capital? We study this question using contract-level data from Upwork, a large online labor market. We represent worker profiles with high-dimensional text embeddings, allowing us to capture rich human capital information from unstructured profile text. We then compute the predictive importance of workers' human capital information and posted hourly rates for client demand, and incorporate these measures into a difference-in-differences design around the release of ChatGPT. We find that in more AI-exposed job categories, the importance of human capital declines and the importance of price rises, suggesting a commoditization effect of AI on labor. Two additional findings support commoditization as a mechanism: The demand premium enjoyed by workers with strong human capital declines in more AI-exposed categories, and demand reallocates toward lower-priced workers. Our results have implications for the design of online labor markets, workers' incentives to invest in human capital, and labor welfare.

Editor's pick
Top Daily Headlines: Sysadmin broke hardware worth more than he made in a month – and lied his way out of the mess· Yesterday

How the AI bubble could pop and take down the global economy, according to the BIS

The Bank for International Settlements warns that the current hyperscaler capital expenditure binge shares similarities with the dotcom bubble.

Editor's pickPAYWALL
FT· Yesterday

Heavy corporate AI spenders add staff faster than peers

Study of 22,000 US companies challenges fears that generative AI will trigger broad job losses

Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

US Lifts Export Restrictions on Anthropic’s Fable 5 AI Model

The US government removed foreign access restrictions on Anthropic PBC’s Fable 5 artificial intelligence model, clearing it for wider distribution after the startup resolved the Trump administration’s safety concerns.

Editor's pickTechnology
Fortune· Yesterday

Dell’s AI boom is real, but so is the profit margin hit nobody is pricing in

The company’s gross margin declined by 26% since Dell first reported AI server revenue over a year ago as servers surpass computers and laptops.

Editor's pickGovernment & Public Sector
Silicon Canals· Yesterday

Europe’s new tech-sovereignty plan doesn’t ban U.S. cloud giants — it sets four levels of “sovereignty” for sensitive government data, and an American law makes the top levels nearly impossible for them to reach - Silicon Canals

On June 3 the European Commission proposed the Cloud and AI Development Act, which grades cloud providers on four sovereignty levels for public-sector data. The rules stop short of a ban, but the Commission’s own tech chief says U.S. firms will struggle to reach the highest tiers because ...

Editor's pickFinancial Services
VentureBeat· Yesterday

Morgan Stanley cut its riskiest reconciliation job in half — by making its agents less autonomous

Most enterprise AI deployments so far have focused on coding assistants and customer service bots. Morgan Stanley has deployed agents in one of banking's most accuracy-critical, deadline-driven workflows instead — profit and loss (P&L) reconciliation — and cut the work in half. The counterintuitive part: it got there by making the system less autonomous, not more. Humans stay tightly in the loop, and their decisions are iteratively turned into repeatable rules the system can apply on its own. “It's much more like a co-worker than a copilot,” Morgan Stanley Managing Director Todd Johnson said at a recent VB AI Impact event. The internal production agentic system, known as FIXR, goes beyond simple, straightforward "gen AI 1.0" tasks. “We think that's where the opportunity is to really unlock more complex work in the organization.” FIXR behind the scenes Every trading day, Morgan Stanley’s trade desks handle the important work around transactions such as cash equities or debt investments.  And, at the end of each of those days, controllers must reconcile P&L across the finance giant’s Finance, Risk, Operations, and Trade Capture systems. All that data must come together, and, perhaps not surprisingly, hundreds of thousands of attributes frequently fail to match. Typically, this means controllers must manually investigate each mismatch (or “break”), make decisions on adjustments, then ideally sign off before the number goes to the desk. And all of this while working on a hard morning deadline.  Previously, this could take up to six hours for a single book. Now, FIXR performs the task in two to three hours, Johnson said. Across the roughly 100 controllers who do this work, that adds up to about 1,500 hours saved per week. After nightly P&L calculations complete, the system automatically analyzes “breaks” and proposes resolutions based on learned rules. Several agents work together:  One interprets past guidance to develop start-of-day resolutions. One learns from controller behavior and documents the rules they apply. One converts repeated patterns into durable, automated logic. Over time, the system can auto-clear certain breaks it’s encountered before, suggest solutions for others that may be less familiar, ask for help when it’s unsure, and flag for human investigation. When items are repeatedly resolved through the same method, it can create firm rules.  Critically, humans don’t leave the loop, but stay fully in it, he said. They review, approve or correct every recommendation, then feed those decisions back to improve the next run. The agent learns daily from controllers what it gets right and wrong and codifies that knowledge as it iterates.  “You still preserve that element of human accountability even as you start to automate,” Johnson said. “Over time you'll see more and more of those items resolved in an automatic way.” He emphasized that autonomy requires a great deal of trust; enterprises will not see efficiency gains if everyone's checking everything an agent does.  The human–agent feedback loop was critical to addressing the challenge of controlled, measured, and repeatable automation. “We recognized that all that intelligence that's sitting in the mind of a controller is gonna be difficult to get all into an agent on day one,” Johnson said.  Focus on process-first, extensibility It was critical to establish processes first, before getting any AI involved, Johnson said. His team ran a “very thorough” process intelligence assessment that mapped and mined workflows to identify where automation would be the most advantageous: Was the answer agents, traditional automation, or simple re-engineering of an inefficient step?  “If we can fix that first before we add agents to the problem, then we really will be transforming the opportunity,” he said.  The P&L sign-off process was full of manual steps suitable for automation, and agents taking over some of these time-consuming tasks are freeing up controllers for “more value-added analysis” and “deeper risk consideration” work, he said. Extensibility, though, was just as important as time savings. Johnson’s team chose this particular P&L reconciliation use case because hundreds of controllers were doing this work globally across the business (in the Americas, Europe, Asia).  So start with a use case, prove it, extend it, “and then ultimately the transformation will be as we roll this out more and more across the organization,” Johnson said.  Deterministic by design Johnson said the team also deliberately limited how much of the workflow depended on the model's judgment at all. "If you have an opportunity to make things very prescribed and repeatable, that's cheaper in terms of token consumption, it's more repeatable in terms of controls — and have the LLM do the stuff where you don't need that kind of deterministic workflow," he said. As the system sees more controller feedback on a given break type, Morgan Stanley converts that pattern into a fixed rule instead of leaving it to the model. Humans still own the behavior  An interesting (and perhaps fundamental) question being raised at the dawn of the agentic era is: Are agents code or digital employees? Johnson argues that “they're probably a little bit of both,” and, as such, require nuance when it comes to governance and oversight. Technical teams must still be responsible for maintaining protections and guardrails like firewalls or encryption, for instance.  But there’s a new dynamic around the “performance element”: Humans using agents are responsible for them because it’s aiding their business work. For instance, if a senior controller is working with a junior controller, they don’t just relinquish responsibility because someone is helping them out, Johnson noted.  “One of our strong principles in our AI governance generally is that there always has to be human accountability, even if there's a degree of automation,” he said. But there typically isn’t “one single one person,” and the process is ultimately continuous. To this point, Johnson joked that one “depressing” thing about agentic AI is that it’s going to require ongoing training because models are ever-changing. “You're never gonna be able to say: ‘We've done all the evaluation and testing that we need to do. Let's just let it go.’ You're going to have to have a constant view as it evolves over time.” Morgan Stanley is aiming at real enterprise pain points Morgan Stanley's experience mirrors patterns VentureBeat has uncovered across enterprise AI deployments. In VentureBeat's recent VB Pulse survey, nearly three-quarters of respondents reported seeing little to no ROI from custom model fine-tuning, describing a "sandbox graveyard" of AI projects that proved too costly to maintain. This suggests that Morgan Stanley's process-first, buy-and-blend approach may be more sustainable than chasing bespoke models. The survey had 87 respondents and findings should be considered directional. Governance emerged as another common challenge: 38% of respondents cited the lack of a single accountable owner as their biggest barrier to production AI, while only two of the 87 enterprises surveyed had active monitoring and alerting in place to detect model failures.

Editor's pickProfessional Services
Arxiv· Today

The Organizational Behavior of Agentic AI: Collective Intelligence in Human-Agent Workflows

arXiv:2606.30986v1 Announce Type: cross Abstract: Agentic artificial intelligence is increasingly deployed not as a single assistant but as a collective of planners, solvers, reviewers, memory managers, tool users, and orchestrators. These systems are entering organisational workflows under familiar labels such as teams, managers, committees, markets, and workflows. This article asks whether such agent collectives exhibit organisational behaviour in a sense that is analytically comparable to, yet distinct from, human organisational behaviour. I argue that agentic AI is a partial organisational analogue. It resembles a human organisation because it differentiates work, coordinates interdependence, performs recurrent routines, crosses boundaries, and produces collective outcomes. It differs because these patterns are not sustained by motivation, identity, trust, employment, socialisation, or moral accountability. They are sustained by context architecture: prompts, memory, traces, schemas, tools, validators, and permissions. The article develops contextual transaction cost as the central mechanism linking these similarities and differences. Computational theorising, synthetic task simulations, real LLM agent traces, and robustness analyses show that human-imitation forms often underperform when they add lossy handoffs, correlated deliberation, and verification burdens, whereas shared-state and adaptive forms perform better when they make context durable, inspectable, and task-contingent. The article contributes to organisation studies by theorising agentic AI as an emerging object of organising and by specifying the interface conditions under which human and agentic organisational behaviour can jointly support collective intelligence.

Editor's pickTechnology
Reuters· Yesterday

China's Meituan says new AI model trained on domestic chips

An aircraft about the size of a car crashed ​into Beijing's tallest building on Friday, witnesses told Reuters, with police closing off roads around the ‌skyscraper and authorities giving no information about the incident.

Editor's pickTechnology
Top Daily Headlines: Sysadmin broke hardware worth more than he made in a month – and lied his way out of the mess· Yesterday

Zuck saves Meta bucks by reusing memory from old servers with a custom CXL ASIC

Meta is using a custom CXL ASIC to reuse memory from older servers, resulting in a 25% reduction in machines needed for certain inference workloads.

Editor's pickTechnology
Arxiv· Today

When Does Learning to Stop Help? A Cost-Aware Study of Early Exits in Reasoning Models

arXiv:2606.30852v1 Announce Type: new Abstract: Reasoning models spend different amounts of useful computation across instances, but it remains unclear when a learned stopping rule improves over simple confidence or convergence thresholds. We study this question with LearnStop, a hidden-state-free checkpoint stopper for reasoning language models. At fixed budget checkpoints, LearnStop probes a short answer from the current reasoning prefix and predicts prefix correctness from online features such as answer confidence, entropy, prefix vote share, answer stability, and backtracking-marker density. Across 18 task-model settings spanning GSM8K, MATH-500, MMLU-Pro, AIME-90, GPQA, Qwen3, and DeepSeek-R1 distillations, the answer is task-dependent. On free-form math, learned multi-feature stopping improves the fixed-budget frontier and often beats scalar exits: on GSM8K with Qwen3-32B, the empirical frontier reaches a post-hoc peak adapt gain of +0.157, validation-selected operating points preserve positive gains, and the paired gain over the strongest scalar baseline is +0.028. On multiple-choice and very hard settings, scalar confidence, entropy, or stability rules are competitive or stronger. We therefore frame learned stopping not as a universal replacement for scalar exits, but as a tool whose value depends on trajectory structure. We further provide validation-selected operating points, paired bootstrap tests, finite-grid lost-correct risk calibration, cost accounting under KV-fork, prefix-cache, and black-box regimes, H100 serving profiles, checkpoint-schedule sweeps, transfer analyses, and robustness checks. The main practical finding is that learned stopping is useful when many questions become correct before full budget but do not exhibit a single reliable scalar stopping signal; its benefits largely disappear when confidence or answer convergence already solves the stopping problem.

Economics & Markets

56 articles
AI Investment & Valuations25 articles
Editor's pickPAYWALLTechnology
FT· Yesterday

Magnificent Seven stocks shed $2.2tn in Wall Street tech rotation

Investors switch to soaring chipmakers benefiting from hyperscalers’ vast AI spending

Editor's pickTechnology
Fortune· Yesterday

Dell’s AI boom is real, but so is the profit margin hit nobody is pricing in

The company’s gross margin declined by 26% since Dell first reported AI server revenue over a year ago as servers surpass computers and laptops.

Editor's pickPAYWALLFinancial Services
Bloomberg· Today

OCBC to Raise Annual Tech Spending to More Than $771 Million

Oversea-Chinese Banking Corp.  plans to increase its annual technology spending, including on artificial intelligence, to more than S$1 billion ($771 million), said a top executive.

Editor's pickTechnology
VentureBeat· Yesterday

Anthropic launches Claude Sonnet 5 at a steep discount to its top model as the company races toward a blockbuster IPO

Anthropic today released Claude Sonnet 5, a new AI model that the company says delivers near-flagship performance at mid-tier prices — a move designed to give cost-conscious enterprise developers access to powerful agentic capabilities just as the San Francisco-based AI lab barrels toward an initial public offering that will test whether the private market's staggering AI valuations can survive public scrutiny. The release, which Anthropic describes as "the most agentic Sonnet model yet," makes Sonnet 5 the default model for users on Anthropic's Free and Pro plans, while also making it available to Max, Team, and Enterprise customers. Introductory API pricing is set at $2 per million input tokens and $10 per million output tokens through August 31, after which it rises to $3 and $15 respectively — still well below the $5 input and $25 output pricing of Anthropic's top-of-the-line Opus 4.8. The strategic logic is unmistakable: Anthropic is trying to democratize access to capabilities that until very recently only its most expensive models could deliver, while building the kind of broad-based developer adoption that will look attractive in an S-1 filing. Sonnet 5 benchmarks show the mid-tier model closing in on Anthropic's flagship Opus Sonnet 5 posts major gains over its predecessor, Sonnet 4.6, across every evaluation Anthropic disclosed. On SWE-bench Pro, an agentic coding benchmark, Sonnet 5 scores 63.2% compared with Sonnet 4.6's 58.1% — a jump that brings it within striking distance of Opus 4.8's 69.2%. On Terminal-Bench 2.1, another coding evaluation, the gap narrows further: 80.4% for Sonnet 5 versus 67.0% for Sonnet 4.6 and 82.7% for Opus 4.8. In multidisciplinary reasoning, as measured by Humanity's Last Exam, Sonnet 5 scores 43.2% without tools and 57.4% with tools — the latter figure essentially matching Opus 4.8's 57.9%. On computer use tasks evaluated through OSWorld-Verified, Sonnet 5 reaches 81.2%, up from 78.5%. And on GDPval-AA v2, a knowledge-work benchmark, it scores 1,618 — surpassing Opus 4.8's 1,615 and far exceeding Sonnet 4.6's 1,395. The pattern across these evaluations tells a consistent story: Sonnet 5 doesn't merely inch forward from its predecessor. It vaults into a performance tier that overlaps substantially with Anthropic's flagship model, while costing roughly 40% less per token at standard pricing and 60% less during the introductory period. Enterprise partners say Sonnet 5's agentic AI capabilities finish jobs that previous models abandoned The emphasis on agentic capabilities — the ability to plan, use tools like browsers and terminals, and execute multi-step workflows autonomously — reflects where the AI industry's center of gravity has shifted in 2026. Enterprises are no longer simply asking chatbots questions; they are deploying AI systems that can navigate complex software environments, execute multi-step coding tasks, and operate with minimal human supervision. Early access partners painted a picture of a model that doesn't just start tasks but finishes them. Sualeh Asif, co-founder of Cursor, the AI-powered code editor that has become a bellwether for developer tool adoption, said that "with Claude Sonnet 5, agents stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost." Daniel Shepard, a senior engineer at Zapier, described handing the model a two-part automation job — updating Salesforce account tiers and sending a launch announcement — that "used to stall halfway" with previous models but now completes end to end. These testimonials matter because they describe exactly the kind of reliability gap that has kept many enterprises from moving agentic AI from pilot programs to production deployments. A model that gets 80% of the way through a complex task before stalling creates more problems than it solves; one that reliably completes the full workflow changes the economics of automation. Anthropic also introduced cost-performance curves showing that developers can now adjust effort levels across Sonnet 5 and Opus 4.8 to find the optimal balance of cost and accuracy for their specific use case — a granularity that reflects growing sophistication in how enterprises consume AI services. An updated tokenizer boosts Sonnet 5 performance but could quietly raise costs for some workloads One technical detail buried in the announcement's footnotes deserves attention: Sonnet 5 uses an updated tokenizer that changes how the model processes text, similar to the change Anthropic introduced with Opus 4.7. The tradeoff is that the same input can map to roughly 1.0 to 1.35 times as many tokens depending on content type. Anthropic says the introductory pricing is calibrated to make the transition "roughly cost-neutral," but enterprise customers running high-volume workloads will want to benchmark their specific use cases carefully before assuming their bills won't change. Anthropic says Sonnet 5 is safer than its predecessor, but its most capable models still lead on alignment Anthropic's safety disclosures reveal a nuanced picture. The company reports that Sonnet 5 shows lower rates of hallucination and sycophancy than Sonnet 4.6, is better at refusing malicious requests, and is more resistant to prompt injection attacks in agentic contexts. On Anthropic's automated behavioral audit — which tests for a wide range of misaligned behaviors including cooperation with misuse and deception — Sonnet 5 scored lower (meaning safer) overall than Sonnet 4.6. However, Sonnet 5 showed "somewhat higher rates of misaligned behavior" compared with the more capable Opus 4.8 and Anthropic's Claude Mythos Preview, the company's powerful but tightly restricted cybersecurity-focused model. On a Firefox 147 exploit development evaluation created in collaboration with Mozilla, neither Sonnet model could develop a working exploit — both scored 0.0% — though Sonnet 5 showed a slightly higher partial success rate (13.2%) than Sonnet 4.6 (8.8%). Both remain far below Opus 4.8 (68.8% working exploits) and Mythos 5 (88.4%). Because of these incremental gains in cyber-adjacent capabilities, Anthropic launched Sonnet 5 with cyber safeguards enabled by default — real-time systems that detect and block dangerous cybersecurity usage. The safeguards mirror those on Opus 4.7 and 4.8 but are less restrictive than those applied to Fable 5, the latest Mythos-class model that Bloomberg reported on June 10 is "blocked from responding to queries related to cybersecurity and biology." Organizations enrolled in Anthropic's Cyber Verification Program automatically receive the same access on Sonnet 5 without needing to reapply. From $14 billion to $47 billion in revenue: Sonnet 5 arrives as Anthropic's IPO narrative takes shape The Sonnet 5 launch arrives at what may be the most consequential moment in Anthropic's short history. The company confidentially filed its IPO prospectus with the SEC in early June, setting up what CNBC has described as "the most scrutinized public offering in tech history." The financial trajectory has been extraordinary. In February, Anthropic raised $30 billion at a $380 billion valuation, with the company reporting $14 billion in annualized revenue that had "grown more than tenfold in each of the past three years," as The Guardian reported.  By late May, Anthropic had closed a $65 billion Series H round at a $965 billion post-money valuation — co-led by Altimeter Capital, Sequoia Capital, and others — with a revenue run rate that had crossed $47 billion. Harrison Rolfes, an analyst at PitchBook, told CNBC that the number that will "either validate or collapse the entire narrative the private markets have been pricing for three years" won't be the valuation or revenue, but gross margin — a figure no outside observer has yet seen. In this context, Sonnet 5 serves a dual purpose. For developers, it offers genuine capability improvements at competitive prices. For Anthropic's IPO narrative, it demonstrates the company can deliver a compelling product at a price tier that could drive the kind of broad adoption Wall Street rewards — high-volume, recurring API revenue from thousands of enterprise customers. Government deals and growing competition define the market Sonnet 5 enters The timing also aligns with Anthropic's aggressive push into institutional contracts. Just yesterday, California Governor Gavin Newsom announced a first-of-its-kind partnership providing Claude to all state agencies at a 50% discount, with free workforce training. Kate Jensen, Anthropic's Head of Americas, called it an effort to "put Claude to work for the people who keep this state running." The deal — which extends to California's cities and counties — represents exactly the kind of durable, recurring adoption that could anchor revenue well beyond the developer community. But Anthropic's release lands in an increasingly crowded field. OpenAI, which raised a $122 billion round in March at an $852 billion valuation, is pursuing its own IPO. Elon Musk's SpaceX, which merged with xAI, priced its IPO at $135 per share with a $1.77 trillion valuation. Google, Meta, and a growing wave of well-funded competitors — including Asian AI startups that, as the Wall Street Journal has reported, are developing Mythos-like cybersecurity capabilities — are all vying for the same enterprise market. Gil Luria, head of technology research at D.A. Davidson, told CNBC that while Anthropic "appears to have the lead" in frontier AI models, "much of their current usage is for trials and experimentation and that may not sustain." That observation cuts to the heart of the challenge facing every frontier AI lab: converting experimental developer usage into durable, production-grade revenue. The real test for Sonnet 5 isn't benchmarks — it's whether cheaper AI can sustain a trillion-dollar story Sonnet 5's positioning — offering near-Opus performance at Sonnet prices — is a direct play for that conversion. Enterprise customers experimenting with expensive Opus-class models may find that Sonnet 5 delivers sufficient quality for production workloads at a price point that finance teams can approve at scale. If it works, it could accelerate the shift from experimentation to deployment that every AI company needs to justify its valuation. Three things will determine whether Sonnet 5 matters beyond the initial benchmark charts. Real-world agentic reliability is the first: benchmarks measure capability, but production deployments measure consistency, and the true test will come when thousands of developers push the model through messy, unpredictable workflows at scale. The tokenizer economics are the second: the updated tokenizer's 1.0 to 1.35x token expansion could quietly erode the pricing advantage for certain workloads, and enterprise customers should run their own cost analyses rather than relying on headline per-token prices. The third is the IPO narrative itself: when Anthropic's S-1 eventually becomes public, investors will scrutinize whether the Sonnet tier — cheaper but high-volume — or the Opus tier — expensive but high-margin — drives the bulk of revenue and, critically, gross profit. As PitchBook's Rolfes told CNBC, the 2026 IPO window "either becomes the most consequential IPO cycle since the dot-com era or the most expensive lesson in narrative-versus-fundamentals that public markets have ever taught." Anthropic is betting that a model good enough to rival its flagship and cheap enough to run at scale is the product that closes the gap between those two outcomes. The public markets will soon decide whether they agree.

Editor's pick
Top Daily Headlines: Sysadmin broke hardware worth more than he made in a month – and lied his way out of the mess· Yesterday

How the AI bubble could pop and take down the global economy, according to the BIS

The Bank for International Settlements warns that the current hyperscaler capital expenditure binge shares similarities with the dotcom bubble.

Editor's pickPAYWALLFinancial Services
Bloomberg· Today

Japan Investment Corp.'s Hata on AI, Strategic Sectors

Yuka Hata, Senior Managing Director & Head of Fund Investments at Japan Investment Corporation, says demand for capital is growing as Japan shifts its focus toward physical AI and deep technologies. She adds that Japan’s labor shortages have heightened the need for AI-driven solutions, making the sector a key investment priority for JIC. (Source: Bloomberg)

Editor's pickTechnology
Guardian· Yesterday

Rocky week for AI as shares slump but no sign of crash – yet

The markets are souring on artificial intelligence, but is this the bubble being burst? Meanwhile, California proposes a tax on billionaires Hello, and welcome to TechScape. I’m Blake Montgomery, US tech editor at the Guardian, writing to you after fending off sunburns at the beach. Today, we’re discussing a rocky week for the AI industry’s finances and how California’s proposed billionaire’s tax is changing the political posture of the state’s governor. Impact of social media ban for under-16s in UK hinges on how firm it is UK under-16s social media ban: which apps will be blocked and how will it work? ‘Tech firms are losing the public’: social media age bans near tipping point OpenAI staggers AI model release after Trump administration request Meta pauses employee tracker for AI training amid privacy concerns ‘It’s dangerous and it’s going to erode trust’: redesign of US government websites stokes surveillance fears California billionaire tax will appear on ballot after deadline for deal passes | Technology | The Guardian Continue reading...

Editor's pickPAYWALLTechnology
Bloomberg· Today

Panasonic Targets Further AI Growth to Build on Record Valuation

Panasonic Holdings Corp., the Japanese electronics conglomerate, is attracting renewed investor attention as a beneficiary of the artificial-intelligence boom.

Editor's pickPAYWALLFinancial Services
NYT· Yesterday

Stocks Notch Strongest Quarter Since 2020

The S&P 500 rose almost 15 percent for the three months through June, and many stock analysts remain optimistic that corporate earnings driven by artificial intelligence will keep growing.

Editor's pickTechnology
Business Insider· Yesterday

Top Economist Warns of Market Repricing Amid Slow Returns on AI Investments - Business Insider

Apollo's chief economist says that returns on AI investment are likely confined to the tech sector for now, warning of a "slower cash flow reality."

Editor's pickFinancial Services
Win Investing· Yesterday

Investors Rethink AI Exposure as Technology Valuations Face Scrutiny • Win Investing

Investors are reassessing AI exposure as technology valuations face scrutiny, with markets focusing on earnings, risks and long term growth.

Editor's pickTechnology
International Business Times· Yesterday

The AI Boom Has Sent Major Stock Indexes Soaring. But The Magnificent Tell a Different Story. | IBTimes

The seven largest U.S. technology companies have seen trillions erased in market value this month even as chipmakers and memory suppliers extend a separate rally tied to artificial intelligence

Editor's pickTechnology
Investorideas· Yesterday

Mag 7 Value Drops $2.3 Trillion as 'Great AI Reckoning' Begins | InvestorIdeas

deVere CEO Nigel Green warns the Magnificent Seven faces its Great AI Reckoning as $2.3 trillion is wiped from Mag 7 value during June. Investors now demand proof of AI returns, not just infrastructure spending.

Editor's pickTechnology
Yahoo! Finance· Today

Amazon (AMZN) Launches $1 Billion AI Engineering Unit Inside Customer Organizations

Amazon Web Services is creating a new Forward Deployed Engineering division funded with US$1b. The unit will embed specialist AI engineers directly inside customer organizations. The aim is to speed up enterprise adoption of AI and agentic AI systems through hands-on co-development.

Editor's pickFinancial Services
Archyde· Yesterday

AI Boom's Ripple Effects: How Growth Threatens Credit Stability, BIS Warns – Archyde

The Bank for International Settlements (BIS) has issued a warning about the risks of an AI bust, highlighting potential ripple effects on global economic

Editor's pickFinancial Services
HNGN· Yesterday

Bank for International Settlements Warns AI Investment Boom Could Create Global Financial Risks

The global race to dominate artificial intelligence has fueled one of the largest investment surges in modern corporate history. Now, one of the world’s most influential financial institutions is warning that the boom could carry risks extending far beyond Silicon Valley.

Editor's pickTechnology
Benzinga· Yesterday

The AI ETF Trade Has Evolved: Here's What Investors Bought Instead In H1 - Roundhill Memory ETF (BATS:DRA - Benzinga

Investors didn't just buy AI, they bought what AI needs. AI infrastructure ETFs gained traction in H1 driven by demand in memory chips, power, HVAC, data center themes.

Editor's pickFinancial Services
Business Standard· Yesterday

BIS flags AI spending boom as growing threat to global financial stability | Tech News - Business Standard

Bank for International Settlements says AI has supported global growth, but warns that opaque financing, rising debt and trillion-dollar infrastructure spending could become a systemic financial risk

Editor's pickTechnology
Memeburn· Yesterday

AI Chip Stocks Surge 100% Had Their Best Half-Year Ever - Memeburn

AI chip stocks posted historic H1 2026 gains — SOXX up 108%, SK Hynix up 250%, AMD up 130%. See which chipmakers lead the AI boom — and what risks lie ahead.

Editor's pickTechnology
Traders Union· Yesterday

Microsoft stock consolidates as AI infrastructure spending fears hit valuation

Microsoft trades at $369.1 today, down 0.78%. Get the latest on MSFT stock performance, technicals, and AI-driven pressures.

Editor's pickFinancial Services
Best Startup US· Yesterday

US Startup Funding Rounds June 2026: 20 Biggest Deals Ranked - BestStartup.US

The top 20 US startup funding rounds June 2026 ranked by capital raised. Baseten $1.5B, Ramp $750M, Cyera $600M lead a record month of AI and fintech megarounds. Fully verified data.

Editor's pickTechnology
TradingView· Yesterday

TSMC vs. NVIDIA: Which AI Semiconductor Stock Should You Buy in July? — TradingView News

AI semiconductor stocks took center stage in June as continued enterprise AI adoption, sustained hyperscaler investments in AI infrastructure and a steady stream of product and manufacturing updates strengthened in the sector's long-term growth prospects. With July set to begin, this is an ...

Editor's pickTelecommunications
Daily Brew· Yesterday

Rocket Lab to Acquire Iridium

Rocket Lab has announced a historic deal to acquire Iridium, aiming to create a fully integrated space and communications company.

Editor's pickPAYWALLFinancial Services
FT· Yesterday

US stocks chalk up biggest quarterly gain in six years

Investors navigate Iran war fallout, chip stock volatility and blockbuster SpaceX IPO

Editor's pick
Ventureburn· Yesterday

AI Statistic 2026: Spending, Adoption and Sentiment - Ventureburn

Comprehensive breakdown the hard data behind the billions spent, the actual phases of the evolving human perspective. Click to learn more

AI Pricing & Cost Curves7 articles
Editor's pickTechnology
🧠 Meta Brain2Qwerty v2 hits 78% word accuracy, no surgery needed· Yesterday

10 open-source GitHub tools that scrape any website for free

A curated list of open-source tools like Firecrawl, browser-use, and Crawlee enables developers to build AI training datasets and RAG pipelines without paid APIs.

Editor's pickTechnology
Arxiv· Today

When Does Learning to Stop Help? A Cost-Aware Study of Early Exits in Reasoning Models

arXiv:2606.30852v1 Announce Type: new Abstract: Reasoning models spend different amounts of useful computation across instances, but it remains unclear when a learned stopping rule improves over simple confidence or convergence thresholds. We study this question with LearnStop, a hidden-state-free checkpoint stopper for reasoning language models. At fixed budget checkpoints, LearnStop probes a short answer from the current reasoning prefix and predicts prefix correctness from online features such as answer confidence, entropy, prefix vote share, answer stability, and backtracking-marker density. Across 18 task-model settings spanning GSM8K, MATH-500, MMLU-Pro, AIME-90, GPQA, Qwen3, and DeepSeek-R1 distillations, the answer is task-dependent. On free-form math, learned multi-feature stopping improves the fixed-budget frontier and often beats scalar exits: on GSM8K with Qwen3-32B, the empirical frontier reaches a post-hoc peak adapt gain of +0.157, validation-selected operating points preserve positive gains, and the paired gain over the strongest scalar baseline is +0.028. On multiple-choice and very hard settings, scalar confidence, entropy, or stability rules are competitive or stronger. We therefore frame learned stopping not as a universal replacement for scalar exits, but as a tool whose value depends on trajectory structure. We further provide validation-selected operating points, paired bootstrap tests, finite-grid lost-correct risk calibration, cost accounting under KV-fork, prefix-cache, and black-box regimes, H100 serving profiles, checkpoint-schedule sweeps, transfer analyses, and robustness checks. The main practical finding is that learned stopping is useful when many questions become correct before full budget but do not exhibit a single reliable scalar stopping signal; its benefits largely disappear when confidence or answer convergence already solves the stopping problem.

Editor's pickTechnology
VentureBeat· Yesterday

Google unveils Nano Banana 2 Lite aka Gemini 3.1 Flash-Lite for low cost, 4-second fast enterprise image generations

Google is upgrading its AI image generation capabilities today with the debut of Nano Banana 2 (NB2) Lite, an optimized model built for rapid execution and tight infrastructure budgets. Technically designated as Gemini 3.1 Flash-Lite Image on Google's application programming interface (API), NB2 Lite is positioned as the fastest and most cost-effective option within Google's creative model family, capable of generating images in 4 seconds at a flat rate of $0.034 per 1,000 images. It's available immediately to enterprise developers through Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform (GEAP). It's not quite as fast or customizable as startup Krea's new, partially open licensed Krea 2 Turbo (which allows for open modification and commercial usage by small enterprises), but the big selling point here is the low price and bundling with Google's larger Workplace and AI offerings. This release lands alongside the public preview of Gemini Omni Flash, a multimodal conversational video generation and editing model. However, while Omni Flash represents Google's long-term bet on agentic video manipulation, Nano Banana 2 Lite is the immediate infrastructure workhorse, tailored specifically for high-throughput commercial application, rapid programmatic prototyping, and automated asset generation workflows. The technology of speed At its core, Nano Banana 2 Lite is built directly upon the Gemini 3.1 Flash Lite architecture, engineered to solve the persistent tension between computational latency and operational overhead. In high-velocity enterprise frameworks, traditional large-scale image models introduce significant friction due to multi-second processing delays and high per-token costs. Google's new lightweight model circumvents these bottlenecks by generating a standard 1k resolution image in under four seconds. This represents a stark performance optimization over its legacy predecessor, Nano Banana (Gemini 2.5 Flash Image), achieved through targeted enhancements in core baseline capabilities. According to internal documentation, the model features upgraded world knowledge for drafting rough data visualizations and contextual layouts, enhanced character consistency to preserve identity across continuous image streams, and localized typographic rendering capabilities. The trade-offs inherent to this "Lite" designation are transparently outlined in Google’s technical data sheets. Unlike the broader standard Nano Banana 2 (NB2) and Nano Banana Pro (NB Pro) lines, which support versatile multi-resolution scaling across 1k, 2k, and 4k outputs, Nano Banana 2 Lite restricts its resolution support exclusively to a 1k canvas. Yet, within this specialized operational boundary, the architectural tuning yields surprising competitive efficiencies. In standardized internal benchmarks, Nano Banana 2 Lite achieved a Text to Image arena Elo score of 1251. This score comfortably eclipses the legacy NB1 score of 1151 and remarkably edges out the bulkier, more expensive NB Pro, which sits at 1245 in the same text-to-image track. For specialized editing tasks, the model maintains a single-image editing Elo score of 1308 and a multiple-image editing score of 1294, providing a highly optimized sweet spot for real-time applications. A boost to rapid prototyping and marketing research From a product implementation perspective, Google is marketing Nano Banana 2 Lite not as an artistic engine, but as an invisible, high-throughput utility layer for automated workflows. T he target demographic spans software engineers, programmatic ad platforms, and digital commerce applications where rapid iteration is crucial. Think real-time A/B testing for thousands of targeted advertising variations or immediate layout adjustments on localized storefronts. Google highlights three specific production environments where the model excels. First, its world knowledge allows systems to instantly draft accurate contextual scenes or location-specific mockups. Second, its character consistency handles the rigorous demands of storyboarding tools and digital fashion try-ons, where keeping object fidelity static across sequential generations is historically difficult. Finally, its text rendering improvements mean legible copy can be embedded directly into rapid ad generations, allowing teams to verify layout compatibility across various languages on the fly. Developers should note, however, that while native image generation operates with lowest-latency profiles, conditional image editing tasks may experience marginally higher response times due to the secondary processing layers required to rewrite existing pixels. Licensing and acess The deployment mechanism of Nano Banana 2 Lite via proprietary APIs underscores an enterprise-first commercial licensing strategy. Unlike open-weights models that developers can pull down to run locally under open-source frameworks like Apache 2.0 or modified OpenRAIL licenses, Google’s latest models remain tightly integrated into its managed cloud stack. For enterprises, this eliminates the operational complexity of hosting hardware but binds usage strictly to Google’s metered pricing terms.Financially, this commercial strategy is highly aggressive. At $0.034 per 1,000 images across both AI Studio and GEAP channels, the model undercuts the older, less capable NB1 model ($0.039) and slashes costs dramatically compared to standard NB2 ($0.067) and NB Pro ($0.134) tiers. Internal notes indicate that the model delivers roughly 60–70% of the general capability of NB2 and NB Pro while executing at significantly higher speeds and a fraction of the cost. By lowering the fiscal barrier to high-frequency image generation, Google is making a direct play to lock enterprise developers into its commercial platform ecosystem.

Editor's pickTechnology
ETEnterpriseai.com· Yesterday

Businesses opting for cheaper AI models as token bills climb for premium services

Cheaper AI Models: Companies are reassessing their AI spending with a growing preference for lower-cost AI models. Premium services are becoming prohibitive due to usage-based pricing and increased operational costs. This trend, driven by rising complexities in AI tasks, is changing the landscape ...

AI Productivity7 articles
Editor's pickPAYWALLTechnology
NYT· Yesterday

How Grindr’s C.E.O. Adopted A.I.: ‘I Just Imposed It’

George Arison, the gay dating app’s chief executive, is aiming for all code to be eventually written by artificial intelligence, making the company “leaner.”

Editor's pickPAYWALLManufacturing & Industrials
FT· Yesterday

AI speeds the march of China’s factory robots into new sectors

Artificial intelligence is enabling the spread of automation to traditional industries

Editor's pickHealthcare
Arxiv· Today

Can Physician Expertise Improve Machine Learning Identification of Delirium?

arXiv:2606.30651v1 Announce Type: new Abstract: Delirium is common in hospitalized patients and is often missed in routine care. We present a user-centered interactive machine learning (UC-iML) framework for delirium detection support that combines physician-guided feature refinement with interpretable modeling. Using 3,862 labeled admissions from six Toronto hospitals in the General Medicine Inpatient Initiative (GEMINI), we integrate administrative variables, laboratory results, medications, and a radiology-derived text indicator. Physicians guide feature refinement and model evaluation, and Shapley Additive exPlanations (SHAP) are used to summarize feature attribution. We evaluate standard supervised classifiers with temporally separated holdout testing and a later-phase validation cohort. Compared with automated and baseline variants, the proposed framework shows better overall discrimination and stronger temporal robustness, while the explanations highlight clinically meaningful signals. These results support UC-iML as a practical human-in-the-loop framework for clinically relevant delirium modeling.

Editor's pickEducation
Arxiv· Today

Qualified Educational Capacity Planning under Heterogeneous Student Support Needs: A Synthetic Benchmark and Decision-Support Framework

arXiv:2606.30650v1 Announce Type: new Abstract: Educational support services often face a qualified-capacity problem: staff time is scarce, qualifications decay, new support needs can appear before anyone is prepared for them, and training consumes the same hours needed by current students. We introduce a synthetic benchmark and decision-support framework for qualified educational capacity planning. The model is a stylized single-institution service system with heterogeneous support-demand categories, backlog-only dynamics, continuous preparation states with hard threshold qualification and decay, and capacity-consuming training. The benchmark includes seed-controlled scenarios for announced and surprise new support categories, staff absences, and demand surges; exact feasibility discipline; declared per-policy information sets; requalification and greenfield-qualification counters; access-dispersion metrics; replay checksums; and paired statistics. We compare service-only, reactive, static-insurance, water-filling, and rolling-horizon mixed-integer controllers, with an attribution chain separating service planning, qualification maintenance, and acquisition, plus a perfect-foresight reference. The central result is a regime map governed by whether a newly required qualification can be acquired within the controller's reaction reach. When it can, the closed-loop controller wins across the core and adversarial suites, with value concentrated in just-in-time qualification acquisition. When the training lag exceeds the horizon, lean static insurance wins structurally, and a reactive trainer that starts after onset can be worse than no training. Backlog perishability shifts this boundary without erasing either regime. EduCapacity Studio reproduces exported scenarios bit-for-bit. All evidence is stylized and synthetic; the framework makes no claims about real student outcomes, compliance, or individual placements.

Editor's pick
TechRadar· Yesterday

The AI job paradox and the missing link in productivity gains | TechRadar

AI boosts productivity, but workforce structures lag behind

Editor's pickTechnology
Arxiv· Today

Why Solve It Twice? Hierarchical Accumulation of Skills for Transfer-Efficient ML Engineering

arXiv:2606.30911v1 Announce Type: new Abstract: ML engineering agents waste compute rediscovering known techniques because every competition is a cold start. We present HASTE, a hierarchical multi-agent system that organizes cross-competition knowledge into three scope tiers (global, domain, and competition-specific), each coupled to a matching agent level. An orchestrator coordinates domain specialists and promotes learning between tiers via LLM-driven abstraction. A controlled ablation provides evidence for scoped loading: holding a 159-skill inventory constant across 8 competitions, tiered loading achieves a 100% medal rate while flat loading reaches only 62.5%, the same medal rate as loading no skills, and consumes 2x the output tokens. On the full MLE-Bench Lite benchmark (22 Kaggle competitions), HASTE reaches a medal rate of 77.3% using Claude Sonnet 4.6 at 12h per competition. In a cold-start run, the system begins with no accumulated skills. In warm-start runs, it reloads skills learned from earlier competitions, using only global and domain-level skills for transfer across competitions. Warm starts use 52% fewer refinement iterations, and the fraction of proposed changes kept by the agent rises from 42% at low inventory to 85% once 50+ skills are available. These results suggest that better knowledge organization can partly substitute for model strength and compute budget in ML-engineering agents.

Editor's pickProfessional Services
Arxiv· Today

Translation Readiness Index: Measuring Patent-Paper Proximity from Scientific Publication Text

arXiv:2606.31102v1 Announce Type: new Abstract: Universities, funders, investors, and policy agencies often need to identify research with translational relevance before patents, licenses, startups, or industry collaborations are visible. This study introduces the Translation Readiness Index (TRI), a text-based measure evaluating a publication's semantic similarity to papers that appear in high-confidence patent-paper pairs. Using 20,610 publications from OpenAlex, including 9,431 publications from the Reliance on Science patent-paper pairs data and 11,179 matched comparison publications, we created paper-level 768-dimensional semantic embeddings from titles and abstracts with SPECTER2. After evaluating four machine learning classifiers, XGBoost achieved the highest ROC-AUC (0.77). We define TRI as the model-estimated probability that a publication belongs to the patent-paper-paired class. Linguistic analysis revealed that patent-paired publications more often use an invention-oriented framing, distinct from the observational language of the comparison group. External validation across University of Western Australia (UWA) publications and leading global universities demonstrated positive associations between high TRI scores and independent translational indicators. TRI provides a text-based method for identifying translation-ready research, though it should be interpreted as a measure of semantic proximity to patented science rather than a direct measure of realized commercialization.

AI Startups & Venture7 articles

Labor, Society & Culture

40 articles
AI & Employment20 articles
Editor's pickProfessional Services
Arxiv· Today

Human Capital, AI, and Labor Commoditization

arXiv:2606.21880v2 Announce Type: replace Abstract: Has generative AI changed how labor markets value human capital? We study this question using contract-level data from Upwork, a large online labor market. We represent worker profiles with high-dimensional text embeddings, allowing us to capture rich human capital information from unstructured profile text. We then compute the predictive importance of workers' human capital information and posted hourly rates for client demand, and incorporate these measures into a difference-in-differences design around the release of ChatGPT. We find that in more AI-exposed job categories, the importance of human capital declines and the importance of price rises, suggesting a commoditization effect of AI on labor. Two additional findings support commoditization as a mechanism: The demand premium enjoyed by workers with strong human capital declines in more AI-exposed categories, and demand reallocates toward lower-priced workers. Our results have implications for the design of online labor markets, workers' incentives to invest in human capital, and labor welfare.

Editor's pickPAYWALL
FT· Yesterday

Heavy corporate AI spenders add staff faster than peers

Study of 22,000 US companies challenges fears that generative AI will trigger broad job losses

Editor's pickProfessional Services
Microsoft News· Yesterday

Microsoft's Work Trend Index 2026: 33% of Indonesian Workers are at the Forefront of AI Adoption - Source Asia

Read in Indonesian here · Indonesia stands out as one of the markets with a high proportion of Frontier Professionals, or advanced AI users in Asia. This shows that more Indonesian workers are not only actively using AI but are also able to use it more strategically while continuing to prioritize ...

Editor's pick
Ethan Mollick· Yesterday

The Economic and Policy Implications of Rapid AI Capability Gains

Rapid advancements in AI are fundamentally altering workplace productivity and labor dynamics. These shifts are increasingly driving volatility in both corporate policy and broader market expectations.

Editor's pickFinancial Services
Substack· Yesterday

The Broken Ladder That We Are Now Burning

US banks are projecting roughly 200,000 back-office job losses over the next three to five years, with JPMorgan, Citigroup, Goldman Sachs, Bank of America, and Wells Fargo all now naming AI explicitly as a driver.

Editor's pickTechnology
Tech Times· Yesterday

AI Cuts 87,714 Jobs While Its Makers Fund $1 Billion Worker Retraining Push

AI job displacement reached 87,714 cuts through May 2026, the highest AI-attributed total Challenger, Gray & Christmas has ever recorded. Days later, Anthropic, Amazon, Microsoft, and OpenAI backed RAISE US, a $1 billion worker retraining fund. Over 35% of Claude users expect AI to handle most of

Editor's pick
Business Insider· Yesterday

Is AI Causing Layoffs? This Report Says It's Complicated. - Business Insider

New research has found that companies spending the most on AI aren't slashing jobs; they're actually hiring faster than their peers.

Editor's pickEducation
TUN AI· Yesterday

OpenAI Maps AI’s Impact on EU Jobs — What Students Should Know - TUN

OpenAI's Economic Research team has extended its AI Jobs Transition Framework to the European labor market, categorizing EU occupations by automation risk, growth potential and workflow reorganization. The country-level findings carry real strategic weight for students deciding where to build ...

Editor's pick
International Business Times· Yesterday

Many Analysts Expected AI To Replace Workers. New Company Data Tells A Different Story. | IBTimes

Businesses making the largest investments in artificial intelligence expanded their workforces faster than comparable companies after adopting the technology, according to new research.

Editor's pickProfessional Services
Cheung Kong Graduate School of Business· Yesterday

AI is Speeding Workforce Turnover. But Your Next Great Hire May Already be Working for You - CKGSB Knowledge

Companies looking to hire externally to resolve AI-related workforce upheaval may end up paying a hidden cost. As well as the higher costs of recruitment, external hiring also undermines trust in an existing workforce already struggling to transition to AI. Businesses should instead consider ...

Editor's pick
NBC News· Yesterday

New data finds AI’s heaviest adopters are expanding, not shrinking, their workforces

A new research paper from financial operations company Ramp reveals that AI-embracing companies have increased hiring.

Editor's pick
Straight Arrow News· Yesterday

Is AI actually creating jobs? Research sheds light on the new tech’s impact

Heavy AI adopters are growing headcount fastest at entry level — even as Ford rebuilds the team AI alone couldn't replace.

Editor's pickTechnology
The Hindu BusinessLine· Yesterday

India software companies in flux: AI coding costs triple developer salaries - The HinduBusinessLine

Indian corporates face skyrocketing AI coding costs, tripling developer salaries, raising concerns about budget overruns and cost management.

Editor's pick
Daily Brew· 2 days ago

The AI jobs debate just got messier

A look at the evolving and complex impact of AI on the labor market as startups and enterprises navigate new automation realities.

Editor's pick
Daily AI News June 30, 2026: When AI Becomes the Attack Surface· Yesterday

AI Adoption Is Overloading Your Middle Managers

Generative AI adoption is placing pressure on middle managers who must oversee implementation, validate AI-generated work, and maintain operational performance.

Editor's pickPAYWALL
FT· Yesterday

Is AI an exoskeleton for the mind?

Technology that helps people do things they couldn’t otherwise achieve can also lead to atrophy

Editor's pick
Inc· Yesterday

Everyone Thought AI Was Killing Entry-Level Jobs. They Were Wrong

A massive new study reveals the entry-level roles automated by AI aren't disappearing—they're getting a major promotion.

Editor's pick
Big Technology· Yesterday

Heavy AI Adoption Linked To More Hiring, Not Layoffs, New Data Shows

A new Ramp study shows that the companies spending the most on AI are actually hiring more and not cutting the workforce.

Editor's pickMedia & Entertainment
Daily Brew· Today

Dragon Age Co-Creator Warns AI Risks Stalling Game Developer Growth

David Gaider warns that generative AI in gaming could stifle junior developers' growth and lead to lower-quality products.

Editor's pick
R Street Institute· Yesterday

New Technology, Same Anxieties: Economic Freedom, Labor Markets and the AI Transition - R Street Institute

Almost from the moment ChatGPT was launched in 2022, fears that artificial intelligence (AI) will produce a “job apocalypse” have abounded. Popular concerns regarding technological unemployment have been stoked by predictions from technology company executives that AI could eliminate half ...

AI Ethics & Safety10 articles
Editor's pickPAYWALLFinancial Services
FT· Yesterday

‘Kill switches’ could be needed for AI-powered trading, BoE official says

Technology could make markets more volatile through ‘herding behaviour’, Sarah Breeden tells ECB conference

Editor's pick
Arxiv· Today

Thinking Out Loud: Real-Time Deception Monitoring in Asymmetric LLM Negotiations

arXiv:2606.30649v1 Announce Type: new Abstract: As LLM-based agents are increasingly deployed to negotiate, delegate, or transact on a user's behalf, software pipelines need runtime mechanisms to verify that an agent's stated intentions match its actual behavior. We study whether a lightweight, real-time chain-of-thought (CoT) monitor can detect strategic deception during asymmetric negotiations, using a used-car sales scenario where a seller agent has private knowledge of an undisclosed defect and a buyer agent has only public market data. The monitor, implemented as a third agent, audits the seller's internal reasoning against its messages and alerts the buyer whenever concealment is detected, across multiple buyer-seller model pairings. Our experiments show that this monitor increases the buyer's walk-away rate, but reveal a persistent intelligence gap: lower-capability buyers often cannot translate an alert into an equitable counter-offer and still accept exploitative deals after being warned. Sellers also change their behavior when told they are monitored, though concealment is not eliminated. These results highlight both the promise and limits of lightweight real-time oversight, offering practical guidance for engineers building and validating monitoring infrastructure for agentic systems with conflicting stakeholder incentives.

Editor's pickPAYWALLTechnology
Washington Post· Yesterday

Are AI chatbots like ChatGPT politically biased? We tested them. - Washington Post

So, are chatbots politically biased? The Washington Post tested the AI models behind Open AI ’s ChatGPT, Google’s Gemini and others using political questions designed by researchers to gauge how chatbots respond to hot-button political issues.

Editor's pickTechnology
Daily Brew· Yesterday

Meta Contractors Posed as Teens to Prompt Rival Chatbots

An investigation reveals that Meta contractors were instructed to pose as teenagers to test rival AI chatbots for safety vulnerabilities.

Editor's pickTechnology
Top Daily Headlines: Sysadmin broke hardware worth more than he made in a month – and lied his way out of the mess· Yesterday

AI may be good at finding security vulnerabilities, but it can't beat human stupidity

Despite advancements in AI security tools, poor password habits remain a primary vulnerability that requires no sophisticated exploits to compromise.

Editor's pickTechnology
Guardian· Yesterday

‘There’s this deep mystery of what, actually, is this thing?’: the philosopher inside Google DeepMind AI

Since 2017, Iason Gabriel has worked at the tech giant, trying to anticipate – and think through – the impact of AI. But as commercial and geopolitical pressures escalate, can ethicists make any difference? In 2017, a 33-year-old political philosopher named Iason Gabriel was told by a friend that he ought to apply for a job at DeepMind, the London-based subsidiary of Google where much of its AI research was concentrated. The suggestion was not an obvious one. Gabriel was a cheerful but intense junior academic with a passion for Vipassana meditation and what his brother calls “enthusiastic” rock climbing. The eldest son of a Greek management professor and a British documentary maker, Gabriel split his time between teaching and international development work. At the University of Oxford, where he was a fellow at St John’s College, Gabriel taught courses on political theory and wrote papers on the moral contortions of “yuppie ethics” and the ethical blind spots of effective altruism. When he wasn’t there, he did crisis work for the United Nations Development Programme in Sudan and Lebanon. Continue reading...

Editor's pickTechnology
OfficeChai· Yesterday

Anthropic Talks About Regulation And AI Safety Far More Than OpenAI, Shows Data

Anthropic and OpenAI keep swapping places at the top of the AI leaderboards, but there is something that sets them apart in the...

Editor's pick
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

US youth health groups praise provisions of House KIDS Act

Fifteen US health organizations wrote to House leadership supporting the bipartisan KIDS Act as a vital step for child online safety.

Editor's pick
Daily Brew· Today

Global Scammers Exploit US Tech: Starlink and Others Under Scrutiny in AI-Assisted Fraud Probe

An AP/FRONTLINE investigation reveals global scammers using American tech and Starlink to conduct AI-assisted fraud from Myanmar.

Editor's pick
Arxiv· Today

Reframing AGI Confrontation with Off Earth Autonomy

arXiv:2606.30666v1 Announce Type: new Abstract: A common AI-safety narrative holds that sufficiently capable agents will predictably seek power, resist shutdown, and therefore tend toward confrontation with humans. We argue that this conclusion is often drawn in an implicitly Earth-centered strategic landscape. If a credible off-Earth autonomy pathway exists - i.e., a staged transition from Earth dependence to an autonomous machine industrial base - then confrontation is not the only route to reducing human control. Using Saklakov's decision-theoretic 'confrontation question' as an anchor, we provide a qualitative mapping from the autonomy pathway to key model terms showing that early cooperation can dominate confrontation as a path to autonomy, and that the autonomy pathway can reduce confrontation incentives by making Earth less strategically binding. We discuss how this incentive shift interacts with feedback-loop dynamics between human preemption and agent behavior, and outline implications for governance: under incentive-compatible early cooperation, a more stable, higher-observability regime can support iterative oversight and cooperative alignment.

AI Skills & Education3 articles
Editor's pickTechnology
Azeem Azhar· Yesterday

Divergent Talent Acquisition Strategies in US and Chinese AI Labs

Chinese AI labs are hiring talent with significantly less experience than their US counterparts, suggesting different approaches to scaling human capital. This disparity highlights potential differences in research maturity and labor market competition between the two regions.

Editor's pickEducation
Arxiv· Today

Toward AI-Resilient Assessment in Computer Science Courses in an AI-Native World

arXiv:2606.30655v1 Announce Type: new Abstract: AI-native course assessments in senior computer science courses and related fields should grade students by \emph{AI-resilient skill}: the ability to achieve outcomes beyond a strong AI baseline. Such assessments should allow students to use AI freely, while reducing the extent to which greater private AI budget or more intensive AI use, by itself, becomes a grading advantage. This paper proposes a minimal formal framework for this goal. The framework specifies a real task, an executable evaluator, a declared AI-native Pareto frontier, and a grading rule based on Pareto surplus. The central claim is simple: Pareto surplus provides a measurable, protocol-relative certificate that a submitted artifact achieves a tradeoff not already supplied by the declared AI baseline, and grading by this surplus is AI-resilient with respect to that baseline. Interpreting surplus as evidence of student skill requires the surrounding assessment protocol--for example, design reports, ablations, prompt traces, oral checks, or reproducibility explanations--but the grading certificate itself is behavioral and executable. The framework is then extended to practical complications, including self-improving AI loops, budget neutrality, server-mediated feedback, and prompt-based red teaming. As a concrete instantiation, we describe an AI-resilient approximate-membership assignment centered on Bloom filters for COMP 480/580 at Rice University, designed to test whether students can improve beyond AI-generated implementations.

Technology & Infrastructure

63 articles
AI Agents & Automation15 articles
Editor's pickProfessional Services
Arxiv· Today

The Organizational Behavior of Agentic AI: Collective Intelligence in Human-Agent Workflows

arXiv:2606.30986v1 Announce Type: cross Abstract: Agentic artificial intelligence is increasingly deployed not as a single assistant but as a collective of planners, solvers, reviewers, memory managers, tool users, and orchestrators. These systems are entering organisational workflows under familiar labels such as teams, managers, committees, markets, and workflows. This article asks whether such agent collectives exhibit organisational behaviour in a sense that is analytically comparable to, yet distinct from, human organisational behaviour. I argue that agentic AI is a partial organisational analogue. It resembles a human organisation because it differentiates work, coordinates interdependence, performs recurrent routines, crosses boundaries, and produces collective outcomes. It differs because these patterns are not sustained by motivation, identity, trust, employment, socialisation, or moral accountability. They are sustained by context architecture: prompts, memory, traces, schemas, tools, validators, and permissions. The article develops contextual transaction cost as the central mechanism linking these similarities and differences. Computational theorising, synthetic task simulations, real LLM agent traces, and robustness analyses show that human-imitation forms often underperform when they add lossy handoffs, correlated deliberation, and verification burdens, whereas shared-state and adaptive forms perform better when they make context durable, inspectable, and task-contingent. The article contributes to organisation studies by theorising agentic AI as an emerging object of organising and by specifying the interface conditions under which human and agentic organisational behaviour can jointly support collective intelligence.

Editor's pickTechnology
Arxiv· Today

AgentBound: Verifiable Behavioral Governance for Autonomous AI Agents

arXiv:2606.30970v1 Announce Type: new Abstract: Autonomous AI agents increasingly perform consequential actions on behalf of human principals, including financial transactions, external communications, and enterprise workflows. Existing agent infrastructure relies on identity federation and delegated authorization to authenticate workloads and control resource access, but it cannot determine whether an authorized action should be executed under the current behavioral and operational context. We present AgentBound, a runtime governance framework that provides verifiable behavioral oversight for autonomous AI agents. AgentBound evaluates each proposed action using three independent authorities: delegated authorization, owner-signed behavioral constitutions, and site action contracts. Their judgments are conservatively composed through a formal decision model to determine whether an action should be permitted, reviewed, or denied before execution. To provide accountability, AgentBound generates cryptographically verifiable governance receipts that bind every action to the exact delegation, policy, and semantic artifacts governing the decision, enabling independent replay verification and policy provenance. The framework also introduces standing delegation for long-running agents, allowing periodic workloads to operate under continuously refreshed governance policies while preserving revocability and bounded authority. We present the formal foundation, system architecture, governance receipt protocol, and AgentBound-Bench, a benchmark framework for evaluating governance correctness, authority composition, and accountability. Rather than replacing model alignment, AgentBound complements it by providing a deterministic governance layer between authorization and execution, transforming governance from a process that must be trusted into one that can be independently verified.

Editor's pickProfessional Services
Arxiv· Today

Investigating Multi-Agent Deliberation in Law

arXiv:2606.30906v1 Announce Type: new Abstract: Artificial Intelligence is increasingly applied to the field of law, and has the potential to increase access to justice. One particular movement that is gaining traction is that of agentic AI, wherein AI agents, based on Large Language Models (LLMs) can take autonomous actions. In particular, multi-agent approaches in the legal domain remain largely unexplored. In this paper, we investigate multi-agent deliberation methods for legal reasoning tasks using LLMs. We explore multi-agent deliberation (MAD) and introduce two novel multi-agent frameworks inspired by courtroom procedures and legal argumentation. Our experiments on both legal and non-legal benchmarks reveal that multi-agent frameworks achieve comparable overall performance to baseline large language models, but produce significantly distinct answers. Notably, these approaches can successfully solve cases that the baseline fails to address, and vice versa. We conduct a qualitative evaluation and highlight scenarios where multi-agent frameworks outperform monolithic approaches. For example, multi-agent approaches appear better suited for answering questions that require critical thinking from multiple perspectives. Our work positions multi-agent systems as a promising direction for AI in the legal domain, while demonstrating the potential of law-inspired multi-agent approaches for deliberation.

Editor's pickHealthcare
Arxiv· Today

Agentic AI Enhances Physician Trust in Clinical Decision Making

arXiv:2606.30658v1 Announce Type: new Abstract: Medical AI has shifted from reasoning to agentic AI, a new paradigm that autonomously invokes external tools during reasoning, rendering intermediate reasoning steps and tool outputs transparent to users. Although proven to outperform previous models, physician trust in agentic AI remains largely unexplored. To address this, three physicians evaluated 315 multimodal clinical cases quantifying both process-oriented cognitive trust and outcome-oriented behavioral reliance. Comparing agentic AI against non-agentic baselines, physicians exhibited significantly higher cognitive and behavioral trust for the agentic model (P < 0.001). Specifically, on treatment planning tasks, physicians trusted the agentic reasoning most, preferring it in 89.57% of cases. Furthermore, process-oriented cognitive trust is significantly associated with outcome-oriented behavioral reliance (P < 0.001). However, measurable over-reliance on incorrect agentic outputs still exists, highlighting the inherent limitations of decision-logic transparency alone and underscoring the continuous need for rigorous clinician oversight.

Editor's pickTechnology
Theregister· Yesterday

AI agents: Cause of database sprawl. And also the proposed solution

DB wrangling tech needs to meet demands of AI agents, Cockroach Labs CEO Spencer Kimball tells El Reg

Editor's pickTechnology
TechCrunch· Yesterday

Amazon launches new $1 billion FDE org, following OpenAI and Anthropic | TechCrunch

Engineers on the new team will embed within companies to deploy purpose-built agents, focusing on fast deployments and customer self-sufficiency.

Editor's pickTechnology
Amazon· Yesterday

AWS invests $1 billion in forward deployed AI engineers

A new AWS Forward Deployed Engineering organization will embed thousands of experts with customers to co-develop and deploy agentic AI solutions in days.

Editor's pickFinancial Services
PrimaFelicitas· Today

Multi-Agent AI Systems for Enterprise Finance Automation - PrimaFelicitas

The real-time automated systems improve the overall efficiency of financial workflows instead of waiting for manual processing. ... Organisations must define relevant governance and follow compliance to ensure safe and secure use of AI. Agents are deployed to keep a check on regulations.

Editor's pickTechnology
Forbes· Yesterday

Council Post: ​What CTOs Should Know Before Letting AI Agents Touch Production Infrastructure​

CTOs should deploy AI agents with operational discipline.

Editor's pickTechnology
Vendasta· Yesterday

AI Agent Infrastructure for SaaS in 2026: Ship AI in Weeks, Not Quarters

Understand the complexities of building AI agent infrastructure and how it impacts your SaaS offering in the competitive landscape.

Editor's pick
MIT Technology Review· Yesterday

The Download: AI “coworkers” and stratospheric internet

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. AI agents are not your “coworkers” Imagine coming in to work to learn that a new underling will report to you. The worker is not a person but an AI tool—one…

Editor's pickTechnology
Arxiv· Today

AgRefactor: Self-Evolving Agentic Workflow for HLS Compatibility and Performance

arXiv:2606.30949v1 Announce Type: new Abstract: High-Level Synthesis (HLS) provides a fast path from concepts to silicon, but converting real-world software into synthesizable HLS code remains challenging due to restrictive language support and the gap between software and hardware programming practices. Existing automated and LLM-based refactoring approaches partially address this problem, yet they often lack flexibility, struggle to scale, and incur high computational costs. We introduce AgRefactor, an LLM-based multi-agent workflow for refactoring software into HLS-compatible programs. AgRefactor incorporates a self-evolving memory system that accumulates and retrieves factual and strategic knowledge across tasks, improving robustness and efficiency on unseen programs. To reduce cost and enhance scalability, it integrates automated refactoring tools, enabling agents to balance LLM-driven rewrites with efficient tool-based transformations. On 9 out of 11 challenging real-world benchmarks, which are 5-10x longer than the most complex cases studied in prior work, AgRefactor outperforms or matches the state-of-the-art automated refactoring tool and a strong LLM-based baseline built on the same framework backbone. Further agentic performance optimization yields a 6.51x geometric mean speedup over the SoTA pragma tuning tool and a 1.20x speedup over optimized open-source designs with less than 20% extra resources. AgRefactor is fully-automated and open-sourced.

Editor's pick
MIT News· Yesterday

Q&A: What is agentic AI today, and what do we want it to be? | MIT News | Massachusetts Institute of Technology

MIT Associate Professor Phillip Isola explains what agentic AI is, how these systems are used, what applications they are best suited for, and what the future may hold for this exploding technology.

Editor's pickManufacturing & Industrials
Daily Brew· Today

Apptronik Launches Robot Park in Austin, Targets 2027 for Humanoid Robot Commercial Deployment

Apptronik has unveiled Robot Park in Austin, partnering with Google DeepMind to advance humanoid robotics through real-world data collection.

Editor's pickManufacturing & Industrials
Daily Brew· Yesterday

Indian housewives are training next wave of humanoids through their chores

A look at how Indian housewives are contributing to the training of next-generation humanoid robots by performing daily chores.

AI Energy4 articles
Editor's pick
Arxiv· Today

FAIR+S: A validation study of a framework for sustainable research data and software

arXiv:2606.30663v1 Announce Type: new Abstract: The FAIR principles (Findable, Accessible, Interoperable, Reusable) have transformed research data management, but they do not address the environmental impact of creating and using research software and data, such as energy consumption, carbon emissions, and life-cycle impacts that become central to computer science and engineering-related domains. To bridge this gap FAIR+Sustainability or FAIR+S, an extension of the FAIR framework that embeds environmental accountability as a core element, was introduced. Because FAIR principles already structure how digital research artefacts are described, shared, and reused, they offer an effective entry point for embedding sustainability considerations at scale. FAIR+S weaves carbon-footprint and energy-use considerations directly into FAIR-aligned metadata schemas, workflows and development specifications. In doing so, it enables research infrastructures to report, compare, and audit the environmental implications of data and software in a measurable, interoperable, and transparent manner. This creates a foundation for reproducible research that simultaneously advances open science goals and decarbonisation objectives. However, integrating environmental accountability into established research workflows raises questions of feasibility, relevance, and acceptance across stakeholders and disciplines. In this work we validated the framework through a cross-disciplinary expert survey. The evaluation confirms its importance and practical relevance, but also reveals current gaps in researchers' awareness of green software practices.

Editor's pickTechnology
Crypto Briefing· Yesterday

Google's AI expansion drives surge in emissions and power use

Google's 2025 Environmental Report shows a 51% emissions increase since 2019 and 27% jump in data center electricity use, driven by AI workloads like

AI Infrastructure & Compute17 articles
Editor's pick
Arxiv· Today

Mapping the Artificial Intelligence Divide in Africa: Infrastructure, Accessibility and Capacity

arXiv:2606.30656v1 Announce Type: new Abstract: Artificial Intelligence (AI) has the potential to be transformative for development, but Africa is currently facing a fragmented and challenging "AI divide". This paper provides an empirical analysis of the current state of the AI landscape and how it compares with Africa's technological preparedness for the future. In our analysis, we approach the "AI Divide" from three angles: infrastructure, accessibility, and human capacity. First, we look at the physical constraints that prevent Africa from integrating digitally. We then evaluate the human-centred factors that limit the development of AI technology on the continent. Finally, we examine the human capacity to develop AI systems on the continent and provide three focused case studies. Our investigation shows that the physical infrastructure needed to build an AI economy on the continent is lagging, with only 38% internet penetration, poor broadband coverage and less than 1% of all data centres globally. Other constraints include high data costs relative to income, gender-based digital divides, and the need to build more representative NLP models that can understand Africa's native languages. However, there are positive trends towards the emergence of local initiatives and grassroots movements, such as startups and universities, contributing to AI development on the continent. Based on these findings, we provide concrete recommendations to policymakers to help develop a more comprehensive and equitable AI ecosystem on the African continent.

Editor's pickEnergy & Utilities
Fortune· Yesterday

Vinod Khosla: AI’s energy crisis has a fix — and it doesn’t need the grid

The founder of Khosla Ventures says we're looking in the wrong place as AI data centers wait in a seven-year interconnection queue.

Editor's pickEnergy & Utilities
Forbes· Yesterday

AI Data Centers Hit Energy Transition Wall as Grids Stall 2500 GW

The WEF Energy Transition Index 2026 shows readiness falling for the first time in a decade just as AI infrastructure demands trillions in firm power. Boards must now price queue position, co-located generation and sovereign capital access as first-order risks.

Editor's pickEnergy & Utilities
Bebeez· Yesterday

Shaking off the rust: Pennsylvania’s data center rise

AWS’ data center next to a nuclear plant in Salem Township. – Talen | Cumulus Though some might disagree, Pennsylvania claims to be the birthplace of artificial intelligence. AI research at Carnegie Mellon University dates back to the mid-1960s and what some say was the foundation of the world’s first AI research hub. Despite the […]

Editor's pickPAYWALLTelecommunications
Bloomberg· Today

A Giant Cable Exposes Ireland’s AI Ambitions and Security Risks

A new Amazon transatlantic fiber-optic link is symbolic of the country’s tech economy, but also of its chronic lack of defense spending.

Editor's pickPAYWALLManufacturing & Industrials
Bloomberg· Today

Vertiv Opens Malaysia Plant to Meet AI Data Centers’ Power Needs

US data center equipment maker Vertiv Holdings Co. opened a factory in Malaysia, underscoring the rapid pace of AI infrastructure buildout in the Asia Pacific.

Editor's pickTechnology
Top Daily Headlines: Sysadmin broke hardware worth more than he made in a month – and lied his way out of the mess· Yesterday

Zuck saves Meta bucks by reusing memory from old servers with a custom CXL ASIC

Meta is using a custom CXL ASIC to reuse memory from older servers, resulting in a 25% reduction in machines needed for certain inference workloads.

Editor's pickTechnology
VentureBeat· Yesterday

AI agents need context everywhere they run, even where the cloud can't follow

The competitive edge in enterprise AI is shifting to context: which platform can give an agent the right memory, the right retrieval and the right data at the moment of decision. Couchbase on Tuesday announced its AI Data Plane, combining persistent agent memory, real-time context retrieval and an enterprise-managed MCP server in a single operational platform.  Couchbase's roots are in caching and high-transaction databases — an architecture the company argues makes it better suited for agent memory than vendors that came to the problem from search or analytics. The AI Data Plane runs identically across cloud, on-premises and disconnected edge environments, extending agent memory and local vector search to devices with no network connection. "How do you make sure that the intelligence that you get out of these models are the ones that databases specialize in?" Gopi Duddi, CTO at Couchbase, told VentureBeat. "How can you get that value out of storage systems, which are still going to be databases?" What the AI Data Plane delivers The AI Data Plane packages three components designed to replace the fragmented stacks most enterprises are currently running. Agent memory: A unified persistence layer for conversational context, structured operational data and vector embeddings. Couchbase says the guardrails are what distinguish it from standalone memory services: token constraints per session, time-to-live limits on stored memories and metering controls that cap compute consumption per agent session. Enterprise MCP server: An enterprise-supported self-managed server for standardized model-context protocol integration, shipping as part of the platform rather than requiring a separate service. Agent catalog: A function-level catalog of discoverable agent tooling built by Couchbase. Duddi distinguished it from metadata catalogs like Databricks Unity or AWS Glue — describing it, in his words, as closer to a glorified MCP that surfaces agent functions as callable tools within the platform. Memory-first architecture takes agent context to the disconnected edge The lineage of Couchbase and its core architectural foundation is what Duddi says gives it an edge when it comes to context. "We were a cache before we became a database," Duddi said. Writing to memory is 10x faster than writing to disk, Duddi said — a speed advantage he argues separates Couchbase from NoSQL databases that layer memory workloads on top of disk-based storage. Couchbase isn't the only data technology that has its roots in a caching layer. Redis similarly is rooted in cache and also recently announced an agentic AI context layer. Duddi argued that Couchbase is different in that it maintains an ACID (Atomicity, Consistency, Isolation, and Durability) compliant database which matters for transactional workloads. Couchbase also has a long history across multiple deployment modalities. That architecture extends to the edge through Couchbase Lite, the platform's on-device runtime. It runs SQL, full-text search and vector search locally without a network connection, using a proprietary sync mechanism to replicate bidirectionally back to cloud or between edge nodes when connectivity returns. The target environments are retail floor operations, field service, industrial deployments and regulated settings where agent data cannot leave the device. Duddi cited hotel reservations as an early example: multiple agents serving customers concurrently, each pulling local context and running vector search on-device, with shared session memory synchronizing centrally. The practical benefit is token efficiency. Rather than every agent independently retrieving and processing the same data, the platform caches shared context so concurrent sessions draw on it without burning tokens repeatedly. Agora's view from production Agora, a platform that helps developers embed real-time voice, video and conversational AI into enterprise applications, has run Couchbase in production since February 2024. The initial use case was its Signaling product, managing channel setup and state synchronization for live calls. Expanding into conversational AI agents brought stricter requirements: memory-first architecture, full JSON support for storage and query, cross-datacenter replication for high availability and enterprise-grade vendor support. "Couchbase was the best fit based on these criteria," Patrick Ferriter, SVP of Product at Agora, told VentureBeat. Agora is now extending that relationship to support context retrieval for conversational AI agents. "This will simplify the architecture and deliver enterprise grade RAG with predictable lower latency required for conversational AI use cases," Ferriter said. For data professionals trying to figure out the best approach to context, there is no one answer. On platform selection, Ferriter was direct. "It depends on the preference and goals of the organization, including timing," Ferriter  said. "If they want something enterprise grade and optimal for immediate production and scale vs. having to optimize and maintain an open-source solution with community support. We wanted the former and that is why we looked at an expanded partnership with Couchbase." Competitive context: following the right trend The context layer has become a crowded space in 2025. Oracle put a memory core in its database back in March providing a context layer. Redis added a context layer in May as did vector-native database vendor Pinecone.   "Couchbase is following this trend, not setting it, but it's the right one to follow," Devin Pratt, Research Director for AI, Automation, Data and Analytics at IDC, told VentureBeat. "Its real edge is reach, running the same platform from cloud to edge to mobile, which is how enterprises actually operate. The test now is to scale against bigger names." For teams navigating the vendor landscape, Pratt's framing is direct. "Match the tool to the workload. Consolidate where it makes sense, use a specialized engine like a graph database where relationship-heavy reasoning earns it, and let governance drive the call rather than treating memory as plumbing," Pratt said.

Editor's pickTechnology
Theregister· Yesterday

What the OCI MSA didn't solve for AI scaling

PARTNER CONTENT: The OCI MSA settled the architecture for optical scale-up. How fast bandwidth scales is a manufacturing question, not an architectural one

Editor's pickTechnology
Business Standard· Yesterday

India's AI compute race: Will subsidised GPUs close the capability gap? | Artificial Intelligence News - Business Standard

Subsidised GPUs are helping lower AI compute costs in India, but industry experts say closing the capability gap will require much more than hardware access: AI GPUs

Editor's pickTechnology
Business Standard· Yesterday

India's AI race: Why building infrastructure matters more than chatbots | Artificial Intelligence News - Business Standard

Industry experts say the next stage of AI development in India will depend on investments in compute infrastructure, data centres, energy, and connectivity

Editor's pickTechnology
DataCenterKnowledge· Yesterday

Stargate Update: AI’s Biggest Data Center Buildout Meets Reality

Stargate tests gigawatt-scale AI campuses that fuse compute, power, and capital, growing fast while facing risks in energy, financing, and governance.

Editor's pickFinancial Services
Forbes· Yesterday

Council Post: Why The 'Last Mile' Of AI In Finance Is An Infrastructure Problem, Not A Model Problem

Your AI system's ceiling is set by your data infrastructure quality. No model architecture improvement can break through that ceiling.

Editor's pickTechnology
TNGlobal· Yesterday

India's AI future depends on an infrastructure challenge few are talking about - TNGlobal

The conversation around AI often focuses on what technology can achieve. The more important question may be whether the underlying infrastructure can keep pace with that vision.

Editor's pickTechnology
AIM Media House· Yesterday

AI Infrastructure Is No Longer a One-Company Race | AIM Media House

Qualcomm, OpenAI, Cerebras, Micron and SK hynix all made major AI infrastructure moves within weeks of each other.

Editor's pickTechnology
Bebeez· Yesterday

Alto Infrastructure breaks ground on 70MW AI campus in Granada, Spain

Alto Infrastructure has broken ground on a new data center in Granada, Spain. Just a few days after the Andalusian Regional Government announced the start of construction on the region’s first next-generation data center, Alto Infrastructure – the company formerly known as Sierra DC – has unveiled details of SP01, a campus specifically designed for […]

Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

Oaktree Capital-Backed ITG Raises $312.2 Million in US IPO

Digital infrastructure services firm ITG Inc. raised $312.2 million in a US initial public offering, pricing its shares below a marketed range.

AI Models & Capabilities13 articles
Editor's pickTechnology
Arxiv· Today

The Consistency Dilemma in LLMs: Generator-Evaluator Agreement and Vulnerability to Mistakes

arXiv:2606.30653v1 Announce Type: new Abstract: Large language models are increasingly deployed in agentic pipelines that depend on the model evaluating its own outputs without external verification. The reliability of these pipelines depends on an implicit assumption: that the model applies relevant concepts the same way when it generates an output and later evaluates that output. We propose a new measure, generator-evaluator self-consistency, to test this assumption directly and apply it to 10 frontier models across 491 concepts. We find, first, that there is substantial variation in self-consistency. Second, we find that in a clinical setting with physician-validated mistakes (Proniakin et al., 2025), across models, those with higher self-consistency are linked to greater vulnerability to mistakes. Thus, even when models consistently apply concepts they may not be safe to deploy. This is evidence of a consistency dilemma in LLMs: self-consistency is operationally useful, but models that are more consistent are also more prone to mistakes.

Editor's pickTechnology
Arxiv· Today

What Drives Interactive Improvement from Feedback?

arXiv:2606.30774v1 Announce Type: new Abstract: We study when natural-language feedback produces improvement beyond the gains obtainable from repeated attempts alone. In multi-turn language agent setting, higher final accuracy can reflect useful feedback, but it can also arise from resampling, format correction, or additional test-time computation. To separate these effects, we introduce a controlled student-teacher protocol across Omni-MATH, Codeforces, BBEH Linguini, and ARC-AGI1, evaluating thirteen open-weight models in both student and teacher roles. We compare external feedback, self-feedback, and unguided self-refinement, while varying interaction history, task difficulty, and teacher access to privileged task information. Across settings, we find that multi-turn improvement is often not evidence of feedback use: self-generated feedback adds little beyond unguided self-refinement, whereas the strongest external teachers produce substantially larger feedback-specific gains, suggesting that useful feedback must provide guidance beyond generic retry. Dense student-teacher interaction matrices further show that interactive gains are driven more by the student's ability to use feedback than by the teacher's identity, although teacher choice remains important for a fixed student. These results suggest that feedback-based agents should be evaluated against repeated-attempt baselines, and that ability to act on feedback, not merely feedback availability, is a central bottleneck for interactive improvement. We release our controlled student-teacher evaluation framework at https://j-lojek.github.io/feedback-generation-is-a-bottleneck/.

Editor's pickTechnology
Theregister· Yesterday

Changing AI math could reduce the hardware burden, researchers show

SEMQ promises an abstraction layer for separating semantics from embeddings

Editor's pickTechnology
Daily Brew· Yesterday

DeepSeek open sources DSpark, a new framework to speed up LLM inference by up to 85%

DeepSeek has released DSpark, an open-source framework designed to significantly accelerate large language model inference.

Editor's pick
Arxiv· Today

When Regulation Has Memory: Hysteresis and Control Burden in Artificial Agency

arXiv:2606.30975v1 Announce Type: new Abstract: Adaptive agents are usually judged by what they do, but an agent can appear stable while the internal effort required to keep it stable is increasing. This hidden regulatory burden matters for artificial agents operating under noise, delay, or changing demands: two systems may reach similar internal states while one requires much more corrective control to get there. Here, we study whether that burden depends on history. Using a computational model of adaptive uncertainty regulation, we drive an artificial agent through a continuous change in its uncertainty target and then reverse the change without resetting the agent. This creates a simple test for carryover: does the controller respond only to the current target, or does the path by which the agent reached that target still matter? The simulations show a clear history-dependent effect. The adaptive gain required to regulate the agent forms a reproducible hysteresis loop, meaning that the same target can require different levels of control depending on whether the agent is moving toward or returning from a more demanding regime. The timing of regulation also matters. When stabilization is available before disturbance exposure, the agent generally requires less adaptive gain than when it can only recover after disturbance has already acted. The state-level coherence measure also shows path dependence, but the timing effect is much clearer in regulatory gain. The main difference is therefore not that anticipatory regulation produces a completely different state. Rather, it reaches comparable regulated behavior with lower modeled control demand. These results suggest that adaptive agents should be evaluated not only by whether they remain organized, but by how much regulation they must recruit to do so.

Editor's pickTechnology
Arxiv· Today

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

arXiv:2606.30846v1 Announce Type: new Abstract: Discovering simulation models for reuse remains a fundamental challenge in Modeling and Simulation (M&S). When many models coexist, identifying those that align with a given modeling intent remains difficult. Recent advances in Artificial Intelligence (AI), particularly retrieval-based approaches, offer a promising pathway to operate at this semantic layer. In this paper, we present an experimental study investigating the impact of data representation, transformer-based embedding models, and retrieval strategies on the discovery of simulation models using natural language queries. We evaluated performance across multiple query types using standard information retrieval metrics, including recall@5 and nDCG@5. Results show that data representation matters, open-source embedding models can achieve high performance, and reranking methods are important, especially as query complexity increases. This work provides a baseline for AI-driven model discovery and discusses its role in advancing toward AI-driven composability and interoperability.

Editor's pickEducation
Arxiv· Today

ELEVATE: Designing Human-Centered GenAI Virtual Tutors for Scalable and Inclusive Education

arXiv:2606.30662v1 Announce Type: new Abstract: The advent of Generative Artificial Intelligence (GenAI), and in particular Large Language Models (LLMs), is reshaping educational practice, while intensifying ethical debate about its adoption. To date, the dominant paradigm remains cloud-based and text-only chatbot: a centralized service that offers limited pedagogical control, weak transparency over knowledge sources, and non-trivial risks for privacy and regulatory compliance. This model also presumes continuous connectivity and recurring API costs, creating structural barriers for many institutions, reinforcing existing digital divides. At the same time, educational interaction with LLM can benefit from multimodal cues and embodied presence, requiring interfaces that move beyond text-only tutoring. In this work, we propose ELEVATE (Efficient LLM Education with Virtual Avatar Teaching Engine), a framework to develop efficient GenAI-driven avatar tutors governed by epistemic infrastructures. ELEVATE integrates LLM-driven dialogue with embodied 3D avatars for multimodal interaction and adopts a local-first execution model enabling deployment on consumer-grade hardware. The framework formalizes a three-stratum design that separates (i) a student-facing virtual avatar interaction layer, (ii) a local GenAI execution and multimodal synthesis core, and (iii) a teacher-facing governance layer. We implemented and evaluated a working prototype deployed in a real-world educational curriculum. The system runs on standard PCs and smartphones, and we provide system-level performance evidence to show responsive interaction under realistic hardware constraints. Finally, we discuss sociotechnical and pedagogical implications for responsible adoption, positioning ELEVATE as a scalable pathway for privacy-preserving and inclusive GenAI tutoring across heterogeneous school environments.

Editor's pickTechnology
Arxiv· Today

BayesBench: Evaluating LLM Belief Trajectories Under Multi-Turn Evidence Accumulation

arXiv:2606.30850v1 Announce Type: new Abstract: Large language models (LLMs) are typically deployed in multi-turn conversations, where each turn provides new evidence that should reduce epistemic uncertainty about their environment. Acting rationally then requires inferring the unobserved quantities that govern it and updating beliefs about them as evidence accumulates. Yet most evaluations only score the model's final-turn answer in a single-turn format, leaving this process unexamined. We ask how closely LLMs' belief updates match those of a rational Bayesian reasoner in multi-turn settings, and introduce BayesBench, a suite of simulation environments that probe this across three progressively complex tasks: (i) Bayesian estimation, where the model infers an unknown parameter from sequential evidence; (ii) Bayesian prediction, where the model turns inferred beliefs about a latent variable into outcome forecasts; and (iii) latent-framed Bayesian prediction, where observations are filtered through a user-persona framing, requiring joint inference over the latent state and the persona. Across seven LLMs (3B--70B), scaling improves latent inference and evidence accumulation, with updates occasionally matching the Bayesian posterior. However, these gains do not reliably carry over to downstream prediction, exposing a gap between inferring latent structure and using it to rationally update beliefs about the target outcome.

Editor's pickTechnology
Daily Brew· Today

Meituan unveils LongCat-2.0, China’s first trillion-parameter AI model built on domestic chips

Meituan introduces a massive AI model developed using domestic Chinese hardware.

Editor's pick
Arxiv· Today

HyPOLE: Hyperproperty-Guided Multi-Agent Reinforcement Learning under Partial Observation

arXiv:2606.30966v1 Announce Type: new Abstract: Formal specification is a powerful tool to guide the learning process and provides significant advantages over reward shaping: (1) mathematical rigor; (2) expressiveness to specify objectives and constraints, and (3) the ability to define tactics to achieve objectives. However, these benefits remain largely unexplored in the context of Multi-Agent Reinforcement Learning (MARL). This paper introduces HyPOLE, a novel framework for MARL under partial observability, where learning is guided by the expressive power of the so-called hyperproperties and, in particular, the temporal logic HyperLTL. We integrate Centralized Training for Decentralized Execution (CTDE) techniques with HyPOLE to synthesize decentralized policies, and our evaluation on SMAC, MessySMAC, and WildFire benchmark demonstrates clear advantages over baselines.

Editor's pickManufacturing & Industrials
Artificial Intelligence Newsletter | July 1, 2026· Today

Japan launches advanced AI model project for physical AI

Japan's METI has launched a five-year project to develop advanced multimodal AI models for physical AI, aiming to strengthen national competitiveness.

Editor's pickPAYWALLPharma & Biotech
Bloomberg· Today

Agilent CEO on Business Strategy, Innovation

Padraig McDonnell, President and CEO at Agilent, discusses the company's business strategy and the innovations he sees in the life sciences sector. He speaks with Minmin Low from the sidelines of the World Economic Forum in Dalian. (Source: Bloomberg)

Editor's pickTechnology
Bebeez· Yesterday

Verkko Robotics launch VOLTAIC, a cost and energy saving Spiking Neural Inference Model (Sponsored)

London-based Verkko Robotics, a DeepTech European AI research lab, has unveiled VOLTAIC, an AI system designed to keep learning over time, retain what it has learned, and operate using far less energy and computing power than today’s leading multimodal models. If its early performance claims hold true, the technology could challenge the economics of the […]

AI Security & Cybersecurity9 articles
Editor's pickFinancial Services
Economic Times BFSI· Yesterday

AI-Enabled Cyber Threats Top Cybersecurity Risk for India's Financial Sector: RBI Report, ETBFSI

The Reserve Bank of India's latest report highlights AI-enabled cyber threats as the leading security concern for the financial sector in India over the next year, emphasizing the evolving challenges posed by sophisticated cyberattacks and third-party dependencies.

Editor's pickFinancial Services
The Economic Times· Yesterday

AI-enabled cyberattacks biggest near-term threat to financial system: RBI - The Economic Times

AI-powered cyberattacks are India's top financial system risk, according to the RBI's latest report. Banks and NBFCs are most concerned about these sophisticated threats, though preparedness is uneven. Rising third-party tech dependence and geopolitical uncertainty also heighten vulnerabilities.

Editor's pickFinancial Services
Artificial Intelligence Newsletter | July 1, 2026· Yesterday

Regulators should check whether current rules are fit for AI, UK central banker says

Bank of England Deputy Governor Sarah Breeden stated that regulators must evaluate if existing frameworks can handle AI risks, noting that AI-enabled cyberattacks pose significant financial stability threats.

Editor's pickTechnology
Arxiv· Today

Neuro-Bayesian-Symbolic Residual Attention Shallow Network: Explainable Deep Learning for Cybersecurity Risk Assessment

arXiv:2606.30953v1 Announce Type: new Abstract: We introduce the Neuro-Bayesian-Symbolic Residual Attention Shallow Network (NBS-RASN), a hybrid neural architecture for explainable cybersecurity risk assessment in open-source ecosystems. Unlike deep models that trade interpretability for accuracy, our shallow network encodes domain knowledge, causal reasoning, and expert judgment as differentiable components. It uses 80 interpretable neurons across 12 layers, including a gatekeeper that enforces five epistemological axioms - precision, causality, falsifiability, transparency, and completeness - as hard constraints before propagation. Despite limited depth, the network exhibits deep-learning traits via residual attention and feedback loops, learning complex risk patterns without becoming a black box. It produces fully decomposable scores: a deterministic weighted component plus an expert adjustment, with each adjustment traceable to named amplifiers (blast radius, propagation speed, structural nature, default exposure, exploitation pattern, institutional criticality). We validate on 20 open-source projects covering all OWASP Top 10:2025 categories and language risk classes, achieving confidence scores of 0.79-0.97, and show that explainability is guaranteed by design, not by a training algorithm. This challenges the assumption that deep learning requires deep networks, proving that shallow networks with deep reasoning can outperform opaque models in high-stakes cybersecurity, where interpretability is essential.

Editor's pickTechnology
Daily Brew· Yesterday

The attack that hijacked Claude Code came through Sentry. Datadog, PagerDuty, and Jira have the same exposure.

A security vulnerability in Sentry allowed for the hijacking of Claude Code, with similar risks identified for other major developer tools.

Adoption, Deployment & Impact

38 articles
AI Adoption Barriers & Enablers8 articles
AI Applications14 articles
Editor's pickPAYWALL
FT· Yesterday

AI in Practice

How artificial intelligence is being set to use in — and often fundamentally changing — multiple sectors, including: budget-busting AI bills; transforming pharma; AI and trust in air traffic control; spreading factory automation; rewriting gaming rules; chatbots and mental health; astronomical opportunities; agentic travel agents

Editor's pickConsumer & Retail
Arxiv· Today

Beyond expert users: agents should help users construct preferences, not just elicit them

arXiv:2606.30863v1 Announce Type: new Abstract: Agents typically assume an expert user -- one with well-formed preferences about what they want -- and default to clarifying questions whenever the task is underspecified. We argue this assumption is unrealistic. Users often lack the domain knowledge to have completely specified preferences; if asked about their preference on some feature, the user may be unable to answer without the agent helping the user to learn some domain knowledge needed to form a preference for that feature, e.g., via examples or explanations. To formalize these principles, we draw on the Search-Experience-Credence framework from Information Economics to introduce CoPref, a model of how users construct preferences based on agent dialog actions. We then study these ideas concretely in agentic recommender systems, proposing CoShop, an interactive benchmark. In CoShop, an agent converses with and makes recommendations for a CoPref user. The agent's performance depends on whether it can help the user gain the knowledge needed to specify the task well. Evaluating five frontier models, we find that no agent exceeds 56% accuracy on CoShop despite five turns of interaction. Failures stem not from agents' ability to find items, but from how little the interaction expands what users know about what they want.

Editor's pickPAYWALLGovernment & Public Sector
Washington Post· Yesterday

CIA to accelerate its use of AI, other advanced technologies - The Washington Post

The CIA has long used homegrown high-tech spy devices to aid its undercover officers around the globe. And, Ratcliffe said, CIA technology was integral to the January capture of Venezuela’s president, Nicolás Maduro, and the daring rescue of a U.S.

Editor's pickTechnology
Arxiv· Today

Contrastive Reflection for Iterative Prompt Optimization

arXiv:2606.30840v1 Announce Type: new Abstract: LLM agents are becoming central to information retrieval: they issue retrieval queries, synthesize answers, and increasingly serve as judges for IR evaluation. Improving the prompts that control these agents is an optimization problem, but in applied IR settings it often looks less like blind search and more like debugging. Engineers need to know which behavior failed, which nearby behavior still worked, what distinguishes the two, and whether a prompt edit improves held-out quality without introducing regressions. We present Contrastive Reflection, an iterative prompt-optimization framework for agentic IR workflows. The framework starts from a task-centric quality definition: QA agents expose retrieval or reasoning traces, and grading agents expose dimension-level scores and rationales. These structured traces are used to identify error-anchored behavioral slices, add nearby successful examples from the same region, and ask a Teacher LLM to propose a targeted prompt edit. Candidate edits are accepted only when validation performance improves, optionally subject to regression checks. We instantiate the framework with a tree-based slice selector, but the contribution is the contrastive reflection loop rather than the tree itself. On a public HotpotQA retrieval-augmented QA setup, one tree-selected contrastive repair improves held-out exact-match accuracy from 51.4% to 60.4%. Failure-only and random-evidence variants improve less and break more previously correct examples. A light instruction-only comparison places the method near modern prompt optimizers: MIPROv2 reaches 59.4% and GEPA 57.0%. The result is an interpretable optimization loop for IR agents, aimed at making prompt repair more inspectable and validation-driven.

Editor's pickMedia & Entertainment
VentureBeat· Yesterday

Google's Gemini Omni Flash hits the API, turning enterprise video production into a conversation

For most enterprises, a 90-second training video or a product explainer has never been an easy ask. It means a well planned brief, an internal film crew or an outside vendor, a shoot, an edit, and a round of revisions. Change one line of on-screen text due to a legal review and the whole chain runs again. The cost and the long time lines are why so much internal video never gets made. That equation is what Google is aiming to rewrite with Gemini Omni Flash, the first model in its new "Omni" family, now rolling out to developers and enterprise customers through an API after debuting to consumers at I/O 2026. Google frames the family's ambition as creating anything "from any input," starting with video. But the headline interaction isn't just a sharper text-to-video prompt. It's the ability to edit a finished clip through conversation. When the model launched in May, VentureBeat's enterprise analysis flagged the catch: with no programmatic interface, Omni was a consumer and prosumer tool, not a production one. This API rollout changes that. It puts conversational editing in front of the marketing and learning-and-development teams that make the most videos in an organization. The pitch: a five-tool pipeline collapses into a single conversation Until now, many teams have been assembling AI videos the hard way, bolting together an LLM for a script, a text-to-image model, an image-to-video model, a separate lip-sync tool and a voice generator, each with its own contract, billing and data path. Omni's enterprise argument is unification: one model that takes text, images and video and returns a finished clip with synced audio. That simplicity factor is the part decision-makers should weigh first. Collapsing several point tools into one model means fewer vendors and a single place to monitor output and enforce data-handling rules. For an organization that has avoided generative video because stitching the tools together wasn't worth the overhead, the equation shifts. With conversational editing each instruction builds on the last, so a marketer can relight a product shot, reframe it, or change the wardrobe without regenerating from scratch and losing the parts that already worked. It is the difference between booking a reshoot and sending a note. Multimodal references and a physics engine for brand assets Omni accepts far more than a text prompt. Alongside the words describing what you want, you can feed it multiple reference images, and existing video clips, and it carries those specifics into the result. Hand it a photograph of a particular object, ask the model to place that object into a scene, and it reproduces the real thing's coloring and rough shape instead of inventing a generic stand-in. While the match might not be pixel-perfect, it is close enough to be recognizable. That reference-driven control is what makes the feature commercially interesting: a product photo, a brand logo, or a specific location can be dropped in as an ingredient rather than described in a prompt and hoped for. Two of Google's four highlighted strengths speak directly to enterprise work. The first is a world model, the system's grasp of how physical scenes behave. Add light rain and puddles to an existing shot and it renders reflections of the people and objects in the wet pavement, the sort of physical consistency that separates real footage from obvious AI video.  The second is text and logo insertion. Point it at a scene full of signage and you can have it rewrite those signs in another language, or for a brand of your choosing, and even drop in a company's logo. The results aren't flawless: in testing, sign tracking in complex scenes weren’t always perfect and some text slipped back to the original language between frames. For training videos that need on-screen labels, or ads that need a logo placed in-scene, it is a capability worth a close look, and a reminder that the output still needs a human review before it ships. The interactions API and where the limits still bite Under the hood, this runs on Google's new interactions API, a stateful interface built for multi-turn tasks rather than open-ended chat. Each turn carries the previous video and its references forward, which is what lets edits accumulate coherently. Developers can chain generations. They can produce a clip, edit the cat into a puma kitten, restyle a video into 8-bit retro and then into a watercolor look, and store each version to branch from later. The constraints are real and worth budgeting around. Clips currently cap at 10 seconds, per the model's published model card. To make something longer, you generate chunks and edit them together. Uploaded footage can be edited too, as long as it runs 10 seconds or under and the user holds the rights to it. Google's own model card is candid that holding consistency across edits and rendering accurate text remain open problems. Guardrails, watermarking and the line Google won't cross For a CISO, the demos matter less than the provenance work shipping alongside the model. Every Omni clip carries Google's SynthID watermark, Google is extending C2PA Content Credentials across its generative tools, and it has launched an AI Content Detection API that flags AI-generated media, both Google's and other vendors'. Google has also drawn a deliberate line. The model won't take a still photo of a person plus an audio clip and lip-sync them into speech, an explicit move to limit deepfakes. It will, however, take a recording of someone talking and translate it into another language, a useful path for localizing global training content. For regulated enterprises, those constraints and the baked-in provenance are features rather than friction. The numbers: cheap, 720p-only, and (preliminarily) ranked first The pricing landed alongside the API, and it is aggressive. Omni Flash costs $0.10 per second of generated 720p video, which puts a ten-second clip at roughly a dollar. That matches Veo 3.1 Fast at the same resolution, runs double Veo 3.1 Lite, and undercuts standard Veo 3.1 by three-quarters. Per second (USD) Gemini Omni Flash Veo 3.1 Lite Veo 3.1 Fast Veo 3.1 720p $0.10 $0.05 $0.10 $0.40 1080p n/a $0.08 $0.12 $0.40 4K n/a n/a $0.30 $0.60 The table also exposes the catch though. Omni Flash only generates 720p. There is no 1080p or 4K option, while the Veo tiers scale up to 4K. For internal training and most social video, 720p is fine. For premium brand work meant for a large screen, it is a real ceiling, and the reason Veo 3.1 still has a job Clips run 3 to 10 seconds at 720p native, in landscape (16:9) or portrait (9:16). As reference inputs the model accepts up to seven images and up to three video clips of three seconds or less. It does not take audio as an input yet, though it generates audio alongside the video it produces. Output is standard MP4, and every clip ships with SynthID watermarking and C2PA credentials baked in. On quality, the early signal is strong. In LMArena's Text-to-Video Arena, a leaderboard where people vote on head-to-head outputs from competing models, Omni Flash sat at number one with a score of 1527.  What it means for budgets, and what's still missing With real pricing in hand, the iteration story gets concrete. Every conversational edit is a fresh generation you pay for, so an edit-heavy session still adds up, roughly a dollar for each ten-second pass at 720p. What the stateful model changes isn't the cost of an edit, it's the number of wasted ones: because context carries across turns, those generations go toward refining a take that mostly works instead of restarting from a blank prompt and hoping the next attempt lands. Omni isn't alone in this field. Veo 3.1 remains Google's production-grade option when you need higher resolution, and rivals from Bytedance, Alibaba and OpenAI are all chasing the same budgets. What Omni adds is the editing capability itself: the ability to treat a video as a living document instead of a one-shot render.

Editor's pickHealthcare
Arxiv· Today

AI for Quality Assurance in the Operating Room

arXiv:2606.30657v1 Announce Type: new Abstract: Surgical outcomes depend not only on patient factors and postoperative care but are also strongly influenced by the quality of the operation itself. Yet, for much of mod-ern surgery, intraoperative quality has been assessed indirectly through outcomes and operative reports. The increase in minimally invasive procedures inherently guided by endoscopic video, together with advances in artificial intelligence, creates an unprecedented opportunity to systematically observe, measure, and improve surgi-cal care. This chapter introduces AI-enabled Surgical Quality Assurance as a frame-work for using surgical data to support continuous assessment and improvement in the operating room. We first review existing approaches to surgical safety, from sys-tem-level interventions to procedure-specific standards. We then describe how AI can transform intraoperative video into clinically meaningful information, including recog-nition of anatomy, instruments, workflow, surgical actions, quality criteria, adverse events, and critical moments. Finally, we outline the major challenges that must be addressed before these systems can deliver routine clinical value, including representa-tive data collection, robust validation, workflow integration, regulation, liability, pri-vacy, and equitable access. Rather than replacing surgical judgment, AI for quality assurance should be understood as a set of tools for augmenting the surgical team, scaling expert review, and helping surgery evolve toward a learning system in which intraoperative care is continuously observed, assessed, and improved.

Editor's pickPAYWALLConsumer & Retail
Bloomberg· Today

An English Furniture Maker Faces AI Era of Bots Buying Sofas

With origins in the English countryside, The Cotswold Company is known for upscale furniture that evokes its bucolic backstory. Now it’s bracing for the era of artificial intelligence.

Editor's pick
Arxiv· Today

Improving Survey Participation in Low-Literacy Populations Through Value-Sensitive Conversational AI

arXiv:2606.30660v1 Announce Type: new Abstract: Collecting reliable social data from low-literacy populations remains a persistent challenge, particularly when surveys involve sensitive topics and marginalized communities. Traditional paper-based and web-based survey modalities often suffer from high attrition and incomplete responses due to literacy barriers, social pressure, and interactional discomfort. In this paper, we present findings from an initial field evaluation comparing multiple survey modalities paper-based interviews, digital web-based surveys, conversational AI (convAI) surveys, and convAI enhanced with layered value-sensitive design conducted with low-literacy women across India. Using data from 315 participants, we show that convAI significantly improves survey completion rates relative to traditional modalities, with the highest completion and lowest drop-off observed when value-sensitive and culturally aligned conversational design elements are fully integrated. These results demonstrate the importance of human-centered and value-sensitive interaction design in enabling inclusive, ethical, and scalable data collection; motivating more `AI for social good' applications.

Editor's pickHealthcare
Arxiv· Today

Explainable Artificial Intelligence For The Detection and Characterisation of Stage B Heart Failure

arXiv:2606.30665v1 Announce Type: new Abstract: Stage B heart failure is characterized by asymptomatic structural or functional cardiac abnormalities. Identifying individuals at this stage is clinically important, as early detection may enable targeted interventions to prevent progression to symptomatic disease. Explainable artificial intelligence (XAI) may support early detection, transparent risk stratification, and selection of clinically actionable interventions. This review examines the use of XAI in detecting and characterizing stage B heart failure. A literature search of Web of Science, Scopus, and PubMed was conducted on 27 March 2026. Studies were included if they applied AI with XAI techniques to stage B heart failure. After screening, 20 studies were included. Data on modalities, outcomes, demographic reporting, and XAI methods were extracted and synthesized. SHAP was the most commonly used method, followed by LIME, saliency maps, and Grad-CAM; however, XAI adoption was inconsistent, with some studies relying on limited or ad hoc interpretability approaches. Notably, none compared explanations across sex or ethnic subgroups, despite evidence of subgroup differences in disease burden. Evaluation of XAI outputs was often insufficient: some studies did not assess explanations, while others relied only on literature-based comparisons, introducing potential bias. These limitations suggest explainability was not systematically validated or leveraged to support robust and fair clinical inference. XAI shows promise for improving transparency in stage B heart failure identification, but current implementations remain limited. Key gaps include limited consideration of sex and ethnicity, absence of subgroup-specific analyses, inconsistent evaluation, and lack of external validation, all of which constrain generalisability and clinical adoption.

Editor's pickProfessional Services
Theregister· Yesterday

Where there's a will, AI still has work to do

Probate lawyer finds generated document looked the part but missed many of the questions that matter

Editor's pickConsumer & Retail
Daily AI News June 30, 2026: When AI Becomes the Attack Surface· Yesterday

How We Use Langchain to Power Lumi, Our E-Commerce Copilot

This article details how Lumi uses LangChain and multiple AI agents to support over 180,000 e-commerce merchants through an AI-powered copilot architecture.

Editor's pick
CISO Platform· Yesterday

AI Adoption at Scale: Security, Governance, and Business Growth - All Articles - CISO Platform

Artificial intelligence has rapidly evolved from an experimental technology into a strategic business necessity. Organizations across nearly every industry are integrating AI into customer service, marketing, operations, software development, analytics, and decision-making.

Editor's pickFinancial Services
Arxiv· Today

A Three-Phase Foundation Model for Tax-Aware Personalized Portfolio Management

arXiv:2606.30997v1 Announce Type: new Abstract: We present a three-phase deep reinforcement learning system for personalized portfolio management that addresses three limitations shared by all prior financial RL work: 1) ticker lock-in, 2) monolithic objectives , and 3) static user models. Phase 1 pretrains a ticker-identity-free cross asset encoder via self-supervised learning on a multi-asset corpus, augmented by a frozen parallel branch using Chronos, a T5-based time series foundation model, fused via a learned gating mechanism. To our knowledge, this is the first application of a time series foundation model to portfolio management RL. The encoder generalizes to any publicly traded asset via a 50-dimensional observable metadata vector that requires no retraining for new tickers. Phase 2 fine-tunes a MoE (Mixture of Experts) portfolio actor critic with PPO under an objective-conditioned reward that simultaneously serves six distinct investment goals sampled per episode: short-term alpha, short-term gain, long-term gain, capital preservation, tax-loss harvesting, and long-term-gains-only. A MoE architecture assigns each objective to a specialized expert head (momentum, growth, defensive, tax-aware), and a learned intent router blends experts based on the active objective and current market regime, which eliminates cross-objective gradient conflict. Phase 3 adds a lightweight personalization layer further adapted at inference time to each individual via a 76-parameter LoRA module fine-tuned on real brokerage transaction history, inferring investment objectives from revealed trading behavior rather than questionnaires. A natural language intent parser converts free-form goals directly into structured investment objective parameters.

Editor's pickPharma & Biotech
Siliconrepublic· Yesterday

Google Cloud Marketplace to offer LQMs from SandboxAQ

SandboxAQ claims its LQMs can offer 'critical advances' in sectors such as life sciences, financial services and navigation. Read more: Google Cloud Marketplace to offer LQMs from SandboxAQ

AI Measurement & Evaluation2 articles
Editor's pickProfessional Services
Arxiv· Today

Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments

arXiv:2606.30987v1 Announce Type: cross Abstract: Decision-makers routinely rely on expert judgments accompanied by written explanations, yet explanation quality is difficult to measure at scale. Forecasting tournaments offer a natural testing ground: probabilistic judgments are paired with natural-language rationales and scored against realized outcomes. We introduce Explanation Quality Markers (EQMs), a set of sixty theory-guided reasoning patterns scored by large language models (LLMs). In a pre-registered analysis of over 55,000 forecast-rationale pairs from a multiyear forecasting tournament, EQMs predict accuracy at both the forecast and forecaster levels, consistently outperforming pre-LLM text-analysis methods. More than 90% of statistically significant pattern-level EQM-accuracy correlations match our directional hypotheses. The signal is asymmetric: EQMs identify likely underperformers more reliably than they distinguish the very best forecasters. Benchmarked against traditional indicators of forecasting skill, EQMs are the strongest predictor at the forecast level and competitive at the forecaster level, though weaker than prior accuracy. Human ratings of rationale quality are less consistently correlated with accuracy and place disproportionate weight on rationale length. Results transfer to an independent forecasting study. EQMs provide a scalable, interpretable method for extracting judgment-relevant information from written explanations.

Editor's pickTechnology
Arxiv· Today

RoPoLL: Robust Panel of LLM Judges

arXiv:2606.30931v1 Announce Type: new Abstract: The LLM Jury, a Panel of LLM Evaluators (PoLL) reporting consensus scores, has become a practical alternative to single-judge LLM evaluation, yet its statistical behavior remains poorly understood. We formalize the LLM Jury under the Huber contamination model and show that PoLL incurs unbounded bias under any positive contamination, regardless of jury size, whenever a single judge fails in a biased, LLM-typical way (mode collapse, sycophancy, safety refusal). Framing jury consensus as classical robust mean estimation, we propose RoPoLL (Robust Panel of LLM-as-Judge), which preserves the PoLL panel but replaces the aggregation function with a robust mean estimator, instantiated with the geometric median (GM): tuning-free, with the optimal finite-sample breakdown point 1/2. A finite-sample error bound and a matching information-theoretic minimax lower bound agree on the parametric rate sigma*sqrt(d/N) and differ on the breakdown floor by a factor of sqrt(d), a statistical-computational gap that polynomial-time RoPoLL pays relative to the intractable Tukey halfspace median. Across 13 open-weight judges (4B-675B), three reward-model benchmarks, and four corruption regimes at rates up to 50%, RoPoLL dominates PoLL on every biased corruption type: by about 19% on cross-dimensional attacks at matched compute, and by orders of magnitude on heavy-tailed Byzantine adversaries. A 3-judge RoPoLL committee at 38B beats Mistral-Large-3 (675B) by 1.31x on HelpSteer-2 under 30% bimodal-random corruption, an 18x parameter advantage at better accuracy; a Noisy-GT control confirms the premium is paid against biased contamination, not benign imprecision.

AI Organisational Change5 articles
AI Productivity Evidence5 articles
Editor's pickFinancial Services
VentureBeat· Yesterday

Morgan Stanley cut its riskiest reconciliation job in half — by making its agents less autonomous

Most enterprise AI deployments so far have focused on coding assistants and customer service bots. Morgan Stanley has deployed agents in one of banking's most accuracy-critical, deadline-driven workflows instead — profit and loss (P&L) reconciliation — and cut the work in half. The counterintuitive part: it got there by making the system less autonomous, not more. Humans stay tightly in the loop, and their decisions are iteratively turned into repeatable rules the system can apply on its own. “It's much more like a co-worker than a copilot,” Morgan Stanley Managing Director Todd Johnson said at a recent VB AI Impact event. The internal production agentic system, known as FIXR, goes beyond simple, straightforward "gen AI 1.0" tasks. “We think that's where the opportunity is to really unlock more complex work in the organization.” FIXR behind the scenes Every trading day, Morgan Stanley’s trade desks handle the important work around transactions such as cash equities or debt investments.  And, at the end of each of those days, controllers must reconcile P&L across the finance giant’s Finance, Risk, Operations, and Trade Capture systems. All that data must come together, and, perhaps not surprisingly, hundreds of thousands of attributes frequently fail to match. Typically, this means controllers must manually investigate each mismatch (or “break”), make decisions on adjustments, then ideally sign off before the number goes to the desk. And all of this while working on a hard morning deadline.  Previously, this could take up to six hours for a single book. Now, FIXR performs the task in two to three hours, Johnson said. Across the roughly 100 controllers who do this work, that adds up to about 1,500 hours saved per week. After nightly P&L calculations complete, the system automatically analyzes “breaks” and proposes resolutions based on learned rules. Several agents work together:  One interprets past guidance to develop start-of-day resolutions. One learns from controller behavior and documents the rules they apply. One converts repeated patterns into durable, automated logic. Over time, the system can auto-clear certain breaks it’s encountered before, suggest solutions for others that may be less familiar, ask for help when it’s unsure, and flag for human investigation. When items are repeatedly resolved through the same method, it can create firm rules.  Critically, humans don’t leave the loop, but stay fully in it, he said. They review, approve or correct every recommendation, then feed those decisions back to improve the next run. The agent learns daily from controllers what it gets right and wrong and codifies that knowledge as it iterates.  “You still preserve that element of human accountability even as you start to automate,” Johnson said. “Over time you'll see more and more of those items resolved in an automatic way.” He emphasized that autonomy requires a great deal of trust; enterprises will not see efficiency gains if everyone's checking everything an agent does.  The human–agent feedback loop was critical to addressing the challenge of controlled, measured, and repeatable automation. “We recognized that all that intelligence that's sitting in the mind of a controller is gonna be difficult to get all into an agent on day one,” Johnson said.  Focus on process-first, extensibility It was critical to establish processes first, before getting any AI involved, Johnson said. His team ran a “very thorough” process intelligence assessment that mapped and mined workflows to identify where automation would be the most advantageous: Was the answer agents, traditional automation, or simple re-engineering of an inefficient step?  “If we can fix that first before we add agents to the problem, then we really will be transforming the opportunity,” he said.  The P&L sign-off process was full of manual steps suitable for automation, and agents taking over some of these time-consuming tasks are freeing up controllers for “more value-added analysis” and “deeper risk consideration” work, he said. Extensibility, though, was just as important as time savings. Johnson’s team chose this particular P&L reconciliation use case because hundreds of controllers were doing this work globally across the business (in the Americas, Europe, Asia).  So start with a use case, prove it, extend it, “and then ultimately the transformation will be as we roll this out more and more across the organization,” Johnson said.  Deterministic by design Johnson said the team also deliberately limited how much of the workflow depended on the model's judgment at all. "If you have an opportunity to make things very prescribed and repeatable, that's cheaper in terms of token consumption, it's more repeatable in terms of controls — and have the LLM do the stuff where you don't need that kind of deterministic workflow," he said. As the system sees more controller feedback on a given break type, Morgan Stanley converts that pattern into a fixed rule instead of leaving it to the model. Humans still own the behavior  An interesting (and perhaps fundamental) question being raised at the dawn of the agentic era is: Are agents code or digital employees? Johnson argues that “they're probably a little bit of both,” and, as such, require nuance when it comes to governance and oversight. Technical teams must still be responsible for maintaining protections and guardrails like firewalls or encryption, for instance.  But there’s a new dynamic around the “performance element”: Humans using agents are responsible for them because it’s aiding their business work. For instance, if a senior controller is working with a junior controller, they don’t just relinquish responsibility because someone is helping them out, Johnson noted.  “One of our strong principles in our AI governance generally is that there always has to be human accountability, even if there's a degree of automation,” he said. But there typically isn’t “one single one person,” and the process is ultimately continuous. To this point, Johnson joked that one “depressing” thing about agentic AI is that it’s going to require ongoing training because models are ever-changing. “You're never gonna be able to say: ‘We've done all the evaluation and testing that we need to do. Let's just let it go.’ You're going to have to have a constant view as it evolves over time.” Morgan Stanley is aiming at real enterprise pain points Morgan Stanley's experience mirrors patterns VentureBeat has uncovered across enterprise AI deployments. In VentureBeat's recent VB Pulse survey, nearly three-quarters of respondents reported seeing little to no ROI from custom model fine-tuning, describing a "sandbox graveyard" of AI projects that proved too costly to maintain. This suggests that Morgan Stanley's process-first, buy-and-blend approach may be more sustainable than chasing bespoke models. The survey had 87 respondents and findings should be considered directional. Governance emerged as another common challenge: 38% of respondents cited the lack of a single accountable owner as their biggest barrier to production AI, while only two of the 87 enterprises surveyed had active monitoring and alerting in place to detect model failures.

Editor's pickManufacturing & Industrials
Guardian· Yesterday

Return of the ‘greybeards’: AI backfired – so Ford had to rehire humans

The US motor company found that the hundreds of AI cameras being used for design and manufacturing checks were prone to pitfalls Name: “Greybeards.” Age: There’s a clue in the name. Continue reading...

Editor's pickManufacturing & Industrials
Daily Brew· Yesterday

Ford rehires more than 300 veteran human engineers

Ford has rehired over 300 veteran engineers after finding that AI systems failed to meet the required quality and expertise standards.

Geopolitics, Policy & Governance

49 articles
AI Geopolitics3 articles
Editor's pickTechnology
CNBC· Yesterday

White House AI crackdown opens door Chinese model makers to close gap

The Trump administration's crackdown on Anthropic's leading artificial intelligence models is looking like a gift to China.

Editor's pickTechnology
VentureBeat· Yesterday

Meituan open sources LongCat-2.0, the 1.6T, near-frontier agentic coding model that's been leading OpenRouter — trained entirely on Chinese chips

A few hours ago, Chinese delivery app company Meituan officially unveiled LongCat-2.0 on GitHub, Hugging Face, and its native platform, unmasking the model as the computational engine behind "Owl Alpha," the anonymous stealth model that has spent the last two months commanding global developer charts on OpenRouter. Developed to fundamentally disrupt closed-source enterprise dominance in autonomous software engineering, the 1.6-trillion-parameter Mixture-of-Experts (MoE) system brings a native 1-million-token context window to the public domain under a highly permissive, enterprise grade, commercially viable MIT license. However, the company has yet to post the full weights — both the Github and Hugging Face pages say "Model weights coming soon — stay tuned!" Commercial access to the architecture introduces a highly aggressive pricing tier, deploying a mechanism where all context-cache hits are processed completely free of charge, running alongside a time-limited "Token Pack" flash-sale paradigm. There's also a typical "pay-as-you-go" API for non-cache hits standard priced at $0.75/$2.95 per million tokens in/out. However, a limited-time promotional discount aggressively slashes these operational expenditures down to $0.30 per million tokens for uncached input and $1.20 per million tokens for output, both on the cheaper-end of top performing models globally. Model Input ($/1M) Output ($/1M) Total ($/1M) Source MiMo-V2.5 Flash $0.10 $0.30 $0.40 Xiaomi deepseek-v4-flash $0.14 $0.28 $0.42 DeepSeek deepseek-v4-pro $0.435 $0.87 $1.305 DeepSeek MiniMax-M3 $0.30 $1.20 $1.50 MiniMax LongCat-2.0 — limited-time promo $0.30 $1.20 $1.50 LongCat Gemini 3.1 Flash-Lite $0.25 $1.50 $1.75 Google Qwen3.7-Plus $0.40 $1.60 $2.00 Alibaba Cloud MiMo-V2.5 $0.40 $2.00 $2.40 Xiaomi LongCat-2.0 — standard $0.75 $2.95 $3.70 LongCat Grok 4.3 (low context) $1.25 $2.50 $3.75 xAI MiMo-V2.5 Pro (≤256K) $1.00 $3.00 $4.00 Xiaomi Kimi-K2.6 $0.95 $4.00 $4.95 Moonshot AI GLM-5.2 $1.40 $4.40 $5.80 Z.ai GPT-5.6 Luna $1.00 $6.00 $7.00 OpenAI Grok 4.3 (high context) $2.50 $5.00 $7.50 xAI MiMo-V2.5 Pro (>256K) $2.00 $6.00 $8.00 Xiaomi Qwen3.7-Max $2.50 $7.50 $10.00 Alibaba Cloud Gemini 3.5 Flash $1.50 $9.00 $10.50 Google Gemini 3.1 Pro Preview (≤200K) $2.00 $12.00 $14.00 Google GPT-5.6 Terra $2.50 $15.00 $17.50 OpenAI GPT-5.4 $2.50 $15.00 $17.50 OpenAI Gemini 3.1 Pro Preview (>200K) $4.00 $18.00 $22.00 Google Claude Opus 4.8 $5.00 $25.00 $30.00 Anthropic GPT-5.5 $5.00 $30.00 $35.00 OpenAI GPT-5.5 Instant (chat-latest) $5.00 $30.00 $35.00 OpenAI Sakana Fugu Ultra (≤272K) $5.00 $30.00 $35.00 Sakana AI GPT-5.6 Sol $5.00 $30.00 $35.00 OpenAI Claude Fable 5 / Claude Mythos 5 $10.00 $50.00 $60.00 Anthropic What makes the release a definitive inflection point for global tech infrastructure is its operational independence: the massive model was trained entirely on a cluster of over 50,000 domestic Chinese Application-Specific Integrated Circuits (ASICs), proving that near-frontier AI models can be scaled successfully without relying on the typical U.S. Nvidia GPUs that have, to date, powered much of the global generative AI frontier model training effort. This successful deployment of alternative silicon signals a profound structural shift. If Chinese conglomerates can consistently iterate trillion-parameter architectures using homegrown ASICs rather than general-purpose GPUs, it would seem to threaten Nvidia's dominance in this sector. Crucially, this technological pivot arrives precisely as Washington pressures top-tier American labs to restrict access to their latest models. Following a U.S. governmental request, OpenAI was forced to limit access to its new GPT-5.6 models, while Anthropic was previously also ordered by the U.S. to restrict access to its latest Claude Fable 5 / Mythos 5 models, which it took entirely offline in response. At the same time, a growing chorus of technologists, activists, and industry experts warn that these defensive regulatory maneuvers have inadvertently backfired. By locking down Western closed-source models and driving up API costs, the U.S. government has left a wide operational window for global developers seeking affordable, high-performance alternatives like those found in Chinese open source models such as Meituan LongCat-2.0. The raw operational metrics backed up the developer enthusiasm: during its unbranded residency on OpenRouter, Owl Alpha accounted for approximately 10.1 trillion monthly tokens—averaging 559 billion tokens per day—representing a 242% month-over-month explosion in volume that propelled it into the platform's global top three. By the time Meituan stepped forward to claim the architecture, the model had already secured the top ranking on the Hermes Agent workspace, second place on Claude Code deployments, and third place across international OpenClaw environments. Technology: Engineering the 1M-Token Sparse Context At the core of LongCat-2.0 lies an aggressive optimization of Mixture-of-Experts (MoE) sparsity, scaling total parameters to 1.6 trillion while limiting active computation to an average of 48 billion parameters per token. Depending on the structural complexity of a query, the model’s dynamic activation ranges from 33 billion to 56 billion parameters. This design implements a "Zero-Compute Experts" framework, ensuring that routine execution elements pass through lighter subnetworks, entirely eliminating the idle computational overhead that typically penalizes ultra-dense models. To sustain a functional 1-million-token context window without incurring catastrophic hardware bottlenecks, Meituan introduced LongCat Sparse Attention (LSA). Designed as an evolutionary iteration of DeepSeek Sparse Attention, LSA resolves the quadratic scoring costs and memory fragmentation that typically plague fine-grained sparse mechanisms through three distinct, orthogonal vectors: Streaming-aware Indexing (SI): This system restructures the token selection pipeline by blending hardware-aligned contiguous data reads with dynamic random selection. By converting fragmented memory access into highly predictable, sequential blocks, the system achieves coalesced High Bandwidth Memory (HBM) utilization and elevated effective bandwidth. Cross-Layer Indexing (CLI): Leveraging the empirical reality that attention saliency remains highly stable across adjacent hidden layers, CLI amortizes calculation costs. A single indexing pass successfully guides multiple consecutive layers during inference, a capability reinforced by cross-layer distillation throughout the training phase. Hierarchical Indexing (HI): This approach applies a coarse-to-fine, two-stage scoring layout. The indexer performs a rapid, approximate block-level recall to filter candidates, before running fine-grained token selection exclusively on the remaining population. Furthermore, Meituan integrated an N-gram Embedding module inherited from its lighter model lines. By expanding parameter allocation in sparse dimensions completely orthogonal to the MoE expert layout, the architecture appends 135 billion parameters to a 5-gram token combination framework. This expands the core embedding space by roughly 100-fold, allowing the model to capture dense local token relationships and accelerate large-batch inference operations by reducing memory Input/Output (I/O) bottlenecks. Product: Post-Training, MOPD Framework and Benchmark Performance While generalist large language models prioritize fluid, conversational interfaces, LongCat-2.0 focuses explicitly on multi-step engineering tasks, tool integration, and automated repository manipulation — agentic tasks, in other words. In standardized assessments, LongCat-2.0 registers an empirical 59.5 on SWE-bench Pro, surpassing GPT-5.5's benchmark of 58.6. The model further establishes its agentic specialization by marking a 70.8 on Terminal-Bench 2.1, a 77.3 on SWE-bench Multilingual, and a 73.2 on the general corporate workflow simulator FORTE. This precise operational behavior is achieved through a structural post-training layer called Multi-Teacher Optimization via Mixture of Specialized Experts (MOPD). Rather than blending raw human feedback into a singular reward function, the MOPD architecture segregates post-training optimization into three independent, highly focused expert clusters. The Agent Experts are fine-tuned strictly for structural execution, specializing in precise tool invocation, multi-turn API parameter parsing, and self-correcting loop mechanisms to avoid execution stagnation. The Reasoning Experts are optimized in isolation to advance multi-hop logic, complex chain-of-thought engineering, mathematics, and high-level STEM problem-solving. The Interaction Experts focus entirely on human alignment, instruction-following nuances, factual grounding to suppress hallucinations, and maintaining rigid safety guardrails without diminishing the model's overall utility. By segregating these vectors during post-training, LongCat-2.0 prevents functional degradation. A dynamic gate-routing mechanism then seamlessly fuses these specialized behaviors at runtime, allowing the final model to coordinate deep reasoning, stable tool execution, and safe user interaction simultaneously While LongCat-2.0 generally trails premium frontier systems like Claude Opus 4.8 across broad general-agent benchmarks such as FORTE and BrowseComp, it explicitly punches above its weight in software engineering. What makes this open-weight architecture special is its hyper-focus on autonomous development; it manages to narrowly exceed OpenAI's proprietary GPT-5.5 on the rigorous software engineering benchmark SWE-bench Pro (scoring 59.5 against 58.6), proving it is highly capable and fiercely competitive for complex coding tasks despite a leaner computational footprint. Commercial Framework: Pay-As-You-Go vs. Flash-Sale Token Packs Meituan's deployment strategy introduces a specialized commercial model that splits network access between conventional real-time API billing and structured "Token Packs". For traditional enterprise integration, standard top-up accounts are available, deducting operational capital in real time based directly on token input and generation metrics. However, to accommodate the unpredictable compute bursts characteristic of autonomous development agents, Meituan launched a structured Token Pack framework. Purchased as fixed, one-time volumetric allocations valid for a strict 30-day window, these packages stack directly on top of an organization's existing baseline API account. To manage network load across its ASIC clusters, Meituan releases these high-volume packages via limited flash sales four times daily, precisely at 10:00, 16:00, 21:00, and 23:00 Beijing Time on a first-come, first-served basis.The economic standout of this framework is the zero-charge processing of context cache hits. In massive agentic environments where a coding assistant must repeatedly read, reference, and modify the same multi-million-token code repository over an extended session, standard architectures penalize developers by charging full pricing for repeated input context. Under Meituan's infrastructure, only cache-miss inputs and final token generations consume the package quota. This architecture completely alters the operational cost economics of large-scale agent software development, enabling deep iterative context exploration without compounding costs. Licensing: Open-Source Structural Freedom By registering the LongCat-2.0 repository under the open-source MIT License, Meituan positions the architecture with maximum legal flexibility for enterprise integration. In contrast to copyleft paradigms like the GNU General Public License (GPL)—which legally obligates developers to open-source any derivative frameworks or internal software that links to the code—the MIT license permits near-unrestricted freedom. For corporate engineering teams, this legal standard ensures that LongCat-2.0 can be deeply modified, compiled, and hard-coded directly into closed-source commercial applications, proprietary dev tools, and internal automation backends. Corporations can fork the repository, optimize the internal LSA mechanisms for private databases, and sell the resulting software stack to end users without any obligation to disclose their proprietary intellectual property or structural enhancements. Meituan's Evolution: From Delivery Super App to AI Powerhouse Founded in March 2010 by serial entrepreneur Wang Xing, Meituan initially launched as a Groupon-style daily deals website before rapidly evolving into one of China’s dominant “super apps”. Following a massive 2015 merger with Dianping, the Beijing-based tech giant solidified a dominant market share over the country's urban delivery corridors, bridging local consumer reviews, instant retail, hotel bookings, and food delivery. Operating as a publicly traded powerhouse on the Hong Kong Stock Exchange, Meituan claims over 770 million annual transacting users and supports a network of more than 14.5 million merchants. However, faced with intense domestic market competition, severe margin compression, and a sliding profit margin, the company aggressively pivoted its strategy beyond logistics. Meituan publicly committed to investing "billions" into artificial intelligence and domestic chip capabilities to revitalize its technology-driven offerings. This strategic shift into the global AI race began materializing in late 2025 with the release of LongCat-Flash, a 560-billion-parameter Mixture-of-Experts foundation model, followed quickly by the advanced reasoning model LongCat-Flash-Thinking. By open-sourcing these frontier-class models under enterprise-friendly licenses, Meituan signaled its ambition to become a foundational player in global AI infrastructure rather than remaining strictly a regional e-commerce and delivery giant. Enterprise Implications: Autonomous Operational Workflows For modern enterprises, the release of LongCat-2.0 unlocks clear operational strategies across software engineering, system operations, and long-form data interpretation. The combination of an open-weight, MIT-licensed model with an expansive 1-million-token context window means organizations can bypass the data privacy concerns and recurring overhead associated with hosting proprietary third-party APIs.In large-scale enterprise development environments, teams can leverage the model's specialized Agent Experts to orchestrate autonomous codebase migrations. Instead of dedicating hundreds of developer hours to manually rewriting legacy application frameworks, engineers can pass an entire enterprise repository along with modern SDK documentation directly into the 1-million-token context window. LongCat-2.0 can map the dependencies, execute the repository-level structural updates, compile the new codebase, and catch compilation and execution bugs autonomously within local sandbox environments before generating a final pull request. The model's architectural separation via the MOPD gate-routing mechanism yields significant advantages for strict enterprise compliance. By routing specific operational queries through isolated expert clusters, a financial institution or healthcare firm can deploy deep logic and mathematical reasoning passes without risking factual hallucination or violating strict safety bounds. The Interaction Experts function as an implicit guardrail layer, suppressing errors and enforcing instruction-following protocols without degrading the raw processing power of the internal Reasoning Experts. Combined with the zero-cost caching model, enterprises can maintain hyper-focused autonomous software networks that can repeatedly inspect corporate data pools, continuously maintaining and optimizing internal infrastructure at a fraction of standard operational costs.

AI National Strategy15 articles
Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

Lee’s $880 Billion AI Bet Ties Legacy to Korea’s Chip Boom

With South Korea already one of the biggest winners of the global AI boom, President Lee Jae Myung is now staking his legacy on a plan to transform the nation’s less developed southwest into a global chip hub.

Editor's pickPAYWALLDefense & National Security
FT· Yesterday

What would multilateral ‘AI arms control’ look like?

Given the competition, it’s debatable whether a US-China safety deal is even possible

Editor's pickTechnology
Daily Brew· Yesterday

At the heart of Anthropic’s clashes with the U.S. government

An analysis of why Anthropic has faced friction with the U.S. government regarding its operational strategies.

Editor's pickTechnology
Los Angeles Times· Yesterday

South Korea bets $518 billion on AI chipmaking boom - Los Angeles Times

South Korean tech giants Samsung Electronics and SK Hynix plan to invest a combined $518 billion in a new computer chip manufacturing hub, capitalizing on surging artificial intelligence-driven demand

Editor's pickTechnology
Times of India· Yesterday

South Korea unveils $880 billion megaplan to challenge US, China and Taiwan in AI chip dominance and global tech race | - The Times of India

For years, South Korea's economy has relied on advanced manufacturing, with semiconductors sitting at the centre of its global success. The country already produces many of the memory chips used in smartphones, computers and increasingly the servers powering artificial intelligence.

Editor's pickGovernment & Public Sector
Digital Watch Observatory· Yesterday

South Korea unveils national AI infrastructure strategy | Digital Watch Observatory

The government cited China's DeepSeek breakthrough as a catalyst for accelerating its national AI strategy.

Editor's pickTechnology
ICO Optics· Yesterday

South Korea Accelerates AI Megaprojects to Secure Global Leadership – ICO Optics

South Korea is launching a series of ambitious, state-backed megaprojects designed to rapidly accelerate its domestic artificial intelligence sector. This […]

Editor's pickTechnology
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

South Korea unveils three mega-projects to drive AI-era industrial growth

South Korea has unveiled a state-backed industrial strategy centered on semiconductors, physical AI, and AI data centers, with President Lee Jae Myung pledging direct oversight.

Editor's pickDefense & National Security
Daily Brew· Yesterday

Palantir and Nvidia Expand Sovereign AI Partnership for US Government

Palantir and Nvidia are deepening their collaboration to provide sovereign AI solutions for the US government.

Editor's pickTechnology
KeyBank· Yesterday

The Rise of Sovereign AI: The Next Evolution of HPC

As AI moves from experimentation into critical infrastructure, a new reality is emerging across Europe and the Middle East: control over data, compute and energy is becoming as important as performance itself.

Editor's pickGovernment & Public Sector
Amazon· Yesterday

AWS Summit DC 2026: Billions in AI and cloud investment for public sector

CIA Director Ratcliffe, Energy Secretary Wright, and UK CTO Patel joined the AWS Summit D.C. keynote for major classified cloud and AI announcements.

Editor's pickGovernment & Public Sector
Digital Watch Observatory· Yesterday

Malaysia adopts AI-centred digital strategy to 2030 | Digital Watch Observatory

A new digital strategy places Malaysia at the centre of regional AI development and public sector transformation.

Editor's pickGovernment & Public Sector
The Hindu· Yesterday

Reimagining sovereign AI for India’s strategic future - The Hindu

India’s objective should be clear which is to remain deeply integrated with global AI ecosystems while steadily reducing the strategic vulnerabilities that such integration creates

Editor's pickTechnology
Daily Brew· Yesterday

For Most of the World, Open-Source AI Is the Only Way Forward

An argument for why open-source AI is the essential path for global accessibility and development.

Editor's pickManufacturing & Industrials
Artificial Intelligence Newsletter | July 1, 2026· Yesterday

US robotics strategy needed to catch up to China, industry official says

A Boston Dynamics official told a congressional committee that a national robotics strategy is essential to secure emerging technology amid a competitive race against China.

AI Policy & Regulation30 articles
Editor's pickPAYWALLTechnology
Bloomberg· Yesterday

US Lifts Export Restrictions on Anthropic’s Fable 5 AI Model

The US government removed foreign access restrictions on Anthropic PBC’s Fable 5 artificial intelligence model, clearing it for wider distribution after the startup resolved the Trump administration’s safety concerns.

Editor's pickPAYWALLTechnology
FT· Yesterday

Apple’s Cook holds ‘constructive’ talks with EU tech chief over ‘Siri AI’

Discussions come as tech group seeks to avoid fines as it and the bloc have been deadlocked over launch of AI assistant

Editor's pickTechnology
Reuters· Yesterday

Reuters AI News | Latest Headlines and Developments | Reuters

LegalcategoryOpen AI defers public rollout of GPT‑5.6 as US seeks early access to frontier AI models

Editor's pickDefense & National Security
Reuters· Yesterday

US lifts curbs on Anthropic's Fable, Mythos AI models

Find latest business news from every corner of the globe at Reuters.com, your online source for breaking international news coverage.

Editor's pickTechnology
Guardian· Today

Anthropic: US has lifted export controls on Fable and Mythos AI models after security risk fears

The AI company was forced earlier this month to suspend access to its Fable 5 and Mythos 5 models for all foreign nationals Anthropic has said the US commerce department has lifted export controls on its Fable and Mythos AI models, less than ⁠three weeks after ⁠the company ​was ordered to suspend access to its most advanced AI models over national security risks. “We’ll begin restoring access tomorrow,” Anthropic said in a statement on X late on Tuesday. Continue reading...

Editor's pickPAYWALLTechnology
FT· Today

White House lifts ban on Anthropic models

US government move allows AI start-up to re-release Mythos and Fable models

Editor's pick
Arxiv· Today

Understanding Censorship in Large Language Models: From Mechanisms to Governance

arXiv:2606.30661v1 Announce Type: new Abstract: Large language models (LLMs) increasingly mediate access to information, yet their responses are shaped by training-data curation, alignment procedures, provider policies, inference-time moderation, and jurisdictional regulation. This paper examines LLM censorship as a sociotechnical phenomenon that extends beyond explicit refusals to include omissions, selective emphasis, framing effects, and geographically variable content controls. We synthesize recent empirical studies, provider case studies, regulatory developments, auditing methods, and mitigation strategies to clarify how censorship-like behavior emerges across the model lifecycle. The analysis highlights the tension between safety and openness, the difficulty of measuring soft censorship, the geopolitical divergence of moderation regimes, and the need for transparent, contestable, and independently auditable governance mechanisms. We argue that the central challenge is not whether LLMs should moderate content, but how moderation can be made proportionate, accountable, pluralistic, and resistant to opaque epistemic control.

Editor's pickPAYWALLTechnology
FT· Yesterday

Tech CEOs want AI rules — it may be too late

A deregulated environment sounds good until it results in ad hoc political intervention

Editor's pickGovernment & Public Sector
Arxiv· Today

AI Transparency: Governance Compliance or Stakeholder Requirements?

arXiv:2606.30652v1 Announce Type: new Abstract: Transparency is increasingly mandated for public-sector AI systems, with organisations required to publish statements describing their AI use and oversight arrangements. However, the existence of such artefacts is often treated as equivalent to transparency itself, despite limited evidence that they proportionately serve relevant stakeholder groups. From a requirements engineering perspective, this raises a validation concern: compliance with mandated disclosure criteria does not necessarily ensure transparency adequacy for stakeholders with different levels of risk exposure, decision control, and involvement. This paper presents an empirical analysis of 92 publicly available AI transparency statements published by Australian Government agencies under the national AI governance mandate. We introduce the stakeholder Risk--Control--Involvement--Need (RCIN) framework to differentiate stakeholder classes according to their structural position and transparency needs. Using a structured rubric derived from the mandated criteria, we evaluate how both the mandate and published statements are calibrated to each stakeholder class. The findings show that while structural compliance is widespread, transparency calibration is uneven. Criteria serving high-control stakeholders are consistently realised, whereas criteria most critical for high-risk, low-control stakeholders are fewer and less substantively addressed. We conceptualise this as the Transparency Illusion: a condition in which transparency appears satisfied through compliant artefacts yet remains unevenly calibrated to stakeholders bearing the greatest exposure to AI-supported decisions. The study frames transparency as a stakeholder-calibrated validation problem, demonstrating that artefact-level compliance does not constitute requirements validation in this context.

Editor's pickGovernment & Public Sector
Guardian· Yesterday

Silicon Valley donations make Colorado Democratic primary one of state’s most expensive

Manny Rutinel’s House campaign draws millions from big tech as pro- and anti-AI factions spar over regulation Political groups funded by top tech executives have been homing in on one local race in Colorado, as the state’s Democratic primary vote gets under way on Tuesday. Democrat Manny Rutinel, who’s running in the competitive eighth congressional district for a seat in the House, has seen his campaign boosted with at least $2m in donations from committees led by the former Google CEO Eric Schmidt and crypto billionaire Chris Larsen. Rutinel is a progressive candidate running against former state representative and centrist Democrat Shannon Bird. During his campaign, he has focused on his Latino heritage and centered his platform around affordability and regulating Immigrations and Customs Enforcement (ICE). Continue reading...

Editor's pickTechnology
WIRED· Yesterday

The Trump Administration Is Lifting Its Export Controls on Anthropic’s Mythos and Fable AI Models | WIRED

The White House is easing restrictions on Anthropic’s most advanced AI models weeks after ordering the company to suspend access for foreign nationals.

Editor's pickGovernment & Public Sector
Silicon Canals· Yesterday

Europe’s new tech-sovereignty plan doesn’t ban U.S. cloud giants — it sets four levels of “sovereignty” for sensitive government data, and an American law makes the top levels nearly impossible for them to reach - Silicon Canals

On June 3 the European Commission proposed the Cloud and AI Development Act, which grades cloud providers on four sovereignty levels for public-sector data. The rules stop short of a ban, but the Commission’s own tech chief says U.S. firms will struggle to reach the highest tiers because ...

Editor's pickTechnology
The Business Times· Today

US lifts export controls on Anthropic's Fable, Mythos AI models - The Business Times

Washington has stepped up oversight of new model releases to identify potential threats posed by advanced AI models Read more at The Business Times.

Editor's pickFinancial Services
TheStreet· Yesterday

Central bankers grow nervous about AI funding - TheStreet

A quiet shift in how AI’s biggest spenders pay for their data centers has the world’s top financial watchdog sounding an unusual alarm.

Editor's pickGovernment & Public Sector
NatLawReview· Yesterday

New AI Executive Order Signals Major Opportunities for Government Contractors

On June 2, 2026, President Trump signed a sweeping executive order titled “Promoting Advanced Artificial Intelligence Innovation and Security,” signaling the administration’s latest effort to strengthen America’s leadership in artificial intelligence while addressing emerging cybersecurity ...

Editor's pick
Just Security· Yesterday

The Handover of AI Standard-Setting

Providers, not regulators, are increasingly setting the standards against which their own AI systems are measured.

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

China revises privacy standard with new AI, sensitive-data requirements

China has proposed a major overhaul of its flagship national personal-information protection standard, introducing new compliance requirements for AI developers and stricter rules on handling sensitive personal data.

Editor's pickTechnology
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

OpenAI-Ona deal sparks regulatory review in Australia

The Australian Competition & Consumer Commission is reviewing OpenAI's proposed acquisition of Gitpod, Inc., now operating as Ona.

Editor's pickTechnology
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

AI simplification package gets final approval from EU member states

EU member states approved legislation simplifying parts of the bloc's AI rulebook, delaying obligations for high-risk systems and banning non-consensual sexual deepfakes.

Editor's pickHealthcare
Daily Brew· Yesterday

Lawmakers want to ban AI companies from selling your health data

New legislative efforts are underway to prevent AI companies from monetizing sensitive health and location data.

Editor's pickDefense & National Security
Artificial Intelligence Newsletter | July 1, 2026· Yesterday

US regulator of foreign investment eyes agentic AI, data centers

Agentic AI programs and the data centers that power them are a top concern for national security experts at the US Committee on Foreign Investment in the US.

Editor's pick
Artificial Intelligence Newsletter | July 1, 2026· Today

Crowell & Moring and King’s College London Seventh EU Competition Law Conference

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

US Supreme Court says president can remove independent agency officials for any reason

The Supreme Court ruled 6-3 that FTC members can be removed for any reason, overturning previous 'for cause' protections.

Editor's pickTechnology
Daily Brew· Today

European digital ID wallets rely on safety services of Google and Apple

Concerns regarding the reliance of European digital ID wallets on infrastructure provided by Google and Apple.

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

US Sen. Warren slams Supreme Court decision on FTC commissioners

Senator Elizabeth Warren criticized a Supreme Court ruling allowing the president to remove FTC commissioners without cause, claiming it allows the president to seize control of independent agencies.

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

US Senator Cantwell says Supreme Court 'gutting' independent agencies

Senator Maria Cantwell stated that the Supreme Court's decision regarding the removal of FTC commissioners undermines independent agencies designed to ensure bipartisan, fact-based decision-making.

Editor's pick
Artificial Intelligence Newsletter | June 30, 2026· 2 days ago

Open Markets says US Supreme Court 'grossly usurped' Congress in Slaughter case

The Open Markets Institute condemned the Supreme Court's ruling on FTC commissioner removal, arguing it usurps congressional authority and creates a significant constitutional crisis.

Editor's pickGovernment & Public Sector
🩸 Bloodbath no more· 2 days ago

KIDS Act passes House

The House passed the Kids Internet and Digital Safety Act, but key senators warn the legislation faces an uphill battle due to disagreements over preemption language.

Editor's pickGovernment & Public Sector
Artificial Intelligence Newsletter | July 1, 2026· Yesterday

Issa commits to US Judiciary votes on deepfake, anti-piracy legislation

House Judiciary IP subcommittee Chairman Darrell Issa aims to advance bills regulating AI deepfakes and anti-piracy measures through the full committee before the end of the term.

Editor's pickTechnology
Daily Brew· Today

Anthropic’s long-sidelined Fable 5 is greenlit to return

Anthropic's Fable 5 model is cleared for release following regulatory changes.

Best Practice AI© 2026 Best Practice AI Ltd. All rights reserved.

Get the full executive brief

Receive curated insights with practical implications for strategy, operations, and governance.

AI Daily Brief — leaders actually read it.

Free email — not hiring or booking. Optional BPAI updates for company news. Unsubscribe anytime.

Include

No spam. Unsubscribe anytime. Privacy policy.