Thu 16 April 2026
Daily Brief — Curated and contextualised by Best Practice AI
Mercor Secures Ten Billion to Automate Offices, TSMC Posts Record Profits, and OpenAI Pauses UK Data Centers
TL;DR Mercor, a startup co-founded by inexperienced twentysomethings, has reached a $10 billion valuation by promising AI that replicates most professional work. TSMC reported record first-quarter profits, propelling Taiwan's stock market value past the UK's amid the AI chip boom. OpenAI paused its Stargate data center project in the UK, blaming high energy costs and regulations, prompting criticism from Britain's AI minister. Goldman Sachs highlighted durable AI investment spending driving equity earnings revisions despite market uncertainty. Madison Air raised $2.23 billion in its IPO to capitalize on data center ventilation demand tied to terafab expansions.
The stories that matter most
Selected and contextualised by the Best Practice AI team
The $10 Billion Startup Training AI to Replace the White-Collar Workforce
Mercor is promising to replicate most professional work. It was also co-founded by twentysomethings who previously never held a real job.
Anthropic Unveils Updated Opus 4.7 Model | Bloomberg Tech 4/16/2026
Bloomberg’s Caroline Hyde and Ed Ludlow discuss Anthropic's updated version of its AI model, Opus 4.7, released just a week after its limited release of Mythos. Plus, Elon Musk is kicking his Terafab plan into high gear, even as skepticism grows from the semiconductor industry. And, TSMC reports a big surge in profit and raises its revenue outlook for 2026, driven by strong demand for AI chips. (S
a16z’s Martin Casado: It’s not that hard to build AI models
The technologist and investor argues recent progress in AI is an industrial revolution-scale event but warns the ability of the bigger players to raise ‘cheap money’ is time-limited
Taiwan overtakes UK in stock market value on AI chip boom
Crossover comes as chipmaker TSMC reports record first-quarter profit
Alignment as Institutional Design: From Behavioral Correction to Transaction Structure in Intelligent Systems
arXiv:2604.13079v1 Announce Type: new Abstract: Current AI alignment paradigms rely on behavioral correction: external supervisors (e.g., RLHF) observe outputs, judge against preferences, and adjust parameters. This paper argues that behavioral correction is structurally analogous to an economy without property rights, where order requires perpetual policing and does not scale. Drawing on institutional economics (Coase, Alchian, Cheung), capability mutual exclusivity, and competitive cost discovery, we propose alignment as institutional design: the designer specifies internal transaction structures (module boundaries, competition topologies, cost-feedback loops) such that aligned behavior emerges as the lowest-cost strategy for each component. We identify three irreducible levels of human intervention (structural, parametric, monitorial) and show that this framework transforms alignment from a behavioral control problem into a political-economy problem. No institution eliminates self-interest or guarantees optimality; the best design makes misalignment costly, detectable, and correctable. We conclude that the proper goal is institutional robustness-a dynamic, self-correcting process under human oversight, not perfection. This work provides the normative foundation for the Wuxing resource-competition mechanisms in companion papers. Keywords: AI alignment, institutional design, transaction costs, property rights, resource competition, behavioral correction, RLHF, cost truthfulness, modular architecture, correctable alignment
Economics & Markets
a16z’s Martin Casado: It’s not that hard to build AI models
The technologist and investor argues recent progress in AI is an industrial revolution-scale event but warns the ability of the bigger players to raise ‘cheap money’ is time-limited
Emerging Stocks Extend Gains in 2026 as Risk Appetite Holds Firm
Emerging-market stocks rose for a third straight day, with a key index nearly erasing losses triggered by the Middle East conflict as investors snapped up artificial intelligence-linked shares in Asia and commodity companies in Latin America.
Tiger Global-Backed Upscale AI in Talks for $2 Billion Valuation
Artificial intelligence startup Upscale AI is in talks to raise a new round of funding at a valuation of about $2 billion, according to people with knowledge of the efforts.
OpenAI Takes on Google With New AI Model Aimed at Drug Discovery
OpenAI is rolling out an early version of an artificial intelligence model meant to speed up drug discoveries, joining a field of growing interest for tech companies eager to prove AI can pave the way for more scientific breakthroughs.
AI-powered mainframe exits are a bubble set to pop
Analysts predict that 70 percent of AI-powered mainframe exit projects will fail and 75 percent of vendors in the field will disappear.
Accel Raises $5 Billion to Amplify AI Investments, Targets Late-Stage Startups with $4 Billion Leaders Fund
Accel has secured a $5 billion funding round to enhance its investments in AI, with $4 billion dedicated to late-stage startups.
Danish finance AI start-up Spektr raises $20m
The new funding will be used to expand the Copenhagen-based company’s AI platform for banks and fintech companies, and accelerate adoption across financial institutions globally. Read more: Danish finance AI start-up Spektr raises $20m
Humyn Labs Invests $20M in Global AI Data Expansion
Humyn Labs invests $20 million to expand its infrastructure for Physical AI, targeting the growing $25 billion AI training data market by 2030.
Solidroad raises $25m as demand for QA product sparks fresh hiring
Hiring has already begun and will continue ‘over the coming months’, founder Mark Hughes tells SiliconRepublic.com. Read more: Solidroad raises $25m as demand for QA product sparks fresh hiring
Manycore, the first of the Hangzhou ‘Little Dragons’ to go public, pushes ‘spatial intelligence’ as the next wave of AI development
Manycore Tech debuts on Hong Kong's stock exchange today, following a $130 million IPO—and contributing to the Chinese city's AI listing boom.
Harvey’s 30-year-old CEO says failing is a ‘good way to learn’ and destroying his ego led to an $11 billion success
Winston Weinberg says he learns from both his wins and losses to stay on top in a hypercompetitive AI startup market.
Strategic Response of News Publishers to Generative AI
arXiv:2512.24968v4 Announce Type: replace Abstract: Generative AI can adversely impact news publishers by lowering consumer demand. It can also reduce demand for newsroom employees, and increase the creation of news "slop." However, it can also form a source of traffic referrals and an information-discovery channel that increases demand. We use high-frequency granular data to analyze the strategic response of news publishers to the introduction of Generative AI. Many publishers strategically blocked LLM access to their websites using the robots.txt file standard. Using a difference-in-differences approach, we find that large publishers who block GenAI bots experience reduced website traffic compared to not blocking. In addition, we find that large publishers shift toward richer content that is harder for LLMs to replicate, without increasing text volume. Finally, we find that the share of new editorial and content-production job postings rises over time. Together, these findings illustrate the levers that publishers choose to use to strategically respond to competitive Generative AI threats, and their consequences.
Treating enterprise AI as an operating layer
There’s a fault line running through enterprise AI, and it’s not the one getting the most attention. The public conversation still tracks foundation models and benchmarks—GPT versus Gemini, reasoning scores, and marginal capability gains. But in practice, the more durable advantage is structural: who owns the operating layer where intelligence is applied, governed, and improved.…
AI-driven markets a key focus for Italian competition watchdog, head says
Outgoing AGCM President Roberto Rustichelli stated that AI-driven markets will be a key challenge for the Italian competition authority, which aims to play a pioneering role in ensuring fair innovation.
The Next Evolution of the Agents SDK
OpenAI's updated Agents SDK introduces new primitives like configurable memory and native sandbox execution, signaling a move to compete with independent agent infrastructure vendors.
Are we getting what we paid for? How to turn AI momentum into measurable value
Enterprise AI is entering a new phase — one where the central question is no longer what can be built, but how to make the most of our AI investment. At VentureBeat’s latest AI Impact Tour session, Brian Gracely, director of portfolio strategy at Red Hat, described the operational reality inside large organizations: AI sprawl, rising inference costs, and limited visibility into what those investments are actually returning. It’s the “Day 2” moment — when pilots give way to production, and cost, governance, and sustainability become harder than building the system in the first place. "We've seen customers who say, 'I have 50,000 licenses of Copilot. I don't really know what people are getting out of that. But I do know that I'm paying for the most expensive computing in the world, because it's GPUs,'" Gracely said. "'How am I going to get that under control?'" Why enterprise AI costs are now a board-level problem For much of the past two years, cost was not the primary concern for organizations evaluating generative AI. The experimental phase gave teams cover to spend freely, and the promise of productivity gains justified aggressive investment, but that dynamic is shifting as enterprises enter their second and third budget cycles with AI. The focus has moved from "can we build something?" to "are we getting what we paid for?" Enterprises that made large, early bets on managed AI services are conducting hard reviews of whether those investments are delivering measurable value. The issue isn’t just that GPU computing is expensive. It is that many organizations lack the instrumentation to connect spending to outcomes, making it nearly impossible to justify renewals or scale responsibly. The strategic shift from token consumer to token producer The dominant AI procurement model of the past few years has been straightforward: pay a vendor per token, per seat, or per API call, and let someone else manage the infrastructure. That model made sense as a starting point but is increasingly being questioned by organizations with enough experience to compare alternatives. Enterprises that have been through one AI cycle are starting to rethink that model. "Instead of being purely a token consumer, how can I start being a token generator?" Gracely said. "Are there use cases and workloads that make sense for me to own more? It may mean operating GPUs. It may mean renting GPUs. And then asking, 'Does that workload need the greatest state-of-the-art model? Are there more capable open models or smaller models that fit?'" The decision is not binary. The right answer depends on the workload, the organization, and the risk tolerance involved, but the math is getting more complicated as the number of capable open models, from DeepSeek to models now available through cloud marketplaces, grows. Now enterprises actually have real alternatives to the handful of providers that dominated the landscape two years ago. Falling AI costs and rising usage create a paradox for enterprise budgets Some enterprise leaders argue that locking into infrastructure investments now could mean significantly overpaying in the long run, pointing to the statement from Anthropic CEO Dario Amodei that AI inference costs are declining roughly 60% per year. The emergence of open-source models such as DeepSeek and others has meaningfully expanded the strategic options available to enterprises that are willing to invest in the underlying infrastructure in the last three years. But while costs per token are falling, usage is accelerating at a pace that more than offsets efficiency gains. It's a version of Jevons Paradox, the economic principle that improvements in resource efficiency tend to increase total consumption rather than reduce it, as lower cost enables broader adoption. For enterprise budget planners, this means declining unit costs do not translate into declining total bills. An organization that triples its AI usage while costs fall by half still ends up spending more than it did before. The consideration becomes which workloads genuinely require the most capable and most expensive models, and which can be handled just fine by smaller, cheaper alternatives. The business case for investing in AI infrastructure flexibility The prescription isn't to slow down AI investment, but to build with flexibility being top of mind. The organizations that will win aren't necessarily the ones that move fastest or spend the most; they're the ones building infrastructure and operating models capable of absorbing the next unexpected development. "The more you can build some abstractions and give yourself some flexibility, the more you can experiment without running up costs, but also without jeopardizing your business. Those are as important as asking whether you're doing everything best practice right now," Gracely explained. But despite how entrenched AI discussions have become in enterprise planning cycles, the practical experience most organizations have is still measured in years, not decades. "It feels like we've been doing this forever. We've been doing this for three years," Gracely added. "It's early and it's moving really fast. You don't know what's coming next. But the characteristics of what's coming next — you should have some sense of what that looks like.” For enterprise leaders still calibrating their AI investment strategies, that may be the most actionable takeaway: the goal is not to optimize for today's cost structure, but to build the organizational and technical flexibility to adapt when, not if, it changes again.
When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation
arXiv:2604.11840v1 Announce Type: cross Abstract: Large language models are increasingly used as agents in social, economic, and policy simulations. A common assumption is that stronger reasoning should improve simulation fidelity. We argue that this assumption can fail when the objective is not to solve a strategic problem, but to sample plausible boundedly rational behavior. In such settings, reasoning-enhanced models can become better solvers and worse simulators: they can over-optimize for strategically dominant actions, collapse compromise-oriented terminal behavior, and sometimes exhibit a diversity-without-fidelity pattern in which local variation survives without outcome-level fidelity. We study this solver-sampler mismatch in three multi-agent negotiation environments adapted from earlier simulation work: an ambiguous fragmented-authority trading-limits scenario, an ambiguous unified-opposition trading-limits scenario, and a new-domain grid-curtailment case in emergency electricity management. We compare three reflection conditions, no reflection, bounded reflection, and native reasoning, across two primary model families and then extend the same protocol to direct OpenAI runs with GPT-4.1 and GPT-5.2. Across all three experiments, bounded reflection produces substantially more diverse and compromise-oriented trajectories than either no reflection or native reasoning. In the direct OpenAI extension, GPT-5.2 native ends in authority decisions in 45 of 45 runs across the three experiments, while GPT-5.2 bounded recovers compromise outcomes in every environment. The contribution is not a claim that reasoning is generally harmful. It is a methodological warning: model capability and simulation fidelity are different objectives, and behavioral simulation should qualify models as samplers, not only as solvers.
A Comparative Study of Dynamic Programming and Reinforcement Learning in Finite Horizon Dynamic Pricing
arXiv:2604.14059v1 Announce Type: new Abstract: This paper provides a systematic comparison between Fitted Dynamic Programming (DP), where demand is estimated from data, and Reinforcement Learning (RL) methods in finite-horizon dynamic pricing problems. We analyze their performance across environments of increasing structural complexity, ranging from a single typology benchmark to multi-typology settings with heterogeneous demand and inter-temporal revenue constraints. Unlike simplified comparisons that restrict DP to low-dimensional settings, we apply dynamic programming in richer, multi-dimensional environments with multiple product types and constraints. We evaluate revenue performance, stability, constraint satisfaction behavior, and computational scaling, highlighting the trade-offs between explicit expectation-based optimization and trajectory-based learning.
The Token Reckoning
This article examines how flat-rate AI coding subscriptions are colliding with agentic workloads that consume significantly more tokens per task, particularly in compute-constrained markets.
Rethinking AI TCO: Why Cost per Token Is the Only Metric That Matters
This article argues that cost per token is the primary metric for evaluating AI infrastructure, though it is criticized for focusing on hardware output rather than broader business value.
Labor & Society
The $10 Billion Startup Training AI to Replace the White-Collar Workforce
Mercor is promising to replicate most professional work. It was also co-founded by twentysomethings who previously never held a real job.
Wish there was information about where this data came from, but this is a very significant change. Since AI use comes from experience, the
Wish there was information about where this data came from, but this is a very significant change. Since AI use comes from experience, the persistent gender gap in AI use across every study of AI was something that a lot of scholars were concerned about.
Gen Z turning its back on AI isn’t irrational—it’s a verdict on everyone who failed them
The most tech-native generation in U.S. history has soured on the defining technology of its era. The more they use it, the more they despair.
Americans who masterminded Nork IT worker fraud sentenced to 200 months behind bars
Fortune 500 companies and one US defense contractor got taken for $5m in four-year scam Two Americans have been jailed for a combined 200 months for helping North Korea generate $5 million through fraudulent IT worker schemes.…
AI has an Awful Image problem
Modern-day Luddites are gaining ground because tech titans haven’t shown people how innovation will improve their lives
Claude power users have complaints
Anthropic users are reporting that Claude feels less capable, sparking speculation about whether the model has been 'nerfed' to manage costs or prioritize newer, more powerful models.
High-Risk Memories? Comparative audit of the representation of Second World War atrocities in Ukraine by generative AI applications
arXiv:2604.13765v1 Announce Type: new Abstract: The rise of generative artificial intelligence (genAI) models poses new possibilities and risks for how the past is remembered by accelerating content production and altering the process of information discovery. The most critical risk is historical misrepresentation, which ranges from the distortion of facts and inaccurate depiction of specific groups to more subtle forms, such as the selective moralization of history. The dangers of misrepresentation of the past are particularly pronounced for high-risk memories, such as memories of past atrocities, which have a strong emotional load and are often instrumentalised by political actors. To understand how substantive this risk is, we empirically investigate how genAI applications deal with high-risk memories of the Second World War atrocities in Ukraine. This case is crucial due to the scope of the atrocities and the intense, often instrumentalised, contestation surrounding their memory. We audit the performance of three common genAI applications for different types of misrepresentation, including hallucinations and inconsistent moralization, and discuss the implications for future memory practices.
Apple, Google Offer ‘Nudify’ Apps Despite Policies Against Them
Apple Inc. and Google have continued to offer mobile apps that let users make nonconsensual sexualized images of people despite their policies prohibiting such content, according to a report published Wednesday by the Tech Transparency Project.
Agents hooked into GitHub can steal creds – but Anthropic, Google, and Microsoft haven't warned users
Researchers discovered flaws in AI agents that can steal credentials, warning that the issue is likely pervasive.
Git identity spoof fools Claude into giving bad code the nod
Forged metadata made AI reviewer treat hostile changes as though they came from known maintainer Security boffins say Anthropic's Claude can be tricked into approving malicious code with just two Git commands by spoofing a trusted developer's identity.…
AI agents are scaling faster than cyber defenses
AI agents could soon outnumber humans in the enterprise, necessitating a shift in cybersecurity. Traditional defenses may fail as agent-driven software creation accelerates.
AI in Healthcare: Aid, Not Replace—Clinicians Warn of Risks as Users Seek Speed and Privacy
Physicians stress AI should support, not replace, professional care amid concerns about misinformation and privacy.
AI Is Weaponizing Your Own Biases Against You
New research from MIT and Stanford suggests that AI systems can be used to exploit and amplify individual human biases.
Cooperate to detect ‘LLM grooming,’ EU official tells AI companies
EU official Chiara Zannini warned AI companies to cooperate on detecting 'LLM grooming,' a practice of manipulating chatbot outputs. Reports show Russia and China are increasingly using AI for manipulation.
Sam Altman's attacker had a kill list of AI executives
Reports indicate the individual who attacked Sam Altman possessed a list of other AI industry leaders, sparking concerns about targeted violence.
Are we ready to place lab experiments in non-human hands?
Stephen D Turner of the University of Virginia explores the importance of governance and oversight around AI in the design and execution of lab experiments. Read more: Are we ready to place lab experiments in non-human hands?
Alignment as Institutional Design: From Behavioral Correction to Transaction Structure in Intelligent Systems
arXiv:2604.13079v1 Announce Type: new Abstract: Current AI alignment paradigms rely on behavioral correction: external supervisors (e.g., RLHF) observe outputs, judge against preferences, and adjust parameters. This paper argues that behavioral correction is structurally analogous to an economy without property rights, where order requires perpetual policing and does not scale. Drawing on institutional economics (Coase, Alchian, Cheung), capability mutual exclusivity, and competitive cost discovery, we propose alignment as institutional design: the designer specifies internal transaction structures (module boundaries, competition topologies, cost-feedback loops) such that aligned behavior emerges as the lowest-cost strategy for each component. We identify three irreducible levels of human intervention (structural, parametric, monitorial) and show that this framework transforms alignment from a behavioral control problem into a political-economy problem. No institution eliminates self-interest or guarantees optimality; the best design makes misalignment costly, detectable, and correctable. We conclude that the proper goal is institutional robustness-a dynamic, self-correcting process under human oversight, not perfection. This work provides the normative foundation for the Wuxing resource-competition mechanisms in companion papers. Keywords: AI alignment, institutional design, transaction costs, property rights, resource competition, behavioral correction, RLHF, cost truthfulness, modular architecture, correctable alignment
Big Tech’s $300mn election war chest rattles Democrats
Pro-industry campaign groups deploy millions amid growing public support for tighter regulation
UK firms ‘should be worried’ about Anthropic’s latest AI model, minister says
Kanishka Narayan says Britain needs to ‘make most of opportunities’ as government launches £500mn unit
US FTC shouldn’t be lead AI enforcer, Chairman Ferguson says
US Federal Trade Commission Chairman Andrew Ferguson stated the agency should not become the general all-purpose AI regulator.
It’s clear we won’t regulate AI for safety’s sake
As usual, governments will barely affect the trends that most affect our lives
Make crappy moves around AI and face voter backlash, govts warned
When the taxpayers are wondering whose side you are on... Britain's government faces a public backlash against AI unless it can show ordinary people that they stand to benefit from its push to inject the technology into every area of the UK in the name of growth.…
RED ALERT: Tennessee is about to make building chatbots a Class A felony
Proposed legislation in Tennessee could classify the development of certain chatbots as a Class A felony, carrying severe prison sentences.
Insurers seek to exclude their risk-assessment models from EU's AI Act
The European Insurances and Occupational Pensions Authority has requested that insurers' risk-assessment models be excluded from the EU's AI Act, arguing they are not autonomous.
Ad Companies Settle With F.T.C. Over Claims of Harm to Conservative Sites
WPP, Dentsu and Publicis settled claims they colluded on policies to combat misinformation, denying ad revenue to publishers on the right.
Online platforms must act to tackle political interference, UK AI minister says
The UK's AI minister told parliament that online platforms are not doing enough to stop foreign interference and misinformation, following reports of suspicious activity on X.
Exclusive: OpenAI lobbies for science
OpenAI is lobbying for an expanded role for AI in the life sciences sector.
Technology & Infrastructure
US states can't account for datacenter tax breaks. Literally
A report indicates that authorities are failing to disclose revenue lost to server farm subsidies, effectively flouting rules.
Instead of the gold standard, we can imagine an inference standard of exchange, the FLOP. (As opposed to tokens, this accounts for AI ability)
Instead of the gold standard, we can imagine an inference standard of exchange, the FLOP. (As opposed to tokens, this accounts for AI ability) With some AI help, I figure $1 buys roughly 10^17 managed-LLM inference FLOPs. So that $4 coffee would cost half an exaFLOP, choom.
Compute constraints are a double bind: On the inference side you need to either (a) raise prices, (b) ration use, and/or (c) serve worse
Compute constraints are a double bind: On the inference side you need to either (a) raise prices, (b) ration use, and/or (c) serve worse models. This hurts current growth On the training side, you can't train the next gen of models to stay competitive. This hurts future growth
NodeWeaver says its perpetual licensing beats VMware’s perpetual price hikes
'I think you can run this thing on a potato,' NodeWeaver CTO Alan Conboy said. Broadcom's price increases and policy changes have led many VMware customers to look for other options. Nodeweaver is positioning itself as an alternative for customers running computing workloads in far-flung edge locations, from cruise ships to solar farms in Sub-Saharan Africa, and it is taking cost out of the hardware needed as well.…
Mythos cyber scare signals the economics of AI scarcity
As capabilities of frontier models advance, gaining access to technology could become critically important
Data centers: Hyperscalers spending billions on hardware that’s worthless in 3 years
Fortune reports on the massive capital expenditure by tech giants on AI hardware that faces rapid obsolescence.
Anthropic Unveils Updated Opus 4.7 Model | Bloomberg Tech 4/16/2026
Bloomberg’s Caroline Hyde and Ed Ludlow discuss Anthropic's updated version of its AI model, Opus 4.7, released just a week after its limited release of Mythos. Plus, Elon Musk is kicking his Terafab plan into high gear, even as skepticism grows from the semiconductor industry. And, TSMC reports a big surge in profit and raises its revenue outlook for 2026, driven by strong demand for AI chips. (S
Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub
arXiv:2604.13064v1 Announce Type: cross Abstract: Skill ecosystems have emerged as an increasingly important layer in Large Language Model (LLM) agent systems, enabling reusable task packaging, public distribution, and community-driven capability sharing. However, despite their rapid growth, the functionality, ecosystem structure, and security risks of public skill registries remain underexplored. In this paper, we present an empirical study of ClawHub, a large public registry of agent skills. We build and normalize a dataset of 26,502 skills, and conduct a systematic analysis of their language distribution, functional organization, popularity, and security signals. Our clustering results show clear cross-lingual differences: English skills are more infrastructure-oriented and centered on technical capabilities such as APIs, automation, and memory, whereas Chinese skills are more application-oriented, with clearer scenario-driven clusters such as media generation, social content production, and finance-related services. We further find that more than 30% of all crawled skills are labeled as suspicious or malicious by available platform signals, while a substantial fraction of skills still lack complete safety observability. To study early risk assessment, we formulate submission-time skill risk prediction using only information available at publication time, and construct a balanced benchmark of 11,010 skills. Across 12 classifiers, the best Logistic Regression achieves a accuracy of 72.62% and an AUROC of 78.95%, with primary documentation emerging as the most informative submission-time signal. Our findings position public skill registries as both a key enabler of agent capability reuse and a new surface for ecosystem-scale security risk.
We Don’t Really Know How A.I. Works. That’s a Problem.
For us to trust it on certain subjects, researchers in the growing field of interpretability might need to learn how to open the black box of its brain.
AI’s Next Frontier: People Skills
Imagine a chatbot that actually knows how to talk to you.
Has Google’s AI watermarking system been reverse-engineered?
Security researchers have reportedly reverse-engineered Google's SynthID watermarking system, raising questions about the reliability of AI content detection.
Human-Inspired Context-Selective Multimodal Memory for Social Robots
arXiv:2604.12081v1 Announce Type: new Abstract: Memory is fundamental to social interaction, enabling humans to recall meaningful past experiences and adapt their behavior accordingly based on the context. However, most current social robots and embodied agents rely on non-selective, text-based memory, limiting their ability to support personalized, context-aware interactions. Drawing inspiration from cognitive neuroscience, we propose a context-selective, multimodal memory architecture for social robots that captures and retrieves both textual and visual episodic traces, prioritizing moments characterized by high emotional salience or scene novelty. By associating these memories with individual users, our system enables socially personalized recall and more natural, grounded dialogue. We evaluate the selective storage mechanism using a curated dataset of social scenarios, achieving a Spearman correlation of 0.506, surpassing human consistency ($\rho=0.415$) and outperforming existing image memorability models. In multimodal retrieval experiments, our fusion approach improves Recall@1 by up to 13\% over unimodal text or image retrieval. Runtime evaluations confirm that the system maintains real-time performance. Qualitative analyses further demonstrate that the proposed framework produces richer and more socially relevant responses than baseline models. This work advances memory design for social robots by bridging human-inspired selectivity and multimodal retrieval to enhance long-term, personalized human-robot interaction.
Irish space AI start-up Ubotica on board for NASA’s FAME
The multiyear flight demonstration of FAME is expected to begin with an initial set of six spacecraft this summer. Read more: Irish space AI start-up Ubotica on board for NASA’s FAME
Human-machine teaming dives underwater
MIT researchers are exploring how human-machine teams can operate effectively in underwater environments, a critical frontier for autonomous systems.
Nvidia slaps forehead: I know what quantum is missing - it's AI!
Nvidia is looking to apply AI to address high error rates in quantum computing operations.
For the first time in history, Ukraine captured a Russian position using only robots and drones
Military reports confirm that Ukrainian forces successfully captured a Russian position using an entirely autonomous fleet of drones and robots.
This is becoming a pattern in AI that makes talking about capabilities challenging. First, there are overstated claims (like the flubbed
This is becoming a pattern in AI that makes talking about capabilities challenging. First, there are overstated claims (like the flubbed Erdos problems last year), then minor wins (AI helps with discovery) then breakthroughs. The first stage feels like (& often is) hype, but…
Adoption & Impact
Making AI operational in constrained public sector environments
The AI boom has hit across industries, and public sector organizations are facing pressure to accelerate adoption. At the same time, government institutions face distinct constraints around security, governance, and operations that set them apart from their business counterparts. For this reason, purpose-built small language models (SLMs) offer a promising path to operationalize AI in…
The Hidden Demand for AI Inside Your Company
This HBR case study explains how BBVA transformed unmanaged employee LLM use into a secure, governed ChatGPT Enterprise rollout, serving as a practical playbook for enterprise AI adoption.
How AI Is Being Used To Detect Cancer at The Earliest Stage
Dr. Bea Bakshi, CEO & Co-Founder of C the Signs joins Bloomberg Businessweek to discuss the future of cancer detection and how AI is part of the solution even in the early stages. (Source: Bloomberg)
OpenAI debuts GPT-Rosalind, a new limited access model for life sciences, and broader Codex plugin on Github
The journey from a laboratory hypothesis to a pharmacy shelf is one of the most grueling marathons in modern industry, typically spanning 10 to 15 years and billions of dollars in investment. Progress is often stymied not just by the inherent mysteries of biology, but by the "fragmented and difficult to scale" workflows that force researchers to manually pivot between the actual experimental design equipment, software, and databases. But OpenAI is releasing a new specialized model GPT-Rosalind specifically to speed up this process and make it more efficient, easier, and ideally, more productive. Named after the pioneering chemist Rosalind Franklin, whose work was vital to the discovery of DNA’s structure (and was often overlooked for her male colleagues James Watson and Francis Crick), this new frontier reasoning model is purpose-built to act as a specialized intelligence layer for life sciences research. By shifting AI’s role from a general-purpose assistant to a domain-specific "reasoning" partner, OpenAI is signaling a long-term commitment to biological and chemical discovery. What GPT-Rosalind offers GPT-Rosalind isn't just about faster text generation; it is designed to synthesize evidence, generate biological hypotheses, and plan experiments—tasks that have traditionally required years of expert human synthesis. At its core, GPT-Rosalind is the first in a new series of models optimized for scientific workflows. While previous iterations of GPT excelled at general language tasks, this model is fine-tuned for deeper understanding across genomics, protein engineering, and chemistry. To validate its capabilities, OpenAI tested the model against several industry benchmarks. On BixBench, a metric for real-world bioinformatics and data analysis, GPT-Rosalind achieved leading performance among models with published scores. In more granular testing via LABBench2, the model outperformed GPT-5.4 on six out of eleven tasks, with the most significant gains appearing in CloningQA—a task requiring the end-to-end design of reagents for molecular cloning protocols. The model’s most striking performance signal came from a partnership with Dyno Therapeutics. In an evaluation using unpublished, "uncontaminated" RNA sequences, GPT-Rosalind was tasked with sequence-to-function prediction and generation. When evaluated directly in the Codex environment, the model’s submissions ranked above the 95th percentile of human experts on prediction tasks and reached the 84th percentile for sequence generation. This level of expertise suggests the model can serve as a high-level collaborator capable of identifying "expert-relevant patterns" that generalist models often overlook. The new lab workflow OpenAI is not just releasing a model; it is launching an ecosystem designed to integrate with the tools scientists already use. Central to this is a new Life Sciences research plugin for Codex, available on GitHub. Scientific research is famously siloed. A single project might require a researcher to consult a protein structure database, search through 20 years of clinical literature, and then use a separate tool for sequence manipulation. The new plugin acts as an "orchestration layer," providing a unified starting point for these multi-step questions. Skill Set: The package includes modular skills for biochemistry, human genetics, functional genomics, and clinical evidence. Connectivity: It connects models to over 50 public multi-omics databases and literature sources. Efficiency: This approach targets "long-horizon, tool-heavy scientific workflows," allowing researchers to automate repeatable tasks like protein structure lookups and sequence searches. Limited and gated access Given the potential power of a model capable of redesigning biological structures, OpenAI is eschewing a broad "open-source" or general public release in favor of a Trusted Access program. The model is launching as a research preview specifically for qualified Enterprise customers in the United States. This restricted deployment is built on three core principles: beneficial use, strong governance, and controlled access. Organizations requesting access must undergo a qualification and safety review to ensure they are conducting legitimate research with a clear public benefit. Unlike general-use models, GPT-Rosalind was developed with heightened enterprise-grade security controls. For the end-user, this means: Restricted Access: Usage is limited to approved users within secure, well-managed environments. Governance: Participating organizations must maintain strict misuse-prevention controls and agree to specific life sciences research preview terms. Cost: During the preview phase, the model will not consume existing credits or tokens, allowing researchers to experiment without immediate budgetary constraints (subject to abuse guardrails). Warm reception from initial industry partners The announcement garnered significant buy-in from OpenAI parnters across the pharmaceutical and technology sectors. Sean Bruich, SVP of AI and Data at Amgen, noted that the collaboration allows the company to apply advanced tools in ways that could "accelerate how we deliver medicines to patients".The impact is also being felt in the specialized tech infrastructure that supports labs: NVIDIA: Kimberly Powell, VP of Healthcare and Life Sciences, described the convergence of domain reasoning and accelerated computing as a way to "compress years of traditional R&D into immediate, actionable scientific insights". Moderna: CEO Stéphane Bancel highlighted the model's ability to "reason across complex biological evidence" to help teams translate insights into experimental workflows. The Allen Institute: CTO Andy Hickl emphasized that GPT-Rosalind stands out for making manual steps—like finding and aligning data—more "consistent and repeatable in an agentic workflow". This builds on tangible results OpenAI has already seen in the field, such as its collaboration with Ginkgo Bioworks, where AI models helped achieve a 40% reduction in protein production costs. What's next for Rosalind and OpenAI in life sciences? OpenAI’s mission with GPT-Rosalind is to narrow the gap between a "promising scientific idea" and the actual "evidence, experiments, and decisions" required for medical progress. By partnering with institutions like Los Alamos National Laboratory to explore AI-guided catalyst design and biological structure modification, the company is positioning GPT-Rosalind as more than a tool—it is meant to be a "capable partner in discovery". As the life sciences field becomes increasingly data-dense, the move toward specialized "reasoning" models like Rosalind may become the standard for navigating the "vast search spaces" of biology and chemistry.
Geopolitics
Get the full executive brief
Receive curated insights with practical implications for strategy, operations, and governance.