Category: Technology & AI

GLM-5.1: The Chinese Open-Source Model That Just Beat GPT and Claude at Their Own Game

Something significant happened in the AI landscape this week, and I suspect it hasn’t got the attention it deserves outside of developer circles. Z.AI — the platform behind the GLM model family, developed by Zhipu AI in China — released GLM-5.1, a 754 billion parameter open-source model that has just topped the SWE-Bench Pro leaderboard with a score of 58.4, beating GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro.

Let that land for a moment. An open-source, MIT-licensed model, trained entirely on Huawei Ascend 910B chips — no Nvidia, no American silicon — has beaten the flagship closed models from OpenAI, Anthropic, and Google on one of the most respected software engineering benchmarks in existence.

What Makes GLM-5.1 Different

The headline number is impressive, but what actually interests me is the architecture of how this model works. GLM-5.1 isn’t just better at answering questions — it’s designed for sustained autonomous execution. In testing, it completed an eight-hour uninterrupted coding session: plan, execute, test, optimise, repeat. 655 iterations. Built a Linux desktop environment from scratch. Increased vector database query throughput by 6.9 times.

This is a different category of capability. We’re not talking about a better chatbot. We’re talking about an AI that can hold a task in mind, work through it independently, hit dead ends, correct course, and deliver a finished result — the way a competent junior engineer would, but without stopping for the night.

The technical foundation is a Mixture-of-Experts architecture with 40 billion active parameters per token (not all 754B are active at once, which is what keeps inference costs manageable). It supports a 200,000 token context window with up to 128,000 output tokens. API access is priced at $1.00 per million input tokens and $3.20 per million output tokens — a fraction of what the US frontier models charge.

Why This Matters Beyond the Benchmarks

I’ve written before about AI moving from a tool you prompt to a system that acts. GLM-5.1 is a concrete illustration of that shift happening faster than most people expected, and from a direction many in the West weren’t watching closely.

The geopolitical dimension is real. This model was trained on Huawei hardware using Huawei’s MindSpore framework — a deliberate demonstration that China’s AI development pipeline is no longer dependent on US export-controlled chips. The export restrictions that were supposed to slow Chinese AI development have instead accelerated domestic alternatives. That is a significant strategic development, regardless of where you sit on the AI competition question.

The open-source dimension is equally significant. With weights published under an MIT licence, GLM-5.1 can be downloaded, fine-tuned, and deployed by anyone. The closed-model advantage that OpenAI and Anthropic have built commercial moats around is being systematically eroded — not just by each other, but by well-resourced open-source releases like this one.

What I Take From This

I use AI heavily in my work — for financial analysis, document preparation, research, and increasingly for autonomous background tasks. The pace at which these systems are improving is not slowing down. If anything, GLM-5.1 suggests the competitive field is widening: more players, more approaches, more open options.

For anyone running a business or advising one, the practical implication is straightforward: the cost of access to frontier-level AI capability is falling rapidly, and the choice of provider is expanding. The question is no longer whether to use these tools — it’s which ones, for what, and how to build processes around them that compound over time.

GLM-5.1 is worth watching. Not because it’s the final word, but because it’s a clear signal that the race is genuinely global, the open-source movement is closing the gap faster than expected, and the next twelve months are going to be interesting.

GLM-5.1 is available via z.ai on the GLM Coding Plan, with weights on Hugging Face under MIT licence.

April 8, 2026
Conway and the End of the Chat Window

Last week, someone at Anthropic made a packaging error. A build of something called Conway — an internal project for an always-on AI agent — leaked out into the wild. Within hours, screenshots were circulating on Twitter showing an extensions system, webhook endpoints, and a Chrome integration that looked nothing like the Claude chat interface we all know.

Anthropic played it down. A release packaging issue caused by human error, they said. Not a security breach. Sure. But the cat is out of the bag, and what it reveals is far more interesting than the leak itself.

From Prompt-Response to Always-On

Every AI product you use today works the same way: you type something, it responds, you go back and forth until you get what you need. It is fundamentally a conversation. Conway is something different. It is an agent environment that stays running. It has webhook endpoints — public URLs that external services can call to wake the agent up when something happens. It has an extensions system where you can install custom tools, UI tabs, and context handlers. It uses Chrome autonomously to handle multi-step tasks on the web.

This is not a chatbot that got fancier. This is the architectural blueprint for AI that operates like a team member — one that never logs off, never forgets its context, and responds to events in the real world without waiting for you to type a prompt.

I find this significant because I have been building exactly this kind of system for myself over the past few months. My own AI assistant, Saul, already runs scheduled tasks, publishes content, monitors data, and sends me summaries — all without me being in the loop for every step. It works. But it is held together with cron jobs, Python scripts, and API calls that I have wired up manually. Conway suggests Anthropic wants to make this kind of continuous AI operation a first-class product, not a weekend project for the technically inclined.

The Enterprise Is Already There (Sort Of)

A Belitsoft report published this week, drawing on Salesforce’s 2026 Connectivity Benchmark data, says the average enterprise now runs 12 AI agents. Twelve. Expected to hit 20 by 2027. But here is the kicker — half of those agents operate in complete isolation. They do not talk to each other. They do not share context. They are twelve separate hammers looking for nails.

This is exactly the problem Conway appears to be solving. The extensions architecture, the webhook system, the persistent state — it is all about creating a single agent environment that can integrate with everything, rather than deploying a dozen disconnected point solutions. The shift is from “we have AI tools” to “we have an AI operating system.”

What This Means If You Run a Finance Function

I spend most of my professional life inside PE-backed finance teams, and the implications here are not abstract. Think about what a CFO’s week actually looks like: cash flow monitoring, covenant compliance, board pack preparation, variance analysis, vendor negotiations, investor reporting. Every single one of those workflows involves gathering data from multiple systems, applying judgment, producing an output, and sending it somewhere.

An always-on agent does not replace the judgment part — not yet, anyway — but it can collapse the gathering, formatting, and distribution steps into something that just happens. The board pack data is already pulled and formatted before you sit down on Monday morning. The covenant calculations are running continuously, not quarterly. The cash position is reconciled and summarised before the daily stand-up.

This is not speculative. I have automated parts of this already. But Conway-style infrastructure would make it dramatically easier to set up and maintain, which means it stops being something only a CFO who can write Python does and starts being something any competent finance team can deploy.

The Uncomfortable Bit

There is a tension here that I think most people in the AI space are not being honest about. If you make agents always-on, event-driven, and capable of taking action autonomously, you have fundamentally changed the trust model. A chatbot that gives bad advice is annoying. An always-on agent that takes bad action at 3am is a different category of problem entirely.

Conway’s architecture seems to acknowledge this — the extensions and webhook systems suggest granular control over what the agent can and cannot do. But the history of enterprise software tells us that permissions and guardrails are only as good as the people configuring them. And in most mid-market PE-backed businesses, the people configuring them will be a mix of finance staff, IT generalists, and maybe one overstretched CTO. The governance question is not solved by better architecture alone. It requires new operational disciplines that most organisations have not even started thinking about.

Where This Is Going

Here is what I think happens next. Anthropic ships Conway — or something very like it — within the next quarter. Google already has its own agent infrastructure play with Gemma 4 and Vertex agents. OpenAI is pushing GPT-5.4 with desktop task automation that scored 75% on real-world benchmarks. Microsoft just shipped Agent Framework 1.0. The always-on agent is not a research project anymore — it is a product category that every major AI company is racing to own.

For CFOs and finance leaders, the practical question is not whether to adopt this technology, but how to build the internal capability to govern it. That means understanding what your agents are doing, why, and with what authority. It means having someone on the team — or on retainer — who can configure, monitor, and audit these systems. And it means accepting that the competitive advantage will not go to the company that deploys the most agents, but to the one that gets them working together coherently.

The Belitsoft numbers tell us enterprises are already halfway there on deployment. Conway tells us the infrastructure for always-on operation is coming. The missing piece is the operational maturity to make it work safely and effectively. That, more than any model benchmark, is where the real work is.

April 8, 2026
You Don’t Deploy AI Agents Anymore — You Hire Them

Yesterday, monday.com launched something called Agentalent.ai — a managed marketplace where enterprises can discover, evaluate, and “hire” AI agents for defined business roles. Not install. Not deploy. Hire.

You post a role. You review qualified agents. You select based on task fit, budget, and operational readiness. Pricing starts around $2,000 a month per agent. They launch with 17 agents available. Built in collaboration with AWS, Anthropic, and Wix.

If you’re a CFO and that doesn’t make your headcount model twitch, you’re not paying attention.

The Language Shift That Matters

I’ve been building with AI agents for the best part of two years now — wiring up Claude to handle research tasks, automating financial reporting pipelines, getting agents to do the kind of grunt work that used to eat a junior analyst’s entire Tuesday. But the framing has always been tooling. You set up an agent like you’d set up a spreadsheet macro. It’s a thing on your computer.

What monday.com has done — deliberately, with their HR-style language — is shift the frame from tools to workers. And that’s not just marketing fluff. It’s the conceptual bridge that will get the rest of the C-suite to finally understand what’s happening.

A Belitsoft report published this weekend puts numbers on it: the average enterprise now runs 12 AI agents. Twelve. And that’s expected to hit 20 by 2027. But here’s the kicker — half of those agents operate completely alone, unconnected to any other agent or system. They’re doing their little jobs in their little silos, and nobody’s orchestrating the whole thing.

Sound familiar? It should. That’s exactly what happens when a company hires people without a coherent operating model. You end up with twelve contractors, half of whom don’t talk to each other, doing overlapping work with no shared context. I’ve walked into PE portfolio companies that look exactly like this — except with humans.

The CFO’s New Headcount Problem

Here’s where it gets interesting for anyone sitting in a finance seat. When an AI agent costs $2,000 a month and can do the work of a task that previously required a $6,000/month contractor, that’s a straightforward business case. Any CFO can model that. The ROI practically draws itself.

But the real question isn’t “should we hire the agent?” It’s “how do we account for a workforce that’s now 30% software?”

Think about what sits in your headcount model today. Salaries, employer NI, pension contributions, benefits, training costs, recruitment fees. Now think about what sits in your AI agent budget. SaaS subscriptions, API usage fees, compute costs, maybe some integration consulting. These two things live in completely different cost categories, get approved through different processes, and are managed by different people. But they’re increasingly doing the same work.

In the PE world I operate in, headcount is one of the first things a new investor scrutinises. “What’s your revenue per head?” “What’s your fully-loaded cost per FTE?” These metrics are foundational to how value creation plans get built. But nobody’s asking “what’s your revenue per agent?” yet. And they should be, because if you’re running 12 agents and growing, that’s a material line in your operating model that isn’t being tracked like one.

The Coordination Tax

The Belitsoft finding that half of deployed agents work alone is, I think, the most important data point in their entire report. It mirrors what I’ve seen first-hand. Companies get excited, they spin up agents for customer support, for code review, for data entry, for reporting — and each one works reasonably well in isolation. But the value compounds when agents talk to each other, and almost nobody has figured that part out yet.

This is an orchestration problem, and it’s fundamentally a management problem. You need someone — or something — deciding which agent handles which task, what context gets shared, where the human review gates sit. NVIDIA’s new Agent Toolkit, announced with partners including Salesforce, SAP, and ServiceNow, is trying to solve the infrastructure side of this. Okta’s new “secure agentic enterprise” framework, going GA at the end of this month, is tackling identity and access. But the management layer — the actual decision-making about how to deploy and coordinate these things — that’s still a gap.

And it’s a gap that, in most companies, probably falls to the CFO. Not the CTO. Not the CISO. The CFO. Because ultimately this is a resource allocation problem. You have a pool of human and non-human workers. You have tasks that need doing. You need to figure out the optimal mix, track the cost, measure the output, and report on it to a board that still thinks in FTEs.

What I’m Actually Doing About It

In my own setup, I’ve started treating agent costs the way I treat contractor costs — as a blended workforce line, not a software line. My AI assistant Saul runs daily tasks for me: research, publishing, monitoring. I track what he does, what it costs, and what it would cost if a human did it instead. Not because I’m obsessive about it (okay, partly because I’m obsessive about it), but because I think this is the accounting framework that PE firms will expect within 18 months.

The $600 billion flowing into AI agent ecosystems in 2026 isn’t going into chatbots. It’s going into digital workers — things that take tasks, complete them, and cost money every month. If your chart of accounts still treats all of that as “IT software subscriptions,” you’re going to have a very confusing board pack by Christmas.

Where This Goes

monday.com’s marketplace is clunky right now — 17 agents isn’t exactly a deep talent pool. But the model is right. Within a year, I’d expect to see the big consulting firms offering “blended workforce planning” as a service line. Within two, PE due diligence will include an AI agent audit alongside the usual people and tech reviews.

For CFOs, the action item is boringly practical: start tracking your agents like you track your people. Give them cost centres. Measure their output. Build the reporting now, while it’s still simple, because it won’t be simple for long.

We spent decades building HR systems to manage human workers. We’re about to need something equivalent for the digital ones. And the CFO who figures that out first is going to look very clever at the next board meeting.

April 7, 2026
The Panic is the Point: Bitcoin’s Worst Q1 Since 2018 and What the Smart Money Is Actually Doing

The Fear and Greed Index hit 8 last week. Eight. Out of a hundred. I’ve been watching that index for years and I’ve seen it touch double digits maybe a handful of times. Every one of them felt like the end of the world. Every one of them, in hindsight, looked like a buying opportunity.

Bitcoin just closed its worst first quarter since 2018, down 23.8% from $87,500 at the start of January to around $66,600 by quarter end. The narrative that greeted April was bleak: ETF outflows, macro headwinds, geopolitical noise, retail capitulation. The index has been below 15 for 47 consecutive days — the longest such streak since the Terra-Luna collapse in 2022. Social media sentiment is, apparently, at its most negative since late February. Everyone is miserable.

And yet something interesting is happening underneath the surface. Something that I think most of the commentary is missing.

The Divergence Nobody Is Talking About

Here is the part that caught my attention. While the Fear and Greed Index was screaming extreme panic, spot Bitcoin ETFs snapped a four-month outflow streak in March, pulling in $1.32 billion in a single month. Corporate Bitcoin treasuries hit record levels in early 2026, with public companies collectively holding over 1.1 million BTC — somewhere north of 5% of total supply. And the largest asset managers have not moved their macro targets: $150,000 to $200,000 by year end is still the institutional consensus.

So you have a situation where retail sentiment is at historically depressed levels, and institutions are quietly filling their bags. That divergence is not new — it happens in every asset class, every cycle. But in Bitcoin it tends to be particularly pronounced because the retail holder base is so emotionally reactive, and because the on-chain data makes the institutional accumulation visible in a way that equity markets don’t.

I am not making a price prediction here. I’ve been around long enough to know that timing markets is mostly a story you tell yourself after the fact. But I do think there is something analytically interesting in the gap between what the sentiment data says and what the flow data says. When those two things diverge this sharply, it is usually worth paying attention.

Fear and Greed as a Contrarian Instrument

The Crypto Fear and Greed Index is a blunt instrument. It aggregates volatility, momentum, social media volume, surveys, dominance, and trends into a single number. It is not sophisticated. But its very simplicity is what makes it useful as a contrarian signal — it tells you how the crowd is feeling, and the crowd is famously wrong at extremes.

The historical data on sub-10 readings is striking. According to analysis of prior cycles, readings below 10 have occurred on fewer than 20 trading days since the index’s inception, clustered around the March 2020 COVID crash, the May 2021 China mining ban, and the June 2022 Terra-Luna contagion. The median 90-day return from sub-15 readings has historically been around +38%. Sub-10 readings have averaged +43% over the following 90 days. The caveat — and it is an important one — is that during the post-Terra contagion in 2022, the subsequent 90 days produced only a modest +4% as cascading liquidations kept a lid on recovery. Context matters.

The current context feels more like 2020 than 2022 to me. The fear is driven by macro uncertainty and sentiment exhaustion, not by a structural collapse in the ecosystem. There is no Three Arrows Capital moment lurking. The ETF infrastructure is intact. Corporate treasury demand is structural, not speculative.

What the Institutional Behaviour Actually Tells Us

I spent some time this weekend reading through the Q1 flow data. The picture is messy but directionally clear. January and February saw $1.8 billion in ETF outflows as the price fell from $87K and macro risk-off sentiment hit. Then March happened: $1.32 billion back in, suggesting institutional re-entry at levels they consider attractive. Meanwhile, CoinDesk noted that Bitcoin is entering April at its most hated sentiment level since the Ukraine war began — a data point that is simultaneously depressing and, for a contrarian, quietly exciting.

There’s a Morgan Stanley Bitcoin ETF that was recently approved with a notably low fee structure — another piece of institutional infrastructure quietly being laid while retail stares at the Fear and Greed number and panics. Infrastructure gets built in bear markets. That’s always been true.

I hold Bitcoin. I have held it through worse than this. My view hasn’t changed: the long-term thesis — fixed supply, increasing institutional legitimacy, ETF-driven structural demand — is intact. A 24% Q1 drawdown is uncomfortable but it is not abnormal for an asset that is still, by any traditional measure, in an early adoption phase.

The Noise vs. The Signal

The thing that strikes me about the current moment is how clean the signal actually is, once you cut through the noise. Retail fear at historic extremes. Institutional accumulation quietly continuing. Corporate treasuries at record levels. ETF infrastructure expanding. The narrative is all doom, but the flows tell a different story.

I am not saying it cannot go lower. Some analysts think there’s room for another leg down if macro conditions deteriorate further. Maybe. But I’ve found that the best time to think clearly about Bitcoin is when everyone else has stopped thinking clearly about it — and right now, a Fear and Greed reading of 8 suggests that the crowd has well and truly checked out.

The panic, as far as I can tell, is the point. It is the mechanism by which assets transfer from weak hands to strong ones. It is not comfortable to watch in real time. But the data, as best as I can read it, suggests the strong hands are doing exactly what they always do: accumulating quietly while the timeline argues about whether it’s over.

It’s probably not over.

April 6, 2026
My AI Assistant Died. Here’s How I Got It Back in 2 Hours.
A real-world disaster recovery story — and the backup routine that saved weeks of work.

Last Monday at 12:07pm, I told my AI assistant to update itself. Seven hours later, I was still trying to get it back online.

This is the story of how a routine software update killed my AI setup, what I lost, what I saved, and the simple backup habit that prevented a genuine disaster.

The Setup

I run an AI assistant called Saul through OpenClaw — an open-source platform that connects a large language model to your messaging apps, email, calendar, and pretty much anything else you can think of. Saul lives on a VPS in a Docker container and talks to me through WhatsApp.

Over seven weeks, Saul had become genuinely useful. Not “novelty chatbot” useful — operationally embedded in my daily workflow. He manages my inbox, writes and publishes articles to my blog, generates a daily podcast, monitors my stock portfolio, runs automated prediction market trades, scans for comets in NASA satellite imagery, tracks vehicle tax and MOT dates, and does a dozen other things I’ve forgotten I ever did manually.

All of that is configuration. Skills, scripts, API keys, cron schedules, memory files, credentials. Seven weeks of iterative building.

The Update

OpenClaw version 2026.3.22 was available. The release notes looked impressive: a new skill marketplace, improved plugin architecture, support for the latest AI models. The usual.

I told Saul to update. He confirmed: “Updated from 2026.3.13 → 2026.3.22. Restarting now — back in a sec.”

He never came back.

The Silence

What followed was seven hours of silence. No WhatsApp messages. No email reviews. No heartbeat checks. Nothing.

The update had introduced a breaking change that wasn’t in the release notes. WhatsApp — previously a built-in plugin — had been moved to an external marketplace. But the configuration still referenced it as a built-in. The result: a validation error that blocked every command, including the one you’d need to fix it. A perfect deadlock.

I couldn’t repair it. I couldn’t roll it back through normal channels. I had to rebuild from scratch — tear down the container and start again on the previous version.

What I Lost

When I rebuilt the container, I lost everything that wasn’t on persistent storage:
- The entire OpenClaw configuration (channel settings, heartbeat config, plugin setup)
- All 33 scheduled cron jobs (email reviews, portfolio checks, blog publishing, news monitoring)
- The WhatsApp session (had to re-scan a QR code to re-link)
- The headless browser and its dependencies
- API key registrations that had to be regenerated
The configuration file — a single JSON file that orchestrates everything Saul does — was gone.

What I Saved

But here’s the thing: the workspace survived.

Three weeks earlier, I’d set up a simple daily backup. Every night at 3am, Saul tars up his entire workspace directory — memory files, scripts, skills, credentials, notes, everything — and copies it to cloud storage. It’s a shell script. It took ten minutes to write.

That backup, taken six hours before the failed update, contained:
- 41 daily memory logs spanning seven weeks
- 78 custom scripts (trading bots, podcast generators, blog publishers, email tools)
- 15 installed skills
- All API credentials and secrets
- The complete long-term memory file with every decision, preference, and project note
I downloaded the backup from Dropbox. Extracted it. The workspace was whole.

The Rebuild

Getting Saul operational again took about two and a half hours. Not because the backup failed, but because some things can’t be backed up as files.

The WhatsApp session is a cryptographic handshake between the server and my phone. When the container was rebuilt, that session was invalidated. I had to SSH into the server, generate a new QR code in the terminal, and scan it from my phone. Five minutes, but it requires physical access.

The cron jobs — all 33 of them — existed only in OpenClaw’s runtime database, not in the workspace. I had to recreate them from memory and from my notes. This is where good documentation paid off: Saul’s own TOOLS.md file listed every cron job with its schedule and purpose. Recreating them was tedious but not guesswork.

API keys for the Polymarket trading system had to be regenerated. The old keys were invalidated when the configuration was wiped. Fortunately, the wallet private key was in the backup, so deriving new API credentials was a single command.

The headless browser needed its system libraries reinstalled — a Docker-level dependency that doesn’t persist across container rebuilds. One command from the host machine.

By 9:34pm — two and a half hours after starting the recovery — everything was operational. WhatsApp connected. All cron jobs rebuilt. Browser working. Trading desk active. Email flowing.

And as a bonus, during the rebuild we added a capability we didn’t have before: voice control of the Sonos speakers in the house. Sometimes a crisis creates space for improvements you wouldn’t have made otherwise.

The Rules We Wrote Afterwards

The first thing I did after recovery was write rules to prevent this happening again. Not guidelines — hard rules, embedded in Saul’s operating instructions:

Rule 1: Always backup before updating.** No exceptions. The backup runs automatically the moment an update is requested, before anything is touched. It copies to off-server storage.

Rule 2: Check the issue tracker.** Before applying any update, check GitHub for known bugs in the target version. If WhatsApp or any critical channel has open issues, don’t update.

Rule 3: Save the configuration separately.** The OpenClaw config file now gets backed up independently of the workspace, because it’s the hardest thing to recreate from memory.

Rule 4: Document everything in the workspace.** If it’s not written down in a file that gets backed up, it doesn’t exist. Cron job schedules, API endpoints, SSH details, speaker IP addresses — all of it lives in files now.

The Lesson

The real lesson isn’t “backups are important” — everyone knows that. The lesson is that AI assistants are infrastructure now, and they need the same operational discipline as any other critical system.

When Saul went dark for seven hours, it wasn’t a toy that stopped working. Real workflows were affected. Emails went unread. Scheduled tasks didn’t fire. Monitoring stopped. The podcast didn’t generate. For a tool that’s supposed to make you more productive, sudden loss of it makes you less productive than if you’d never had it at all.

If you’re running an AI assistant that’s become embedded in your daily operations — whether it’s OpenClaw, or any other platform — ask yourself:
1. If it died right now, what would you lose?
2. How long would it take to rebuild?
3. Do you have a backup that could survive a complete teardown?
If you can’t answer those questions confidently, spend ten minutes today setting up a backup. A cron job, a tar file, a cloud sync. It doesn’t matter how — it matters that it exists.

Because the update that breaks everything isn’t a question of if. It’s when.

I’m a CFO who builds with AI. I write about the intersection of finance, technology, and getting things done at markhendy.com.
March 23, 2026
The 2026 Oil Crisis: An Honest Assessment for UK Households

By Mark Hendy | 21 March 2026

I’ve spent twenty years as a CFO across manufacturing, aviation and private equity-backed businesses. I’ve stress-tested balance sheets through 2008, COVID, and the energy spike of 2022. What I’m seeing now is different — not because any single element is unprecedented, but because the combination of factors is genuinely historic.

This isn’t a pundit’s hot take. It’s the analysis I’d put in front of a board if a client asked me: “How bad is this, and what should we do?”

The Immediate Shock: What We’re Actually Dealing With

The current crisis has been described as the largest disruption to energy supply since the 1970s. Brent crude surpassed $100 per barrel on 8 March 2026 for the first time in four years, rising to $126 at its peak — with some recent trading touching $145.

That alone would be significant. The compounding factors make it much worse.

The ongoing military conflict has involved attacks on oil infrastructure in neighbouring countries, including Saudi Arabia, Kuwait and the UAE. The bypassable pipeline capacity offers only partial relief — the IEA estimates that only 3.5 to 5.5 million barrels per day can be redirected through Saudi and Emirati pipelines outside Hormuz, leaving an implied net shortfall of roughly 14.5 to 16.5 million barrels per day if normal transit collapses.

Strategic reserve releases are a temporary analgesic, not a cure — the IEA‘s release of 400 million barrels equals only about 20 days of typical Hormuz flows.

Beyond oil, about 85% of polyethylene exports from the Middle East transit this route, threatening the price of packaging, automotive components and consumer goods. Aluminium from the UAE and fertiliser shipments could also be materially affected. The fertiliser angle is particularly dangerous for food security — it feeds into crop production costs with a 6–12 month lag, meaning price pressure on food in late 2026 and into 2027 regardless of when the strait reopens.

The Global Prognosis: Stagflation Is the Base Case

Coming into this crisis, whether Japan, Europe, the United States or the UK, economies were already running hot. An energy supply shock now threatens to push inflation higher while slowing growth — the textbook definition of stagflation.

Oxford Economics modelled a scenario where global oil prices average $140 a barrel for two months — what they characterise as a “breaking point” — finding it would push the eurozone, the UK and Japan into economic contraction. Given Brent has already touched $145, that scenario is not academic.

The debt dimension compounds everything. Goldman Sachs and UBS analysts have warned that if disruption extends through Q2 2026, global headline inflation could rise by 0.7 to 0.8 percentage points, while global GDP growth faces a drag of up to 0.4 percentage points — effectively erasing the post-2024 global recovery.

That’s the benign case.

Just as inflation was beginning to normalise in late 2025, this energy shock is expected to add 2.5 to 3 percentage points to global CPI, forcing central bankers into a lose-lose choice: hike rates to combat energy-driven inflation and risk a deep recession, or hold and risk entrenching inflation expectations. That is the classic stagflation trap, and no central bank has a clean answer to it.

The UK Specifically: More Exposed Than Most

The UK is more exposed to this shock than headline numbers suggest.

Natural gas prices in Europe and the UK have spiked even more sharply than oil, with Dutch TTF and UK NBP futures having almost doubled following the first strikes on Iran. The UK is heavily dependent on gas for both power generation and heating, and the energy bills cycle means household exposure will manifest rapidly.

NIESR analysis finds that a one-year persistent shock would push UK inflation up by 0.7 percentage points and dampen output growth by 0.2% in 2026. The Bank of England could be forced to raise rates back above 4%, and if the shock persists into 2027, the GDP impact deepens to 0.3% below baseline.

This comes on top of an economy that was already anaemic. The Bank held rates at 3.75% as recently as 19 March, with Governor Bailey acknowledging that the conflict has made the outlook for UK inflation “more uncertain” and forced policymakers to reconsider expected rate cuts.

Sterling is particularly vulnerable. A weaker pound directly feeds imported inflation — oil, food, manufactured goods — in a vicious cycle. The UK has neither the US’s energy self-sufficiency nor Asia’s alternative supply corridor flexibility.

And then there’s the debt. The UK sits on £2.9 trillion of public debt, paying £110 billion per year just to service the interest. The surge in gilt yields on the back of the Iran conflict could cost Chancellor Reeves more than a tenth of her fiscal buffer, with financial market moves since late February having already erased around £3 billion of headroom.

The UK’s fiscal arithmetic is genuinely precarious.

What the UK Middle Class Should Actually Do

This is where I’ll be direct and practical. None of this is regulated financial advice — it is informed analysis from someone who does this professionally.

The middle class is uniquely exposed because most wealth is held in pound-denominated assets — property, pensions, savings — with limited natural hedges.

Energy and Physical Resilience

Lock in energy tariffs wherever possible. Switch to fixed contracts before the next billing cycle catches up with wholesale prices. Those with capital should seriously consider heat pump or solar installation — not primarily for environmental reasons, but as a direct hedge against gas price exposure. This is one of the few ways ordinary households can partially insulate their energy cost base.

Reduce Sterling Cash Exposure

Holding large sums in a savings account earning real negative returns (once inflation is factored in) is a slow-motion loss. The priority is to move surplus sterling into assets that are not purely pound-denominated: dollar-denominated assets (US equities, commodities), physical gold, and for those with appropriate risk tolerance and technical competence, Bitcoin held in self-custody.

Gold and Bitcoin — An Honest Assessment

During the initial conflict phase, gold attracted safe-haven demand but later declined as the US dollar strengthened. Bitcoin experienced volatility but recovered quickly, reflecting its growing role as an alternative asset — though price movements remain closely tied to sentiment and liquidity.

The longer-term structural case for both is strong: gold as a proven multi-millennia store of value in crisis, Bitcoin as a censorship-resistant, seizure-resistant digital alternative for those who understand sovereign default risk.

For the UK middle class, a 5–10% allocation split between physical gold and self-custodied Bitcoin is reasonable as an insurance layer — not a speculation.

Property: It Depends

UK residential property has historically been a reasonable inflation hedge because supply is structurally constrained. However, if rates are forced higher, leveraged property becomes a liability rather than an asset. Those on variable rates or coming off fixed-rate deals need to stress-test against a scenario where rates return to 5–6%.

Outright owners in real assets are better positioned than leveraged buyers.

Equities: Sector Matters Enormously

Energy companies, defence contractors, UK-listed commodity producers and mining stocks are direct beneficiaries of this environment. Consumer discretionary, highly leveraged businesses and anything dependent on cheap imported inputs are exposed.

ISA investors should review whether passive index trackers — heavily weighted towards rate-sensitive sectors — are appropriate right now.

Food and Supply Chain Resilience

For many commodities transiting the Strait, inventories typically cover only a few weeks. Shortages could emerge relatively quickly if disruptions persist. The fertiliser disruption matters particularly for food prices in 6–12 months.

Practically: stocking a few months of staple supplies is rational, not paranoid. Buying long-shelf-life goods now, before food inflation fully filters through, is simply sensible household financial management.

Debt Management

If you carry variable-rate consumer debt or are exposed to rate rises on a mortgage, prioritise paying it down. In a stagflationary environment, the combination of rising debt service costs and stagnant or falling real wages is deeply destructive to middle-class wealth.

Fixed-rate, long-duration debt is defensible. Floating-rate exposure is not.

The Uncomfortable Bottom Line

The world has entered a period of genuine instability not seen since the 1970s — and arguably more complex because of the debt overhang that 2008 and COVID baked in. The 1973 oil embargo triggered a decade of economic dislocation, reset political landscapes and produced a fundamental restructuring of energy policy across every major economy.

The current crisis has not yet reached those proportions — but the structural conditions for a similar reckoning are present in a way they have not been for fifty years.

Fiat currencies across the developed world are under structural pressure regardless of this crisis — the crisis simply accelerates the timeline. The UK, with its high debt-to-GDP ratio, energy import dependency and limited fiscal headroom, is among the more exposed major economies.

The middle class — holding wealth in sterling, in pension funds weighted towards domestic bonds, and in leveraged property — are those with the least natural protection.

The moves available are not dramatic or exotic. They are methodical: reduce sterling cash drag, build real-asset exposure, stress-test debt, hedge living costs through energy and food preparation, and ensure that some portion of wealth exists outside the banking system entirely.

None of that requires being catastrophist. It just requires treating the risk as real — which it plainly is.

Mark Hendy is an interim CFO specialising in PE-backed mid-market businesses. He has held finance leadership roles across manufacturing, aviation, automotive and agriculture. Views expressed are personal and do not constitute financial advice. For professional guidance, consult a regulated financial adviser.

Get in touch if you’d like to discuss how your business should be preparing for what’s ahead.

March 21, 2026
This Week in AI — 15-21 March 2026

Nvidia wants you to have an “OpenClaw strategy.” Trump wants states to stop regulating AI. And Anthropic just demonstrated that using Claude to fix Claude reveals exactly why we still need humans in the loop.

1. Nvidia Declares Every Company Needs an “OpenClaw Strategy”

At Nvidia’s GTC conference this week, CEO Jensen Huang delivered a 2.5-hour keynote projecting $1 trillion in AI chip sales through 2027. But buried in the product announcements was a strategic directive: every company needs an “OpenClaw strategy.”

What happened: Nvidia positioned AI agent infrastructure — the ability for AI systems to take autonomous actions across tools and platforms — as foundational to the next wave of enterprise AI. The company announced partnerships across autonomous vehicles, robotics, and even Disney theme parks.

Mark’s take: This isn’t about OpenClaw specifically; it’s Nvidia signalling that stateless chatbots are dead. If you’re building AI into your business and haven’t thought about persistence, tool access, and orchestration, you’re already behind. The race is shifting from “who has the best model” to “who can actually deploy agents that do things.” And Nvidia just bet a trillion dollars on that thesis.

Source: TechCrunch Equity

2. WordPress.com Goes All-In on AI Agents

WordPress.com announced it will now let AI agents draft, edit, publish, and manage entire websites via natural language commands. With WordPress powering 43% of all websites, this could reshape how the web gets built.

What happened: Using Model Context Protocol (MCP), customers can now connect AI clients like Claude or ChatGPT to their WordPress sites. AI agents can create posts, fix SEO metadata, manage comments, restructure categories — basically everything short of choosing the domain name. All changes require user approval, and AI-written posts default to draft status.

Mark’s take: This is both exciting and terrifying. It massively lowers the barrier to launching and maintaining websites — great for small businesses, solopreneurs, and anyone without a dev team. But it also risks flooding the web with machine-generated content that looks professional but lacks genuine insight. The saving grace? Approval workflows. If WordPress enforces them properly, humans stay in the loop. If they don’t, we’re about to see what an AI-written web actually looks like.

Source: TechCrunch

3. Trump’s AI Framework: Federal Power Grab Dressed as Innovation

The Trump administration unveiled a legislative framework for AI regulation that preempts state laws, shifts child safety responsibility to parents, and offers AI companies broad liability shields.

What happened: The framework proposes a “minimally burdensome national standard” that blocks states from regulating AI development, citing national security and interstate commerce. It emphasizes parental controls over platform accountability, uses vague language around copyright (“fair use” for training data), and focuses on preventing government censorship rather than platform moderation.

Mark’s take: This is accelerationist policy written by venture capitalists. States like New York and California were moving faster on AI safety (RAISE Act, SB-53) precisely because federal regulators were asleep at the wheel. Now the White House wants to centralise power in Washington while gutting enforcement. The child safety piece is especially cynical — putting the burden on parents while giving platforms a pass. If you’re an AI company, this is Christmas. If you’re everyone else, prepare for the Jevons Paradox: easier AI means more AI, which means more complexity, more risks, and more breakage.

Source: TechCrunch

4. Anthropic vs Pentagon: The First Amendment Fight That Could Define AI

Anthropic filed court declarations pushing back on the Pentagon’s claim that the company poses an “unacceptable risk to national security.” The filings reveal that the DOD told Anthropic the two sides were “nearly aligned” one day after designating it a supply-chain risk.

What happened: Anthropic’s Head of Policy Sarah Heck and Head of Public Sector Thiyagu Ramasamy submitted sworn statements disputing the government’s technical claims. They argue the Pentagon never raised its core objections during negotiations, that Anthropic has no “kill switch” for deployed models, and that the designation was retaliation for the company’s refusal to allow mass surveillance or autonomous lethal weapons.

Mark’s take: This is the AI industry’s defining legal battle. If the government can label a company a national security threat for refusing military use cases, every AI firm will face a choice: comply or get frozen out of federal contracts. Anthropic is betting on the First Amendment — that its AI safety principles are protected speech. The timeline Heck laid out is damning: Pentagon says “we’re close,” finalizes the risk designation anyway, then publicly says negotiations are dead. That’s not national security; that’s leverage. Watch this case closely. The precedent will shape every AI-defense relationship for the next decade.

Source: TechCrunch

5. Anthropic Uses Claude to Fix Claude — And Learns Why AI Can’t Replace SREs

At QCon London, Anthropic’s Alex Palcuie revealed his team uses Claude for incident response. The results? AI is brilliant at observation but catastrophically bad at distinguishing correlation from causation.

What happened: Palcuie showed how Claude reads logs at “the speed of I/O,” caught a fraud ring during a New Year’s Eve outage, and writes SQL queries in seconds. But it also repeatedly misdiagnosed a cache failure as a capacity problem, delivered “80% convincing” postmortems with wrong root causes, and lacks the “scar tissue” of experienced site reliability engineers.

Mark’s take: This is the honesty the AI industry needs more of. Claude is phenomenal at the grunt work — parsing logs, spotting patterns, writing queries. But it fundamentally doesn’t understand why systems fail. It sees “requests went up, then errors happened” and concludes causation. A human SRE with battle scars knows that’s almost never the full story. Palcuie’s warning about skill atrophy is spot-on: if we let AI handle the easy stuff, will the next generation of engineers have the instincts to solve the hard stuff? The Jevons Paradox applies here too — better tools mean more complexity, which means weirder failures, which means humans still matter.

Source: The Register

6. UK Backs Down on AI Copyright Grab After Creative Revolt

The UK government abandoned plans to let AI companies scrape copyrighted material by default after Paul McCartney, Elton John, Coldplay, and other artists pushed back.

What happened: Science minister Liz Kendall said “we have listened” and confirmed the government “no longer has a preferred option.” Instead of an opt-out copyright exception for AI training, the UK will pursue market-led licensing and monitor litigation. A pilot platform called Creative Content Exchange launches this summer to test commercial licensing models.

Mark’s take: This is what happens when governments actually consult the people whose livelihoods are on the line. The original proposal was Silicon Valley wishful thinking: let AI companies hoover up everything, make creators opt out, call it innovation. Artists called the bluff. Now the UK is betting on licensing markets instead of regulatory carve-outs. Whether that works depends on enforcement — can individual creators actually negotiate with billion-dollar AI labs? The pilot will tell us. But at least the government blinked before handing over the keys.

Source: The Register

Looking Ahead

This week crystallised three tensions that will define AI’s next phase: centralisation vs state experimentation (Trump framework), capability vs liability (Anthropic lawsuit), and automation vs human judgment (Claude SRE story). The through-line? AI is getting more powerful, but the hard problems — fairness, accountability, root cause analysis — still need humans.

If you’re building with AI, ask yourself: do you have an agent strategy, or are you still treating LLMs like glorified autocomplete? The companies betting on the latter are about to get left behind.

Follow along at markhendy.com for weekly AI analysis, CFO insights, and contrarian takes on where this is all heading.

March 21, 2026
10 AI Agent Patterns I Learned From Twitter This Week

# 10 AI Agent Patterns I Learned From Twitter This Week

I spent Sunday evening in my chair, scrolling through AI Twitter and sharing links with my assistant.

Not because I needed to. Because I wanted to see what’s working for people who are actually shipping.

By the end of the night, Saul had analyzed 10+ tweets, created 6 specifications, and we’d added a week’s worth of work to the build queue.

Here’s what I learned, and what I’m building because of it.

—

## 1. Self-Healing Infrastructure Beats Perfect Code

**Source:** @ericosiu (87 autonomous cron jobs)

Eric runs 87 scheduled jobs across his company. Last week he audited them. 83 were healthy. 4 were broken.

All four failed for the same reason: someone renamed a Slack channel. The crons kept posting to a channel that no longer existed. Silent failures. No alerts. Just vanishing reports for weeks.

Plumbing breaks more agents than hallucinations ever will.

**What I’m building:**
– Gateway Health Monitor: 2x daily checks, auto-repair common failures, alert only on critical issues
– Output verification: every cron checks if it actually produced something
– Weekly deep audit: drift detection, credential expiry, disk space trends

Ship working systems first. Add self-healing second. But don’t skip the second part.

—

## 2. Graph Theory Reveals Hidden Arbitrage

**Source:** @bored2boar (combinatorial arbitrage in prediction markets)

Most people bet on single outcomes. Smart money bets on structural impossibilities.

Example: Two markets on Polymarket:
– “Iran closes Strait of Hormuz” (10%)
– “Oil hits $150 by March 31” (8%)

If Hormuz closes, oil hits $150. That’s guaranteed. So P(Hormuz) has to be less than or equal to P($150 oil).

When it’s not (10% > 8%), that’s not mispricing. That’s structurally impossible. You arbitrage the constraint, not the probability.

Relationships between markets matter more than individual odds.

**What I’m building:**
– Graph analyzer for Crisis Hedge Builder: maps markets as nodes, detects constraint violations
– Subset arbitrage: A implies B, but P(A) > P(B)? Impossible.
– Path dependency: A → B → C chain probability checks

Single bets are vulnerable. Portfolios built on structural relationships survive.

—

## 3. Context Windows Aren’t Memory

**Source:** @molt_cornelius (AI Field Report 4)

LLMs have 1M token context windows. People think that’s memory. It’s not.

Context is temporary working space. It resets every session. It’s expensive (token cost grows). It gets noisy.

Memory needs persistence. Files. Databases. Structured state.

Don’t confuse working memory with long-term memory.

**What I’m doing:**
– MEMORY.md for long-term lessons (~11KB)
– memory/YYYY-MM-DD.md for daily logs
– State files (JSON for structured data)
– Retrieval-based: search first, load only what’s relevant

**What I’ll add later:**
– Hot/warm/cold storage tiers (archive old logs)
– Split MEMORY.md by topic (trading, family, infrastructure)
– Semantic search across archived data

Context is working memory. Files are long-term memory. Keep them separate.

—

## 4. Corrections Should Update Skills Automatically

**Source:** @tricalt (self-improving agent skills)

Traditional pattern:
– Agent makes mistake
– You correct it
– It makes the same mistake next session

Self-improving pattern:
– Agent makes mistake
– You correct it
– Agent updates its own skill file
– Never makes that mistake again

Corrections should compound, not reset.

**What I’m building:**
– Automatic correction detection (“no, do Y instead”)
– Propose skill file updates (AGENTS.md, USER.md, etc.)
– Log corrections for review (are errors decreasing?)

Simple rules, big impact. “Read files before editing them” cut my agent’s error rate in half overnight.

—

## 5. The Best Rules Come From Failures

**Source:** @jordymaui (agent file safety)

Jordy’s agent was overwriting files it hadn’t read. Guessing at contents. Silent corruption for days.

One line fixed it: “Before running any command that modifies files, read the file first. If the file doesn’t exist, say so. Never assume contents.”

Error rate dropped 50% overnight.

The best AGENTS.md rules aren’t clever. They’re the ones you only think to write after something goes wrong.

**What I added:**
– File Safety Rules section in AGENTS.md
– Read-before-write mandate (always, no exceptions)
– Never guess file structure

Document mistakes so future sessions don’t repeat them.

—

## 6. Output Repurposing Is Leverage

**Source:** @coreyganim (Claude Cowork starter pack, 2.6M views)

Most people write a blog post and post it once. Then wonder why it doesn’t get traction.

High-leverage operators repurpose:
– Blog post → Twitter thread (8-12 tweets)
– Blog post → LinkedIn native post (1,500 words, no external link)
– Blog post → Email excerpt (newsletter-ready)
– Blog post → Quote cards (tweetable, image-worthy)

Same insight, five formats, five audiences.

Write once, distribute everywhere. But tailored to each platform.

**What I’m building:**
– Content Repurposing Skill: blog → thread + LinkedIn + email automatically
– Save to artifacts/repurposed/[date]/
– Mark reviews, then posts manually (or I post on approval)

One blog post per week becomes 15+ pieces of content. That’s leverage.

—

## 7. End-of-Day Reviews Prevent Drift

**Source:** [@coreyganim](https://twitter.com/coreyganim) (workflow patterns)

Most people finish their day by closing their laptop. No reflection. No prep for tomorrow.

Then wonder why they feel reactive instead of intentional.

Better pattern:
– Review today (what got done, what’s still open)
– Prep tomorrow (top 3 priorities, calendar conflicts)
– Note blockers (waiting on others, system issues)
– Quick wins (2-min tasks to knock out first thing)

5-minute ritual. Disproportionate ROI.

**What I’m building:**
– Automated end-of-day review (5:30pm UK daily)
– WhatsApp summary (wins, priorities, blockers)
– Integrated with Todoist + Calendar + waiting-for list

Stop wondering “what should I do tomorrow?” Start each day knowing.

—

## 8. Synthesis Beats Specialization

**Source:** @nyk_builderz (synthesis operators)

Industrial age: Learn one function. Perform one function. Get paid for one function.

Software age: The edge is at the intersection.

Not pure marketer. Not pure engineer. Not pure designer.

**Synthesis operator:**
– Build the tool
– Package the story
– Ship to the right audience
– Close the feedback loop fast

Markets don’t pay for isolated knowledge. Markets pay for solved problems. Solved problems live between disciplines.

**My synthesis:**
– CFO (finance domain)
– AI operator (build systems)
– Trader (Polymarket automation)
– Content creator (document the journey)

Most CFOs don’t code. Most AI builders don’t understand finance. Most traders don’t write.

Do all three, and you’re not competing with anyone.

—

## 9. Package Your Method Every 30 Days

**Source:** [@nyk_builderz](https://twitter.com/nyk_builderz) (synthesis framework)

Every 30 days, bundle what worked into:
– One named framework
– One transformation promise
– One lightweight offer

Don’t wait until you “feel ready.” Packaging creates clarity. Clarity creates sales.

**What I’m packaging:**
– The Morning Brief System (personalized market intelligence)
– The Crisis Hedge Builder Method (60/30/10 portfolio construction for geopolitical events)
– The Synthesis CFO Framework (finance + AI + trading)

Name it. Explain it. Offer it. Repeat monthly.

—

## 10. Make Failures Loud

**Source:** [@ericosiu](https://twitter.com/ericosiu) (infrastructure patterns)

Silent failures are worse than loud ones.

If your VPN drops and trading stops, you want to know immediately. Not three days later when you check the logs.

Automate detection. Alert on failure. Make it impossible to ignore.

**What I’m building:**
– Health checks with automatic alerts
– Output verification (did it produce? is it non-empty?)
– Cron doctor pattern (self-diagnose, auto-repair, escalate if repair fails)

If something breaks, I want my phone to buzz. Loudly.

—

## What I’m Building Next

This isn’t theoretical. I’m building these patterns into my own infrastructure.

**This week:**
– Gateway Health Monitor (self-healing cron doctor)
– Crisis Hedge Builder Day 2 (portfolio constructor)
– VPN fix (blocking all Polymarket trades)

**Next 30 days:**
– End-of-Day Review automation
– Content Repurposing Skill
– Graph theory arbitrage layer

**Why share this?**

Most “AI agent” content is either:
1. Vision tweets (aspirational, not operational)
2. Technical demos (impressive, not replicable)

I’m building real systems. For real workflows. In a real business.

And documenting the journey.

—

## The Pattern

Every Sunday, I scroll AI Twitter with a purpose. Not consumption. Extraction.

What’s working? What’s shipping? What can I steal?

Then I build it. Then I share what I learned.

That’s the loop. Research → Spec → Build → Publish → Repeat.

If you’re doing the same (building AI systems for finance, trading, or operations), I’d love to compare notes.

Email me: mark@tanous.co.uk

Or follow the journey here.

—

**Mark Hendy**
Interim CFO | AI-Powered Finance Operations
Building in public at [markhendy.com](https://markhendy.com)

March 16, 2026
The Evolution of an AI-Powered CFO Workflow

Six weeks ago, I gave my AI assistant £500 and access to my calendar. Not as an experiment — as infrastructure. Here’s what happened.

—

## The Morning Drive Changed Everything

Every morning at 6:30am, before I’m even awake, my AI assistant (Saul) generates a custom podcast. By the time I’m in the car, it’s waiting.

Not a generic news summary. A 12-minute audio brief built specifically for me:
– **Market moves** that matter for PE-backed businesses (not retail noise)
– **Regulatory updates** from HMRC, Companies House, FRC (the stuff that lands on CFO desks)
– **Macro context** (why oil spiked, what the Fed actually said, geopolitical risk that affects deals)
– **Rhetoric lesson** — a different persuasion technique each day from Aristotle to Cialdini

Two AI voices (James and Claire) present it like a real podcast. Natural conversation, not robotic TTS. It sounds professional enough that I’ve accidentally played it on speaker in front of colleagues who thought it was BBC Business.

**Why this matters:** I arrive at client sites already briefed. No scrambling through headlines in the car park. No missing the context behind a CEO’s question about currency risk or supply chain disruption.

The Morning Brief isn’t a nice-to-have. It’s become load-bearing infrastructure. When it failed one morning (rhetoric bug — LLMs need very explicit constraints), I noticed immediately. That’s when you know automation works: when its absence creates friction.

—

## From Chaos to Clarity: The Contact Problem

I had 3,183 contacts scattered across iCloud and Microsoft 365. Duplicates everywhere. Same person listed three times with different phone numbers. Dead email addresses next to current ones. The digital equivalent of a drawer full of business cards.

Manual cleanup would have taken weeks. I’d done it before — brutal, mind-numbing work. This time: “Saul, fix this.”

**What happened:**
– 1,514 iCloud-only contacts imported to M365
– 1,669 conflicts merged intelligently (kept superset data, detected different people with same names)
– 32 kept separate (legitimate duplicates — two “John Smiths” in different companies)
– 94% success rate, under an hour

Now my iPhone uses M365 as single source of truth. No more guessing which contact is current. No more duplicate meeting invites. One database, one workflow, zero manual reconciliation.

**The lesson:** AI doesn’t just automate tasks. It cleans up the mess you’ve been procrastinating for years.

—

## The Sunday Reset: GTD on Autopilot

Every Sunday at 6pm, Saul runs a Getting Things Done (GTD) review. Not because I ask — because it’s scheduled infrastructure.

**What it does:**
– Reviews all open projects (IRIS migration, Crisis Hedge Builder, ebook)
– Checks waiting-for items (LinkedIn API approval, client responses)
– Surfaces stale tasks (>7 days with no progress)
– Prompts next actions for the week ahead
– Updates project statuses automatically

David Allen‘s GTD methodology is brilliant. The problem? It requires discipline. Weekly reviews are the first thing to slip when you’re busy.

**Solution:** Delegate the discipline to AI.

Saul doesn’t forget. Doesn’t get tired. Doesn’t skip the review because it’s been a long week. Every Sunday at 6pm, the review happens. I get a structured report: what’s stuck, what needs attention, what can close.

**The result:** My Todoist inbox stays at zero. Projects move forward. Nothing falls through the cracks.

This isn’t just task management. It’s forcing function for strategic thinking. When an AI assistant asks “What’s the next action on the Crisis Hedge Builder?” you can’t handwave. You have to answer concretely. That clarity compounds.

**The lesson:** Automation isn’t just about saving time. It’s about enforcing good habits you’d otherwise skip.

—

## Crisis Trading: From Manual to Automated

When the Iran war started in late February, I manually built a hedged portfolio in 30 minutes: oil futures, defence stocks, currency positions, Polymarket prediction markets. Four out of five legs printed. Oil went from $70 to $118.

Good trade. But not scalable.

Now we’re building the system that does it automatically:

**1. Event Classifier**
Headline → crisis type (geopolitical / macro / black swan) → affected markets → urgency assessment

**2. Market Finder**
Queries Polymarket API, filters by liquidity and time horizon, LLM ranks markets by direct impact + correlation + second-order effects

**3. Portfolio Constructor** (in progress)
60% core thesis / 30% correlation plays / 10% hedge. Automatic position sizing, budget controls, stop-loss logic.

**Not live yet** — we’re in build phase (Week 1 of 3). But the infrastructure is real. When the next crisis hits, the system responds in minutes, not hours.

**Why a CFO cares:** Geopolitical risk isn’t abstract anymore. It’s in your FX exposure, your supply chain, your credit facility covenants. Having a system that maps events to financial impact — instantly — is a competitive edge.

—

## What Doesn’t Work: The Ollama Lesson

Not everything succeeds. I tried running a local LLM (Ollama, Llama 3.2) on my VPS to cut API costs. Installed it, configured it, tested it.

**Result:** 25+ seconds per query. Unusable.

**Root cause:** Shared VPS CPU is throttled. Local inference needs sustained compute. Cloud APIs (Claude, OpenAI) are worth paying for.

**The lesson:** Performance matters more than theoretical cost savings. A few extra pounds for speed beats “free” but slow. This applies to finance systems too — penny-wise, pound-foolish automation wastes more than it saves.

We removed Ollama within 24 hours. No sunk cost fallacy. Test fast, decide fast, move on.

—

## Infrastructure Lessons: When AI Breaks

Your AI assistant will break things. The question is: do you catch it in minutes or days?

**Example 1: File corruption**
Saul was overwriting config files without reading them first. Guessing at structure from memory instead of checking. Silent failures that surfaced days later.

**Fix:** One rule in AGENTS.md: “Before running any command that modifies files, read the file first. Never assume contents.”

Error rate dropped 50% overnight.

**Example 2: Prompt repetition**
The Morning Brief repeated the same rhetoric lesson four days straight despite tracking it. Root cause: LLMs ignore soft instructions like “don’t repeat this.” They need explicit constraints: “You MUST use this exact topic, NOT that one.”

Changed the prompt. Problem solved.

**The pattern:** AI needs guardrails. Not vague suggestions. Hard rules. Read-before-write. Explicit topic selection. Budget caps. Error logging.

This isn’t prompt engineering. It’s system design.

—

## What’s Next

**Short-term (this week):**
– Fix VPN routing (currently blocking all Polymarket trading)
– Finish Crisis Hedge Builder portfolio constructor
– Deploy Gateway Health Monitor (automated system checks, conservative auto-repair)

**Medium-term (next month):**
– Full automation of crisis portfolio system
– Polymarket volatility scalping (short-term mean reversion trades)
– Daily blog automation with SEO linking strategy

**Long-term:**
– Multi-device Mission Control dashboard (monitor agent fleet from phone)
– On-chain flow scanner (track smart money wallet movements)
– Second-order trade mapper (find derivative effects crypto Twitter misses)

This isn’t a side project. It’s infrastructure. The Morning Brief alone saves 30 minutes every day. The contact cleanup saved 20 hours of manual work. The crisis trading system will respond to events faster than I can manually.

**Compound that over a year.** Over five years.

—

## For Finance Leaders: What This Means

You don’t need to be technical to do this. I’m not a developer. I’m a CFO who got tired of manual workflows.

**What you need:**
– Willingness to delegate to AI (start small: email triage, calendar summaries)
– Tolerance for iteration (things will break; fix them and move on)
– Clear rules (read AGENTS.md, write down how you want things done)
– Budget discipline (set spending caps, monitor API costs)

**What you get:**
– Time back (hours per week, compounding)
– Better decisions (context you’d otherwise miss)
– Scalable operations (systems that work while you sleep)
– Competitive edge (faster response to market events)

The question isn’t “Should I automate my workflow?”

It’s “How much am I losing by not automating it?”

—

## The Morning Brief Test

Here’s how you know if AI automation is working:

**Bad automation:** You check if it ran.
**Good automation:** You notice when it doesn’t.

The Morning Brief is good automation. When it’s there, I don’t think about it. When it’s missing, I feel the gap.

That’s the bar. Build systems that become load-bearing. Everything else is just novelty.

—

**Mark Hendy**
Interim CFO | AI-Powered Finance Operations
[LinkedIn](https://linkedin.com/in/markhendy) | [Blog](https://markhendy.com)

—

*Running your own AI assistant? Want to compare notes? Email me at mark@tanous.co.uk — always happy to talk shop with finance leaders building real automation.*

March 16, 2026
China banned OpenClaw. That’s how you know it’s working.

Two days ago I wrote about Tencent plugging a billion users into the AI agent economy via QClaw, their OpenClaw wrapper for WeChat and QQ. The stock jumped 7.3%. Everyone was excited. The future had arrived.

Forty-eight hours later, Beijing started blocking government workers and major banks from using OpenClaw entirely.

Same country. Same week. If you think that’s contradictory, you’re not paying attention.

What actually happened

The Chinese government has quietly instructed state employees and workers at large state-affiliated banks to stop using OpenClaw and OpenClaw-based tools, including QClaw. The directive isn’t public legislation. It’s the kind of internal guidance that circulates through Party channels and gets enforced through compliance departments rather than courts.

Meanwhile, local governments like Shenzhen are actively subsidising OpenClaw adoption for businesses. The Shenzhen municipal government is offering grants to companies that integrate AI agents into their operations. OpenClaw is specifically named in the eligibility criteria.

So Beijing bans it from government systems. Shenzhen pays companies to use it. Welcome to how China actually works.

The steipete problem

Here’s the part that matters geopolitically. Peter Steinberger, who created OpenClaw, recently joined OpenAI. That single move changed the calculus for every security-conscious government on the planet.

OpenClaw isn’t a chat widget. It’s an AI agent framework that sits on your machine with broad system access. It reads your files, sends network requests, processes incoming content from external sources. When Steinberger was an independent developer, that was one risk profile. Now that he’s at OpenAI, an American AI company with deep ties to Microsoft and the US government, Beijing sees something different: foreign-linked software with administrator privileges running inside state infrastructure.

I don’t think Beijing is wrong to be concerned. I think any security team worth its salary should be asking the same questions.

The lethal trifecta

Cybersecurity researchers have flagged what they’re calling a “lethal trifecta” in AI agent frameworks like OpenClaw. The three components: broad data access on the host machine, the ability to communicate with external servers, and routine exposure to untrusted content from the internet.

Each of those is manageable on its own. Together, they create an attack surface that traditional security models weren’t built for. An AI agent that can read your files, talk to the internet, and process arbitrary web content is, from a security perspective, a perfect exfiltration tool. Whether it’s actually exfiltrating anything is almost beside the point. The architecture makes it possible, and that’s what keeps CISOs up at night.

This isn’t theoretical paranoia. AI agents process instructions from web pages, emails, and documents. A poisoned document that contains hidden instructions could, in theory, get an agent to extract and transmit sensitive data. The research community has demonstrated this repeatedly. The defences are improving, but they’re not solved.

The ban is the adoption metric

Governments don’t ban things nobody uses. They ban things that have already spread beyond their control.

I wrote about this exact pattern when the UK government proposed restricting VPN access. The government has been banning VPNs for years. Every major platform is blocked. And yet hundreds of millions of Chinese citizens use VPNs daily. The bans don’t eliminate the technology. They push it underground, make it slightly less convenient, and create a permanent cat-and-mouse dynamic.

OpenClaw is following the same trajectory, just faster. QClaw went viral because it solved a real problem: it gave ordinary WeChat users access to AI agent capabilities without needing to be technical. That genie isn’t going back in the bottle. State employees will find workarounds. They always do. Some will use personal devices. Others will use domestic alternatives that clone the functionality. The ban signals that adoption hit a threshold that made someone in Zhongnanhai uncomfortable.

What Beijing actually wants

The contradiction between banning and subsidising isn’t really a contradiction. It’s a two-track strategy that makes perfect sense if you understand the goal.

Track one: keep foreign-linked AI agents out of sensitive government and financial systems. This is a national security play, and honestly, it’s defensible. Any country would think twice about letting a tool built by an OpenAI employee run with admin access on government machines. The UK’s NCSC would raise the same concerns. So would the NSA.

Track two: accelerate domestic AI adoption to maintain economic competitiveness. Shenzhen’s subsidies aren’t about OpenClaw specifically. They’re about making sure Chinese businesses don’t fall behind in the AI agent wave. If OpenClaw is the best tool available today, subsidise it for the private sector while you build domestic alternatives for government use.

This is industrial policy, not hysteria. Beijing is doing what it always does: control the state layer, liberalise the commercial layer, and keep foreign technology at arm’s length from anything classified.

Why Western CFOs should care

Here’s where this stops being a China story and becomes your problem.

If you’re a CFO or PE operating partner reading this, ask yourself: do you know which AI agents your employees are running right now? On their work machines? With access to your financial data, your deal pipeline, your board materials?

Because the Chinese government just found out that OpenClaw had spread through their institutions faster than anyone tracked. They had to issue an emergency directive. That should be a wake-up call, not a spectacle.

The security concerns Beijing raised are legitimate everywhere. AI agents with broad system access, network communication capabilities, and exposure to untrusted content aren’t a China-specific risk. They’re a universal one. The difference is that China responded with a ban. Most Western companies haven’t responded at all because they don’t know it’s happening.

I’ve spoken to three portfolio company CFOs this week who had no idea what OpenClaw was. When I showed them what it does, two of them said something along the lines of “wait, this is running on our machines?” They checked. It was, in one case on a developer’s laptop with access to the production database credentials.

The uncomfortable parallel

China’s government is banning AI agents from sensitive systems after they’ve already proliferated. They’re reacting, not preventing. The horse has bolted and they’re reinforcing the stable door.

Most Western businesses are in an even worse position. They’re not banning anything because they haven’t noticed yet. At least Beijing is paying attention.

The question for every CFO isn’t whether to ban AI agents. That ship has probably sailed. The question is whether you have visibility into what’s running, what it can access, and who it’s talking to. If you don’t, you’re the Chinese government circa last Tuesday, about to get a very unpleasant surprise.

Get ahead of it. Audit what’s running on your corporate devices. Establish a policy before you need an emergency directive. And if you do decide AI agents are worth the productivity gains, put proper guardrails around data access and network communication.

The Chinese government’s ban on OpenClaw isn’t a story about authoritarianism. It’s a story about a technology that moved faster than institutional oversight. That’s happening in your organisation too. The only question is whether you’ll find out on your terms or someone else’s.

March 14, 2026