GLM-5.1: The AI That Works While You Sleep — And Then Some

There’s a particular kind of AI announcement that makes me sit up. Not the ones that claim to beat GPT on some benchmark no-one’s heard of. Not the ones with slick demos that quietly ignore the bit where it falls over. The ones that matter are the ones where someone shows you the receipts — actual tasks, actual time, actual results.

GLM-5.1, released last week by Z.ai (formerly Zhipu AI), is one of those.

Eight Hours. Unattended.

Let me frame what I mean. Most AI coding tools work in short bursts. You ask them to write a function, review some code, draft a test. Good assistants. But you’re still the loop-closer — the one who notices it’s gone sideways, resets the context, redirects the prompt.

GLM-5.1 does something materially different. Z.ai ran it for eight hours straight building a Linux-style desktop environment from scratch — file browser, terminal, games. No handholding. It planned, executed, hit blockers, revised its approach, iterated. Hundreds of times. The claim isn’t “it wrote code”. The claim is “it didn’t give up”.

That’s a different category of capability.

The Numbers That Matter

I’m a CFO. I like numbers. Here are the ones worth paying attention to:

  • Vector database optimisation: 600+ iterations, 6,000+ tool calls, 21,500 queries per second — six times the previous best
  • GPU kernel tuning: 1,000+ turns, 3.6× speedup on ML workloads
  • SWE-Bench Pro: 58.4% — ahead of both GPT-5.4 (57.7%) and Claude Opus 4.6 (57.3%)

That last one is significant. This isn’t some niche Chinese model playing catch-up. On the hardest software engineering benchmarks, it’s beating the models most people consider the gold standard. And it’s open source — MIT licence, weights on Hugging Face, deployable on your own infrastructure.

Why This Matters to Finance and Business

I’ve written before about the shift from AI as assistant to AI as agent. GLM-5.1 is the clearest demonstration yet of what agentic AI actually looks like in practice.

Think about the workflows in a finance function that are genuinely tedious:

  • Building and debugging complex financial models
  • Writing and testing data pipeline logic
  • Iterating on management information templates
  • Automating reconciliation scripts

These aren’t tasks that fail at step one. They fail at step seven, when the edge case appears. They fail when the data format changes. They fail when the logic that worked last month doesn’t work this month. The human overhead isn’t writing the first version — it’s the iteration.

If a model can sustain goal-directed effort over hundreds of iterations without losing the thread, that’s not incrementally better. That’s a different class of tool.

The Open Source Angle

Z.ai releasing this under MIT licence is genuinely interesting. The dominant models — OpenAI, Anthropic, Google — are all closed. You pay for API access, you accept their terms, you live with their rate limits and pricing changes.

An open-source model that competes on performance changes the calculus for enterprise deployment. You can run it on-premise. You control the data. You don’t get a pricing change email in March telling you costs are going up 40% in April.

For PE-backed businesses with sensitive financial data and legitimate concerns about feeding that data into third-party APIs — this matters.

What I’d Watch

GLM-5.1 isn’t perfect. It trails on some pure reasoning benchmarks. The long-horizon capability, while impressive, assumes the task is well-defined enough for autonomous execution — genuinely ambiguous strategic questions still need a human in the loop. And “it ran for 8 hours” cuts both ways: great if it’s right, expensive if it’s wrong.

But the trajectory is clear. Each successive model generation extends the horizon over which AI can operate without human intervention. GLM-5 held position. GLM-5.1 sustained improvement. The question isn’t whether agentic AI is coming to professional services and finance — it’s whether you’re planning for it.

If you want to know how I’m already using AI agents in day-to-day finance work, read this.

The Practical Bit

If you want to experiment: it’s on Hugging Face, runs via vLLM or SGLang, and integrates with standard agentic frameworks. For those less inclined to self-host, API access is through api.z.ai. The coding-focused subscription plan is $10/month — less than a decent lunch.

I’ll be testing it against some of the financial automation tasks I currently route through Claude. I’ll report back.


Mark Hendy is an interim CFO working with PE-backed businesses. He writes about AI, finance, and the intersection of the two at markhendy.com. Follow on LinkedIn.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *