Meet Your Users Where They Are

Be honest - how many AI products do you actually use? For me, the vast majority of my usage is captured by ChatGPT/Claude, Codex/Claude Code, and the AI responses in Google. What do all of these have in common? Each is an application built by the same company that builds the foundation model inside it. In a future post I'll dig into why I think this is, but for now let's take it as given: there is some structural reason that most users will encounter AI mostly through a big tech company's product. If that is the case, what does it mean for life science software teams?

It means that your existing software stack is not sufficient. Agent harnesses like Codex and Claude Code won't be able to integrate with your application at all, and certainly not in a cost-efficient way. These agents were built to drive a general-purpose computer, not to navigate the bespoke scientific software your team has built.

But you also shouldn't build your own agent harness. This era of software agents has only just begun. As I mentioned, a small number of large, well-resourced, deeply technical companies have built the successful agents so far, and they may have a structural advantage you cannot match. In almost every case it's a wiser course to figure out how to integrate with the AI products your users already know than to build a harness from scratch.

Best practices for agentic software haven't solidified. Consider the pace. Every six months or so a new leading model arrives, and while the improvements are real, this is a very different situation from the one we enjoyed during the Moore's law era. Back then you could develop against the reasonable expectation that a faster processor would show up and fix your application's performance issues for free, with no real effort on your part. LLMs are a different beast. They are non-deterministic, and it takes sophisticated engineering to understand any given model's strengths and weaknesses. When a new model ships, you'll have to re-run your evals (you have those, right?) at minimum, and you may have to re-architect your harness entirely. The foundation model companies will have already done that tuning for their own harnesses before you've even gotten API access.

So in this early era, you integrate with the AI applications that already exist. And as readers of this newsletter will have guessed, that means MCP servers.

Like a REST API, an MCP server is an interface to a web application. The REST API you have today was built for deterministic, non-agentic software: a frontend running in a browser, or a CLI. It is relatively rigid, taking structured data in and handing structured data back. An MCP server serves this purpose and more. It can return a high-level summary instead of a raw payload, suggest prompts for the agent's next turn, and shape its responses into formats that spend fewer tokens.

As a concrete example, one of the first things you'll discover when building an MCP server is the necessity of batching your backend work. An agent left to its own devices will happily call your tools one at a time, and the latency of all those sequential round trips accumulates until the user gives up and closes the tab.

UI Components Over the Wire

Perhaps the most exciting possibility is that an MCP server can return custom UI components for the agent to render inline.

This is where my earlier argument about tools as interface primitives comes back around. A tool call can be more than a bolus of data and backend logic that hides behind a citation chip in a chat. It can also be a small, self-contained piece of interface that renders inline, giving your users a high-bandwidth way to interact with your application.

Imagine a scientist asks the agent to run a differential expression analysis. Today the best you can hope for is a summary paragraph with maybe a static image bolted on. But if your MCP server returns a custom UI component, the agent can hand back an interactive volcano plot the user can hover over, filter, and click into. A query can return a protein structure the user can rotate. A thousand-row results table can come back sortable and searchable rather than truncated to the first ten lines.

So where does that leave a life science company? You aren't going to out-build the foundation model labs at the harness, and you don't need to. Your users are already inside ChatGPT and Claude. Meet them there.