AI-n’t Simple

When LLMs first arrived, they promoted simplicity: plug in a model and watch it handle everything — customer support, coding, research, planning, etc.

But like Tomasz Tunguz points out, the complexity didn’t disappear — it just moved.

Perception vs Reality

The reality of building AI-driven applications feels a lot like the hidden technical debt found at more mature SaaS companies. There’s early signals that powering your business with AI falls back on a familiar pattern: instead of relying purely on AI to reason through tasks, we write software around it. Want to triage emails? You’ll likely build workflows that create Asana tasks, log updates in HubSpot, and trigger alerts — most of it deterministic. AI plays a role, but it doesn’t carry the whole load.

Context Engineering: just the beginning

Agents, like humans, need deep context to work well. They need to know how your CRM is structured, what goes in each field, and how your team communicates. Feeding all that into an LLM is expensive. These models are hungry, and context is costly.

As your system grows, so do the tradeoffs. Tool calling breaks down when you’re dealing with more than a dozen services. You need smarter routing, which often means training a traditional ML model just to decide what tool to use. Then comes observability — logging what happened, why it happened, and how to course-correct when it doesn’t.

You also need guardrails: blocking inappropriate responses, managing token usage, rate-limiting runaway costs. And knowledge retrieval becomes mandatory — using RAG (retrieval-augmented generation), vector databases like LanceDB or Pinecone, and increasingly, graph-based tools for more structured queries.

Then there’s memory. It’s not just about storing chat history. Users expect persistence — like remembering how you want your data visualizations labeled or your dashboards themed. Today, that memory lives in markdown files, config scripts, and increasingly structured agent formats (.gemini, .claude).

Mistakes to Avoid

As mentioned in previous posts, here are some of the common traps we’ve seen:

Assuming the AI will “just figure it out” without structured system prompts, memory schemas, or scoped tasks

Underestimating product integration complexity, including webhooks, API endpoints, latency handling, and deterministic fallbacks

Skipping UX, evaluation, and deployment planning, especially in systems where AI failure cases must be caught or redirected

Neglecting observability and guardrails, leading to surprise costs, hallucinations, or inconsistent outputs

We’re not just building with LLMs — we’re engineering around them. The magic box turns out to be an iceberg. The model is just the visible tip. Underneath: routing, orchestration, retrieval, observability, evals, governance, and controls.