Our default 2026 AI stack · Growvate Journal

People ask us this constantly: "what should I use?" The answer is always boring — pick the thing that will let your team ship, not the thing that will look best in a tech blog post. With that caveat, here's our default stack as of Q2 2026.

The model layer

Primary: Claude Sonnet 4

Our default for everything that involves reasoning, long context, or following complex instructions. It's better than GPT-4o at multi-step tasks, follows system prompts more reliably, and is significantly cheaper for high-volume use. We've shipped 28 of our last 30 projects on Sonnet 4.

For deep reasoning: Claude Opus 4

Used for the 5% of calls where you need real depth — financial analysis, legal reasoning, complex code review. We don't use Opus for production user-facing flows because of latency, but we use it inside async pipelines and offline evaluation.

For dirt-cheap volume: Claude Haiku

When you need to classify, route, or do quick extraction over millions of items, Haiku is 1/10th the cost of Sonnet and good enough. Use it for routing prompts to other models, summarising chunks, or doing first-pass labeling.

What we use OpenAI for

Whisper for speech-to-text (still the best), embeddings (text-embedding-3-large), and GPT-4o for tasks that specifically need image understanding inside a chat loop. We don't use GPT-4o as a primary text model anymore.

"Use Claude as your default, OpenAI for specific specialties, Google for specific specialties." That's our entire model strategy in 2026.

The orchestration layer

For agent workflows: TypeScript + Anthropic SDK

We don't use LangChain anymore. The abstractions get in the way for production work and the framework keeps reshaping itself. Plain TypeScript + the Anthropic SDK + good function-calling discipline is more code but significantly more maintainable.

For non-engineering workflows: n8n

When the workflow doesn't need custom code — connect Shopify to a model to Slack — n8n self-hosted is faster to ship and easier for the client's ops team to maintain after handoff. Zapier is fine for very simple flows; n8n is the right call once there are 4+ steps.

For RAG: pgvector + custom

We've migrated away from Pinecone for almost all client work. Postgres with pgvector handles up to ~5M embeddings comfortably, it's already in the client's stack, and it has zero per-query cost. Pinecone is still right for some very specific cases (multi-tenant SaaS embeddings at scale), but it's not the default anymore.

The infrastructure layer

Hosting: Vercel for Next.js apps, Cloudflare Workers for edge stuff. Fly.io if there's GPU work.
Database: Postgres on Supabase or Neon. Both fine. We pick based on the client's existing relationships.
Auth: Clerk for B2B SaaS. Supabase Auth if Postgres already there.
Files: Cloudflare R2 or S3. R2 wins on cost for large objects.
Observability: Sentry for errors, PostHog for product analytics, Datadog or BetterStack for infra.
LLM observability: Langfuse self-hosted. We've tried most of them. Langfuse is currently the cleanest.

The eval layer

This is where almost no one invests enough. We use:

Promptfoo for batch eval runs against a frozen test set
Langfuse for production-trace-based evals
LLM-as-judge with Claude Opus 4 for rubric-based scoring of generative outputs
Human review queues for the top 5% most ambiguous outputs (a Slack-based mini-tool we built)

What we don't use (and why)

LangChain. Abstractions don't match real-world failure modes. Hard to debug.
AutoGen / CrewAI. Agent orchestration frameworks are still too unstable for production.
Vector DBs other than pgvector and Pinecone. Most of the others solve problems most clients don't have.
Fine-tuning by default. Sonnet 4 with a good system prompt + RAG beats fine-tuned smaller models in 90% of cases. Only fine-tune when the cost math forces it.

This will be wrong in 6 months

Every line of this article will be revisable by Q4 2026. The stack churns. The principles don't:

Pick the boring choice.
Optimise for ability to ship, not theoretical maximums.
Invest in evals before sophistication.
Stay one layer below the bleeding edge — last quarter's frontier is this quarter's stable.

If you'd like our latest version of this list or want our take on a stack you're considering, just ask — happy to share what we've shipped.

Our default 2026 AI stack.