A Thin API Layer for React Apps on Vercel

Next.js is a great choice for many projects, but not every React app needs SSR or a full framework. Sometimes all you need is a couple of protected endpoints: keep an API key off the client, verify auth, cap requests.

Vercel already runs serverless functions alongside static sites. A single rewrite rule plus Hono gives you auth, rate limiting, and AI integration in the same repo, without changing your frontend stack at all.

My open-source AI-powered ER diagram tool uses exactly this pattern (sources).

Project Structure

/
├── api/
│   ├── index.ts *
│   ├── middleware/
│   │   └── auth.ts
│   └── routes/
│       └── chat.ts
├── src/ *
└── vercel.json

api/index.ts is the single Vercel Function entry point. src/ is the React app. They share the repo and the deploy, nothing else.

One rewrite rule funnels everything to one function:

// vercel.json
{
  "rewrites": [
    { "source": "/api/:path*", "destination": "/api/index.ts" }
  ]
}

The Entry Point

Hono handles routing inside that single function. Auth middleware runs on every route before anything else.

// api/index.ts
import { Hono } from 'hono'
import { chatRoute } from './routes/chat.js'
import { authMiddleware } from './middleware/auth.js'

const app = new Hono().basePath('/api')

app.use('*', authMiddleware())
app.route('/chat', chatRoute)

export default app

Auth with Clerk

The middleware wraps Clerk's Hono adapter. Your secret key never leaves the server.

// api/middleware/auth.ts
import { clerkMiddleware } from '@clerk/hono'

export const authMiddleware = () =>
  clerkMiddleware({
    publishableKey: process.env.VITE_CLERK_PUBLISHABLE_KEY,
    secretKey: process.env.CLERK_SECRET_KEY,
  })

Inside a route, extract the user and gate the handler:

import { getAuth } from '@clerk/hono'

chat.post('/', async (c) => {
  const auth = getAuth(c)
  if (!auth?.userId) return c.json({ error: 'Unauthorized' }, 401)

  // safe to proceed
})

Rate Limiting (no external service)

A plain in-memory sliding window is enough for most apps. It resets on cold start, which is fine for soft limits.

One caveat: if Vercel spins up multiple warm instances, each has its own Map, so the limit multiplies by instance count. For a soft abuse deterrent that's acceptable. For strict enforcement, swap this out for an atomic external store like Redis or Vercel KV to prevent the race condition.

const windows = new Map<string, { count: number; resetAt: number }>()

export function checkRateLimit(userId: string) {
  const now = Date.now()
  const entry = windows.get(userId)

  if (!entry || now > entry.resetAt) {
    windows.set(userId, { count: 1, resetAt: now + 60_000 })
    return { ok: true }
  }
  if (entry.count >= 20) return { ok: false, error: 'Too many requests.' }

  entry.count++
  return { ok: true }
}

AI Integration

The route calls the Vercel AI SDK after auth and rate checks pass. The model, tools, and system prompt stay server-side only.

import { generateText, stepCountIs } from 'ai'

const result = await generateText({
  model: 'deepseek/deepseek-v3.2',
  system: systemPrompt,
  prompt,
  tools,
  stopWhen: stepCountIs(10),
})

return c.json({
  text: result.text,
  actions: result.steps.flatMap((s) => s.toolResults.map((t) => t.output)),
})

The client gets back plain JSON, applies the actions to local state, and never touches the model directly.

Local Development

Vite proxies /api to a local server:

// vite.config.ts
server: {
  proxy: { '/api': 'http://localhost:3001' }
}

// api/dev.ts
import { serve } from '@hono/node-server'
import app from './index.js'

serve({ fetch: app.fetch, port: 3001 })

Run both in parallel: pnpm dev and pnpm dev:api. No Docker, no extra config.

When This Fits (and When It Doesn't)

This setup works well for SPAs that need a thin secure backend: AI calls, webhook receivers, or a simple data proxy. The React app stays a React app. The API is a separate concern in a separate directory. They share the repo and the deploy pipeline, and nothing else.

Next.js makes more sense when you need SSR, ISR, or tight server/client component integration. The tradeoffs of this lighter approach:

In-memory rate limiting resets on cold start. Good enough for soft limits, not for strict quotas.
One Vercel Function handles all API routes. Fine for low traffic; split into separate files if you need per-route isolation.
No built-in database. Add one (Postgres, KV, etc.) when you need persistence.

A Thin API Layer for React Apps on Vercel

Project Structure

The Entry Point

Auth with Clerk

Rate Limiting (no external service)

AI Integration

Local Development

When This Fits (and When It Doesn't)

Relevant Posts

Build Your Own AI-Powered API with Ollama, Hono and Vercel AI SDK

Using Small LLM Models Locally with Ollama

AI Framework Comparison: Vercel AI SDK, Mastra, Langchain and Genkit