Next.js is a great choice for many projects, but not every React app needs SSR or a full framework. Sometimes all you need is a couple of protected endpoints: keep an API key off the client, verify auth, cap requests.
Vercel already runs serverless functions alongside static sites. A single rewrite rule plus Hono gives you auth, rate limiting, and AI integration in the same repo, without changing your frontend stack at all.
ERMate, an open-source AI-powered ER diagram tool, uses exactly this pattern (sources).
Project Structure
/
├── api/
│ ├── index.ts *
│ ├── middleware/
│ │ └── auth.ts
│ └── routes/
│ └── chat.ts
├── src/ *
└── vercel.json
api/index.ts is the single Vercel Function entry point. src/ is the React app. They share the repo and the deploy, nothing else.
One rewrite rule funnels everything to one function:
// vercel.json
{
"rewrites": [
{ "source": "/api/:path*", "destination": "/api/index.ts" }
]
}
The Entry Point
Hono handles routing inside that single function. Auth middleware runs on every route before anything else.
// api/index.ts
import { Hono } from 'hono'
import { chatRoute } from './routes/chat.js'
import { authMiddleware } from './middleware/auth.js'
const app = new Hono().basePath('/api')
app.use('*', authMiddleware())
app.route('/chat', chatRoute)
export default app
Auth with Clerk
The middleware wraps Clerk's Hono adapter. Your secret key never leaves the server.
// api/middleware/auth.ts
import { clerkMiddleware } from '@clerk/hono'
export const authMiddleware = () =>
clerkMiddleware({
publishableKey: process.env.VITE_CLERK_PUBLISHABLE_KEY,
secretKey: process.env.CLERK_SECRET_KEY,
})
Inside a route, extract the user and gate the handler:
import { getAuth } from '@clerk/hono'
chat.post('/', async (c) => {
const auth = getAuth(c)
if (!auth?.userId) return c.json({ error: 'Unauthorized' }, 401)
// safe to proceed
})
Rate Limiting (no external service)
A plain in-memory sliding window is enough for most apps. It resets on cold start, which is fine for soft limits.
One caveat: if Vercel spins up multiple warm instances, each has its own Map, so the limit multiplies by instance count. For a soft abuse deterrent that's acceptable. For strict enforcement, swap this out for an atomic external store like Redis or Vercel KV to prevent the race condition.
const windows = new Map<string, { count: number; resetAt: number }>()
export function checkRateLimit(userId: string) {
const now = Date.now()
const entry = windows.get(userId)
if (!entry || now > entry.resetAt) {
windows.set(userId, { count: 1, resetAt: now + 60_000 })
return { ok: true }
}
if (entry.count >= 20) return { ok: false, error: 'Too many requests.' }
entry.count++
return { ok: true }
}
AI Integration
The route calls the Vercel AI SDK after auth and rate checks pass. The model, tools, and system prompt stay server-side only.
import { generateText, stepCountIs } from 'ai'
const result = await generateText({
model: 'deepseek/deepseek-v3.2',
system: systemPrompt,
prompt,
tools,
stopWhen: stepCountIs(10),
})
return c.json({
text: result.text,
actions: result.steps.flatMap((s) => s.toolResults.map((t) => t.output)),
})
The client gets back plain JSON, applies the actions to local state, and never touches the model directly.
Local Development
Vite proxies /api to a local server:
// vite.config.ts
server: {
proxy: { '/api': 'http://localhost:3001' }
}
// api/dev.ts
import { serve } from '@hono/node-server'
import app from './index.js'
serve({ fetch: app.fetch, port: 3001 })
Run both in parallel: pnpm dev and pnpm dev:api. No Docker, no extra config.
When This Fits (and When It Doesn't)
This setup works well for SPAs that need a thin secure backend: AI calls, webhook receivers, or a simple data proxy. The React app stays a React app. The API is a separate concern in a separate directory. They share the repo and the deploy pipeline, and nothing else.
Next.js makes more sense when you need SSR, ISR, or tight server/client component integration. The tradeoffs of this lighter approach:
- In-memory rate limiting resets on cold start. Good enough for soft limits, not for strict quotas.
- One Vercel Function handles all API routes. Fine for low traffic; split into separate files if you need per-route isolation.
- No built-in database. Add one (Postgres, KV, etc.) when you need persistence.
