| title | Next.js Batch LLM Evaluator |
|---|---|
| sidebarTitle | Batch LLM Evaluator |
| description | This example Next.js project evaluates multiple LLM models using the Vercel AI SDK and streams updates to the frontend using Trigger.dev Realtime. |
import RealtimeLearnMore from "/snippets/realtime-learn-more.mdx";
This demo is a full stack example that uses the following:
- A Next.js app with Prisma for the database.
- Trigger.dev Realtime to stream updates to the frontend.
- Work with multiple LLM models using the Vercel AI SDK. (OpenAI, Anthropic, XAI)
- Distribute tasks across multiple tasks using the new
batch.triggerByTaskAndWaitmethod.
<Card title="View the Batch LLM Evaluator repo" icon="GitHub" href="https://github.com/triggerdotdev/examples/tree/main/batch-llm-evaluator"
Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.
<video controls className="w-full aspect-video" src="https://content.trigger.dev/batch-llm-evaluator.mp4"
- View the Trigger.dev task code in the src/trigger/batch.ts file.
- The
evaluateModelstask uses thebatch.triggerByTaskAndWaitmethod to distribute the task to the different LLM models. - It then passes the results through to a
summarizeEvalstask that calculates some dummy "tags" for each LLM response. - We use a useRealtimeRunsWithTag hook to subscribe to the different evaluation tasks runs in the src/components/llm-evaluator.tsx file.
- We then pass the relevant run down into three different components for the different models:
- The
AnthropicEvalcomponent: src/components/evals/Anthropic.tsx - The
XAIEvalcomponent: src/components/evals/XAI.tsx - The
OpenAIEvalcomponent: src/components/evals/OpenAI.tsx
- The
- Each of these components then uses useRealtimeRunWithStreams to subscribe to the different LLM responses.