Building an AI-Powered Image-to-Text Converter with Claude, Next.js 15, and Vercel AI SDK

In this tutorial, we’ll build a modern web application that converts images to detailed text descriptions using Claude’s vision capabilities. We’ll leverage the power of Next.js 15’s App Router and the Vercel AI SDK to create a responsive, real-time streaming application.

Contents

Prerequisites Project Setup Building the API Endpoint Creating the Frontend Interface Deployment Conclusion Resources

Prerequisites

Basic knowledge of React and TypeScript
Node.js installed on your machine
An Anthropic API key
Familiarity with Next.js (basic)

Project Setup

First, let’s create a new Next.js project with TypeScript and Tailwind CSS support:

npx create-next-app@latest image-to-text --typescript --tailwind --app
cd image-to-text

Install the required dependencies:

npm install ai @anthropic-ai/sdk

Building the API Endpoint

Create a new API route at app/api/chat/route.ts. This endpoint will handle image uploads and communicate with Claude:

import { AnthropicStream, StreamingTextResponse } from 'ai';
import { Anthropic } from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export const runtime = 'edge';

export async function POST(req: Request) {
  const { image } = await req.json();

  const response = await anthropic.messages.create({
    model: 'claude-3-haiku-20240307',
    max_tokens: 1024,
    messages: [
      {
        role: 'user',
        content: [
          {
            type: 'image',
            source: {
              type: 'base64',
              media_type: 'image/jpeg',
              data: image.split(',')[1]
            }
          },
          {
            type: 'text',
            text: 'Describe this image in detail.'
          }
        ]
      }
    ]
  });

  const stream = AnthropicStream(response);
  return new StreamingTextResponse(stream);
}

Creating the Frontend Interface

Replace the contents of app/page.tsx with a responsive UI that handles image uploads and displays Claude’s responses:

'use client';

import { useState } from 'react';
import { useChat } from 'ai/react';

export default function ImageToText() {
  const [image, setImage] = useState<string | null>(null);
  const { messages, setMessages, isLoading, append } = useChat();

  const handleImageUpload = (e: React.ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files?.[0];
    if (file) {
      const reader = new FileReader();
      reader.onload = (e) => {
        const base64 = e.target?.result as string;
        setImage(base64);
      };
      reader.readAsDataURL(file);
    }
  };

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!image) return;

    await append({
      role: 'user',
      content: 'Analyze this image',
      id: Date.now().toString(),
    }, {
      data: { image }
    });
  };

  return (
    <div className="max-w-2xl mx-auto p-4">
      <h1 className="text-2xl font-bold mb-4">Image to Text Converter</h1>
      
      <form onSubmit={handleSubmit} className="space-y-4">
        <div className="space-y-2">
          <label className="block">
            <span className="text-gray-700">Upload Image</span>
            <input
              type="file"
              accept="image/*"
              onChange={handleImageUpload}
              className="mt-1 block w-full"
            />
          </label>
        </div>

        {image && (
          <div className="mt-4">
            <img
              src={image}
              alt="Uploaded preview"
              className="max-w-md mx-auto rounded"
            />
          </div>
        )}

        <button
          type="submit"
          disabled={!image || isLoading}
          className="px-4 py-2 bg-blue-500 text-white rounded disabled:opacity-50"
        >
          {isLoading ? 'Analyzing...' : 'Analyze Image'}
        </button>
      </form>

      <div className="mt-8 space-y-4">
        {messages.map((message) => (
          <div
            key={message.id}
            className={`p-4 rounded ${
              message.role === 'assistant'
                ? 'bg-gray-100'
                : 'bg-blue-100'
            }`}
          >
            {message.content}
          </div>
        ))}
      </div>
    </div>
  );
}

Deployment

Create a .env.local file:

ANTHROPIC_API_KEY=your_api_key_here

Deploy to Vercel:

vercel deploy

Conclusion

We’ve built a modern, responsive image-to-text converter using cutting-edge technologies. The application demonstrates the power of combining Claude’s vision capabilities with Next.js and the Vercel AI SDK. This foundation can be extended to build more complex AI-powered image analysis tools.

Building an AI-Powered Image-to-Text Converter with Claude, Next.js 15, and Vercel AI SDK

Prerequisites

Project Setup

Building the API Endpoint

Creating the Frontend Interface

Deployment

Conclusion

Resources

Leave a Reply Cancel reply

Most Popular

Advanced Routing Techniques in Next js 15

How to Generate Dynamic OpenGraph Images in Next.js App Router 15 with TypeScript

How to Install Google Analytics 4 in Next.js 15 (App Router) with TypeScript [2024]

Getting Started with Docker Compose

Image Processing with OpenCV in Python

About

Resource

Get the Top 10 in Search!

Prerequisites

Project Setup

Building the API Endpoint

Creating the Frontend Interface

Deployment

Conclusion

Resources

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Subscribe Now

Most Popular

Always Stay Up to Date

About

Resource

Get the Top 10 in Search!