robots.txt in Next.js — Controlling Search Engine Access to Your Site

1. What Is robots.txt?

The robots.txt file follows the Robots Exclusion Protocol and tells search engine crawlers (like Googlebot or Bingbot) which URLs they are allowed or disallowed to access. In Next.js, this file lives at the root of the app directory.

2. Defining a Static File

You can manually create app/robots.txt with rules like:

User-Agent: *
Allow: /
Disallow: /private/
Sitemap: https://acme.com/sitemap.xml

3. Generating robots.txt with Code

Use the special robots.ts Route Handler:

// app/robots.ts
import type { MetadataRoute } from 'next'

export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: '*',
      allow: '/',
      disallow: '/private/',
    },
    sitemap: 'https://acme.com/sitemap.xml',
  }
}

Output:

User-Agent: *
Allow: /
Disallow: /private/
Sitemap: https://acme.com/sitemap.xml

4. Customizing Specific Crawlers

You can define rules for individual bots using an array:

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: 'Googlebot',
        allow: ['/'],
        disallow: '/private/',
      },
      {
        userAgent: ['Applebot', 'Bingbot'],
        disallow: ['/'],
      },
    ],
    sitemap: 'https://acme.com/sitemap.xml',
  }
}

Output:

User-Agent: Googlebot
Allow: /
Disallow: /private/
User-Agent: Applebot
Disallow: /
User-Agent: Bingbot
Disallow: /
Sitemap: https://acme.com/sitemap.xml

5. Robots Object Structure

The MetadataRoute.Robots type supports:

rules: userAgent, allow, disallow, crawlDelay
sitemap: URL or array of sitemap URLs
host: optional host domain

Conclusion

The robots.txt file in Next.js is a powerful tool for controlling how search engines index your site. Whether static or dynamic, it helps you protect private routes, optimize SEO, and manage crawler behavior with precision.