Skip to main content
Toollyz

Search tools

Search for a command to run...

Sitemap Validator

Parse any sitemap.xml or sitemap-index via DOMParser. Validates structure against sitemaps.org spec. Flags missing <loc>, invalid lastmod, bad changefreq, out-of-range priority, duplicates, > 50,000 URLs, > 50 MB size. Extracts every URL with TSV export. 100% offline.

What is the Sitemap Validator?

Sitemap Validator parses any sitemap.xml or sitemap-index file via DOMParser (browser-native, safe — no script execution) and validates the structure against the sitemaps.org spec. It correctly distinguishes between `<urlset>` (a list of URLs) and `<sitemapindex>` (a list of child sitemaps), validates the recommended `xmlns` attribute, and walks every entry. Per-URL checks include: missing or empty `<loc>` (error), `<loc>` not an absolute URL (warning), `<loc>` using http:// instead of https:// (info), `<loc>` over 2,048 characters (warning), invalid `<lastmod>` (not ISO 8601 — warning), invalid `<changefreq>` (not one of always/hourly/daily/weekly/monthly/yearly/never — warning), out-of-range `<priority>` (must be 0.0-1.0 — warning), and duplicate `<loc>` values (warning). File-level checks: empty document (error), no root or wrong root element (error), > 50,000 URLs (warning — Google's per-sitemap limit), > 50 MB file (warning — Google's per-sitemap size limit). Every URL is extracted to a filterable table with TSV export.

How to use it

  1. Paste your sitemap.xml content — `<urlset>` or `<sitemapindex>`.
  2. Read the issue list: errors (red) block; warnings (amber) suggest fixes; infos (blue) are advisories.
  3. Browse the extracted URLs in the table — filter by substring.
  4. Copy all URLs as a newline-separated list or download TSV for spreadsheet review.

Benefits

  • DOMParser-based — safe parsing, no script execution.
  • Distinguishes `<urlset>` (URL list) and `<sitemapindex>` (sitemap-of-sitemaps).
  • Validates the recommended `xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"` attribute.
  • Per-URL checks: missing/empty <loc>, absolute URL, https vs http, length, ISO 8601 lastmod, valid changefreq, 0.0-1.0 priority, duplicates.
  • File-level checks: > 50,000 URLs and > 50 MB warnings (Google's per-sitemap limits).
  • Every URL extracted to a filterable, sortable table.
  • TSV export with loc / lastmod / changefreq / priority for spreadsheet review.
  • Issue colour-coding: errors red, warnings amber, infos blue.
  • Runs 100% in your browser — your sitemap is never uploaded.

Frequently asked questions

What's the difference between urlset and sitemapindex?

A `<urlset>` is a regular sitemap — a list of `<url>` entries each with a `<loc>` (the URL). A `<sitemapindex>` is a sitemap-of-sitemaps — a list of `<sitemap>` entries each with a `<loc>` pointing to another sitemap. Index files are how large sites split a sitemap into multiple files (Google caps at 50,000 URLs per file).

Why is my <lastmod> flagged as invalid?

Sitemaps.org requires ISO 8601 format. Acceptable: `2026-05-31` (date only) or `2026-05-31T14:30:00Z` (datetime with TZ). Not acceptable: `31/05/2026`, `May 31 2026`, `2026-5-31`. The error message tells you the offending value and URL.

What are valid changefreq values?

always, hourly, daily, weekly, monthly, yearly, never. Anything else is a warning. Note that Google now mostly ignores changefreq — they crawl based on observed behaviour rather than your hints.

What does Google do with <priority>?

Mostly ignore it. Priority was meant to suggest relative importance, but it's been gamed too much. Modern Google uses other signals. Still — invalid values (outside 0.0-1.0) are spec violations.

Why is my sitemap > 50,000 URLs a problem?

Google's per-sitemap limit. Beyond that, split into multiple sitemaps and reference them from a `<sitemapindex>`. Most CMSes do this automatically when you cross the threshold.

And the 50 MB limit?

Google's per-sitemap file size limit (uncompressed). If your sitemap is huge with long URLs and rich metadata, you may hit this before the 50,000-URL count. Same fix — split into multiple sitemaps.

Does it check that the URLs actually return 200?

No — that requires HTTP requests, which we deliberately avoid (you'd hit CORS, rate limits, and we'd need a backend). For URL liveness, use the Broken Link Checker tool or run a CLI like `wget --spider` against the URL list.

What if my sitemap is gzipped (.xml.gz)?

Currently you need to gunzip locally first. A future version may add browser-side gunzip via the Compression Streams API.

Can I validate a sitemap by URL?

Not in this version — paste the XML content. Direct URL fetch would hit CORS on most sitemap servers and require a backend.

Does it understand video / news / image sitemap extensions?

Partially — we don't reject those, but we don't validate their specific tags either. The standard <url>, <loc>, <lastmod>, <changefreq>, <priority> are checked; extension tags pass through.

Is anything uploaded?

No. DOMParser-based parsing and all validation run entirely in your browser.

See all seo tools