top of page

What is an XML Sitemap?

An XML sitemap is a file that lists the important URLs on a website and tells search engines where to find them. It acts as a direct communication channel between a website and search engine crawlers, providing a structured list of pages, their last update dates, and their relationship to each other. Where robots.txt tells crawlers what they cannot access, an XML sitemap tells them what they should prioritize. Submitting a sitemap to Google Search Console is one of the most important first steps in any technical SEO setup.

SEO

How does an XML sitemap work?

An XML sitemap works by listing URLs in a structured format that search engine crawlers can read and process independently of how the site is linked. When a crawler visits a site, it follows links to discover pages. That discovery process works well for pages that are well-linked internally, but it is imperfect. Orphaned pages with no inbound links, new pages published recently, and deep pages that are many clicks from the homepage may be missed or crawled infrequently if Google relies solely on link-following to discover them.


An XML sitemap removes that uncertainty. It tells Google explicitly which pages exist on the site and provides additional metadata for each URL: the date the page was last modified, which signals to crawlers whether the page has changed since the last crawl and whether a revisit is needed. That freshness signal is particularly valuable for sites that update content frequently, because it helps Google prioritize recrawling updated pages over pages that have not changed.


The file sits at the root of the domain, accessible at yourdomain.com/sitemap.xml, and is submitted to Google Search Console to ensure Google processes it. Submitting the sitemap in Search Console also provides ongoing monitoring: the Sitemaps report shows how many URLs were submitted, how many were indexed, and whether any errors were encountered during processing. That data is not available when Google discovers pages through crawling alone.


On Wix, the XML sitemap is generated automatically and updated as pages are published, updated, or removed. When the site is connected to Google Search Console through the Wix SEO Setup Checklist, the sitemap is submitted automatically. Verifying that the sitemap is being processed correctly, that it contains the right URLs, and that the submitted URL count matches the expected number of indexable pages is one of the first checks in any Wix technical SEO audit. For the full indexing setup that the sitemap sits within, the Wix indexing guide covers every configuration step including sitemap verification in Search Console.


What should and should not be in an XML sitemap?

An XML sitemap should contain every URL that Google should index and rank. It should not contain URLs that Google should ignore. That distinction sounds straightforward but is where most sitemap configuration errors occur.


Pages that belong in the sitemap are the ones that have genuine SEO value and are set to index. Service pages, blog posts, product pages, category pages, the homepage, and any other page targeting a specific search query should all be listed. These are the pages that benefit from the crawl priority signal that sitemap inclusion provides, particularly new pages that have not yet accumulated inbound links.


Pages that should not be in the sitemap are those that are set to noindex, those that redirect to other URLs, those that return 404 errors, and those that have no ranking value. A sitemap containing noindex pages sends a contradictory signal: the sitemap says the page is important while the meta tag says do not include it in search results. Google has to reconcile that conflict, which creates unnecessary confusion. A sitemap containing redirect URLs wastes crawl budget on URLs that simply pass the crawler to a destination rather than delivering content. A sitemap with 404 URLs signals poor site maintenance and can delay processing of the valid URLs listed alongside them.


On ecommerce sites, the sitemap question becomes more complex. Product variant URLs, filtered navigation pages, and paginated category pages that are canonicalized to their base URLs should not appear in the sitemap because their canonical configuration already tells Google which URL to treat as authoritative. Including them adds URL count without adding indexation benefit and can dilute the clarity of the sitemap as a signal of site priority.


On Wix, the automatically generated sitemap covers main pages and blog posts but may require review for CMS-driven dynamic pages, collection pages, and any pages that have been manually set to noindex. For the Wix-specific sitemap and indexing configuration, the Wix indexing guide covers what to verify and how to fix the most common sitemap-related indexing gaps.

How does an XML sitemap relate to crawl budget?

The relationship between XML sitemaps and crawl budget is direct but often misunderstood. A sitemap does not increase the total crawl budget Google allocates to a domain. What it does is help Google use that budget more efficiently by providing a clear signal of which pages matter most.


Crawl budget is the amount of crawl attention Google allocates to a specific website over a given period. For small sites with fewer than a few hundred pages, crawl budget is rarely a limiting factor. For larger sites, particularly ecommerce stores with thousands of product pages, publishers with large content archives, and any site with URL proliferation from parameters or faceted navigation, crawl budget management determines which pages get crawled frequently and which get neglected.


A well-configured sitemap contributes to crawl efficiency by removing ambiguity about which URLs are the most important. When Google has a clear, accurate sitemap containing only indexable, canonical, live URLs, it can prioritize those pages in its crawl queue rather than spending time discovering and evaluating low-value URLs through link-following alone. The pages in the sitemap get crawled more reliably and more frequently than pages Google discovers independently through its own link-following logic.


A poorly configured sitemap creates the opposite effect. A sitemap containing redirected URLs, noindex pages, 404 errors, and low-value parameter variants tells Google the list is unreliable and reduces how much weight it places on the sitemap as a crawl priority signal. In those cases, Google falls back on its own crawl assessment rather than trusting the sitemap, which means the carefully curated list of important pages may not receive the crawl priority the site owner intended.


For Wix sites with active blog programmes publishing multiple posts per month, a correctly configured sitemap ensures that new content is discovered and indexed quickly rather than waiting for Google to find it through internal links. For the full technical setup that connects sitemap, crawlability, and indexation on Wix, the Wix technical SEO guide covers each element in sequence.

How do you submit and monitor an XML sitemap?

Submitting and monitoring an XML sitemap is one of the most straightforward technical SEO tasks available, and it is one of the highest-return actions for a site that has never done it. The process is consistent across every major platform and takes under fifteen minutes for most business websites.


The starting point is locating the sitemap. On most platforms the sitemap is accessible at yourdomain.com/sitemap.xml. On Wix, the sitemap is automatically generated and accessible at that standard path. On WordPress with Yoast SEO or Rank Math, the sitemap is generated by the plugin and its location is declared in the robots.txt file. On Framer and Webflow, the sitemap is generated automatically by the platform. On Shopify, the sitemap is generated automatically at the standard path and includes separate sections for products, collections, pages, and blog posts.


Submitting the sitemap to Google Search Console is done through the Sitemaps section under the Indexing menu. Enter the sitemap URL, click submit, and Google begins processing. The Sitemaps report then shows how many URLs were submitted, how many were indexed, and whether any errors or warnings were flagged during processing. A significant gap between submitted and indexed URLs is a signal worth investigating. It indicates that Google is finding pages in the sitemap but declining to index them, which points to content quality, canonical configuration, or technical issues on those specific pages.


Monitoring the sitemap is an ongoing task rather than a one-time setup. New pages need to be added to the sitemap, removed pages need to be excluded, and redirect or error URLs that accumulate over time need to be cleaned out. On platforms that generate sitemaps automatically like Wix, Framer, and Shopify, this happens without manual intervention for most standard page types. For CMS-driven pages, custom URL patterns, or pages that are manually set to noindex, a quarterly sitemap review confirms that the file reflects the current intended index structure of the site. For the full diagnosis of sitemap-related indexing issues on Wix, the Wix indexing guide covers the Search Console verification steps in detail.

How does an XML sitemap differ from an HTML sitemap?

XML sitemaps and HTML sitemaps serve different audiences and different purposes. Both list pages on a website, but the format, location, and function of each are entirely distinct.


An XML sitemap is written for search engine crawlers, not human visitors. It is a machine-readable file containing URLs and metadata that crawlers process to understand what exists on a site and when pages were last updated. Visitors never see it directly. It exists purely to support search engine discovery and crawl efficiency. The file is typically linked from the robots.txt file and submitted to Google Search Console so it can be processed and monitored directly.


An HTML sitemap is a page on the website designed for human visitors. It presents a visual overview of the site's structure, listing pages in a navigable format that helps users find content that might not be obvious from the main navigation. HTML sitemaps were more widely used in the early years of SEO when they contributed meaningfully to internal link structure and crawlability. Their SEO value today is modest for most sites because search engines have significantly improved their ability to discover pages through link-following and XML sitemap processing. They still serve a user experience function on large sites with complex content architecture where visitors may need a structured overview to find specific sections.


The practical relationship between the two is that neither replaces the other. An XML sitemap is a technical requirement for any site serious about SEO. An HTML sitemap is a usability decision that depends on the complexity of the site and whether visitors genuinely benefit from a structured content overview. For most service business websites with clear navigation and a manageable number of pages, an HTML sitemap adds marginal value. For large editorial sites, ecommerce stores with extensive category structures, or complex multi-service agency sites, an HTML sitemap supports both user navigation and internal linking depth simultaneously.

When does it make sense to get help with XML sitemap configuration?

XML sitemap setup is one of the most accessible technical SEO tasks for most business owners. On Wix, Framer, Shopify, and Webflow, the sitemap is generated automatically. Submitting it to Google Search Console takes under five minutes. For a new site or a straightforward service business website, that starting point is entirely manageable without specialist involvement.


Where specialist involvement produces results that self-configuration cannot match is diagnosis, accuracy at scale, and the interaction between sitemap configuration and other technical SEO elements.


The most common scenario where specialist help adds clear value is a site where the submitted sitemap URL count in Search Console significantly differs from the expected number of indexed pages. A site with 80 pages where Search Console shows 40 submitted and 20 indexed has a configuration problem that requires investigation across noindex settings, canonical tags, content quality, and sitemap accuracy simultaneously. Identifying which layer is causing the gap, whether the sitemap is listing the wrong URLs, whether pages are correctly set to index, or whether Google is finding and rejecting pages for content reasons, requires a systematic audit rather than a single fix.


Large sites with multiple content types benefit from structured sitemap management that goes beyond the platform defaults. An ecommerce site with product pages, collection pages, blog posts, and landing pages may benefit from segmented sitemaps that allow monitoring by content type in Search Console. A multilingual site needs sitemap configuration that correctly represents each language version with consistent hreflang alignment. These configurations require deliberate planning rather than relying on automatic generation.


Platform migrations are the clearest trigger for immediate sitemap review. A site that migrates to a new platform needs its sitemap resubmitted in Search Console under the new domain or URL structure. If the old sitemap is still submitted alongside the new one, or if the new sitemap contains old URLs that now redirect, Google receives conflicting signals about what the current site contains. Resolving those conflicts is one of the first post-migration technical tasks.


We Optimizz includes sitemap verification and submission as part of every technical SEO engagement and every platform migration. If your Search Console is showing indexation gaps or your sitemap data looks inconsistent, the free SEO scan identifies the most visible issues as a starting point. For a deeper diagnosis, book a free discovery call and we will review your sitemap and indexation data live.

On this page

Do you need help with your XML Sitemap?

A poorly configured sitemap costs you indexation and crawl efficiency. We Optimizz sets up, verifies, and monitors XML sitemaps across Wix Studio, WordPress, Framer, Webflow, and Shopify — and fixes the gaps that prevent important pages from being indexed. 894 websites delivered across 35+ countries.

img_cta_1_HR.webp
bottom of page