Google Search Console Page Indexing Diagnostics: An Ecommerce Playbook
The Page Indexing report is the most actionable diagnostic Google gives ecommerce stores — but the status messages are cryptic. Here's the practical playbook for resolving Crawled Not Indexed, Discovered Not Indexed, Soft 404s, and Page With Redirect issues on Shopify, WooCommerce, and custom platforms.
Google Search Console's Page Indexing report is the closest thing to a direct line into how Google sees an ecommerce site. Every URL that Google has discovered for the site falls into one of two buckets: indexed or not indexed. The "not indexed" bucket is broken into 20+ status reasons, and each one is a specific diagnostic that points to a specific fix. The trouble is that the status names are written in Google-speak: "Crawled - currently not indexed," "Discovered - currently not indexed," "Page with redirect," "Soft 404." Each one means something different and requires different action.
The patterns below come from auditing hundreds of ecommerce stores. They cover the seven status reasons that account for 90%+ of indexing issues on Shopify, WooCommerce, BigCommerce, and custom-platform stores.
1. Crawled - Currently Not Indexed
Google fetched the page, processed it, and decided not to put it in the index. This is the most common and most diagnostically useful status. It signals one of three things:
- Thin content — the page has too little unique content for Google to consider valuable. Product pages with a one-line description and a few images are the classic case. The fix: add structured product specs, materials, dimensions, care instructions, FAQ blocks, and rich descriptions.
- Near-duplicate content — the page is too similar to other pages on the site. Color variants, size variants, and category pages with overlapping product sets are common offenders. The fix: canonicalize variants, write unique category copy, or consolidate.
- Low perceived authority — the page is on a domain without enough trust signals to merit indexing. The fix is harder — build internal links from indexed pages, earn external backlinks, and improve site-wide content quality.
To diagnose: pick 10-20 URLs from the Crawled Not Indexed report, view them side-by-side, and identify the pattern. If all 20 are product variants of the same parent product, the fix is canonicalization. If all 20 are different products with one-line descriptions, the fix is content depth.
2. Discovered - Currently Not Indexed
Google knows the URL exists (from a sitemap, a link, or some other discovery source) but has not yet fetched it. This is a crawl budget issue, not a content quality issue. Common ecommerce causes:
- Faceted navigation URL explosion — filter parameters like
?color=blue&size=M&sort=price-ascgenerate millions of theoretical URLs. Google sees them in the sitemap or via internal links, tries to schedule a crawl, and never gets around to them. - Internal search result pages indexed — every search someone runs becomes a potential URL. These should be blocked from indexing entirely.
- Pagination depth — page 50+ of a category listing has low link equity flowing to it and rarely gets crawled.
- Slow server response — if your pages take >2 seconds to respond, Google reduces crawl rate to avoid overloading the server. Fewer crawls means more URLs sitting in Discovered.
The fix sequence: (1) ensure faceted nav URLs are canonicalized or blocked, (2) ensure internal search pages have noindex meta tags, (3) prune the sitemap to only include canonical URLs, (4) improve TTFB to give Google budget headroom.
3. Page With Redirect
This status flags URLs that redirect to other URLs. Two patterns:
- Expected redirects — old URLs you've explicitly redirected to new ones. These should appear in the report and you don't need to act on them.
- Unexpected redirects — URLs that you didn't intend to redirect but are redirecting due to misconfiguration. The classic cases: trailing-slash inconsistency (
/productredirecting to/product/), HTTP-to-HTTPS for URLs that already exist on HTTPS, and language/locale redirects fired on URLs that should be language-neutral.
To diagnose: export the list, sort by URL pattern, and look for canonical URLs that appear here. Canonical URLs should never redirect — if they do, you're losing crawl budget on the redirect and confusing Google about which URL is authoritative.
4. Soft 404
The page returned HTTP 200 (OK) but Google thinks it's a "not found" page. Common ecommerce causes:
- Out-of-stock product pages with no content — if your OOS page strips the product details and just shows "Sorry, this is out of stock," Google may treat it as a 404. Keep the product details visible and add a "notify me when back in stock" form.
- Empty category pages — categories with no current products often render an empty grid. Google sees this as a not-found page. Either redirect to a parent category or merge with another category.
- Pagination beyond available products — page 12 of a category with only 8 pages of products. Some platforms render this as an empty page rather than 404-ing.
- JavaScript-rendered content not loading — if the product details only appear after a JavaScript fetch, and that fetch fails or is too slow, Google sees an empty shell. Server-side render or pre-render critical content.
5. Duplicate, Google Chose Different Canonical Than User
You set a canonical tag, but Google ignored it and chose a different page as the canonical. This is Google overriding your signal because:
- The page you canonicalized to is too different from this page — Google won't accept a canonical that points to dissimilar content. A canonical from a "red sneakers" product page to a generic "sneakers" category page won't be honored.
- The canonical target has weaker signals than the current page — if the page you're trying to canonicalize away has more backlinks or internal links than the target, Google flips them.
- Conflicting canonical signals — the page's HTML canonical, HTTP header canonical, and sitemap inclusion contradict each other.
The fix: review each case individually. Sometimes Google is right and you should swap which URL is the canonical. Sometimes you have conflicting signals (canonical tag says X, sitemap includes Y, hreflang points to Z) and need to align them.
6. Alternate Page With Proper Canonical Tag
This is generally a healthy status — variants and parameter URLs that correctly point to a canonical. You don't need to action these, but you should ensure the count is roughly stable. A sudden 10x increase suggests a new parameter pattern emerging (often from a newly installed app) that should be reviewed.
7. Excluded by 'noindex' Tag
Verify that everything in this bucket is actually meant to be noindex. Common ecommerce no-indexing patterns:
- Account pages (login, profile, orders)
- Checkout flow pages
- Returns portal pages
- Internal search results
- Filter/sort URLs
- Thank-you and confirmation pages
If product pages, category pages, or blog posts appear here, something is wrong. A common bug: a developer adds noindex during a redesign for the staging site and forgets to remove it before launch. Check robots meta tags on representative pages.
The Page Indexing Diagnostic Workflow
A monthly 20-minute audit:
- Open Search Console → Indexing → Pages
- Note the total Indexed count vs prior month — track the trend
- Click each Not Indexed status reason, export the affected URLs
- For each reason, identify the top URL pattern (15-20 sample URLs is enough)
- Diagnose the cause using the playbook above
- Fix the underlying issue (content, canonicals, robots, sitemap, response codes)
- Request reindexing for high-priority URLs via the URL Inspection tool
The StoreVitals scan catches the on-site signals that drive these GSC statuses — canonical tags, noindex meta tags, robots.txt rules, redirect chains, soft 404 patterns, and thin-content pages. Run a scan before tackling a GSC indexing audit to surface the structural issues first; the GSC report will validate the impact after.