Feb 16, 2023

Do I need to fix all "pages that aren't indexed" ?? (aka 'Page Indexing Issues')

When looking at the 'Page Indexing' report in search console, you can often see a number of URLs in the gray 'Not Indexed' section ... this can look scary. 

And/or can get emails notifying you that Google has identified pages affected by 'indexing issues',  which can also appear that something needs fixing. 

In reality, many/most of them are probably benign and not really 'affecting' the ranking of the website. I.e. the fact that have some URLs are non-indexed, is not 'dragging down' the ranking of the site as a whole. 

However, if a 'real' page is not indexed for whatever reason, then it not able to get any traffic as it won't appear in results! 
 
So before panicking, it make be worth looking through the affected URLs and classifying them if they need fixing or not, ...there is no 'one size fits all answer'. You kinda have to go through each URL one-by-one, and an determine if it RIGHT to not be indexed. If so, then it can be ignored from further consideration. 
 
  • As progress, will probably hit upon 'patterns' (i.e. all similar URLs should be treated the same!), so in reality shouldn't have to go through literally all URLs. Another complication is could have a mix of situations within one 'Reason'. E.g. may have URLs in 'Not Found' section - having looked through the URLs, possibly majority of them are correct as 404 (no further action). But there could be a few in the list that could be fixed. Which is why need to look at all URLs within a 'Reason', rather that treat each Reason as one issue. 
 
 
One tip would be to download the data (within each 'Reason', use the 'Export' function) and use a spreadsheet to keep track of your evaluation process. You do need to export each 'Reason' separately, and only get 1000 URLs per status 

Could classify each individual URL as one of
A) Perfectly correct to not be indexed, no further thought needed
B) maybe not totally correct in this status, but still ok being not indexed
C) actually this is a proper and canonical URL to some content, and warrants further attention. 

Only URLs in C need looking at!

Some examples of URLs that belong in A,
  • Old URLs - pages no longer active for whatever reason. Then 404 Not found is the Correct Status. 
  • Junk URLs - sometimes Google can end up crawling URLs that dont represent pages (and never have) - but as long as present status like 404, then that is all fine
  • Non-Canonical URLs - In practice there may be multiple URLs to access the same content, including via redirects. Its normal for the non-canonical or 'alternate' URLs to be not-indexed - and the fact that the alternate URL is not indexed, does not affect indexing of the page on another URL.
    ... the differences may be subtle, might just be missing a trailing slash, or capitalisation
  • Domain/Protocol Variations - If your site is indexed on say https://www... then it could be quite normal for the http:// and/or non-www URLs to be non-indexed as they non-canonical.
  • Non-Content Pages - A site may have a technically functional page, but actully it has no content to index. An example would be a 'shopping cart' page. It works for users, but has no content for search engines to index. These are often blocked with 'noindex' tag. 
  • Duplicate Pages - Sometimes a site might have multiple Pages for the same purpose. Eg both a category and a tag for the same subject. They may be different content wise, but do exactly the same thing to users : a list of blog posts. 

Some examples of URLs that belong in B
  • Low Content Pages - sometimes a site might have pages that are indexable, just that they don't make good landing pages. In practice, it simply doesn't matter 
  • Similar Pages - again may pages that effectively same as some other URL, and while they are their own page, they not sufficiently different that need to be indexed. 
  • Feeds and Sitemaps - These days few people use Search Engines to look for RSS feeds, so while Googlebot will sometimes index feeds, there no real benefit. Similarly sitemaps can be marked noindex to prevent them cluttering results. 
  • Intermediate Pages - eg category pages, if item/product pages are indexed, it may not matter that the category pages aren't indexed. Google can just sent visitors direct to item pages, without needing the category page indexed. 
  • Product Variations - particularly with E-Commence sites, it common to have lots of very similar pages, eg because have lots of very similar Products. In practice, its difficult to tell the exact differences apart. Google may not be able to index them all, as they not unique and distinctive enough - it wouldn't be able to send users to the right one. Instead Google tries to index a cross sample (users when find the closest match in Google, and then use the sites own search/browse functions to find the exact item they after. 
  • Misclassified URLs - can have URLs technically in the wrong 'Reason'. Eg might have a URL detected as 'Redirect Error', but actually the redirect is functioning fine. But even if Goolge classified it 'correctly', would just be in 'Page with Redirect' which is also 'non indexed'. So while would be more 'correct' classified as something else, still 'not indexed' so there is no real benefit to trying to get them moved, so can put them in group B :)
(can see above there is lots of overlap between A and B, and can be a fine dividing line between them. in practice, as only looking at C, does really matter which of A/B use!) 


As for C, a key point before declaring a particular URL as canonical, is check content isn't simply indexed on another URL. ie even though you might consider the URL canonical, and hence the one that should be indexed, Google might have stumbled upon some other alternate URL and indexed that. Ie because the Page is indexed elsewhere, can treat it as B.



... so having whittled down to a hopefully smaller list of just URLs in 'C' status, can then start thinking about what if anything to do about them. The exact solution will vary depending on the cause and what 'Reason' Google has put the URL in. 
(with a reminder that just because you consider the URL as C, Google might have other opinion and effectively put it in A or B!) 

There is tips about progressing with URL from each Reason on the official documenation

If still stuck, and want help figuring out what to do with URLs you put into C, then start a new thread here. Don't forget to include some sample URLs, the exact 'Reason' Google has classed them for not being indexed, and if have made any recent changes to try to address, now long ago. 
Pinned
Locked
Informational notification.
This question is locked and replying has been disabled.
Community content may not be verified or up-to-date. Learn more.
Last edited Feb 16, 2023
All Replies
Aug 13, 2024
Supplemental question:

Do I need to use 'Validate Fix' button in the 'Page Indexing' report?

... validation can be useful to check if a change you have made 'worked'. 
But as noted above, a lot of time, URLs are already in the right (or at least acceptable) status, and no fix is needed. 

  • Validation is typically used to confirm that an action (like fixing something) has resulted in the expected change. Validation would be appropriate if we made a change (fixing an error) and expect the pages's status to reflect that change (like becoming operational again).

here is an example:

Many websites prefer to use https:// as their main address (often called the "canonical" URL) for security reasons. When someone tries to access the site using the non-secure http:// version, the server will often redirect them to the https:// version automatically. This explains why http://example.com/ might be flagged as a "Page with redirect."

This scenario describes a situation where validation wouldn't be useful.

However, for a URL that's already functioning correctly (eg http:// redirecting to https://), validation wouldn't be necessary. There's no change expected, so validation would likely fail simply because the status remained the same.

In short, validation is best suited for situations where a change is made and its success needs to be verified. It's not ideal for confirming the existing state of something that's already working as intended.

Or more specifically, there are two scenarios when 'Validate Fix' is not appropriate - for slightly different reasons. 

  • The URL(s) are already in the correct status (like the above example of non-canonical URLs) - A in the classification outlined above. As not broken, nothing to fix. If do use validation anyway, likely to 'Fail'. 

  • When the URL is ok not being indexed (B in above list). You COULD use validation to check that the status changes to the correct status. But in practice it not worth it. The URL(s) would still be non-indexed anyway. But more critically because the URLs are already non-indexed, they going to very low priority for recrawling (validation checks that status changes when it next crawls) - so validation will be very slow. #

... so only if all the status are in group C above, and you have performed a fix - should really use Validation. 

Last edited Aug 13, 2024
false
7405396639717791388
true
Search Help Center
true
true
true
true
true
83844
false
false
Search
Clear search
Close search
Main menu