Sorting XML Sitemap URLs by Folder Depth

This can be handy if for example, you only want a list of products and they reside on folders that are 3 “/” deep into your URLs/Domain

For example:

  • Myshop.com/categorypages/subcategorypage/productpage/

I only want the URLs that reside at the third level – i.e. /productpage/

  1. Go to your XML sitemap – usually at Myshop.com/sitemap.xml
  2. Right click and “save as” – save on your computer
  3. Open Excel
  4. Go to the Developer Tab (you might need to add this as it’s not there by default)
  5. Click “Import”
  6. Browse to find your sitemap.xml and import it into Excel
  7. This usually pulls all your URLs into column 1 and other info like priority into separate columns
  8. Delete all the columns except the first one with your URLs in it
  9. Remove the https:// from the URLs with “find and replace” – On “Home” tab under “Find & Select” on the right
  10. In cell B2 add the function:
=LEN(A2)-LEN(SUBSTITUTE(A2,"/",""))

11. Drag the formula down the rest of column B

12. You can now order column B by the number of “/” found in each URL

If different categories have different folder structures then you can conditionally format and use different colours for different categories and then do a multiple criteria sort – by colour, then folder depth (column B)

You can download an example spreadsheet with the formula in here

Excel – Paste Filtered Data into Filtered Cells

Filter the data so that it shows the cells that you want to paste

In the example below, we’ve filtered column B and want to paste in column E

  • Highlight the data in the cells/column you want to copy
  • At the same time, highlight the cells where you want to paste
    – so hold down CTRL so you can select the cells in column B as well as column E
  • Go to the menu/ribbon and on the right of the “Home” tab click “Fill”
  • Now click “Left” or “right” depending on where the column is in relation to the original data

So in the example above, we want to paste the data in a column to the right, so click on “right”

That should be the data pasted:

Video below shows how this is done. The demo begins at about 52 seconds:

Redirects – Check All URLs end with a slash

When doing redirects, for a new site –

sometimes, when you crawl the old/current site you will find a few rogue URLs that start with http: not https: and don’t end with a backslash.

 

If you’re doing a find and replace job, to create the destination URLs, this can mess with your final URLs.

Search “http:” to ensure, that if applicable, all the final URLs use the correct http protocol

Then use the formula below, to check the all end with a backslash

 

=IF(RIGHT(A1,1)=”/”,TRUE,FALSE)

 

Excel_backslash_formula

Excel – Removing the last Character

Let’s say you have a long list of websites, and you need to remove the trailing slash from the end of each of them…you can use this formula to do it:

 

=LEFT(A1,LEN(A1)-1)

 

Trim

This works a lot better than find and replace – especially if you have https:// in the URL still – as find and replace “/” will get rid of the all the “/”