Semantic Search & SEO – Using Python and Google Colab

Semantic search adds context and meaning to search results. For example, if someone is searching for “Lego” – do they want to buy Lego toys, or see a Lego movie or TV show (Ninjago is great). Another example might be “Tesla” – do people want to see the latest self-driving car, or learn more about Tesla the scientist and inventor?

  • Make sure you understand search intent and any confusing searches like Tesla(inventor or car?), Jaguar (car or animal?), etc
  • Look for structured data opportunities
  • Optimise internal links – especially if you are using a “Pillar Post” and “Cluster Page” structure
  • Follow traditional on page SEO best practices with headers, meta titles, alt tags etc

SMA Marketing have done a cool YouTube video about Semantic Search and they recommend tools including:

  • Wordlift
  • Frase
  • Advanced Custom Fields for WordPress
  • Google Colab with a SpaCy

Before you publish a post – look at the search results for the keyword(s) you are optimising the post for. Check in incognito in Chrome to remove most of the personalisation of the results.

For any answer boxes or snippets, you can click the “3 dots” to get information about the results:

As well as the snippets, you can click the 3 dots next to any organic result. Here’s another result for “MMA training program pdf” with some additional information:

With this in mind – if you are looking to rank for “MMA training program pdf” then you will want to include the search terms highlighted in the “About this result” box: mma, training, program, pdf and ideally LSI keywords “workout” and “plan”.

It’s also a good idea to scroll down to the bottom of the SERP and check out the “related searches”

Take a look too at any breadcrumb results that pull through below the organic listings. Combining all this information will give you a good idea as to what Google understands by your search query and what people are looking for too.

Semantic Search & NLP

This is a bit techy, but thankfully, the guy at SMA Marketing (thank you if you’re reading this) has put together a file/load of python code that does most of the work for us. You can find it here – https://colab.research.google.com/drive/1PI6JBn06i3xNUdEuHZ9xKPG3oSRi1AUm?usp=sharing#scrollTo=uatWEoHp5nxZ

Hover over [1] and click the play icon that appears (highlighted yellow in screenshot below)

When that section has finished loading and refreshing, scroll down to the “Installation tensorflow + transformers + pipelines” section and click the play icon there.

When that’s finished doing it’s thing, scroll down again, and add your search query to the uQuery_1: section:

add your query and then press the “play” button on the left hand side opposite the uQuery_1 line

You should then see the top 10 organic results from Google on the left hand side – in the form of a list of URLs

Next, you can scrape all the results by scrolling down to the “Scraping results with Trafilatura” section and hover over the “[ ]” and press play again:

Next, when the scraping of results is done – scroll down to “Analyze terms from the corpus of results” section and click the play button that appears when you hover over “[ ]”

Next! when that’s done click the play button on the section full of code starting with:

“df_1[‘top_result’] = [‘Top 3’ if x <= 3 else ‘Positions 4 – 10’ for x in df_1[‘position’]] # add top_result = True when position <=3 “

Finally – scroll down and click the play button on the left of the “Visualizing the Top Results” section.

On the right hand side where it says “Top Top 3” and lists a load of keywords/terms – these are frequent and meaningful (apparently) terms used in the top 3 results for your search term.

Below that, you can see the terms used in the results from 4-10

Terms at the top of the graph are used frequently in the top 3 results e.g. “Mini bands”

Terms on the right are used frequently by the results in positions 4-10

From the graph above, I can see that for the search term “resistance bands” the top 3 results are using some terms, not used by 4-10 – including “Mini bands”, “superbands” “pick bodylastics”

  • If you click on a term/keyword in the graph – a ton of information appears just below:

e.g. if I click “mini bands”

Google Colab TOol

It’s interesting that “mini bands” is not featured at all in the results positioned 4-10

If you were currently ranking in position 7 for example, you’d probably want to look at adding “mini bands” into your post or product page

You can now go to the left-side-bar and click “Top 25 Terms” and click the “play icon” to refresh the data:

Semantic SEO tool

Obviously – use your experience etc and take the results with a pinch of salt – some won’t be relevant.

Natural Language Processing

next click on “Natural Langauge Processing” in the side-menu

Click the “play” icons next to “df_entity =df_1[df_1[‘position’] < 6]” and the section below.

When they have finished running click the play icon next to “Extracting Entities”

Click “play” on the “remove duplicates” section and again on the “Visualising Data” section

This should present you with a colourful table, with more terms and keywords – although for me most of the terms weren’t relevant in this instance 😦

You can also copy the output from the “Extracting the content from Top 5” section:

Python Google Colab
Then paste it into the DEMO/API for NLP that Google have created here:

https://cloud.google.com/natural-language#section-2

You can then click the different tabs/headings and get some cool insights

Google NLP API

Remember to scroll right down to the bottom, as you’ll find some additional insights about important terms and their relevance

The Google NLP API is pretty interesting. You can also copy and paste your existing page copy into it, and see what Google categories different terms as, and how “salient” or important/relevant it thinks each term is. For some reason, it thinks “band” is an organisation in the above screenshot. You can look to improve the interpretations by adding relevant contextual copy around the term on the page, by using schema and internal links.

Quickly Scrape SERP Results for SEO

Go to Google.com

Press CTRL+D to bookmark the page

Add it to your bookmarks bar if possible

Copy the code below:

javascript:var a = document.getElementsByTagName('a'), arr = '';for(var i=0; i<a.length; i++) if (a[i].ping && !a[i].href.includes('google'))arr +=('<p>' + a[i].href + '</p>');var newWindow = window.open();newWindow.document.write(arr);newWindow.document.close();


Right click on the bookmark and click “Edit”

Where it says “URL” paste in the code

Rename the bookmark “URL Extractor”

Save the bookmark

That’s it!

Test the URL Extractor – do a Google Search and then click on the Bookmark you’ve just created

You should now have a new tab with all of the URLs listed in it

Change your Google Search settings to show 100 results – you can just copy and paste the tab full of extracted URLS into Google Sheets

Once in Google Sheets, you can easily extract the Meta Titles, Descriptions and keywords

You just need to find and replace the URL in these formulas:

=importxml("https://blackbeltwhitehat.com","//title")
=importxml("https://blackbeltwhitehat.com/","//meta[@name='description']/@content")
=importxml("https://blackbeltwhitehat.com/","//meta[@name='keywords']/@content")

=IMPORTXML("https://blackbeltwhitehat.com/","//h1")

Data Studio Fields & Filters – Notes

Once you’ve created a field – add it to your chart as a Dimension

Mixed Case URLs

Having a mix of URL cases in the letters can faff with your data as D.S. might think each capitalisation variation is a new URL:

  • Click on the Data Tab to the right – then “Create new field” or “Add a Field”
  • Name the field “Lower Source” type in “LOWER(“
  • Then add “Source” from “Available Fields” to the left
  • Click “Save” and add “Lower Source” to the table as a Dimension

Concatenate Data

  • Create or “Add a Field”
  • In the Formula box type “CONCAT” and then select the fields you want to use
  • Close the formula with the final “)” and save it

Internal Site Searches – Extract Search Query

Make the data look nicer and get the search term on its own in the table.

  • Select to a “Add a Field”
  • Use REGEXP_EXTRACT formula to pull out the search terms and get rid of “/search?q”
  • Use the REGEX shown below:

Search Queries – Pull Out Questions – What, Why, When?

  • Create / Add a New Field
  • Add REGEX as shown below and save
  • Add New Field as a Dimension to the table
  • Create a table/data Filter so that you include only table rows that equate to “True” in the new dimension/column

CASE Formulas in Data Studio

CASE formulas are basically “If this, then do that” formulas

When X happens, Then do Y

You can use the CASE Formula to classify and group channels together

For example:

WHEN query matches “who”, then display text “Who?” in the table

  • In the Formula – you need a “catch all” default for when nothing is true in the criteria
  • If the search doesn’t contain “who, what, why” etc. then:
    ELSE “others”

    So if the search term, doesn’t match any of the REGEX criteria – classify it as “others”
– Save the CASE statement
  • Add the new field – to the table or report

Date Ranges & Filter Controls

Taken me ages to work this out – I’ve only just twigged:

You can add Comparisons, Search Boxes, Drop Down Menus and all sorts to your Data Studio Reports

For controls and filters to work – you need the chart or table to have “Default Date Range” set to “Auto”

See my post about REGEX for SEO here.

Ta

If you want a pre-made Data Studio template, you can send me some money and hope for the best.

How to Create a Topic Cluster for SEO

Using Ahrefs with Wikipedia

  • Find the most relevant Wikipedia page, for example:

https://en.wikipedia.org/wiki/Association_football

  • Add the Wikipedia URL and click “submit”
  • Click “Copy to Clipboard”
  • Paste all the keywords/topics into a Google Sheet and remove any that aren’t relevant
  • In Ahrefs (or a similar tool like SEMRush’s KW Manager) – go to KW explorer
  • Paste in the keywords/topics
  • In Ahrefs – click “matching terms” then “Questions”

Using SEMRush with UGC

  • Copy the homepage URL of a UGC* website such as Pinterest or Quora
  • Put the URL in the main “search bar” in SEMRush
  • Go to “Organic Research” using the sidebar
  • Click “Positions”
  • Add a Keyword to the “Filter by Keyword” search box
  • Click a relevant Keyword in the table of results
  • Under “Questions” click “View all….”
  • Go back and click other relevant keywords
  • Click the “Questions” tab again – make a note or download all the relevant questions
  • Organise KWs and questions into main pillar posts and “cluster content”

See Hubspot’s article about Pillar Pages and Cluster Content – here

*UGC – User Generated Content

Google Ads Reporting Script

This script is no longer live, but I had it in my account, I’m pasting it here for safekeeping and future reference as it is pretty handy:

to use the script:

Change the Google sheet/doc URL and the email address

  • Sign into Google Ads
  • Click on “Tools & Settings” on the top menu near the right hand side
  • Click “scripts”
  • Click the addition (+) sign to add a new script
  • Paste in the below script – save – preview and then run

// Copyright 2015, Google Inc. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

/**
 * @name Account Summary Report
 *
 * @overview The Account Summary Report script generates an at-a-glance report
 *     showing the performance of an entire Google Ads account. See
 *     https://developers.google.com/google-ads/scripts/docs/solutions/account-summary
 *     for more details.
 *
 * @author Google Ads Scripts Team [adwords-scripts@googlegroups.com]
 *
 * @version 1.1
 *
 * @changelog
 * - version 1.1
 *   - Add user-updateable fields, and ensure report row ordering.
 * - version 1.0.4
 *   - Improved code readability and comments.
 * - version 1.0.3
 *   - Added validation for external spreadsheet setup.
 * - version 1.0.2
 *   - Fixes date formatting bug in certain timezones.
 * - version 1.0.1
 *   - Improvements to time zone handling.
 * - version 1.0
 *   - Released initial version.
 */

var RECIPIENT_EMAIL = 'drewgriffiths@live.com';

var SPREADSHEET_URL = 'https://docs.google.com/spreadsheets/d/1mNjc7iJWOIq580DMLf6rYKT8xRAcl2MynnGSY29UiXY/edit#gid=3';

/**
 * Configuration to be used for running reports.
 */
var REPORTING_OPTIONS = {
  // Comment out the following line to default to the latest reporting version.
  apiVersion: 'v201809'
};

/**
 * To add additional fields to the report, follow the instructions at the link
 * in the header above, and add fields to this variable, taken from the Account
 * Performance Report reference:
 * https://developers.google.com/adwords/api/docs/appendix/reports/account-performance-report
 */
var REPORT_FIELDS = [
  {columnName: 'Cost', displayName: 'Cost'},
  {columnName: 'AverageCpc', displayName: 'Avg. CPC'},
  {columnName: 'Ctr', displayName: 'CTR'},
  {columnName: 'Impressions', displayName: 'Impressions'},
  {columnName: 'Clicks', displayName: 'Clicks'}
];

function main() {
  Logger.log('Using spreadsheet - %s.', SPREADSHEET_URL);
  var spreadsheet = validateAndGetSpreadsheet();
  spreadsheet.setSpreadsheetTimeZone(AdsApp.currentAccount().getTimeZone());
  spreadsheet.getRangeByName('account_id_report').setValue(
      AdsApp.currentAccount().getCustomerId());

  var yesterday = getYesterday();
  var date = getFirstDayToCheck(spreadsheet, yesterday);

  var rows = [];
  var existingDates = getExistingDates();

  while (date.getTime() <= yesterday.getTime()) {
    if (!existingDates[date]) {
      var row = getReportRowForDate(date);
      rows.push([new Date(date)].concat(REPORT_FIELDS.map(function(field) {
        return row[field.columnName];
      })));
      spreadsheet.getRangeByName('last_check').setValue(date);
    }
    date.setDate(date.getDate() + 1);
  }

  if (rows.length > 0) {
    writeToSpreadsheet(rows);

    var email = spreadsheet.getRangeByName('email').getValue();
    if (email) {
      sendEmail(email);
    }
  }
}

/**
 * Retrieves a lookup of dates for which rows already exist in the spreadsheet.
 *
 * @return {!Object} A lookup of existing dates.
 */
function getExistingDates() {
  var spreadsheet = validateAndGetSpreadsheet();
  var sheet = spreadsheet.getSheetByName('Report');

  var data = sheet.getDataRange().getValues();
  var existingDates = {};
  data.slice(5).forEach(function(row) {
    existingDates[row[1]] = true;
  });
  return existingDates;
}

/**
 * Sorts the data in the spreadsheet into ascending date order.
 */
function sortReportRows() {
  var spreadsheet = validateAndGetSpreadsheet();
  var sheet = spreadsheet.getSheetByName('Report');

  var data = sheet.getDataRange().getValues();
  var reportRows = data.slice(5);
  if (reportRows.length) {
    reportRows.sort(function(rowA, rowB) {
      if (!rowA || !rowA.length) {
        return -1;
      } else if (!rowB || !rowB.length) {
        return 1;
      } else if (rowA[1] < rowB[1]) {
        return -1;
      } else if (rowA[1] > rowB[1]) {
        return 1;
      }
      return 0;
    });
    sheet.getRange(6, 1, reportRows.length, reportRows[0].length)
        .setValues(reportRows);
  }
}

/**
 * Append the data rows to the spreadsheet.
 *
 * @param {Array<Array<string>>} rows The data rows.
 */
function writeToSpreadsheet(rows) {
  var access = new SpreadsheetAccess(SPREADSHEET_URL, 'Report');
  var emptyRow = access.findEmptyRow(6, 2);
  if (emptyRow < 0) {
    access.addRows(rows.length);
    emptyRow = access.findEmptyRow(6, 2);
  }
  access.writeRows(rows, emptyRow, 2);
  sortReportRows();
}

function sendEmail(email) {
  var day = getYesterday();
  var yesterdayRow = getReportRowForDate(day);
  day.setDate(day.getDate() - 1);
  var twoDaysAgoRow = getReportRowForDate(day);
  day.setDate(day.getDate() - 5);
  var weekAgoRow = getReportRowForDate(day);

  var html = [];
  html.push(
    '<html>',
      '<body>',
        '<table width=800 cellpadding=0 border=0 cellspacing=0>',
          '<tr>',
            '<td colspan=2 align=right>',
              "<div style='font: italic normal 10pt Times New Roman, serif; " +
                  "margin: 0; color: #666; padding-right: 5px;'>" +
                  'Powered by Google Ads Scripts</div>',
            '</td>',
          '</tr>',
          "<tr bgcolor='#3c78d8'>",
            '<td width=500>',
              "<div style='font: normal 18pt verdana, sans-serif; " +
              "padding: 3px 10px; color: white'>Account Summary report</div>",
            '</td>',
            '<td align=right>',
              "<div style='font: normal 18pt verdana, sans-serif; " +
              "padding: 3px 10px; color: white'>",
               AdsApp.currentAccount().getCustomerId(), '</h1>',
            '</td>',
            '</tr>',
          '</table>',
          '<table width=800 cellpadding=0 border=0 cellspacing=0>',
            "<tr bgcolor='#ddd'>",
              '<td></td>',
              "<td style='font: 12pt verdana, sans-serif; " +
                  'padding: 5px 0px 5px 5px; background-color: #ddd; ' +
                  "text-align: left'>Yesterday</td>",
              "<td style='font: 12pt verdana, sans-serif; " +
                  'padding: 5px 0px 5px 5px; background-color: #ddd; ' +
                  "text-align: left'>Two Days Ago</td>",
              "<td style='font: 12pt verdana, sans-serif; " +
                  'padding: 5px 0px 5x 5px; background-color: #ddd; ' +
                  "text-align: left'>A week ago</td>",
            '</tr>');
  REPORT_FIELDS.forEach(function(field) {
    html.push(emailRow(
        field.displayName, field.columnName, yesterdayRow, twoDaysAgoRow,
        weekAgoRow));
  });
  html.push('</table>', '</body>', '</html>');
  MailApp.sendEmail(email, 'Google Ads Account ' +
      AdsApp.currentAccount().getCustomerId() + ' Summary Report', '',
      {htmlBody: html.join('\n')});
}

function emailRow(title, column, yesterdayRow, twoDaysAgoRow, weekAgoRow) {
  var html = [];
  html.push('<tr>',
      "<td style='padding: 5px 10px'>" + title + '</td>',
      "<td style='padding: 0px 10px'>" + yesterdayRow[column] + '</td>',
      "<td style='padding: 0px 10px'>" + twoDaysAgoRow[column] +
          formatChangeString(yesterdayRow[column], twoDaysAgoRow[column]) +
          '</td>',
      "<td style='padding: 0px 10px'>" + weekAgoRow[column] +
          formatChangeString(yesterdayRow[column], weekAgoRow[column]) +
          '</td>',
      '</tr>');
  return html.join('\n');
}


function getReportRowForDate(date) {
  var timeZone = AdsApp.currentAccount().getTimeZone();
  var dateString = Utilities.formatDate(date, timeZone, 'yyyyMMdd');
  return getReportRowForDuring(dateString + ',' + dateString);
}

function getReportRowForDuring(during) {
  var report = AdsApp.report(
      'SELECT ' +
          REPORT_FIELDS
              .map(function(field) {
                return field.columnName;
              })
              .join(',') +
          ' FROM ACCOUNT_PERFORMANCE_REPORT ' +
          'DURING ' + during,
      REPORTING_OPTIONS);
  return report.rows().next();
}

function formatChangeString(newValue,  oldValue) {
  var x = newValue.indexOf('%');
  if (x != -1) {
    newValue = newValue.substring(0, x);
    var y = oldValue.indexOf('%');
    oldValue = oldValue.substring(0, y);
  }

  var change = parseFloat(newValue - oldValue).toFixed(2);
  var changeString = change;
  if (x != -1) {
    changeString = change + '%';
  }

  if (change >= 0) {
    return "<span style='color: #38761d; font-size: 8pt'> (+" +
        changeString + ')</span>';
  } else {
    return "<span style='color: #cc0000; font-size: 8pt'> (" +
        changeString + ')</span>';
  }
}

function SpreadsheetAccess(spreadsheetUrl, sheetName) {
  this.spreadsheet = SpreadsheetApp.openByUrl(spreadsheetUrl);
  this.sheet = this.spreadsheet.getSheetByName(sheetName);

  // what column should we be looking at to check whether the row is empty?
  this.findEmptyRow = function(minRow, column) {
    var values = this.sheet.getRange(minRow, column,
        this.sheet.getMaxRows(), 1).getValues();
    for (var i = 0; i < values.length; i++) {
      if (!values[i][0]) {
        return i + minRow;
      }
    }
    return -1;
  };
  this.addRows = function(howMany) {
    this.sheet.insertRowsAfter(this.sheet.getMaxRows(), howMany);
  };
  this.writeRows = function(rows, startRow, startColumn) {
    this.sheet.getRange(startRow, startColumn, rows.length, rows[0].length).
        setValues(rows);
  };
}

/**
 * Gets a date object that is 00:00 yesterday.
 *
 * @return {Date} A date object that is equivalent to 00:00 yesterday in the
 *     account's time zone.
 */
function getYesterday() {
  var yesterday = new Date(new Date().getTime() - 24 * 3600 * 1000);
  return new Date(getDateStringInTimeZone('MMM dd, yyyy 00:00:00 Z',
      yesterday));
}

/**
 * Returned the last checked date + 1 day, or yesterday if there isn't
 * a specified last checked date.
 *
 * @param {Spreadsheet} spreadsheet The export spreadsheet.
 * @param {Date} yesterday The yesterday date.
 *
 * @return {Date} The date corresponding to the first day to check.
 */
function getFirstDayToCheck(spreadsheet, yesterday) {
  var last_check = spreadsheet.getRangeByName('last_check').getValue();
  var date;
  if (last_check.length == 0) {
    date = new Date(yesterday);
  } else {
    date = new Date(last_check);
    date.setDate(date.getDate() + 1);
  }
  return date;
}

/**
 * Produces a formatted string representing a given date in a given time zone.
 *
 * @param {string} format A format specifier for the string to be produced.
 * @param {date} date A date object. Defaults to the current date.
 * @param {string} timeZone A time zone. Defaults to the account's time zone.
 * @return {string} A formatted string of the given date in the given time zone.
 */
function getDateStringInTimeZone(format, date, timeZone) {
  date = date || new Date();
  timeZone = timeZone || AdsApp.currentAccount().getTimeZone();
  return Utilities.formatDate(date, timeZone, format);
}

/**
 * Validates the provided spreadsheet URL to make sure that it's set up
 * properly. Throws a descriptive error message if validation fails.
 *
 * @return {Spreadsheet} The spreadsheet object itself, fetched from the URL.
 */
function validateAndGetSpreadsheet() {
  if ('YOUR_SPREADSHEET_URL' == SPREADSHEET_URL) {
    throw new Error('Please specify a valid Spreadsheet URL. You can find' +
        ' a link to a template in the associated guide for this script.');
  }
  var spreadsheet = SpreadsheetApp.openByUrl(SPREADSHEET_URL);
  var email = spreadsheet.getRangeByName('email').getValue();
  if ('foo@example.com' == email) {
    throw new Error('Please either set a custom email address in the' +
        ' spreadsheet, or set the email field in the spreadsheet to blank' +
        ' to send no email.');
  }
  return spreadsheet;
}

Bob’s your uncle

The code was originally on the Google Developer’s site here – https://developers.google.com/adwords/scripts/docs/solutions/ad-performance

Please Google, don’t bankrupt me for using the script on here! I’ll take it down if it upsets thee.

SEO Technical Audit Checklist (advanced)

The idea of technical SEO is to minimise the work of bots when they come to your website to index it on Google and Bing. Look at the build, the crawl and the rendering of the site.

Tools Required:

  • SEO Crawler such as Screaming Frog or DeepCrawl
  • Log File Analyzer – Screaming Frog has this too
  • Developer Tools – such as the ones found in Google Chrome – View>Developer>Developer Tools
  • Web Developer Toolbar – giving you the ability to turn off Javascript
  • Search Console
  • Bing Webmaster Tools – shows you geotargetting behaviour, gives you a second opinion on security etc.
  • Google Analytics – With onsite search tracking *

    *Great for tailoring copy and pages. Just turn it on and add query parameter

Tech SEO 1 – The Website Build & Setup

The website setup – a neglected element of many SEO tech audits.

  • Storage
    Do you have enough storage for your website now and in the near future? you can work this out by taking your average page size (times 1.5 to be safe), multiplied by the number of pages and posts, multiplied by 1+growth rate/100

for example, a site with an average page size of 1mb with 500 pages and an annual growth rate of 150%

1mb X 1.5 X 500 X 1.5 = 1125mb of storage required for the year.

You don’t want to be held to ransom by a webhost, because you have gone over your storage limit.

  • How is your site Logging Data?
    Before we think about web analytics, think about how your site is storing data.
    As a minimum, your site should be logging the date, the request, the referrer, the response and the User Agent – this is inline with the W3 Extended Format.
log file analyzer

When, what it was, where it came from, how the server responded and whether it was a browser or a bot that came to your site.

  • Blog Post Publishing
    Can authors and copywriters add meta titles, descriptions and schema easily? Some websites require a ‘code release’ to allow authors to add a meta description.
  • Site Maintenance & Updates – Accessibility & Permissions
    Along with the meta stuff – how much access does each user have to the code and backend of a website? How are permissions built in?
    This could and probably should be tailored to each team and their skillset.

    For example, can an author of a blog post easily compress an image?
    Can the same author update a menu (often not a good idea)
    Who can access the server to tune server performance?

Tech SEO 2 – The Crawl

  • Google Index

Carry out a site: search and check the number of pages compared to a crawl with Screaming Frog.

With a site: search (for example, search in Google for site:businessdaduk.com) – don’t trust the number of pages that Google tells you it has found, scrape the SERPs using Python on Link Clump:

Too many or too few URLs being indexed – both suggest there is a problem.

  • Correct Files in Place – e.g. Robots.txt
    Check these files carefully. Google says spaces are not an issue in Robots.txt files, but many coders and SEOers suggest this isn’t the case.

XML sitemaps also need to be correct and in place and submitted to search console. Be careful with the <lastmod> directive, lots of websites have lastmod but don’t update it when they update a page or post.

  • Response Codes
    Checking response codes with a browser plugin or Screaming Frog works 99% of the time, but to go next level, try using curl and command line. Curl avoids JS and gives you the response header.

Type in Curl – I and then the URL

e.g.

curl – I https://businessdaduk.com/

You need to download cURL which can be a ball ache if you need IT’s permission etc.

Anyway, if you do download it and run curl, your response should look like this:

Next enter an incorrect URL and make sure it results in a 404.

  • Canonical URLs
    Each ‘resource’ should have a single canonical address.

common causes of canonical issues include – sharing URLs/shortened URLs, tracking URLs and product option parameters.

The best way to check for any canonical issues is to check crawling behaviour and do this by checking log files.

You can check log files and analyse them, with Screaming Frog – the first 1,000 log files can be analysed with the free version (at time of writing).

Most of the time, your host will have your logfiles in the cPanel section, named something like “Raw Access”. The files are normally zipped with gzip, so you might need a piece of software to unzip them or just allow you to open them – although often you can still just drag and drop the files into Screaming Frog.

The Screaming Frog log file analyser, is a different download to the SEO site crawler – https://www.screamingfrog.co.uk/log-file-analyser/

If the log files are in the tens of millions, you might need to go next level nerd and use grep in Linux command line

Read more about all things log file analysis-y on Ian Lurie’s Blog here.

This video tutorial about Linux might also be handy. I’ve stuck it on my brother’s old laptop. Probably should have asked first.

With product IDs, and other URL fragments, use a # instead of a ? to add tracking.

Using rel-canonical is a hint, not a directive. It’s a work around rather than a solution.

Remember also, that the server header, can override a canonical tag.

You can check your server headers using this tool – http://www.rexswain.com/httpview.html (at your own risk like)


Tech SEO 3 – Rendering & Speed

  • Lighthouse
    Use lighthouse, but use in with command line or use it in a browser with no browser add-ons.If you are not into Linux, use pingdom, GTMetrix and Lighthouse, ideally in a browser with no add-ons.

    Look out for too much code, but also invalid code. This might include things such as image alt tags, which aren’t marked up properly – some plugins will display the code just as ‘alt’ rather than alt=”blah”
  • Javascript
    Despite what Google says, all the SEO professionals that I follow the work of, state that client-side JS is still a site speed problem and potential ranking factor. Only use JS if you need it and use server-side JS.

    Use a browser add-on that lets you turn off JS and then check that your site is still full functional.

  • Schema

Finally, possibly in the wrong place down here – but use Screaming Frog or Deepcrawl to check your schema markup is correct.

You can add schema using the Yoast or Rank Math SEO plugins

The Actual Tech SEO Checklist (Without Waffle)

Basic Setup

  • Google Analytics, Search Console and Tag Manager all set up

Site Indexation

  • Sitemap & Robots.txt set up
  • Check appropriate use of robots tags and x-robots
  • Check site: search URLs vs crawl
  • Check internal links pointing to important pages
  • Check important pages are only 1 or 2 clicks from homepage

Site Speed

Tools – Lighthouse, GTMetrix, Pingdom

Check – Image size, domain & http requests, code bloat, Javascript use, optimal CSS delivery, code minification, browser cache, reduce redirects, reduce errors like 404s.

For render blocking JS and stuff, there are WordPress plugins like Autoptimize and the W3 Total Cache.

Make sure there are no unnecessary redirects, broken links or other shenanigans going on with status codes. Use Search Console and Screaming Frog to check.

Site UX

Mobile Friendly Test, Site Speed, time to interactive, consistent UX across devices and browsers

Consider adding breadcrumbs with schema markup.

Clean URLs

Image from Blogspot.com

Make sure URLs – Include a keyword, are short – use a dash/hyphen –

Secure Server HTTPS

Use a secure server, and make sure the unsecure version redirects to it

Allow Google to Crawl Resources

Google wants to crawl your external CSS and JS files. Use “Fetch as Google” in Search Console to check what Googlebot sees.

Hreflang Attribute

Check that you are using and implementing hreflang properly.

Tracking – Make Sure Tag Manager & Analytics are Working

Check tracking is working properly. You can check tracking coed is on each webpage with Screaming Frog.

Internal Linking

Make sure your ‘money pages’ or most profitable pages, get the most internal links

Content Audit

Redirect or unpublish thin content that gets zero traffic and has no links. **note on this, I had decent content that had no visits, I updated the H1 with a celebrity’s name and now it’s one of my best performing pages – so it’s not always a good idea to delete zero traffic pages**

Consider combining thin content into an in depth guide or article.

Use search console to see what keywords your content ranks for, what new content you could create (based on those keywords) and where you should point internal links.

Use Google Analytics data regarding internal site searches for keyword and content ideas 💡

Update old content

Fix meta titles and meta description issues – including low CTR

Find & Fix KW cannibalization

Optimize images – compress, alt text, file name

Check proper use of H1 and H2

See what questions etc. are pulled through into the rich snipetts and answer these within content

Do you have EAT? Expertise, Authority and Trust?

https://www.semrush.com/blog/seo-checklist/

You can download a rather messy Word Doc Template of my usual SEO technical checklist here:

https://smallbusinessdad.files.wordpress.com/2021/11/drewseotemplate.docx

It uses Screaming Frog, SEMRush and Search Console

SEO – Use Search Console to Create Blog Posts that Rank

Go to search console

  • Click “Performance” in the side bar
  • Click “Position”
  • Click “Pages” (near the bottom-third of the page on the left)
  • Click on a high-performing post in terms of Impressions and Clicks in google
  • With the specific page/post selected, click on queries
  • Make a note of all relevant queries in the top 100
  • See if these queries can be added to the ranking post
  • Find any queries that are not directly related to your post
  • Create a new post specifically about this/these queries (if you rank for it without a specific post – you’ll rank better with a specific post for that query)
  • In the original post – put an internal link to the new post

Building Your Brand – Why Blog, Why Use Display Ads etc?

“You’ll never get yourself off the treadmill of paid ads, if you don’t build your brand”

Someone on a Search Podcast, 2019


It’s very easy to dismiss online content, blogs, image assets and even display ads as pretty much useless – because you don’t have the instant gratification of seeing leads and/or sales.

This is completely understandable; especially if you have a background in sales – where your job has been to ‘finish off the lead’ and get a sale.

However, if you are in it for the long (or medium) run, then building your brand is a must. Whether you are a tradesman or a giant corporation, your brand’s reputation and the brand-awareness is your safety net when it comes to consistent website traffic, leads & sales.

It takes time to build a brand – but once it is built, those people who come to you direct because they know who you are – are effectively free – or at least very cheap in comparison to some of the cost per click of Google Search Ads these days.

Building a brand is not easy however. Take my other blog for example – Blackbeltwhitehat.com

The blog has over 600 pages of content, lots of it really long, in-depth and time-consuming to produce. The site has 5,000-10,000 visitors per month, but virtually nobody comes to my website via a branded search on Google.

This could be down to one specific reason – the domain name is crap and hard to remember.

I’ve bought a few more memorable domains (like WokeMMA.com “Woke” being an ironic term for self-awareness used in the MMA & Jiu Jitsu communities) and I am currently weighing up the time & effort of re-branding everything like GoogleMyBusiness, TrustPilot etc. – plus all my back-links currently point to blackbeltwhitehat.com (I’m aware of 301s etc. but I’ll still definitely see a drop in rankings).

My blog is ultimately a hobby that I’ve invested less than $50 into over 6 years.  But if I had some more budget – I’d put together a plan to build my brand online…

Logo Design Illustrator
Blackbeltwhitehat.com

How to Build a Brand Online

First make sure you know your target audience & do one of those SWOT analysis. Then make specific goals to establish some brand KPIs.

Here are some ideas on what to do next:

  • Get a relevant, easy to remember domain name!

Learn from my mistake, a short catchy domain name is an easy-win if you are just starting out from scratch. A lot of the best and obvious domain name will be taken however, so you’ll have to do some research first. If you are just starting out, don’t name your business until you secure your domain name!

  • Display Ads

Depending on your niche, you can set tiny max CPC bids in some instances – and they’ll still get thousands of impressions for very little spend. Gmail ads work particularly well for (potential) low CPM (cost per 1000 impressions).

Rotate your display ads’ design & colours to stop people ignoring them due to ‘banner blindness’.

  • Blog & Outreach

Blog are great for reaching people who are researching a potential purchase.

For example, I landed on Perfect Keto’s blog a few times whilst researching Exogenous Ketones. Then ended up buying their branded product on Keto-pro.co.uk; because, for what ever reason, I trusted their brand.

Create great content, with statistics, images and video – and then outreach it – i.e. send it to relevant blogs and websites.

If you can afford it, use “PR-Level” outreach and contact national newspapers etc. This can be done via websites such as gorkana

If you content gets links too – then great – that’s good for Search Engine Optimisation (SEO). Doing some of your own exclusive research and generating tables of statistics are great for generating back-links naturally i.e. passively.

So consider doing some market research using Google surveys etc. These guys calculated RV/Campervan depreciation in value, just by looking at vehicles for sale online and get hundreds of back-links.

To turn blog’s into direct sales, you can also use relevant ‘CTA’ images below your blog.

For example, if you post a blog about the Walking Path’s of Snowdonia on your Snowdonia-based-bed-&-breakfast website; consider adding a relevant & clickable ‘book now’ and/or ‘get your free brochure’ button with eye-catching image at the bottom of the post. Many people now do this with newsletter sign up pop ups, which are a bit annoying but do work.

Before you start a blog, do your keyword research.

  • Create Tools

Content is great – but tools tend to do better than copy. For example, NerdWallet’s top page in terms of organic traffic – is their mortgage calculator.

  • Reviews

As well as brand awareness, you want some social-proofing of your brand. Start with a free account on Trustpilot and GoogleMyBusiness

  • Video & Social Media

The number 1 mistake people make on social media is to harp on about their brand all the time. Be entertaining, provide useful information and insightful comments. If you are over-promotional, people will not follow you. Build some authority by providing helpful insights that your target market will appreciate.

Videos & podcasts can be costly in terms of time. If you don’t want to set up your own podcast, guest-appearance on other people’s podcasts can generate valuable awareness and also back-links to your website (important for Search Engine Optimisation/Rankings).

  • Build an amazing product and/or service

This is your foundation and one of the reasons that Apple is so successful. An LSD-fueled Steve Jobs came up with some amazing ideas and concepts. The brand also turned itself into unique hybrid of tech & fashion thanks to their pioneering products.

The big, light-up apple on the back of Macbooks no doubt was a design aimed at building brand awareness too!

Please note – I realise this blog has a rubbish social media following. But that’s due to lack of time/money investment. I generally just use this blog as somewhere to record my thoughts & to remember how to do all things marketing related. E.g. here are my notes so I remember how to use Screaming Frog to scrape OG tags.

See my 2019 guide to Keyword Research by Clicking Here.

KW Research for SEO [Updated June 2022]

Hello,

Just thought I’d write a post about how I go about doing my SEO & PPC Keyword Research these days.

  1. Add head term to google search bar in chrome

Make a note of the suggested searches & predicted searches

Screen Shot 2019-07-22 at 12.29.22
Predicted Searches in the Address/Search Bar
suggested google searches
Related/Suggested Searches at the Bottom

2. Google ‘Alphabet Soup’

Put in your main head term and then add a

then b, then c and so on

Make a note of the relevant suggestions

Screen Shot 2019-07-22 at 12.33.18
Screen Shot 2019-07-22 at 12.33.21
Screen Shot 2019-07-22 at 12.33.23

For product related KWs, this also works if you go to the desktop version of the Google Play Store.

After each search, check out the related search terms at the bottom of each SERP.

3. Have a Quick Look on Reddit & Amazon

Have a look at any relevant subreddits on Reddit – e.g. https://www.reddit.com/r/jeffbridges/

Do a quick search for “KW” site:reddit.com

Screen Shot 2019-07-22 at 12.41.49.png

Have a look on Amazon, just search your head term and see what products appear

Screen Shot 2019-07-22 at 12.42.40.png

In the example above .- “The Dude and the Zen Master” might be a decent KW

Other forums can help too. For example, when looking for KWs for my MMA blog, I’ll look on sherdog forums for trending & frequent topics.

4. Add Competitor Domains to Ubersuggest

https://neilpatel.com/ubersuggest/ or use the SEMRush plugin.

If necessary, we could export the KWs in this report, and then filter in Excel for those containing “Jeff Bridges”

Screen Shot 2019-07-22 at 12.46.34.png

5. Upload your Final KW List to Keyword Planner

https://ads.google.com/intl/en_uk/home/tools/keyword-planner/

A final note on search volumes.

For some blogs and websites, even keywords with 0 monthly searches may be relevant.

My other blog – blackbeltwhitehat.com has built all of its traffic off KWs that Google KW planner says has 0 searches.

It all depends on how authoritative your website is and your competitors are. You can go after bigger, more popular KWs if you are a huge website with a DA of 90. It’s a different ball game if you are running a personal blog with a DA of 15

Try and include a number of the relevant searches in your articles etc.

Check Competitors

If you have a tool like semrush.com,

Check what keywords competitors are ranking for.

If your head keyword is “football goals”, see who’s ranking top 5 for that term and see what other keywords the top URLs are ranking for.

If you don’t have an SEO Tool, you can just check the copy and meta title & description to see what keywords the top websites are trying to opimise for.