How to Get SEO Budgets Approved and Technical SEO Fixes Prioritized and Implemented

If you have a technical background like me and need to explain the value of your work to potential clients or senior management, you are likely very familiar with this situation:

You: “We have 10 million missing meta descriptions on the site. We need to fix them asap or we are not going to increase our rankings and hit our goals!”

Your boss: “Ok. How much is it going to cost to fix?”

You: “Here is the spreadsheet. Only 200k dollars, and it will take 10 months to complete.”

Your boss: “How much do we stand to gain after we do this?”

You: “Heh … I don’t know. It is not possible to estimate this.”

Your boss: “Ok. We need to prioritize other projects where we know upfront the potential return.”

This situation is similar with consulting discussions with potential clients. Traffic and revenue increases are a major concern for marketing and sales leaders, but budgets are prioritized where there is a clear potential return on investment. While investment in SEO activities continues to increase, it remains a fraction of the investment on paid search. Investment in SEO has remained what I call gambling money. A small fraction of marketing budgets allocated to important, but high risk, investments that are likely to not pay off. A big consequence of this is that many companies pay for SEO audits, and do nothing about them when they realize the effort required to implement them.

For the past 5 years, my business has been primarily selling technical SEO services to marketing leaders at medium and large business, and I came up with an SEO unclaimed revenue estimation framework that has helped me translate my value add clearly. I call this framework The SEO Pipeline. In this post, I’m going to share it with you with the hope it can help you create solid business cases to get your budgets and projects approved.

Sales and Marketing Pipelines

Credit: http://bit.ly/2jMahCA

Sales and marketing people are intimately familiar with the concept of funnels and pipelines. It is a very powerful concept to 1) see what prospects to prioritize; 2) see the total revenue potential of the existing opportunities; and, 3) determine what additional work is required to hit sales targets.

Now, you might be wondering: “what does this have to do with technical SEO issues?” Stay with me for a bit, and you will find out. I promise it will be worth it.

The basic sales pipeline consists of three major parts:

The Body: Sales Steps

The Brain: The Probability of Closure

The Heart: Weighted Target

The Body: Sales Steps

A sales pipeline represents all of the steps you need to take in order to sell a product or a service to your customer. It’s a process in which every action needs to be made in a particular, predetermined order, as you successfully close each stage of the pipeline before moving on to the next one.  

It always depends on whose definition you’re looking at, but experts generally agree the fewer stages in the pipeline, the better. Typically, these five stages are present in most sales pipelines:

  1. Initial Contact
  2. Qualification
  3. Meeting
  4. Proposal
  5. Closing

The Brain: The Probability of Closure

This part represents the probability of selling your product during a particular sales step. The probability of closing increases the further down in the pipeline you are.

The Heart: Weighted Target

“Weighted target is equal to the sum of the total opportunity value in each sales step multiplied by the probability of closure for that step.” – David Brock partnersinexcellenceblog.com

Let’s jump to a relatively similar idea in the SEO space and you will see how I connected both ideas to come up with my approach to SEO revenue estimation.

SEO 101

Every SEO worth hiring understands how search engines work, but to keep this explanation complete, let’s review a quick summary of the basics. Feel free to jump to the SEO pipeline section if you already know this.

Google sends Googlebot, a distributed system that is fetching pages and all sorts of resources from all sites it can discover from around the web. This is an ongoing process that is taking place all the time. This process is called crawling.

You can review Googlebot’s daily activity on your website inside Google Search Console > Crawl > Crawl Stats.

As Dawn Anderson accurately points out, the Crawl Stats report is not limited to “pages”, but it includes any page resources (images, JavaScript, CSS, etc), and also redirects, but it excludes errors and blocked pages.

Google stores these pages fetched in a cache, and you can check if Google grabbed specific pages on your site by typing cache:<url> in a Google search. If you get an error, the page has not been crawled or Googlebot didn’t like it because it was a duplicate.

Next, Google needs to organize all the pages fetched into an index so they are easy to find fast by performing searches. This process is called indexing.

BTWGoogle is moving to prioritize a mobile-first index as they announced last November. Cindy Krum has a great series on the implications of this move. Definitely, worth a good read.

You can review how many pages and resources Google has indexed from your site inside Google Search Console > Google index > Index Status.

This report includes all the pages Google considers canonicals and non-duplicates; however, if your site canonicalization is not setup correctly, you might end up seeing more pages indexed than there should be.

The final step is ranking, and it is the step most people are familiar with. Many marketing leaders I talk to only understand SEO as higher rankings, but before their site pages rank they need to be actually indexed.

Google Search Console offers the best keyword rank checking tool on the planet because among many other cool things, you are able to find pages that rank high, but are not getting any clicks.   

Thanks to this, I like to split the ranking process in two: ranking and results presentation.

Results presentation means that it is not enough to just rank, you also need a compelling message in the search results or people won’t click. You can download the pages into Google Sheets and filter the pages to get the pages with impressions but no clicks, and the pages getting clicks.

So, now that we covered the basics and where to find the fundamental building blocks, let me explain the concept of the SEO pipeline.

The SEO Pipeline

Just like the sales pipeline, our SEO pipeline consists of three major parts:

The Body: The Search Engine Stages

The Brain: The Probability of Search Visitor Action

The Heart: Weighed Target

The Body: The Search Engine Stages

An SEO pipeline represents the steps a search engine takes before your site pages actually receive new search visitors that take action on your site (place an order).

Each step represents a single or group of actions that need to be taken in order to push pages past that phase of the SEO process and move to the next one within the SEO pipeline.

An SEO pipeline has 5 stages:

  1. Crawling
  2. Indexing
  3. Ranking/relevance
  4. Results presentation
  5. Search visitor action

The Brain: The Probability of Search Visitor Action

The probability of search visitor action shows you the likelihood of pages (or group of pages) that still remain at that step, moving through all the stages and ultimately leading to leads or sales.

The Heart: Weighed Target

When people try to sell, we came up with the process for how to try to assess how much opportunity is in the other two stages of the SEO process.

The SEO pipeline process is similar to the already proven sales process of a sales pipeline. If you look at website pages as prospects, you see the prospect turn into a lead when the page is crawled and indexed, an opportunity as it starts to rank, and if it has the right message it gets a click and turns into an account so that if you’re selling something there is a possibility of a sale.

Another advantage of a SEO pipeline is that we can estimate how much money there is in the pipeline and the probability of the website making money. In the sales pipeline, at the initial contact the probability of closing is not that high, just like it’s not high at the crawling stage.

The probability of the website making money increases as the pages are found and indexed. This is something we try to assess: the opportunity. Measuring the opportunity on the unclaimed revenue is based on how many pages are left at each of the SEO pipeline stages.

Let’s discuss a couple of practical examples next as you now know where to find the right reports to produce an SEO pipeline for your site.

But before we do that, we need to know where to find pages that resulted in transactions. As we use Google Search Console to estimate most of the data, it is important to also use the Search Console reports in Google Analytics to estimate this. That is Acquisition > Search Console > Landing Pages. Then use the Advanced Filter to limit the pages to the ones with transactions and/or revenue.

Practical Example

If the search engine crawls 8,597 pages, and ultimately 147 of those pages lead to search visitor actions, you have a probability of 1.71% at the crawling stage.

A typical SEO pipeline could have the following probability of search visitor action for each step:

Crawling: 1.71%

8,597 unique* pages crawled

147 pages with transactions (1.71%)

 

Indexing: 4.03%

3,651 pages indexed

147 pages with transactions (4.03%)

 

Ranking/relevance: 6.07%

2,427 pages with search impressions

147 pages with transactions (6.07%)

 

Results presentation: 8.81%

1,668 pages with search visits

147 pages with transactions (8.81%)    

Search visitor action: 100%

 

*In order to get accurate crawling, you need to parse the server logs

Consider each page on your site as potentially driving or influencing sales. When we have a page in the SEO pipeline, we need to know:

  1. What is the average number of transactions on the page (or page group)?
  2. What is the average order value of the transactions coming through the page (or page group)?

Each page has its own specific value, which can be used to prioritize which pages should make it through all the stages, maximizing the potential results.

To simplify things, we will use the global average values: 3.44 average transactions, $108.70 per transaction. For real results, we need to consider the metrics for the page groups or pages left at each stage.

The opportunity at each stage is represented by the number of pages yet to move to the next one.

 

Crawling : 1.71%

8,597 unique pages crawled

Opportunity: (8597 – 3651) * 1.71% = 85

Opportunity Value: 85 * 3.44 * 108.70 = $31,783.88

 

Indexing: 4.03%

3,651 pages indexed

Opportunity: (3651 – 2427) * 4.03% = 49

Opportunity Value: 49 * 3.44 * 108.70 = $18,322.47

 

Ranking/relevance: 6.07%

2,427 pages with search impressions

Opportunity: (2427 – 1668) * 6.07% = 46

Opportunity Value: 46 * 3.44 * 108.70 = $17,200.68

 

Results presentation: 8.81%

1,668 pages with search visits

Opportunity: (1668 – 147) * 8.81% = 134   

Opportunity Value: 134 * 3.44 * 108.70 = $50,106.35

Search visitor action: 100%

 

The weighted target is equal to the sum of the total opportunity value in each SEO phase multiplied by its probability of search visitor action. In this case: $117,413.38.

Now, if you are still awake and followed the whole thing, let’s discuss the cool implications of this powerful concept.

Conclusion

Whenever there are projections and estimates, there are assumptions that need to hold true. In our case, we are collecting data from Google Search Console, Google Analytics, and web server logs to get more accurate crawling numbers. But, the fundamental assumption is that probabilities of pages leading up to sales at the end of the SEO pipeline must remain true. This is also the same case in sales projections estimated from sales funnels. In practice, what most organizations do is know they ultimately close 20-30% of the opportunities in their sales pipeline, and plan accordingly. I think we can develop the same discipline to see what we realistically realize from the SEO pipeline in practice.

We need to know that it is not possible to get all pages crawled, to be indexed, etc. The same limitations are in the marketing and sales funnels. But, this funnel concept allows us to: 1) prioritize which SEO activities are the lowest hanging fruit in terms of producing revenue; and, 2) confidently present projects for evaluation and budget approvals backed by this simple framework, which most marketing leaders I’ve shown this to get immediately.

Now, let’s replay that initial conversation with your boss, but this time you are armed with an SEO Pipeline.

You: “We have 10 million missing meta descriptions on the site. We need to fix them asap or we are not going to increase our rankings and hit our goals!”

Your boss: “Ok. How much is it going to cost to fix?”

You: “Here is the spreadsheet. Only 200k dollars, and it will take 10 months to complete.”

Your boss: “How much do we stand to gain after we do this?”

You: “Approximately $1 million … “

You: “See here this SEO Pipeline … see how many pages are not getting clicks, and their potential”

Your boss: “I get it! … You are a genius! Let’s do it”

You: “Ok. “hum … about my promotion??”

Your boss: “Let’s see if this works first :)”

Last, but not least, if you like this concept, and would rather not complete the pipelines manually, consider signing up for our upcoming free, cloud-based SEO auditing tool https://www.ranksense.com/advisor/, which will incorporate this SEO pipeline concept. As an early adopter, you get a generous free tier limit of 10k URLs.

 

Additional Ways to Use Chrome Developer Tools for SEO

I recently read Aleyda’s excellent post about using Chrome Developer Tools for SEO, and as I’m also a big fan of DevTools, I am going to share my own use cases.

Misplaced SEO tags

If you are reviewing correct SEO tag implementations by only using View Source, or running an SEO spider, you might be overlooking an important and interesting issue.

I call this issue misplaced DOM tags, and here is one example http://www.homedepot.com/p/Kidde-Intelligent-Battery-Operated-Combination-Smoke-and-CO-Alarm-Voice-Warning-3-Pack-per-Case-KN-COSM-XTR-BA/202480909

If you check this page using View Source in Chrome or any other browser, you would see the canonical tag correctly placed inside the <HEAD> HTML element.

Similarly, if you check this page using your favorite SEO spider, you’d arrive at the same conclusion. The canonical tag is inside the <HEAD> HTML element, where it should be.

Now, let’s check again using the Chrome Developer Tools Elements tab.

Wait! What?? Surprisingly, the canonical tag appears inside the <BODY> HTML element. This is incorrect, and if this is what Googlebot sees, the canonical tag on this page is effectively useless. Then we go blaming the poor tag saying that it doesn’t work.

Is this a bug in Google Chrome Developer Tools? Let’s review the same page with Firefox and Safari Developer Tools.

You can see the same issue is visible in Firefox and Safari too, so we can safely conclude that it is not a problem with Developer Tools. It is very unlikely all of them would have the same bug. So why is this happening? Does The Home Depot need to fix this?

Let’s first look at how to fix this to understand why it happens.

We are going to save a local copy of this page using the popular command line tool curl. I will explain why it is better to use this tool than to save directly from Chrome.

Once we download the web page, open it in any of the browsers to confirm the problem is still visible in the DevTools. In my case, I didn’t see the issue in Chrome, but saw it in Safari. I’ll revisit why the discrepancy when we discuss why this happens.

Next, in order to correct the issue we will move the SEO meta tags so they are the first tags right after the opening <HEAD> HTML tag.

Now, let’s reload the page in Safari to see if the canonical still shows up inside the <BODY> HTML tag.

Bingo! We have the canonical correctly placed, and visible inside the HTML <HEAD>.

In order to understand why this addresses the issue, we need to understand a key difference between checking pages with View Source, and inside the Elements tab in the web browsers’ DevTools.

The Elements feature has a handy feature that allows you to expand and collapse parent and child elements in the DOM tree of the page. In order for this feature to work, the web browser needs to parse the page and build the tree that will represent the DOM. A common issue with HTML is that it often contains markup errors or invalid tags placed in the wrong places.

For example, if we check the page using https://validator.w3.org/nu/?doc=http%3A%2F%2Fwww.homedepot.com%2Fp%2FKidde-Intelligent-Battery-Operated-Combination-Smoke-and-CO-Alarm-Voice-Warning-3-Pack-per-Case-KN-COSM-XTR-BA%2F202480909

You can see this page “only” has 61 HTML coding errors and warnings.

Fortunately, web browsers expect errors and automatically compensate for them using a process called HTML linting or tidying. A popular tool that does this is https://infohound.net/tidy/ by Dave Raggett at W3C.

The tidying process works by adding missing closing tags, reordering tags, etc. This works flawlessly most of the time, but it can often fail and tags end up in the wrong places. This is precisely what is happening here.

Understanding this allowed me to come up with the lazy trick to move the SEO tags to the beginning of the head, because this essentially bypasses any problems introduced by other tags. 🙂

A more “professional” solution is to at least fix all the errors reported between the HTML <HEAD> tags.

Can we tell if this is affecting Googlebot or not?

It is fair to assume that as Google is now able to execute JavaScript, that Google’s indexing systems need to build DOM trees just like the main browsers do. So, I’d not ignore or overlook this issue.

A simple litmus test to see if the misplaced canonicals are being ignored is to check whether the target page is reporting duplicate titles and/or duplicate meta descriptions in Google Search Console, or not. If it is reporting duplicates, correct the issue as I explained here, use Fetch as Googlebot, and re-submit the page to the index. Then wait and see if the duplicates clear.

 

Following redirect chains

Another useful use case is reviewing automatic redirects from desktop to mobile optimized websites, or from http to https or viceversa directly in your browser.

In order to complete the next steps, you need to customize DevTools a little bit.

  1. Tick the checkbox that says “Preserve Log” in the Network tab so the log entries don’t get cleared up by the redirects
  2. Right-click on the headers of the Network tab, and select these additional headers: Scheme, Vary, and optionally Protocol to see if the resources are using the newer HTTP/2 protocol

In this example, we opened https://www.macys.com, and you can see we are 301 redirected to http://www1.macys.com, from secure to non-secure, and we can also see that the page provides a Vary header with the value User-Agent. Google recommends the use of this header with this value to tell Googlebot to try refetching the page but with a mobile user agent. We are going to do just that, but within Chrome using the mobile emulation feature.

Before we do that, it is a good idea to clear the site cookies because some sites set “desktop sticky” cookies that prevent the mobile emulation from working after you have opened the site as a desktop user.

Let’s clear the network activity log and get ready to refresh as a mobile user. Remember that we will open the desktop URL to see the redirection.

In this case you can see that Macys correctly 302 redirects to the mobile site at http://m.macys.com, which is consistent with Google’s recommendation.

 

Sneaky affiliate backlinks

As Aleyda mentioned in her post, we can use DevTools to find hidden text, and some really sneaky spam. Let me share with you a super clever link building trick I discovered a while ago while auditing the links of a client’s competitor. I used our free Chrome DevTools extension as it eliminates most of the manual checks. You can get it from here.  

To most of you, and to most Googlers, this looks like a regular backlink and it doesn’t raise any red flags. The anchor text is “here”, and it is directly in the editorial content like most editorial links. However, coming from an affiliate marketing background, I see the extra tracking parameters can be effectively used to track any sales that come from that link.

I’m not saying they are doing this, but it is relatively easy to convince many unsophisticated bloggers to write about your product, and place affiliate links like this back to your site to get compensated for sales they generated. Sales you would track directly in Google Analytics, and maybe even provide reporting by pulling stats via the GA API.

Now, the clever part is this one: they are likely setting up these tracking parameters in Google Search Console so Googlebot ignores them completely, and it is normal to expect utm_ parameters to be ignored. This trick effectively turns these affiliate links into SEO endorsement links. This is one of the stealthiest affiliate + SEO backlink tricks I’ve seen in many years reviewing backlink profiles!

 

Troubleshooting page speed issues

Let’s switch gears a bit, and discuss pagespeed from an implementation review perspective.

Let’s review another example to learn how well the website server software or CDN handles caching page resources. Caching page resources in the client browser or CDN layer offers an obvious way to improve page load time. However, web server software needs to be properly configured to handle this correctly.

If a page has been visited before, and the page resources are cached, Chrome sends conditional web requests to avoid refetching them each time.

You can see page resources already cached by looking for ones with the status code 304, which means that they haven’t changed on the server. The web server only sends headers in this case, saving valuable bandwidth and page load time.

The conditional requests are controlled by the IF-Modified-Since request header. When you tick the option in DevTools to disable the cache, Chrome doesn’t send this extra header, and you won’t see any 304 status code in the responses.

This is particularly handy to help troubleshoot page resource changes that users report are not visible.

Finally, it is generally hard to reproduce individual users’ performance problems because there are way too many factors that impact page load time outside of just the coding of the web page.

One way to easily reproduce performance problems is to have users preserve the network log and export the entries in the log as an HAR file. You can learn more about this in this video from Google Developers https://www.youtube.com/watch?v=FmsLJHikRf8.

Google provides a web tool you can use to review HAR files you receive from users here https://toolbox.googleapps.com/apps/har_analyzer/. Make sure to warn users about saving potentially sensitive information in this file.

Bonus: Find mixed http and https content quickly

Aleyda mentioned using DevTools to check for mixed http and https content raising warnings in your browser. Here is a shortcut to identify the problematic resources quickly.

You can type “mixed-content:displayed” in the filter to get the resources without https.

If you are not actively using DevTools in your SEO audits, these extra tips encourage you to get started. And, if you are, please feel free to share any cool tips you might have discovered yourself.

How to Get Googlebot to “Teach You” Advanced SEO

I recently worked on an enterprise-level client’s non-SEO related project where the goal was to confirm or deny that their new product:

1)  Was not doing anything that could be considered black hat.

2)  Was providing any SEO benefit for their clients.

The problems you face with projects like this is that Google doesn’t provide enough information, and you cannot post corner-case questions like this in public Webmaster forums. To do so would violate your NDA, and potentially reveal your client’s intellectual property. So, what option do you have left? Well, you set up a honeypot!

A honeypot is a term that comes from the information security industry. Honeypots are a set of files that, to an automated program, appear like regular files, but they allow for the monitoring and “capturing” of specific viruses, e-mail harvesters, etc. In our case, we set up a honeypot with the purpose of detecting and tracking search engine bot behavior in specific circumstances. We also wanted to track the outcome (positive, neutral or negative) in the search engine results pages (SERPs).

Let me walk you trough a few ways you can learn advanced SEO by using a honeypot. Read more

Controlling Your Robots: Using the X-Robots-Tag HTTP header with Googlebot

robopet.jpgWe have discussed before how to control Googlebot via robots.txt and meta robot tags. Both methods have limitations. With robots.txt you can block the crawling of any page or directory, but you cannot control the indexing, caching or snippets. With the robots meta tag you can control crawling, caching and snippets but you can only do that for HTML files, as the tag is embedded in the files themselves. You have no granular control for binary and non-HTML files.

Until now. Google recently introduced another clever solution to this problem. You can now specify robot meta tags via an HTTP header. The new header is the X-Robots-Tag, and it behaves and supports the same directives as the regular robots meta tag: index/noindex, archive/noarchive, snippet/nosnippet and the new unavailable_after directive. This new technique makes it possible to have granular control over crawling, caching, and other functions for any page on your website, no matter the type of content it has—PDF, Word doc, Excel file, zip files, etc. Read more

Log based link analysis for improved PageRank

While top website analytics packages offer pretty much anything you might needto find actionable data to improve your site, there are situations where we need to dig deeper to identify vital information.

One of such situations came to light in a post by randfish of Seomoz.org.He writes about the problem with most enterprise-size websites, they have many pages with no or very few incoming links and fewer pages that get a lot of incoming links.He later discusses some approaches to alleviate the problem, suggesting primary linking to link-poor pages from link-rich ones manually, or restructuring the website.I commented that this is a practical situation where one would want to use automation.

Log files are a goldmine of information about your website: links, clicks, search terms, errors, etcIn this case, they can be of great use to identify the pages that are getting a lot of links and the ones that are getting very few.We can later use this information to link from the rich to the poor by manual or automated means.

Here is a brief explanation on how this can be done.

Here is an actual log entry to my site tripscan.com in the extended log format: 64.246.161.30 – – [29/May/2007:13:12:26 -0400] “GET /favicon.ico HTTP/1.1″ 206 1406 “http://www.whois.sc/tripscan.com” “SurveyBot/2.3 (Whois Source)” “-”

First we need to parse the entries with a regex to extract the internal pages — between GET and HTTP — and the page that is linking after the server status code and the page size.In this case, after 206 and 1406.

We then create two maps: one for the internal pages — page and page id, and another for the external incoming links page and page id as well.After that we can create a matrix where we identify the linking relationships between the pages. For example: matrix[23][15] = 1, means there is a link from external page id 15 to internal page id 23.This matrix is commonly known in information retrieval as the adjacency matrix or hyper link matrix.We want an implementation that can be preferably operated from disk in order to be able to scale to millions of link relationships.

Later we can walk the matrix and create reports identifying the link-rich pages, the pages with many link relationships, and the link-poor pages with few link relationships. We can define the threshold at some point (i.e. pages with more or less than 10 incoming links.)