The Leaked Google Documents: What They Really Mean for SEO

Ana Gotter

Ana Gotter

5min read

Earlier this year, the Google Search document leak was all that the SEO world could talk about.

In case you missed it, here are the highlights:

  • Thousands of documents that seemingly came from Google’s internal Content API warehouse were released by an automated bot, and then shared with SparkToro co-founder Rand Fishkin in May.
  • Fishkin shared this information publicly in late May.
  • Chaos ensued in the SEO world.

For a few weeks immediately following the big leak, it was all anyone could talk about. Hot takes about whether Google lies to us and what it all actually means were everywhere for weeks.

Now that the dust has settled, most of us have had a chance to read up on the changes, and some of the most respected thought leaders have spoken. So what does the Google leak actually mean for SEO, and what should SEO experts and content marketers do now?

Let’s take a look.

What The Leaked Documents Revealed

The leaked documents were extensive— there were literally thousands of documents included in the leak. So, to keep things relatively simple, let’s look at the most significant findings from the documents that are most likely to impact content marketing, search visibility, and SEO overall.

  • Ranking factors: There are over 2,596 modules in the API documentation and 14,014 attributes that seemingly contribute to ranking calculations. There was no specification of how Google weights any of these features.
  • Twiddlers: Google has re-ranking functions that “can adjust the information retrieval score of a document or change the ranking of a document.”
  • Content demotions: Site content can be demoted for reasons like SERP signals indicate user dissatisfaction, low product reviews, business location, exact match domains, restricted content (like porn), and a link not matching the target site.
  • Archiving indexed pages: Google apparently keeps a copy of every version of every page it has indexed, which is some powerful change history action. However, Google only uses the last 20 changes made to a URL when analyzing links.
  • Backlinks: The relevance and diversity of backlinks are essential when it comes to ranking potential, and links very much matter.
  • PageRank: Google’s PageRank is still very much active, and your site’s homepage PageRank is considered.
  • Clicks: Google uses multiple measurements like badClicks, goodClicks, lastLongestClicks, and unsqaushedClicks to assess successful clicks and a positive (and relevant!) user experience.
  • Authorship: Google stores author information associated with content, determining whether an entity is the author of the content. This likely plays into their EEAT considerations.
  • Site Authority: Google uses something called “siteAuthority,” which may be related to low quality content on part of a site potentially impacting the site’s overall ranking. Google previously denied using site authority scores for many years.
  • Chrome: The ChromeinTotal module may indicate that Google leverages data from the Chrome browser to impact ranking.
  • Site size: There’s a smallPersonalSite which seems to indicate small personal sites or blogs. Google may boost or demote these sites using twiddlers, but that’s uncertain.
  • Recency: Content freshness seems to matter, as Google looks at byline dates, URL dates, and on-page semantic content date.
  • Relevance: Google seemingly looks at whether or not something is a core topic of Google’s website by vectorizing pages and sites and comparing the page embeddings to the site embeddings (siteFocusScore).
  • Page titles: Google’s titlematchScore is believed to measure how well a page’s title aligns with a search query.

So What Does This Mean?

Many people began by insisting that all the leaked documents mean is Google lies. I actually disagree with this: We’ve always known that Google will never share the entire truth, with a full list of ranking factors and how they’re weighted.

They also make rapid and constant changes, and while they may have been lying about certain parts of the algorithm (potentially like actually using site authority despite denying it for several years), I think a lot of what we’re discovering is that standard good SEO practices still hold true.

For example:

  • We’ve always known that link building was clearly an important ranking factor, and that site authority, relevance, and diversity in backlinks mattered a great deal. That’s still very much true, and should be a key consideration in your long-term SEO strategies.
  • Many content marketing strategies have centered around topical authority, building a website’s reputation as an expert for certain key topics.
  • Google’s EEAT has stressed that having relevant author expertise could impact content ranking.
  • We’ve long expected that strong on-page experiences and relevance— assessed by long average time on page metrics and low bounce metrics— may impact how Google assesses content.
  • SEO experts have been stressing the importance of relevant page titles and accurate, matching links for years, ensuring that customers end up on pages with content they’re actually looking for.

Is This Data Still Relevant?

The documents indicated that the leaked information was seemingly accurate as of March, 2024. It’s possible that some of the findings were things that Google was simply testing, or that they’ve changed things up since then.

While Google’s algorithm is always changing, I think that it’s safe to say that a lot of these general principles are consistent enough with what we’ve already seen and previously thought to be true.

The Takeaways: How to Adapt Your SEO Strategy

The good news is that the majority of the leaked documents confirm some of what we already knew and suspected. The rest have affirmed that these are some of the best moves you can make with content marketing and SEO right now:

  • Build links— as Chima Mmeje puts it, collect backlinks like infinity stones, collecting backlinks from different but relevant websites.
  • Create topical authority, and consider using content strategies like topic clusters to do so.
  • Create expert-led content, which is unique, original, valuable, and is more easily trusted; content marketers should be connecting with subject matter experts for interviews at the very least.
  • Update old content, removing content that’s no longer relevant and adding in anything new that is.
  • Ensure that all of your existing site content has intent-accurate keywords, correct site links, and is following any updated SEO practices.

Finally, expert Rand Fishkin has stressed multiple times that it’s important to build a strong brand that can benefit you in search but also independently of Google. That brand can help support your business’s growth even if Google changes the algorithm.

We know that Google will change the algorithm at some point— plenty of sites were impacted by an algorithm update last year— so it’s important to ensure that you aren’t depending entirely on Google and SEO.

Have other defined distribution plans for content, and build a brand using multiple different strategies while tracking important site KPIs using transparent website analytics tools.