Google’s John Mueller answered a question in Reddit about whether Google indexes content placed before the main content as part of the post.
John’s answer addressed part of the question but left the overall question unanswered.
Nevertheless, there is a solution for the person’s question.
Does GoogleBot “Collect” Content Before the Post?
The person asking the question in Reddit has a theme that uses a specific kind of code called a hook to insert content like advertising in a specific section of a WordPress post.
A hook is a convenient way for a theme or a plugin to make changes to the webpage structure without having to mess around with the WordPress core code itself.
In this situation, the person asking the question has a theme that uses a hook to add a block of content (like an advertisement) before the main post of the webpage.
Their concern was if Google would see that block of content as a part of the webpage post, the main content.
Cannot Noindex a Page But…
Mueller’s right, one cannot noindex a section of a page. But…
There are other options available to use to help boost the SEO of the webpage.
The way to do that is to make sure Google knows what part of the page contains the Main Content.
Semantic HTML Elements Can Help SEO
Semantic HTML, for this purpose, consists of HTML elements that tell the browser, assistive devices and Google what the different parts of a webpage are.
Google is already pretty good at understanding what the different sections of a webpage are.
Google generally sees the webpage in terms of:
- Header section (top of the page with logo, etc.)
- Navigation section
- Main Content
- Sidebars
- Footer
For content indexing purposes, everything that is not in the Main Content area can be more or less ignored.
The header, navigation, and footers generally have the same content sitewide, they are not main content, and are treated differently by Google (more on this later).
The sidebars may have unique and sitewide content but it’s not the main content.
Make the Main Content Extra Visible
What Google is most interested in is the main content.
Making the location of the main content clear for Google is a good SEO practice.
The essence of SEO is to make the message of the webpage as clear as possible to eliminate the possibility of a mistake on Google’s part.
The <MAIN> HTML Element
There is an HTML element called <main> that can be used to mark up a WordPress post that tells Google that the section of content that is enclosed within the <main> element is the Main Content.
A bare-bones outline of a webpage can look like this
<!DOCTYPE html> <HTML> <HEAD> <TITLE>An awesome webpage</TITLE> </HEAD> <BODY <MAIN> <H1>Hey Google, index my content!</H1> <P>Content for indexing.</P> <P> More content for indexing!</P> </MAIN> </BODY> </html>
That section bracketed between the <main> element is where Google will know your Main Content is.
Everything outside of that semantic HTML element will not be considered a part of the main content.
What If Ad Is Within the Main Content?
If the theme or plugin injects the advertisement within the main content, like before the content begins but within the <main> element, there is something that can be done for that, too.
The <ASIDE> HTML Element
You can use another HTML element called <aside>.
The <aside> element tells Google that all the content enclosed within the <aside></aside> element is not a part of the main content.
The official HTML specs for the <aside> element state that this is what can be used for advertising content:
Circling Back to Answering the Reddit Question:
Google in general can identify what content is advertising and which content is main content.
But don’t leave it to chance, use semantic HTML markup to make it clear for Google.
No. Google sees this as essentially boilerplate, content that is not a part of the main content.
Google’s Martin Splitt addressed that in a Search Engine Journal webinar, which was summarized here.
If Google can’t tell the difference between the ads and the main content, then maybe there’s something wrong with the code.
For example, if HTML elements that properly belong in the <body> section are found in the <head> section, like <img>, <div> or <p>, then Googlebot might start indexing the head content as if it was in the <body>.
In general, Google’s pretty good about identifying where the main content is.
However, it’s an SEO best practice to make the page structure exceedingly clear for Google.
You shouldn’t try to hide the advertising. You could probably do it but the extra code necessary just to do that would probably end up slowing the webpage down.
But you can tell Googlebot that it’s an advertisement by using the <aside> element.