Key Data Points When Viewing Source

When you view a webpage’s source, you want to look carefully at several key data points to get an idea of how the search engines see the given page. The key data points below are the most important meta data of a page that the engines care about.

Meta description:
The quick definition is that a meta description is the textual description of a webpage that webmasters can write that search engines will include in search results. Thus, this content is important because it acts as a free ad for the given website.

In the source code meta descriptions are written as <meta name=”description” content=”Description Goes Here” />

The best meta descriptions are enticing and are written about the specific subject of the given page. (For example, all of the information you would ever need on adult diapers.)

Optimizing for People, Not Just Search Engines

Remember that not all of your optimization should be aimed at search engine algorithms. In order to be successful, you need to optimize for the people that are going to read the search results that you optimize. Meta descriptions and title tags are your place to do that.

Meta descriptions are practically easy to write for people because they are not used directly for ranking purposes by the engines. Instead they are used to entice people to click the given search result. They should be reviewed by a marketing person (like an SEO) and optimized for people rather than engines.

TIP
Meta descriptions are for people not necessarily for search engines. I have found the following textual additions very useful for increasing click-through rates in search engine results:

Free Shipping
Low Price Guarantee
Reviews, Pictures, Samples
Interviews
Official Site

When reviewing meta descriptions, I find it useful to pull up the search result in the search engines and compare the meta description to that of the competition. I ask myself if I would click the result compared to others, and if not, I figure out why not and use this information to improve the description.

Meta robots: Meta robots is a page-specific directive for controlling search engine crawlers. This is most useful for keeping specific pages out of the search engines indices while still being able to transfer their link value to other pages.

In source code meta robots looks like this:
<meta name=”robots” content=”VALUES” />

Technically, robots is the value of an attribute called name. Because this is a mouthful, SEOs tend to refer to it as meta robots. Regardless of what you call it, it is better than robots.txt for keeping pages out of the search engine’s indices because it disallows the engines from even listing the URL.

Figure 3-2 shows the result of using robots.txt to block a page that has inbound links to it. It’s fairly rare to see a robots-excluded file show up in SERPs, but it’s not unheard of (as Figure 3-2 shows), so remember that as you choose your exclusion methods. Notice that the URL is still shown in search engine results, but all of the meta data (title tag and meta description) is not shown. This makes this result essentially a waste, thus making it better to not include these blocked pages in the search engine indices at all.

Figure 3-2: Google result showing page blocked by robots.txt

See how the URL is still present in the index? This is because it is blocked by robots.txt but still has links pointing at it. These links are now pointing at a page that the search engines can’t access (SEOs refer to this as a black hole or an “uncrawled reference”), and the result is formatted in a way that is unlikely to be clicked by searchers. This is happening because behind the scenes search engines find this URL via links but aren’t able to crawl it because it blocked via robots.txt.

To make matters worse, since the search engines can’t crawl these pages, they can’t pass the page’s link value through the links on the page, thus the black hole association. This means not only does the given page not get credit for its bound links, but it also can’t pass this value to other pages that aren’t blocked.

Alternatively, meta robots keep even the URL out of the indices and allow the links on that page to continue to pass juice (“noindex, follow”).

From an SEO perspective, you should avoid frames at all times. If a client has them, you should educate them on alternatives like content displayed using AJAX.

Frames:
A frame is a HTML technique for embedding one URL into another URL. A common example of this is a help center that keeps navigation separate from informational articles. Frames have very negative impact on SEO. Search engines treat frames as completely different pages (as they should) and do not share any of the page link metrics between two frames on a given page. This means that if a link goes to a given URL, it won’t help any of the other frames on the page.

Avoiding Black Holes
To avoid the black hole problems associated with robots.txt, meta robots should almost always be set to “index, follow”. (These are actually the default values; having no meta robots tag is the same as having one set to “index, follow”.)

Exceptions include “noindex, follow”, which is appropriate for duplicate index pages, and “index, nofollow”, where the destination of links can not be vouched for (as in user-generated content). There is very little reason to ever use “noindex, nofollow” because you might as well maintain the value of the outgoing links.

In the source code a frame is identified by code like the following:

<frameset rows=”70%” cols=”50%”>
<frame src=”left-frame.html”>
<frame src=”right-frame.html”>
<noframes>
<p>This is what is displayed to users who don’t have
frames and search engines in some cases.</p>
</noframes>
</frameset>

or simply:

<iframe src =”example.html” width=”100px” height=”300px”>
<p>This text is read by engines but not people with
frames enables</p>
</iframe>

Flash and Shockwave: Although the search engines have gotten better at parsing Flash and Shockwave, it is still not a viable SEO friendly option for a website. Not only is most of the content obfuscated, but linking is made difficult because websites made in Flash usually lack any kind of site architecture (from a URL perspective).

Flash and Shockwave is usually identified in the source code with something similar to:

<object classid=”clsid:D27CDB6E-AE6D-11cf-96B8-444553540000″
codebase=”http://active.macromedia.com/flash2/cabs/
swflash.cab#version=4,0,0,0″ id=inrozxa width=100%
height=100%>
<param name=movie value=”welcomenew6.swf”>
<param name=quality value=high>
<param name=bgcolor value=#FFFFFF>
<embed src=”inrozxa.swf” quality=high bgcolor=#FFFFFF
width=100% height=100% type=”application/
x-shockwave-flash” pluginspage=
“http://www.macromedia.com/shockwave/download/
index.cgi?P1_Prod_Version=ShockwaveFlash”>
</embed>
</object>

Depending on the version, the code you see might be different but the main indicators are <embed> or <object> tags with one attribute pointing to macromedia.com or adobe.com (the maker of Flash).

The Problem with Flash

The problem with Flash is it is hard for search engines to parse it and
understand what content it contains. When you see Flash with valuable content
inside of it, it is best to recommend to the client that the content be added to the HTML page so that the search engines can parse it or to use an alternative to Flash.

The best potential replacement of Flash may become HTML5. At the time of
writing, HTML5 is in infancy with only a few major websites including it. HTML5 has some of the pros of Flash (animation) but is easy to parse like normal HTML.

JavaScript links:
At the time of writing, JavaScript links are dangerous because their ability to pass juice is not very clear. They can be written in a lot of different ways, and Bing and Google have not said which types of JavaScript links they support. We know that they are being crawled, are used for URL “discovery” by engines, and that they are passing some link juice, but the relative amount is unknown.

This means that JavaScript links are not useful as an alternative to HTML-based links. They are implemented in JavaScript with the location object:

window.location.replace(‘http://www.example.com’);

This is often followed by a .href or .replace depending on the implementation.

When you encounter JavaScript based links on clients’ websites, it is best to try to replace them with standard HTML-based links.

Page title:
Even though you can see a page’s title at the top of most browser windows, viewing the title tag from within source
code can be very helpful. Does the page title appear within the <head> section and outside of any <script> tags? Does the page
have only one title? (You’d be surprised.)

NOTE Many people are confused by the relationship of Java to JavaScript. My favorite explanation is “Java is to JavaScript what Car is to Carpet.” They are not related other than they are both computer languages. (Even that is a stretch because JavaScript is only a scripting language.) The name similarities are due to their respective creators really enjoying coffee (Java). It’s as simple as that. Howard Schultz would be proud.

Meta Keywords Are Obsolete
What about meta keywords? Meta keywords refer to a specific meta tag that used to be used by search engines. This meta data is no longer an important metric. It is used by neither Bing nor Google. Don’t waste your time writing it.