There’s no denying Bill Slawski is a bit difficult to understand. It doesn’t mean he’s not interesting and well worth the read if you can wade through it. So, to save you some time, I’ve summarised a few of his comments about web blocks and linguistic features.
Slawski writes about how a page can be broken down into segments such as the main content, header, footer, advertising, navigation, etc. Each of these blocks can be considered as “separate semantic units” that can be connected or standalone in relation to the page topic (they can also be physically connected or broken up into smaller segments).
In a patent filed on behalf of Microsoft in 2003, this analysis is described as an “…independent approach to detect content structure. It simulates how a user understands web layout structure based on his visual perception (emphasis mine).” If you think about how you read web pages (in a kind of zig-zag pattern, amiright?), the segmentation approach is not far off.
As a writer, I’m interested in the way content is structured and that includes the selection and placement of words and links. We already know that links in the middle of the page have more weight than those in footers, but what I didn’t know was that a search engine might actually assign PageRank for individual segments.
For example (according to the patent), a section of page with hyperlinked, capitalised words in short phrases, which appear in the sidebar or at the top of the page, indicates the main navigation. It sounds like common sense, but understanding how a search engine sees a page is really essential to SEO. These basic linguistic features – i.e. syntax and punctuation – are the means by which search engines are classifying and indexing pages.
*Puctuation Owl is impressed with your new-found wisdom:
So, if you write content for the web, it’s important to keep in mind how a search engine might segment it, but also remember that this patent was filed in 2003. A similar patent from Google followed in 2004. In other words, search engines have been thinking about segmentation for nearly a decade, and they’re continuing to improve their understanding of page semantics all the time. Watch this space!