语义化的HTML与搜索引擎优化

Semantic HTML and Search Engine Optimization(how to be a POSH SEO)
语义化的HTML与搜索引擎优化(如何编写纯语义的HTML进行搜索引擎优化)

By: Joost.De.Valk
Translate: W3CGroup dh20156

Introduction
简介

So what is POSH? No, it's not just some new clothing fashion hype amongst web designers - POSH is the acronym for Plain Old Semantic HTML. The term Semantic HTML is used for a variety of things, but it has it's origin in one objective: creating (X)HTML documents using semantic elements and attributes, as opposed to using presentational HTML. The term POSH was coined because a group of highly respected web designers wanted to have a short mnemonic to easily capture the essence of the concept of Semantic HTML.

什么是POSH?它可不是网页设计师们谈论的服装、时尚,POSH是plain old Semantic HTML的缩写。Semantic HTML可以用在很多地方,不过,它的本意是创建(X)HTML文档中,使用语义化的对象或属性,而不用表象(只为显示效果)的HTML。POSH的产生,来自一群令人追崇的网页设计师,他们希望能够写出简短、易记,能够使人一看就知道其意义的语义化HTML。

In this article, I talk about why you should use POSH, exactly what you need to do to implement it (many of you are probably doing this already, and a few of you might not even realize it!) and how you can optimize it to improve SEO for your site. I also take a brief look at Microformats at the end of the article.

在这篇文章中,我会讲述为什么要使用POSH,你应该如何正确地去做(可能有一些你们已经在做,有一些你还不了解)以及如何去改善你的站点的SEO。我会在文章后稍微看一下微格式部分。

Why Should I Use Semantic HTML?
为什么要使用语义化的HTML?

You could just say: "because it's the right thing for the web," but it's benefits go far beyond that. For instance, it makes it easier for screenreaders to interpret in an order that will make sense to users with visual impairments.

你也许会说:“因为它对WEB有好处”,然而,它的好处远不止于此。比如,它使得屏幕阅读器更容易使用,这对那些视觉障碍的用户将非常有意义。

Secondly, SEO and Semantic HTML are close friends. They might sometimes have conflicts of interest, which we'll get to later on in this document, but over all, they're friends. The purpose of SEO is to help search engine spiders better understand what a page is about and therefore categorize them better. Since a search engine spider basically has even less capabilities than a screenreader, it needs even more guidance in determining a page's structure and topic. Good semantic HTML provides just that structure.

其次,SEO对语义化的HTML更亲睐。他们可能有时候会有一些冲突(我们将在后面看到),但是,语义化HTML仍然是对SEO非常友好。SEO的目的是帮助搜索引擎更好地理解一个页面讲述的是什么主题以便更好地对它进行分类。既然搜索引擎都只有这些基础功能,更何况是屏幕阅读器。它们需要在页面结构和主题确定时给予更多的提示。好的语义化HTML将会是很好的支持。

Semantic HTML tries to convey meaning through the words and the tags on a page. Try thinking of it this way: the content on the page is the words you speak. The tags provide the structure, the intonation, the pauses and even the looks on your face. Basically, your tags are half your message.

语义化的HTML会试图将页面上标签与标签中的内容联系起来。你可以试着这样去想:页面上的内容是你需要念出来的,而标签提供了一种结构,语调,停顿等辅助信息,基本上,标签起到了另一半信息的作用。

Site Structure
站点结构

In my previous article on dev.opera.com I talked about site structures, with the aim of providing a clear way for search engines to discover which page on your site discusses which topic - this can be further improved by using Semantic HTML.

在我发表在dev.opera.com上的前一篇文章里,我谈过了站点结构,目的是为了提供给搜索引擎一个更清晰的更好的方法来发现你站点中哪一页上谈论哪一话题,这个可以通过使用语义化HTML来实现。

Page Structure
页面结构

A page consists of a title, one or more headings, and content. This content can contain paragraphs of text, lists, quotes, images and tables. All these types of information have their own designated tag(s). We will treat all those tags, starting with the headings. Use this page about sortable tables as an example to follow along with for the coming points.

一个页面会由一个主标题,一个或多个内容标题,和内容来组成。内容部分可以包括段落,列表,引用,图片和表格。所有的这些资料都会有它们特定的标签。我们将一个一个来分析,先从内容标题(headings)开始。用一个排序表格的页面来做例子。

Headings,from h1 to h6
内容标题,从h1到h6

A good document has headings and subheadings, because headings make it easier to determine the topic of a page. These headings can range in importance from h1 to h6. To be honest, I never use h5 and h6 myself. I usually have only one h1 tag on a content page; on portal pages, blog homepages for instance; you can have multiple h1's, for all your articles for example. From a semantic perspective that might be weird, from an SEO perspective, it's great.

一个好的文档应该有主标题和副标题,因为它们可以更容易帮助确定一篇文章的主题。这些标题可以按重要性分为h1到h6.老实说,我自己从来不用h5和h6,经常只在页面内容中使用h1。在一些门户页面,比如说博客首页,你可以为你的每一篇文章使用一个h1,虽然这从语义上来讲可能不太合适,但从SEO的角度来看,这实在太棒了。

Strict semanticists sometimes suggest that you should only have one h1, two h2's, 3 h3's etc. I don't agree with that, as I think it's very normal for a document to have more than two h2's, in fact, this document has a lot more of them, and I think it's very well structured.

严格的语义学家有时会建议你只使用1个h1,2个h2,3个h3等等,不过,我不太赞同,我认为,一个文档中很容易会存在两个以上的h2,事实上,这篇文档中就有许多,我觉得它们组织的很好。

Very often, designers who have heard a bit about Semantic HTML will fit the name of a site in the header into an h1 tag. On the homepage of a site, that might be a very wise decision. On every other page within your site, you probably have a specific topic, which might be related to your site's name but doesn't have to be. On those sub pages, that topic should be in the h1 tag, and it's wise to put the name of your site into an h4 tag or maybe even a span.

经常,设计师们会听到语义化HTML将一个站点的名称放在一个h1标签中更适合。在站点首页,那可能是很棒的决定。在其他页面,你可能会有一些具体的主题,可能与你的站点名称有关或者没有,在这些页面上,主题可以放到h1标签,站点名称可以放到一个h4标签或者span标签内。

Search engines give the words used in the various headings more weight in determining the topic of a page. The keyword your page is optimized for should appear at least once in an h1 tag, and related keywords should be used in the other headings, as illustrated in Figure 1.

搜索引擎给那些在headings标签内帮助确定页面主题的字词更高的权重。你页面上要优化的关键词应该将它们放在至少一个h1标签内,其他相关关键词应该放在其他的headings标签内,如图1.

Sample web site sceenshot showing sensible use of keywords in headings

Figure 1:Include keywords in your page headings to improve SEO for your page.
图1:在页面上用headings包含关键词以改善SEO

Images
图像

Images are used in all sorts of ways within documents, and you should apply the proper semantics to them. The only really useful semantic variable on an img tag is the alt attribute, and it should only be used if the image adds meaning to the document. If the image is there only for decorative purposes, leave the altattribute empty. Otherwise, describe what the image is showing in the alt attribute.

图像在一个文档中以各种不同的方式使用着,你需要给它们应用上一些适当的语义。在img标签上真正有用的语义变量就是alt属性,它仅用在图像在文档中注明它所表达的含义的时候。如果某些图像仅仅是为了装饰的目的,那么可以使alt为空,否则,都需要指明它们的alt属性值。

If you're using images to replace text, because you want the text to look nicer (image replacement,) make sure that you're using normal text in your HTML, and that you replace that text with images by using CSS. You have to do this because both people with visual impairments and search engines cannot read the text in your images. My own preferred method of doing this is through applying the image with CSS background-image, and then hiding the HTML text using a large text-indent (about -1000px or so does the trick.) Be careful though: the text in the image should be exactly the same as the text in your document. If it's not, you risk losing a lot of ranking value from the search engines.

如果你要以图换字,使得文字看起来更漂亮(用图片替换),请务必在你的HTML代码中也写上正常的文字,而且,你需要使用CSS来将图像替换相应的文字。这样做的目的是因为要兼顾到那些视觉障碍的人,并且搜索引擎无法读取图像里的文字。我的首选方法是使用background-image来显示这个图片,然后使用一个很大的text-indent(比如-1000px或其他能达到效果的值)来隐藏文字。注意:图像中的文字应该和文档中的文字完全一样,否则,你将冒上被搜索引擎降低排名的风险。

Abbreviations and acronyms
缩写和首字母组合词

You're bound to do it as a web designer - I do it in this article several times - using acronyms or abbreviations. When you do, make sure you provide the written out version of the term using abbr or acronym tags. That's good for your keyword density too!

作为一名网页设计师,你一定会使用到,我在这篇文章里就用了好几次,使用首字母组合词或缩写。当你使用它们的时候,请将术语写到abbr或acronym标签里并在它们的title属性中写出该术语的全称。这对增加你的关键词密度同样有益。

Tables
表格

We all know why using tables for layout out web sites is bad, and we also know what they are supposed to be used for - displaying tabular data. Just using basic tables is a big step in the right direction, but there are a number of ways in which you can improve your tables' semantic value, thereby improving your site's SEO further:

我们都知道为什么用表格来呈现页面布局是不和谐的,我们知道,表格应该用来显示表格式数据。只使用它这个基本的功能是正确的,不过,有一些方法可以用来改善表格的语义,改进SEO。

- Use table headings(th) for your table's headings(it's really that easy)
- 为表格中的标题使用标题标签(th,这很容易)

- If you can,use the thead,tbody and tfoot sections to properly section your table
- 如果可以,使用thead,tbody和tfoot进行对你的表格进行合理的划分

- provide a caption for your table,describing what's in it
- 为表格提供一段说明标题写在caption标签里

The caption and the table headings would be a good,and usually natural place to use some of your document's keywords.
caption和th可以很自然的放置一些文档关键词,很和谐。

Emphasizing your meaning
强调你要表达的意思

Remember I said earlier that tags should be the emotion of your text? This is where the real emotion comes in: you can provide emphasis to certain words using em or strong. In the old days, people used b and i for that, but these tags are no longer encouraged, since they imply a specific styling, whereas HTML should only describe structure/meaning (all style should be created using CSS, of course.)

还记得我之前说过标签会让你的文字带有情感么?你可以这样:将要强调的字词放在em或strong标签内。在以前,人们常常使用b和i标签来做,但这些标签是不建议使用的,因为,这些标签仅仅是为了实现特别的样式,而HTML要的应该是结构描述和意义(所有的样式应该由CSS来完成。)

Search engines give more weight to any words marked up using any of these four tags. Overusing them can do more harm than good, and actually cause a loss of emphasis, but if treated with care, they can apply an extra dimension to your documents.

搜索引擎会给这四个标签内的文字更高的权重,不过,如果滥用,将弊大于利,而且,反而会影像要强调的内容,但是,如果妥善应用,这会为你的文档增色不少。

A few words on (i)frames
简单讲一下(i)frames

It's quite simple:don't use them.Search engines don't get them,and screenreaders have quite a hard time using them as well.
很简单:别用它们。搜索引擎根本不去抓取它们,而且,屏幕阅读器在读取它们时非常费劲。
 

Conflicts of interest
利与弊

All of the above rules can be bent a little of course, which is a good thing, as sometimes it's necessary to keep everyone at your organization happy. Say your boss wants a page to have a zappy marketing title you'd rather not have, because it doesn't exactly describe what's on the page, and pushes your most important keyword to the second heading. If you're in a competitive area, it might be wise to make the page look like that for your boss, yet use an h2 for the first heading, and an h1 for the second.

以上这些规则多少会感觉有些别扭,然而,好的东西有时候对维持你们单位成员的幸福非常有必要。比如,你老板希望一个页面能有个吸引人的销售标题,而你宁可不要,因为,它不能完全地描述这个页面上的内容,而且这会将你认为最重要的关键词要放在第二个标题内。但是,如果你们行业竞争激烈,你老板那样做是很明智的,页面上还要用一个h2标签放第一个标题,一个h1标签放第二个。

The same goes for iframes and images. If someone really wants you to put a certain block of content on a specific well-ranking page, but you don't want to risk losing focus, you could of course put that content into an iframe or image, and choose not to provide an alternative.

这种矛盾同样适用于iframes和images.如果有人一定要你在一个排名特别好的页面上插入一些打断的内容而你又不想冒降级的风险,你可以选择将这些内容放在一个iframe里或者一张图片里。

These decisions are up to you in the end - normal semantics should be the basis of your design, and the conflicts should only arise when you're really optimizing your pages.

最终的决定在你,正常的语义学应该基于你的设计,而真正的利益冲突会在你真正对自己的页面进行优化时才出现。

Not so simple semantic HTML - Microformats
不那么简单的语义化HTML - 微格式

Microformats are also semantic HTML, but they are not exactly simple! At the moment, search engines are hardly using microformats in their algorithms, but that might change. The hCard especially (the HTML version of the vCard) has some very easy and obvious uses for search engines, and I suspect that they will start using those within the next couple of years. You can apply intelligent extra semantics within Microformats using the basic set of HTML elements - for example, a good way of marking up your address hCard is by using the address tag as a container!

微格式也是语义化的HTML,但是它们却不那么容易。现在,搜索引擎算法还很难理解微格式,但是,这可能会得到改变。hCard(vCard版本的HTML)有很多简单易用的东西针对搜索引擎,我想它们将在随后的数年里使用它们。你可以使用额外的可增强理解的语义化微格式HTML元素,比如,一种好的标识你的地址的hCard会使用address标签来包含。

Conclusion
总结

By using semantic HTML to mark up your pages, you can create pages that are more accessible, both to people with disabilities, as well as to search engines. Good semantic markup helps search engines to determine what the topic of a page is, and if used together with a good site structure, allows you pushy up your web site rankings!

在你的页面上使用语义化HTML,你可以使得它们更易访问,对那些视力障碍的人们和搜索引擎都有好处。好的语义化HTML可以帮助搜索引擎确定页面的主题,如果你同时有一个好的站点结构,这将使得你的网站排名得到提高。

This article is licensed under a Creative Commons Attribution, Non Commercial - Share Alike 2.5 license.

本文采用的授权是创作共用的“署名-非商业性使用-相同方式共享 2.5 通用许可”



[本日志由 dh20156 于 2009-10-29 11:21 AM 编辑]
文章来自: DHTML精英,WEB前端专家!
引用通告: 查看所有引用 | 我要引用此文章
Tags: HTML语义 SEO
评论: 0 | 引用: 0 | 查看次数: 1291
发表评论
昵 称:
密 码: 游客发言不需要密码.
内 容:
验证码: 验证码
选 项:
虽然发表评论不用注册,但是为了保护您的发言权,建议您注册帐号.
字数限制 1000 字 | UBB代码 关闭 | [img]标签 关闭