SearchEngineWatch points to a research paper, Web Spam Taxonomy, by Zolta Gyongyi and Hector Garcia-Molina from the Stanford Database Group that will be presented at the WWW2005 Conference next month.
Unfortunately, this paper gives blanket definitions of spam to most SEO activities and offers, at times, a misguided understanding of what most SEOs do for clients.
“The activity of some SEOs benefits the whole web community, as they help authors create well-structured, high-quality pages. However, most SEOs engage in practices that we call spamming.”
Is that all they think SEO’s can do legitimately? “Create well-structured, high quality pages”? Interestingly enough, some of the techniques in this paper could serve as sort of a black hat seo cookbook. If nothing else, it’s a very interesting read.
I came upon the wikimedia blacklist of spam domains and was curious as to whether Google and Yahoo use this kind of information. Here’s another interesting blacklist of urls that have been used to spam blogs and wikis. The Chongedqed site points out that many owners of these domains are not the spammers, but that their urls have been used to “spamvertise” on wikis or blogs. I suppose this is a mixture of disposable domain names and a few unsuspecting sites.
I hope the effort at combating search or web spam is careful not to throw the “baby out with the bath water” as has happened with some of the Google algorithm shifts in the past. If a competitor can get your domain name into a spam blacklist by abusing it, that spells trouble for many legitimate sites.
Speaking of spam, Google News is getting spammed by “Khalsa News Network” which somehow got accepted as a Google News source and is now re-posting old PRweb news releases. Andy mentions this too.