I'm not sure if any of you have noticed, since it's buried in the archives, but we've been attracting some strange visitors here at stevenberlinjohnson.com. For the past couple of weeks, people have been posting very short, sometimes slightly ungrammatical, messages in response to a few items I posted months ago. Check out the discussion thread here -- scroll past the opening comments, and you'll see a string of "good site" and "thanks for the information!" These posts have the distinct smell of spam about them -- they're formulaic and targeted at a small number of pages in the archives. But what possible purpose are they serving? What's the incentive for posting a short note of appreciation on a page buried on someone's weblog? Is this is a widely recognized phenomenon? Explanations, please....
I don't have comments so I haven't had this problem, but google-page rank won't get people very far, just to the top of the list of the most annoying XXXXers out there. Best.
Posted by: Zack Lynch | September 11, 2003 at 01:18 AM
It's not a bad idea to close out comments after a set period of time to prevent someone from sneaking spam in to mooch your Googlejuice.
Posted by: Mr. Nosuch | September 11, 2003 at 04:40 AM
They're attempting to increase their Google pagerank by linking their website(s) in your comments. It's an unfortunate trend that's appearing in unmoderated comment threads and guestbooks around the web.
Did you look at the IP addresses? I'd bet that most of them are posted by the same person.
Posted by: Andy Baio | September 11, 2003 at 10:12 AM
Yes, they are after your page rank. Right now I don't think google distinguishes between links in comments, and links in content. Therefore to the search engines it is looking like YOU are linking to them.
Comment spam is evil.
Posted by: Daniel Von Fange | September 11, 2003 at 10:37 AM
Google has actually changed page rank so that subpages don't have much weight, not sure if the spammers know that though. The alternative is that they just hope someone clicks on the link to their site.
Posted by: Abe | September 11, 2003 at 11:08 AM
maybe they're just shy? i know i've been guilty of leaving a few simple 'great site' type comments before. the thought of increasing my page rank never even crossed my mind...
great site btw ;
Posted by: warren | September 12, 2003 at 01:04 AM
But if they're after PageRank or want readers to follow the links to their sites, why not comment NEW posts? Weird.
Posted by: Jill | September 12, 2003 at 03:42 AM
warren, that's why I don't delete the comments--just the URL. But I strongly suspect that anyone who's using "free-dvd-trial dot com" as their URL in the comments is not innocent of comment spamming. :)
Posted by: Liz | September 12, 2003 at 07:15 AM
Hmmm. I noticed a few of those this week, as well. Mostly on pages that already have high pagerank. The URLs the posters provided were clearly commercial, even though the content of comments was benign, so my response was to leave their comments but delete the URLs.
Not a scalable solution, obviously, but for the scattering of problem comments I've had, it works.
Posted by: Liz | September 12, 2003 at 07:30 AM
Great Site...Thanks for the information!
Posted by: Dave | September 12, 2003 at 11:45 AM
I'd never thought of comment spam, but admittedly I've wanted to trade off Steven's PageRank, linking to the site to get trackbacks. But, that said they are legitimate comments and relations to what he's written, not just trying to get links.
People in search of PageRank need to get a life. Put some content on your own website and get it standing on its own. Its like the early days of the net where a homepage was an example of how much multimedia you could put on the web.
Posted by: Nicholas Barnard | September 14, 2003 at 08:25 AM
Hmm.
How would you combat this without also becoming a censor? Even asking Google not to spider the comments page (can be done with a meta, and also with a special robots.txt file) is not a good solution: what if people put links in comments that are actually useful?
What if you could put some metadata in the link that described it's worth to you (the owner of the page containing the link), which was then modified by your pages normal google rank?
a href="spam.com" google:rating="-1"...
reed
Posted by: Reed | September 15, 2003 at 09:22 AM
I dont't know what degree of customization MT has, but a really simple solution is that the URL doesn't get hyperlinked; people only should to copy/paste the URL to visit that site, and it would be invisible to Google. Preventing the spidering through metas or robots.txt wouldn't do good, since "they" wouldn't bother in checking if the link is spiderable.
Posted by: quino | September 16, 2003 at 09:16 AM
What's up with that?
robots.txt file is Good solutions.
Good solutions depend on good diagnosis.
Posted by: Mary | September 28, 2003 at 11:05 AM
You can use java redirects also to prevent parsing pagerank. In the long run if they are only using those such techniques to promote their sites they will never make it far. I mean if that is the primary means of advertising - how aweful do you think the site looks? nobody buys from those sites. the people who do that inevitably are wasting their own time and are their own punishment!
Posted by: Aaron Wall | October 02, 2003 at 08:06 AM
on the subject of PageRank, there is definately something far more sinister in these spam-like posts because the majority now on my own site link to invalid domains like some guys name dot com with an email of the same name at aol.
I believe they are chalking our sites, laying themselves a breadcrumb trail that their warez will leverage at some later date.
As for MT not hyperlinking URLs, that hasn't stopped MrHan's erection drug postings, because I don't think he expects anyone to click on those absurd links, he just wants to leave highly unique strings (like his stage-name). Besides, the posted-by is already a homepage link, and we don't want to lose that if we can help it, it's the very best way to learn more about any interesting commenter.
Posted by: mrG | October 03, 2003 at 11:29 AM
Mary: using robots.txt not only would prevent spammers from finding a blog, it would also prevent normal users from finding it through search engines.
MrG: I was talking specifically about the 'posted by' part. If the URL was visible (not 'hiding' behind the posters' name) but not hyperlinked, spammers wouldn't get any benefit from it, and users still would see the posters' URLs. Sure, maybe the spam wouldn't stop immediately, but sooner or later they would realize that it wasn't working and would spend their valuable time in other ventures.
Posted by: quino | October 10, 2003 at 11:11 AM