Chaining links

First, links to a pair of pieces I wrote — one new, one old, both for Linux Journal. The former is Linux and Plethorization, a short piece I put up today, and which contains a little usage experiment that will play out over time. The latter is The New Vernacular, dated (no fooling) April 1, 2001. Much of what it says overlaps with the chapter I wrote for O’Reilly’s Open Sources 2.0. You can find that here and here.

I link to those last two pieces because neither of them show up in a search for searls + glassie on Google, even though my name and that of Henry Glassie are in both. I also like them as an excuse to object to the practice — by WordPress, Flickr and (presumably) others of adding a rel=”nofollow” to the links I put in my html. I know nofollow is an attrribute value with a worthy purpose: to reduce blog and comment spam. But while it reportedly does not influence rankings in Google’s index, it also reportedly has the effect of keeping a page out of the index if it isn’t already there. (Both those reportings are at the last link above.)

I don’t know if that’s why those sites don’t show up in a search. [Later… now I do. See the comments below.] But I can’t think of another reason, and it annoys me that the editors in WordPress and Flickr, which I use almost every day, insert the attribute on my behalf. Putting that attribute there is not my intention. And I would like these editors to obey my intentions. Simple as that.

With the help of friends in Berkman‘s geek cave I found a way to shut the offending additions off in WordPress (though I can’t remember how right now, sorry). But I don’t know if there’s a way to do the same in Flickr. Advice welcome.

And while we’re at it, I’m still not happy that searches for my surname always ask me if I’ve misspelled it — a recently minted Google feature that I consider a problem and which hasn’t gone away. (To friends at Google reading this, I stand my my original guess that the reason for the change is that “Searles” is somewhat more common than “Searls” as a surname. Regardless, I prefer the old results to the new ones.)



11 responses to “Chaining links”

  1. Hey, I’m doing the search [searls glassie] (I think about 20 minutes after you tweeted about this post) and I see the “The New Vernacular” piece at http://www.linuxjournal.com/article/4553 at #3 for the search.

    I see the “Linux and Plethorization” piece show up for the search [linux plethorization] (looks like that piece doesn’t have the word “glassie” in it).

    For the [searls glassie] search, I believe we’re working on the probability that glassie might also be a search for glassy (for most users, that would be true). If you search for [+searls +glassie] (to search for those exact words), I do see the “Making a New World” piece on the second page of the results.

  2. Ah, here’s the reason why we don’t return the first url you want http://searls.com/doc/os2/docchapter.html for that search: you blocked all search engines from crawling in http://searls.com/robots.txt , which says
    “# go away
    User-agent: *
    Disallow: /”

    Blocking us in robots.txt means that we can’t crawl/index the page, so there’s no way we could know that the page has the word “glassie” on it. If you let us in, I suspect we’d crawl/index/return the page. Nothing to do with the nofollow attribute at all; we’re just blocked in robots.txt.

    By the way, if you do the search for that url in Google with http://www.google.com/search?q=http://searls.com/doc/os2/docchapter.html you’ll see that we know the url exists, but we’re not showing a snippet for it. That’s because we can’t crawl the page and couldn’t generate a snippet.

  3. I don’t know if it would help with the nofollow problem (or if you’ve already done it) but registering the blog at Google’s Blog Search comes to mind:

    http://blogsearch.google.com/ping?hl=en

  4. Wow, Matt. Well done. On a weekend, no less. I suppose that particular robots.txt file dates from ’95 or something. I need to take another look at the searls.com directory and see what I should hide and expose.

    Thanks!

  5. nice piece of information on “nofollow” and Matt comments on robots.txt. I still have a lot to learn. Thanks.

  6. No worries–happy to try to help. I chatted with the team about “searls” and “searles.” The problem is that the latter is almost always a helpful spelling suggestion for “searls.” They’re mindful that it impacts you, but they didn’t want to make changes that would hurt the experience for the majority of users. And when they put it that way, it’s difficult to make the case for removing the spelling suggestion when it would hurt the search experience for the majority of users.

  7. Thanks, Matt. Does “helpful spelling suggestion” mean that the majority of people searching for “searls” click on “searles” when it’s suggested?

  8. I didnt actually understand what does nofollow attribute do. I mean if i put no follow in my site then the sites linking to me, will they stop getting the link juice from mysite???

  9. Responding to Kevin above…

    The “nofollow” attribute is basically a tag that tells Google to not treat any link marked with that tag as an endorsement for that page. Yes, it means don’t give it any “link juice”. However, it doesn’t stop Google from crawling the link.

    It doesn’t affect sites link to you but it affects sites that you link to from your site.

Leave a Reply

Your email address will not be published. Required fields are marked *