feed-abstract gem updated to support twitter RSS and Atom

I updated my feed-abstract gem to support twitter RSS/Atom, in that it will automatically parse hashtags and turn them into RSS item subjects/categories. Huzzah! This is pretty fun, as it allows tweets to be aggregated into TagTeam seamlessly and they can be remixed, archived, and searched by tag.

You can get at twitter RSS/Atom via URLs like:

https://search.twitter.com/search.atom?q=url encoded hashtag

so:

https://search.twitter.com/search.atom?q=%23rails

I’m sure there are more search parameters available too. If you want RSS, just change the “.atom” to “.rss”.

OpenWRT, rsync, and linux love

I use an rsync / hard link backup system of my own design (but similar in concept to this).  I have it providing 180 days of backups for numerous production machines spread around the internet, along with more permanent external backups provided by spideroak (referral link, but we both get free stuff if you sign up).  My internal backup system serves as my hot backups, so I want it available 24×7 via a remote file mount (sshfs) should I need it.

The machine I was running it on, though, was WAY overpowered and idles at around 70 watts – this was having a noticeable effect on our electricity bill.  So I put openwrt on my asus wl500gp v2 and have that now doing my backups. It’s silent, fanless, and combined with a good external USB drive has as much storage as you can afford. It also idles at 5 watts total, device and USB drive combined!

Notes:

  • The external drive is formatted for small files and inodes and a higher inode / block ratio. I just did “mkfs.ext -t small /dev/partition” from a full linux machine for the format – these options are a better fit the usage on an rsync / hard link backup system.
  • I installed the openwrt image with the 2.4 kernel because it seems it has better hardware support for this device – it works great.
  • I disabled the wlan and lan, leaving only the wan enabled with a static IP. I port forwarded an external port from my verizon router to allow ssh access from anywhere.
  • I had to install openssh via opkg because the dropbear ssh client doesn’t support outgoing key auth, or if it does it doesn’t support openssh-style keys.
  • I switched the default shell from ash to bash – just too many minor differences for me.
  • rsync is available via opkg. Install it.
  • cron is provided by busybox and has some minor differences in crontab syntax, I could not get @reboot jobs working.
  • USB storage is fairly easy to set up. I found, however, that the external device partitions were recognized at “/dev/discs/disc0/part1” instead of the more traditional “/dev/sda1” locations. No biggie, just odd.  You should read and implement the “start on boot” section.
  • You can see syslog output via the command “logread”
  • I needed to slow down the automount process via a “sleep” command  to allow the drive to spin up before mounting. Details here. Once I put in that delay, automount worked great.
  • “find” provided by busybox is way limited compared to gnu find, and I can’t seem to locate gnu find in the otherwise complete openwrt repos. Busybox find can’t search based on modification times nor link counts – both key to how I implemented my backup system. I reap backup directories via their modification times to expire old backups. I installed ruby and ruby-core (which contains the ruby stdlib that provides file / directory classes) and wrote my own little timed reaper. Source is below.

It’s working great so far – quiet, low-power and fast enough for me.

Stupidly simple timed directory reaper written in ruby

# expire_directories.rb. My backup directory names all look like "back-2012-01-27-04:44:05", 
# hence the regex along with the date check.
require 'find'
require 'fileutils'

days = ARGV[0]

Find.find('./') do |path|
  if FileTest.directory?(path)
    if path.scan('/').size == 1
      if path.match(/back/) && (File.stat(path).mtime < (Time.now - (60 * 60 * 24 * days.to_i)))
        puts "Removing: #{path}"
        FileUtils.rm_rf(path)
      end
    else
      Find.prune
    end
    #puts path
  end
end

invoked thusly:

cd /some/directory/that/contains/your/backup_directories && ruby ~/bin/expire_directories.rb 60

so pass it the number of days. Be sure you're in the proper directory before running this, it's doing an "rm -rf".

YaCy – a p2p search engine

So I’m running a YaCy node – which is a pretty awesome project to create a search engine indexed “by the people, for the people.”

YaCy provides a java servent  that can index internal resources and external web pages. You have MANY controls over what and how it’s indexing and the resources allocated to it. There are tons of built-in analytics and logging for the stats geek in you.

It’s still rough, but seems damned promising.  A bonus – it uses jQuery and Solr.

I really like the idea of indexing all the content you care about and also providing that index to the world at large to search, but I have concerns over the long-term impact of more ‘bots crawling the web. I would like to see YaCy figure out a way to minimize it’s impact on a global level – if every yacy node is indexing the same sites, it could easily escalate to a DDoS-level problem. Perhaps they’re already working on this issue.

Inferring password storage security through maximum password length requirements.

I’d like to follow up on a fairly obvious thought I tweeted today:

If a site has password length limits, it’s probably stored in plaintext. Hashed passwords (of any size) have a predicable length.

Excuse the typo, please. Explanation:

The sane operating procedure for password storage is to use a hash function (md5, sha256, bcrypt and similar) along with a “salt” to one-way hash the password. This hash is a known length, no matter how long the password is. For a hexidecimal md5 sum, it’s 32 characters.  When a user comes back to authenticate, you take the password they supply and the salt, calculate the hash, and match it against the hash you have on file. If the newly calculated sum matches the one you have on file, the user has entered their password properly.

There’s no logical reason to limit a hashed password’s length: a 4 character password and a 4000 character password will both calculate to the same size hash.

So if a website has a maximum password length, it means they are probably NOT hashing it and the limit they apply is to keep you from exceeding the column length of the table they store it in.  @abackstrom pointed out here that it’s probably the same situation for disallowed “special characters” – they shouldn’t care what you enter if all they are doing is hashing your password. If they do, they are probably storing it directly in a database.

Why is this bad? Many obvious reasons:

  • Your unencrypted password is sitting in a database, waiting to be stolen if/when the site in question is compromised,
  • Many people share passwords amongst websites, so a list of email addresses and unencrypted passwords can lead to many more compromised accounts on sites all over the web,
  • It’s unnecessary. There’s almost no valid reason to store an unhashed password, especially with all the excellent authentication frameworks out there.

What can you do?

If you’re a website user, complain! If a website has a maximum password length or a restriction against special characters: contact them! Tell them you don’t appreciate their lax security.  Ask them to support openID, ask them if they store passwords unhashed, just let them know this is important.

If you’re a developer, use an authentication library known to do things right in your language of choice. Don’t write your own, busted, authentication system.

Diff’ing wordpress releases

A new wordpress release is out and you’d like to know what’s changed? Sure, you can read the changelog, but is that REALLY enough?

Point your interweb box to: http://core.svn.wordpress.org/tags/ and then use svn to show you exactly what’s changed. Pick the URLs to the tags that represent the two releases you’d like to compare, and then:

svn diff http://core.svn.wordpress.org/tags/3.0.4/ \
http://core.svn.wordpress.org/tags/3.0.5/  > ~/tmp/wp-3.0.4_to_3.0.5.txt

Then inspect “~/tmp/wp-3.0.4_to_3.0.5.txt” to see what’s changed.

Pre-loaded Image Rollovers with jQuery

I despise implementing image rollovers, but this makes it almost tolerable. Any <img> tag with a “hover” attribute referring to an image URL will have a rollover behavior attached to it. For bonus points, the rollover image will be pre-loaded, so there’s no momentary delay the first time an on-state image is loaded.

In your HTML source:

<img src="/images/button.gif" hover="/images/button_on.gif" />
<img src="/images/another_button.gif" hover="/images/another_button_on.gif" />
jQuery(document).ready(function(){
  jQuery('img[hover]').each(function(){
    // Preload rollover
    var imageEl = jQuery("<img alt="" />");
    imageEl.attr('src',jQuery(this).attr('hover'));

    jQuery(this).hover(
      function(){
        // swap the image.
        var hoverSrc = jQuery(this).attr('hover');
        var regSrc = jQuery(this).attr('src');
        jQuery(this).attr('src',hoverSrc);
        jQuery(this).attr('hover',regSrc);
      }
    );
  });
});

Thanks jQuery for being so awesome.

A simple bash gpg “password safe”

This is definitely not military quality – but if you need a simple way to manage a GPG-encrypted file containing info you want to protect this works pretty well for me. I use this to manage a set of passwords on a trusted machine that I can ssh to.

It assumes you’re using a bash-like shell and have a trusted private key available in the account you’re running it on.


#!/bin/bash

KEYS=""

if [ ! -e "$HOME/private" ]
then
    mkdir -m 700 "$HOME/private"
    umask 77 "$HOME/private/"
fi

# Always delete the unencrypted file at the end of the session. We DO NOT want this hanging around.
trap "rm -f \"$HOME/private/${USER}_private_store.txt\"; chmod 600 \"$HOME/private/${USER}_private_store.txt\"*; exit" INT TERM EXIT

touch "$HOME/private/${USER}_private_store.txt"
chmod 600 "$HOME/private/${USER}_private_store.txt"

gpg --decrypt "$HOME/private/${USER}_private_store.txt.asc" > "$HOME/private/${USER}_private_store.txt"
vim "$HOME/private/${USER}_private_store.txt"

md5sum  "$HOME/private/${USER}_private_store.txt.md5sum.new"

if [ -e "$HOME/private/${USER}_private_store.txt.md5sum" ]
then
    if [ "`cmp "$HOME/private/${USER}_private_store.txt.md5sum.new" "$HOME/private/${USER}_private_store.txt.md5sum"`" == "" ]
    then
        clear
        rm -f "$HOME/private/${USER}_private_store.txt.md5sum.new"
        echo 'No changes, not re-encrypting'
        exit
    fi
fi

mv "$HOME/private/${USER}_private_store.txt.md5sum.new" "$HOME/private/${USER}_private_store.txt.md5sum"

echo 'File has changed. Re-encrypting. . .'
gpg -a --encrypt -r $KEYS "$HOME/private/${USER}_private_store.txt"
clear

First time it runs it’ll create a private directory, start vim, and encrypt the text you enter into vim. On subsequent runs it’ll prompt you for your private key passphrase and repeat the cycle. It won’t re-encrypt if there haven’t been any changes.

I’m betting wordpress messes up the code, so here’s the text file: edit_password_safe.sh.