Al Hoang

September 9, 2009

Spamassassin SIGPIPE errors and the zero file mail message mystery

Filed under: fixes, gripe, sysadmin — hoanga @ 1:52 am

Awhile back I was noticing I was definitely losing emails. As one can might imagine, this is a scary experience since this brings into doubt if the mail system under use is doing something funny to the mail.

My first place to look was in the mail logs for the SMTP server and other associated daemons. However, I saw nothing in the maillogs which was not a very comforting thought.

After more investigation I would notice empty files like this every once in awhile…

~/Maildir)  ls -la new/
total 4
drwx------   2 al  al   512 Jun  8 00:25 .
drwx------  69 al  al  2048 Jun  8 00:25 ..
-rw-------   1 al  al     0 Jun  8 00:22 1244388142.30600_.myserver.net

This gave me more clues on where to look next. So next I looked in my Procmail logs for this particular mail id and noticed the process handling this message was killed by SIGPIPE

procmail: Executing "/usr/local/bin/spamassassin"
[84028] warn: spamassassin: killed by SIGPIPE
procmail: [84026] Tue Apr 14 21:45:26 2009

Googling dug up the following links that explain it all:

 http://www.nabble.com/Zero-exit-code-aft…
 https://issues.apache.org/SpamAssassin/s…

Verdict:
Upgrade Spamassassin

Since I have upgraded Spamassassin, the zero byte email mystery has resolved itself.

June 2, 2009

Using a non-standard port for Capistrano SSH gateways

Filed under: fixes, gripe, ruby, sysadmin, tech — hoanga @ 8:10 am

I have a love-hate affair with Capistrano. It is a great tool if you are a Ruby person and need to do something NOW on a bunch of machines. But the docs are in a constant state of suck from my point of view.

The Capify.org website helps for remembering the ’simple’ details on what Capistrano can do. But where I waste a lot of my time is asking questions like, “How do I set the Capistano SSH gateway to a non-standard port?”. Luckily, Capistrano is written in Ruby so it is easy enough to glance through the code and finally find out where it is but this is why good tech docs exist. To give enough context to answer those questions.

To answer my own question, below is a snippet you can add to your capfile to use a non-standard port if you need to deploy through a SSH gateway that lives on a non-standard port

# Add this to your Capfile
# This sets the SSH gateway to a machine called mysshgateway.com on port 22222
set :gateway, ‘mysshgateway.com:22222′

April 25, 2009

Glad I’m not the only one who prefers monit over god

Filed under: gripe, ruby, sysadmin — hoanga @ 8:44 am

Seems someone else ran into issues while trying to deploy god.

While, I don’t think god sucks I definitely don’t endorse it. At this point I would only use it under the following conditions:

  • Need for a process monitor tool with more dynamic configuration setups. This is where god really shines against monit’s simpler understanding of what process management is about.
  • The host that needs monitoring can easily spare at least 16MB for a monitoring process. See below on why.
  • I really want an all Ruby solution for all the tools in a system

In general, I am into the whole ‘It is Open Source. If you’re having issues, fix it’ deal so I am not nearly as angry sounding as Brad is about god. However, after having issues with god, I switched to monit for simple process monitoring and restarting. I had far less troubles and got on with other tasks that I considered more important than perfection in a process monitoring system.

For those that are curious here are the issues that I ran into with god:

  • Daemonized Ruby took at least 8MB of RAM for the monitoring process. With RAM the way it is, this is not as big a deal. However, if you are trying to get by on a 128MB VPS host every kilobyte counts.
  • God itself had issues just randomly dying after some time. Tom promptly fixed it after it was reported and that was great. However, it was a little disappointing that a monitoring process just died.
  • Sparse documentation compared to monit’s. Then again this is typical from many Ruby projects and luckily Ruby code is readable enough
  • Digging up known issues for god required noodling through groups, forums, and blog posts. Would have been nice to just have a friggin’ FAQ like other sys admin-targeted software I have seen.

I also DO agree as has been said in the comments on Brad’s post that it is the responsibility of the deployer of software to handle the issues with whatever they deploy and just deal with it. The reason I say this is because I fell for the hyped up description of god in the beginning and ultimately paid the price when it sucked up my time. I dealt with it but definitely am less impressed with overhyped marketing descriptions of software these days. Personally, I am not a fan of that type of marketing for software since it seems a little disingenuous to me. But that is just me.

December 7, 2008

Getting X working again after swapping hardware on Open Solaris nv100

Filed under: fixes, gripe, solaris, unix — hoanga @ 10:18 am

After having dain bramaged myself for years with Linux usage. I had gotten spoiled into believing an OS should make it simple to do the following:

1. Shutdown computer
2. Swap around hardware components
3. Restart
4. Life is good

However any techie should tell you this is a pipe dream on Windows. Mac users probably have no clue since they never change hardware components and just buy new Macs to solve their problems. Which leaves the lucky OSS *nix variants to try stunts like this.

Being the stubborn person I am, I attempted this with OpenSolaris by swapping out my motherboard. I wanted to do this in order to take advantage of the E7400 Core 2 Duo that I bought awhile back. Things almost worked however on reboot I was given the dreaded console login screen with a useless keyboard. The following as far as I know don´t work…

1. CTRL-ALT-BACKSPACE
2. CTRL-ALT-Fn
3. CTRL-ALT-DEL

Your best bet is to ssh somehow and try to look for clues. Here is what I did…

1. Swap motherboard and stare at dark screen
2. Find out how to boot into single user mode and make sure the kernel isn’t PO-ed or something and find my IP address
3. Move away the X11 configuration that I configured (dual-display) and try rebooting
4. Reboot and find out it isn’t working
5. ssh in and realize it still isn’t working. Move the old dual display X11 config back to /etc/X11/xorg.conf
6. Try restarting gdm with svcadm restart gdm and watch it fail
7. Scritch head some more
8. Try starting X from the SSH session and whoah it works
9. Restart gdm (svcadm restart gdm) and now I get a login screen
10. Realize that I disconnected the left monitor (VGA) to help debug and want it back
11. Logout and log back in. I now have dual screens and a working Solaris install again!

References

November 29, 2008

Getting KDE 4.1.0 on a Fedora 8 machine when KDE 3 is already there

Filed under: fixes, gripe, linux, stupid — hoanga @ 9:32 am

According to the Fedora FAQ one should be able to update with just this

sudo yum --enablerepo=updates-testing groupupdate "KDE (K Desktop Environment)"

However when I did, I ran into some icons from packages kdepim-3.5.9 and kdegraphics-4.1.0 conflicting with packages crystalsvg-icon-theme and libkipi. Here is a log…

  file /usr/share/icons/crystalsvg/48x48/apps/kpalmdoc.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_contacts.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_date.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_journal.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_mail.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_news.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_notes.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_summary.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_summary_green.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/kontact_todo.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/crystalsvg/64x64/actions/rss_tag.png from install of kdepim-3.5.9-10.fc9.i386 conflicts with file from package crystalsvg-icon-theme-4.0.4-1.fc9.i386
  file /usr/share/icons/hicolor/16x16/apps/kipi.png from install of kdegraphics-4.1.0-3.fc9.i386 conflicts with file from package libkipi-0.1.5-4.fc9.i386
  file /usr/share/icons/hicolor/22x22/apps/kipi.png from install of kdegraphics-4.1.0-3.fc9.i386 conflicts with file from package libkipi-0.1.5-4.fc9.i386
  file /usr/share/icons/hicolor/32x32/apps/kipi.png from install of kdegraphics-4.1.0-3.fc9.i386 conflicts with file from package libkipi-0.1.5-4.fc9.i386
  file /usr/share/icons/hicolor/48x48/apps/kipi.png from install of kdegraphics-4.1.0-3.fc9.i386 conflicts with file from package libkipi-0.1.5-4.fc9.i386

Error Summary
-------------

I tried deleting these packages manually but that led into an even deeper’s rats nest of dependency hell. So one thing I tried was

$ sudo yum groupremove "KDE (K Desktop Environment)"
$ sudo yum --enablerepo=updates-testing groupinstall  "KDE (K Desktop Environment)"

However I still got conflcts so ran

$ sudo yum remove libkipi
$ sudo yum remove crystalsvg-icon-theme
$ sudo yum --enablerepo=updates-testing groupupdate  "KDE (K Desktop Environment)"

And finally I have got KDE 4.1.0 to install in way too many steps.

I can’t get no satisfying manpages

Filed under: gripe, solaris, stupid — hoanga @ 9:29 am

It seems that with most default UNIX-like installations outside of FreeBSD just include craptastic manpages.

This really puts a dent in RTFM. In Linux-land this has been a serious PITA for years (Yes, I know how to hunt around for the packages to install the manpages in Redhat-ish and Debian lang but it is still an irritation). It seems my forays with OpenSolaris mirror the fun of Linux. Ah well..

$  man ps
Reformatting page.  Please Wait... done

Miscellaneous                                          missing(x)

     missing - Missing Manual Page

DESCRIPTION
     Unfortunately, this OpenSolaris Developer Preview  does  not
     include  the  manual  page you are looking for.  We're sorry
     and hope to improve upon this situation in future releases.

     Online versions  of  many  manual  pages  are  available  at
     http://docs.sun.com/app/docs/coll/40.17.

SunOS 5.11            Last change: 07/10/25

November 15, 2008

Japanese declining population rates? How come I can’t find a doctor?

Filed under: gripe, japan, stupid — hoanga @ 7:27 am

少子化 is the term used to describe the shrinking population in Japan and the many hardships that will bring to the country economically. I remember hearing some analyses on why this is happening on a social standpoint. I’ve not heard (or really bothered that hard to search for) many proposals on how to fix the problem. However when I read stories like this.

It makes me wonder if some policy makers are out of touch with reality, sometimes.

September 3, 2008

Losing data in the clouds

Filed under: scaling, stupid, tech — hoanga @ 8:23 am

Seems that some cloud vendors (sheesh I really only knew about Google Ape… err App Engine, Amazon’s EC2 service, and GoGrid) have been having some issues watching customer data go up in a poof. oops.

Datacenter Knowledge mentions Flexiscale having issues

The problems for FlexiScale began when one of the main storage volumes was accidentally deleted by an employee during a system upgrade earlier this week.

as well as another company called LinkUp (although I guess it’s down now) having issues too:

a cloud storage previously known as Media Max, which shut down Aug. 8 after losing “an unspecified amount of customer data

Ouch. As the industry fashionistas and the hordes of zombies (and I guess everyone else pulled in from the vortex created) moves towards trying to fulfill what Nicholas Carr dubs the Big Switch I can only surmise that more cases of cloud computing companies will continue to go belly up due to dumb mistakes, poor execution, etc, name your classic dumb mistake here. This will leave behind the vendors who have spent time at multiple levels (besides just technology) to ensure that many of the typical mistakes that will hit any IT business can be recovered from instead of being the critical wound that kils.

Read more!

July 14, 2008

On the irritation of trying to run GPG on a remote headless server

Filed under: gripe, stupid, unix — hoanga @ 12:40 am

My short, short suggestion is… don’t: This blog post explains it much better than I can although I definitely have been the victim of trying to generate a GPG key on a headless server to no avail:

While trying to generate a gpg keypair on a remote server, I discovered I lack entropy. Eventually I had to physically type on the keyboard in order to generate enough random bytes.

I guess in the meantime the best workaround is to generate a GPG key on a workstation somewhere and transfer that to the server in question. A little hokey but I guess if you truly need a GPG key that would be my suggestion.

References

June 19, 2008

On the future (or lack) of Nitro Web framework

Filed under: Open Source, gripe, programming — hoanga @ 11:05 am

Before there was Rails, there existed other Web Frameworks for Ruby. One of the promising ones is Nitro however it fell into realm of ignorance. On some random surfing I ran across this blog post from one of the main authors of the Nitro Web Framework.

It’s an interesting read since the author laments how Nitro could have been a contender but it never took off. The comments are enlightening since it seems there were project (mis)management issues that prevented a community from really forming around Nitro. Anyways, by now, Rails has the lion’s share of the attention so any hopes now would rest on it being so much better that people will switch. My understanding is that it has some things that are better but not such a big enough jump to make people interested, unfortunately.

Read more

June 18, 2008

Happy half-a-decade blog

Filed under: geek, stupid — hoanga @ 8:24 pm

Yeesh, I just looked at the date of my first post and it’s been a whole 5 years since I started blogging here.

What have I learned? People who blog multiple times a day must either have some kick-ass tools to do it quickly or way too much time. I seem to have neither so this blog gets infrequently updated.

May 13, 2008

Why does RTFM with GNU manpages just suck

Filed under: gripe, stupid, unix — hoanga @ 9:00 am

After looking at a seriously useful tip for cleaning out a clogged postfix queue I puzzled over the -r option in xargs. I’m not an xargs master but I figure if I need it I can always RTFM.

Well man xargs on my Ubuntu box provides the typical GNU fair and I spend the next way too long time period looking for where the -r option is. (The full option is –no-run-if-empty).

Sadly, I ended up finding it faster with a Google search on this web page that has some find-fu and xargs then looking again at the GNU manpage and finally find the option buried just before the EXAMPLES section.

I guess I should be happy this time around, there are some GNU manpages that don’t even HAVE documentation (at least compared to the *BSD brethren). Grrrrrr

May 3, 2008

What the fork are you doing Pidgin devs?

Filed under: Open Source, gripe, stupid — hoanga @ 9:58 am

After glancing at the Slashdot post on the forking of pidgin and wasting far too much time slogging through the ticket that caused a bit of strife, I’m pretty sure I will move away from Pidgin until the developers stop being dorks (highly unlikely since they seem to develop only for themselves).

Quick quick summary of the whole issue. The pidgin developers decided to make the input text box very small and auto-resizing (up to a certain point) based on some fancy heuristic. Quite a few users have jumped up and down and requested to make this optional however the pidgin developers basically said, ‘Go take a hike’. There has been quite a bit of reaction to it, including the creation of a plugin to bring back old functionality to a full on fork of the whole project.

In general, forks are a bit of wasted effort to the Open Source community as a whole but one will never get the idea situation where all developers will just ‘get along’ and combine their energies into the One True Implementation. So, from a pragmatic standpoint, forking seems to get the job done although with a lot of burnt cycles.

I believe migrating to something like Funpidgin (A fork of Pidgin that aims to be listen more to the community) and making sure that it gets enough momentum to stay alive is probably the best answer to extremely stubborn developers wanting to do things their own way at the expense of the ‘users’. However, I guess we’ll just have to wait and see if the fork gets enough energy to keep itself running.

April 30, 2008

One way to clean out a gazillion files in a directory without causing the server to hang on IO

Filed under: fixes, stupid, tech, unix — hoanga @ 5:37 am

Had a case where I had some rails app that was using files for its session store and had been running like that for months. While it was a careless (and dumb) thing to run it that way, we had to do something about it since it was eating up close to 85% of the system partition. The dumb way to try to do this is below:

nice find /tmp/ -name 'ruby_sess.*' | xargs -n 100 rm -fv

However, the problem is that this causes insanity on IO which is NOT good for a running service. (Luckily this service was merely important instead of critical) So, I present to you my quick and dirty script that will clean up all those nefarious ruby sessions files that have run amok for months.

while true; do
    nice find /tmp/ -name 'ruby_sess.*' | head -n 20 | xargs rm -fv; sleep 10
done

April 14, 2008

OS X 10.4.11 update, you suck

Filed under: fixes, gripe, osx, stupid — hoanga @ 10:10 am

After a long time of not updating my trusty old iLamp iMac, I finally updated it to 10.4.11 over the weekend and let it lie. Later on I hear a report that Safari won’t start up.

That’s odd, I’ve never heard of Safari having launch problems before. I check the log and I see something like the following:

Date/Time:      2008-04-14 22:06:45.474 +0900
OS Version:     10.4.11 (Build 8S165)
Report Version: 4

Command: Safari
Path:    /Applications/Safari.app/Contents/MacOS/Safari
Parent:  WindowServer [86]

Version: 3.1 (4525.13)

PID:    20987
Thread: Unknown

Link (dyld) error:

Symbol not found: _WebDatabaseDirectoryDefaultsKey
  Referenced from: /Applications/Safari.app/Contents/MacOS/Safari
  Expected in: /System/Library/Frameworks/WebKit.framework/Versions/A/WebKit

Those type of messages go on and on in Console.app. My first round of Googling brought me to a Macrumors thread that recommended deleting the history.plist and other parts from ~/Library/Safari. I tried removing various files from that directory to no effect and finally just removing the directory altogether. Nada.

Troubleshooting a PC WITHOUT a web browser really sucks. I suggest everyone try it once in awhile to see just how much the nature of debugging IT problems has changed when you are deprived of a very powerful tool. Luckily, this Mac had a very crufty version of IE lurking so the first thing I tried was a download of Firefox. However, the disk image refused to be mounted! Seems this is also tied in with the security update. So basically, I’m left with using IE as my main tool for debugging on this Mac.

After a bit of googling, it seems that the fix is tied requiring the latest OS X security update (Available at the Support Download page). For OS X 10.4.x users you can find a link to the latest Universal here.

However, there is a catch. The security update is only available from that page as a Disk Image. Guess what you can’t open it up under the affected Mac. At this point you’re left with 2 options:

  1. Burn the Disk Image to a CD-R/DVD-R and use it that way
  2. Transfer the disk image to another Mac and unarchive it then push it back

I decided on the latter option since I hate wasting a CD-R if I don’t have to. After applying the Security Update and rebooting, Safari boots up and things seem chipper again but that was a serious side trip on something I’ve rarely seen Apple ever mess up. Anti-kudos to Apple for making this update on your ‘legacy’ OS really sucky.

Powered by WordPress

Protected by AkismetBlog with WordPress