Posts belonging to Category Computing



Paranoid Junk Storage

The backup of my damaged RAID array finished yesterday afternoon (it only took about three days, rather than the week I’d originally thought).  I spent some time verifying that all the files had been copied and opened a random selection of them to verify their contents to satisfy myself that the backup would be useful* if the RAID array died completely.

After confirming the backup, I took down the system and pulled the bad drive and replaced it, rebooted, ran the RAID BIOS setup tool to add the new drive, and booted into Linux.  Once the system was up, the driver initiated the rebuild and it was a matter of waiting.  Fortunately, it rebuilt without any errors or problems.  So now I have at least double-redundancy for most of my data, and in some cases triple-redundancy. 

I keep the majority of my work files on the Linux RAID-5 system and access them from my desktop via Samba shares.  On the desktop I have the directory set to “Make available offline” so that if the Linux system goes down I can continue working.  The Linux system also serves as a secondary backup of this website (Dreamhost has backups, but it’s prudent to assume they won’t work and keep a backup of your own).  Every night at 1:00am the Linux system uses ssh to invoke a couple of database backup commands that leave tar-gzipped files in a temp directory on my web hosting account.  Then it uses rsync via ssh to backup everything in my account’s home directory.  Now, I also have a cron job on my second Linux system (the one with the scanner) that runs at 4:00am and backs up everything in my home directory on the RAID system (I picked 4:00am to give the web backup plenty of time to complete, although it’s usually done in 5 to 10 minutes).  Both backup jobs email me their output so I can confirm that they ran to completion.  This means that my work files are triple-redundant (on the RAID-5 system, on the desktop, and on the scanner backup system), my regular files (photos, emails, etc) are double-redundant (on RAID-5 and on the scanner backup), and my web files are quadruple-redundant (web server, Dreamhost backup, RAID-5 backup, scanner backup). 

That first rsync of my home directory (which took over two days) transferred 120GB and it got me to wondering just what I was using so much space for.  On inspection I found that I was using a lot of space for system backups from old systems that I no longer had (one backup was from 2003).  I also had a bunch of space taken up with software images that I no longer needed (like old SuSE DVD ISO’s, J2EE server install packages, application development toolkits, etc).  Deleting all that junk reclaimed 90GB of space. 

* (Click “Read More…” for the rest of the story)

Verifying a backup if just as important has having the backup in the first place.  If you rely on a third party to backup your data, it’s a good idea to ask them to retrieve a file for you from time to time just to make sure the backups work.  If they do incremental backups, you may even wish to ask for the file from a specific date. 

I do this now because of an incident that happened a number of years ago when I was still a programmer.  Our development was done on a Unix system.  The source code was in a source code control system, so even if the system died, we’d still have our code.  However, many of us had example code, documentation, and notes in our home directories.  The system was backed up to tape every night, and the operators were diligent about changing the tape and seeing that the backups completed each night. 

I don’t remember why it had to be done (it was either an OS upgrade or replacing a hard drive), but over a weekend an operator was going to have to wipe the system, reinstall, and then restore our data.  Unfortunately, he did not verify the latest backup before wiping the disks.  When he went to restore our files he found that the tape was bad.  The backup software didn’t catch it, so even though it had appeared to complete successfully, the tape was useless.  Luckily he found an earlier backup, but we still lost a month or so of data in our home directories.

Hate When That Happens…

Some knuckleheads calling themselves the “Tornado Digital Security Team” have hacked the KISD website.

Keller school district technology experts are restoring the district’s Web site after computer hackers destroyed hundreds of documents and photos late Tuesday.

The hackers replaced the home page with a page that shows a photo of a tornado and a lightning bolt and the message: “In The Name of God … This is Web site hacked by Tornado Digital Security Team.”

Keller officials are trying to trace the Web site address, www.TORNADO.ir.

It appears the hackers were searching for Web sites with holes in the security firewall, said district spokesman Jason Meyer.

“It looks like some sort of software program was working on our Web site and they’ve been doing it for several months,” he said.

I always hate news stories about computing and security issues, because most of the people who know what the heck really happened aren’t allowed to talk to the public.  I don’t doubt that script kiddies and perhaps even some serious hackers have been testing the firewall.  However, I expect that the techies just told the spokesman about the firewall logs so he’d have something to say to the media. 

The firewall does you no good if your application code has holes in it, since you have to allow traffic bound for the web server to pass through the firewall to access the application (unless you proxy it, but even that’s not totally secure).  I know I get a lot of idiots in China trying to run dictionary attacks against my SSHD (which has to have a port open on the firewall for me to use it).

I noticed that KISD is running a CMS called Joomla!.  Coincidentally enough, it appears that Joomla has a number of high priority security issues that were patched on Monday.

That’s not to say I know for sure that this is what happened, since there are so many ways that security can be compromised, and some quick research shows that these guys have hacked systems other than Joomla! (interestingly enough, they seem to like hacking Arabic sites).  I’d be really interested to see a writeup on what really happened here.

It Seemed Too Easy…

I previously mentioned the problem I had with my RAID system when I powered it back on.  It really got me focused on the fact that RAID isn’t really such a great method for preserving files.  It’s good for a single drive failure, and it’s good for keeping a system online even with a bad drive, but anything beyond that means total data loss.

When I ordered the replacement drive I also ordered a couple of plain old 250GB PATA drives (the ones that got “thunked” onto my doorstep yesterday).  I’d decided that I was going to keep a second system to back up the first before replacing the bad drive and rebuilding the array.

A couple of weeks ago I had resurrected an old system and installed Open SuSE 10.1 on it.  At the time I was mostly playing around with putting a scanner online (a RadioShack PRO-2052 that I got on clearance) using Icecast.  But since this system also had a built-in Highpoint 370 “RAID” controller it meant that I had two extra IDE channels that I could use to create a mirrored drive array using the Linux software RAID tools (the HPT370 is actually just a fancy IDE controller with a little hardware support for RAID, but not a true RAID controller).  Anyhow, the system was working and it was running headless in a corner of my living room (one of my motivations for Icecast and the headless living room setup is that my office is an EMF wasteland, making it difficult to get a decent signal).

I installed the new drives and booted the system.  Once it was on the network again I used Yast2’s partition tool (being headless, I did this using Cygwin’s X-server on a PC) to create the RAID array, format it, and add it to the filesystem table.  It took me a minute or two to figure it out, but once I did I was surprised at how easy it actually turned out to be (select each drive, add a primary unformatted partition of type Linux RAID, select RAID drop-down, select each partition and hit “Add”, select RAID 1, select Reiser filesystem, apply changes). 

I’m always a little suspicious when something is too easy, but it all appeared to be working so I started an Rsync of my home directory and left to walk the dog.  When I got back I saw that the drive lights were on solid, so I thought it was working.  A little later I tried to login, but couldn’t access the system. 

Once I got it rebooted, a bit of examination showed that both of the HPT370’s IDE channels as well as the wireless card were sharing IRQ 11.  Everything is PCI, and it’s technically supposed to be able to share interrupts, but I don’t like interrupt sharing.  Further, with the RAID array and the wireless card on the same interrupt, it would seem to be just asking for trouble.  Every operation involved in copying data seems to be piled up on one interrupt (i.e receive data on wireless card, write data to hard drive on first channel, write data to drive on second channel).

Unfortunately, this motherboard has a brain damaged implementation of PnP configuration in its BIOS, so that “unsharing” the interrupts involves physically moving devices to different slots.  For example, this motherboard is hardwired to share an interrupt between the HPT370 controller and whatever device is in PCI slot 2, so whatever interrupt you choose for slot 2 (INT PIN 2 assignment in the BIOS), it’s shared with the HPT370.

I also learned that moving a wireless card causes SuSE to “forget” the card (you have to go into Yast’s network configuration and delete the original entry for the adapter and configure the new entry from scratch). 

Once I got through all that nonsense I started the Rsync.  It’s not exactly setting any speed records, as it’s using ssh over my wireless LAN.  But so far it’s copied about 18GB without hanging (as opposed to just under 2GB yesterday).  So there’s only another 140GB or so to go.

At this rate it’ll be finished in about a week.  As long as it doesn’t hang.

That’s The Way The RAID Crumbles

Just about two years ago I built a RAID system to use as my media/file server.  I’m fairly paranoid about shutting it down, since I’m afraid that I’ll lose a drive.  But when the A/C went out I shut it down to prevent it from overheating. 

The A/C is now back on (fortunately it turned out to be something simple—the fan start capacitor had burned out and shorted out a couple of other wires, but nothing that couldn’t be fixed), and as I feared, it lost a drive on bootup.  I’ve got a new drive on order, and until it gets here the system is still operational, although it’s no longer fault tolerant. 

At the time I thought that a RAID setup would be a good way to prevent losing data, especially given the costs of tape.  But I’m not so sure anymore.  Even though it allows for a single drive failure, it still makes you nervous to pull a drive out of the array and replace it, then rebuild the array.  You always feel like you’re hanging by a thread.

I’ve got some old systems laying around.  I think I’m going to add some disks to one of them and run it as a backup (maybe a daily rsync or something similar).  Hopefully, both systems wouldn’t die at the same time, should something happen to either one.  Of course, that doesn’t take into account something like a lightning strike.  What I really need is an offsite backup.  Perhaps I could persuade one of my friends with high-speed Internet to let me place a system at their place in return for backing up their data to my systems (i.e. cross-site backups).  Or maybe I’m just being overly paranoid.  But I’m starting to feel like someone carrying a bunch of eggs in a frayed basket, so I’m going to have to do something.

Excessively Chatty

Since my central air conditioning is down I’ve confined myself to working from my bedroom where I’ve installed a small window unit.  Unfortunately, since I can’t cram all of my computer systems in there (and I probably wouldn’t be able to cool the room), I had to shut down everything except a laptop.  This means that my main file and media server is offline.  This is annoying, but I can still work since I have an offline replica on the laptop.  The only problem is that the laptop seem to think that I don’t know this and sees fit to remind me every 30 minutes or so that “dominion is still offline” with one of those little notification balloons.  I wonder who was responsible for this bit of code?  Why do I need to be reminded of something I’m painfully aware of? 

I suppose I could look on it as a learning experience, as I’ve learned where to find the setting that turns that balloon off.  But it brings up a point about user interface design.  I hate annoying pop-ups, notification balloons, etc that simply repeat stuff that I already know.  A good example of this was an internal support application that our corporate masters forced on us through the automatic update feature they load on our laptops.  This helpful little application decided that it would be good to tell you every time your network connection was lost. 

I found out about this while traveling and spending time in a conference room with spotty wireless coverage and only 5 wired ports for 9 people.  We’d share the wired ports by taking turns using them as needed, which means we had to periodically disconnect the ethernet plug.  Windows already features a little balloon for this along with the little “X” on the adapter in the system tray.  So if I’ve just pulled the plug, and Windows has notified me, why does this app need to do the same again?  Worse, unlike Windows (which allows you to disable notification), the app had absolutely no way to stop being notified for dropped network connections.  The best you could do would be to delay notifications for a while.

I was suitably annoyed by this, so much so that I was moved to submit a rather angry trouble ticket against it.  I was a bit surprised to get an actual phone call from one of the developers about a week later (mainly because I expected them to just ignore the ticket).  It turns out that the development team had received quite a lot of “feedback” along the same lines as mine and that it was quite unexpected to them.  But they did add an option in the last release to turn off notifications.

I think part of the problem with both Windows and with our internal support people is that they think in terms of the lowest common denominator of user.  There’s no way for me to specify that I’ve done this stuff for a while and don’t need excessive handholding.  Since I’d not heard of this new application they were deploying until we got an email saying it would be sent to our systems in the next few weeks, I can’t help but think they didn’t involve users with some technical experience in their trials (if they even bothered with trials).

Anyhow, I suppose being too verbose can be excused provided that it can be changed.  Making something verbose, annoying, and useless to certain users without allowing it to be disabled isn’t excusable.

When It Rains It Pours…

I’ve been a customer of Dreamhost since April, 2000.  Over those six years I have rarely experienced any significant downtime with them.  Their recent troubles really had me doubting their ability to continue that record, though.  They recently reached a milestone of 300,000 domains served and I couldn’t help but wonder if they were overextending themselves.

I know that Bitter was  very, very unhappy with  Dreamhost and the downtime.  They offered her a free month of hosting, which I suppose is good, but it’s hard to get over being ignored by a hosting company for such a long period.  The cardinal sin of customer service isn’t being unable to solve a problem—it’s keeping your customer in the dark about it.

What was interesting is that I didn’t experience anywhere near the same problems she did.  My sites were up most of the time hers was down.

Anyhow, Dreamhost has put up a detailed rundown of the problems, how they started, and what they’re doing for the future.

Image Theft and Attribution

A while back I decided to ban hotlinkers after my hosting account was was hit hard by spammers.  In the grand scheme of things, the hotlinkers weren’t using a lot of bandwidth or CPU.  It was more the principle of the thing that bothered me. 

After Dreamhost changed their hosting plans to give people more bandwidth and disk space it became even less of an issue.  I haven’t taken the time to completely understand their algorithm for adding bandwidth and disk to my account, but since I’ve been a customer since April, 2000, it appears that I’m getting an insane amount of bandwidth, which increases each week:

  79680 MB Disk (Grows 480 MB / week)
      Used: 7179 MB (9.0% – Overage $.10/MB)
  2272 GB BW per Cycle (Grows 16 GB / week)
      Used this Cycle: 1.6 GB (0.1% – Overage: $0.5/GB)

My calculations suggested that the hotlinkers were only using 3-4MB per day, which is pretty much lost in the noise when you have 2272 GB per month (@ 4MB per day, that would be approximately 120MB/month, which is 0.0052% of my allowed bandwidth).  But the idea of people hotlinking my pictures still bugged me a bit, since there is no attribution. 

I thought about adding a watermark to the images, but going back and doing that manually would be a pain.  Some searching on Google turned up a handy watermarking CGI script that would add watermarks on the fly using ImageMagik.  The script was pretty easy to get running on my account.  Actually, the hardest part was creating a watermark using the GIMP.  While it’s pretty powerful it’s also a bit difficult to use unless you use it a lot.  I experimented with several transparent designs before I gave up and just used a white background with dark lettering. 

Now, when someone tries to hotlink an image, they end up getting the image, but with text in the bottom right corner that says “Hotlinked from aubreyturner.org!”  While I was at it, I added Bloglines as an allowed referrer.  People who read my site through Bloglines should now be able to see the images (without the watermark) in addition to the text (they may need to clear their browser cache, though).

Original example:

Hotlinked example:

Not exactly pretty, but it’ll do for now.  I originally wanted a watermark that was mostly transparent, but that’s kind of difficult to do unless you highlight the letters with some contrasting colors and then make everything else transparent (if you just use one color, it’s easy for it to get “lost” in the source image, depending on the color in the watermark area).  My GIMP-fu just wasn’t up to the task.

Ten Is Not Enough

I thought I’d try out the CVS online prescription refill form so that my refill would be waiting when I got over to the store. 

They helpfully let you sign up on the sign-in page, but they only allow 10 characters for your User ID.  This is one of my pet peeves, since I have a very common last name, and trying to squeeze out a unique ID that I can remember in 10 characters is a pain.  I’m not really sure why anyone would be so limiting in what they’d allow.  Some old systems have limitations on the length of User ID’s, but that typically shows up as an 8 character limit.  If you can allow more than 8, you can generally allow quite a bit more.  And if you’re intending to run a major e-commerce site, then you’d better be prepared to spend the couple of extra dollars that the extra DB storage would cost for a useful User ID length (I tend to specify 32 characters, although there’s nothing magical about this number other than that it works in most cases).

I went ahead and tried my first and middle initial with my last name, which is a common technique for generating a short User ID (I have several Unix accounts with this combination), just on the off chance I’d get lucky.  I didn’t think it would work, and I wasn’t disappointed.  The site helpfully told me that the ID was taken and also added descriptions for each of the fields in the right column.  This was mildly insulting, as it implied that the error was my fault for not entering something correctly.  If I were designing the site, the field descriptions would only be added if there was a gross violation of the validation rules.  If the error was simply that the User ID already existed, then I wouldn’t insult the user with the field descriptions.

The site did correctly remember the information I’d entered, so I didn’t have to re-enter anything.  Except that I noticed that the “send me email crap” checkbox was checked again, despite the fact that I’d unchecked it on the first attempt.  My more cynical side suspects that this is deliberate.  But it’s possible that it’s just an oversight (i.e. they automatically check the box in the form generator and didn’t implement the logic to remember the previous choices).  If I were designing the site, it would remember all of the user’s input, though.

I also couldn’t help but notice that while the email confirmation said that the User ID is case sensitive they’d case-folded my User ID to all lower case, and the website itself does this at login time. 

My guess is that the site was designed and implemented mostly by programmers, with little user-centered design thought.  Programmers tend to think in case-sensitive terms, because programming languages and text processing functions are this way.  To a computer, the word “Ten” is quite distinct from “ten.”  But most users, especially those who are not computer literate, do not think this way.  I suspect that sometime after the initial rollout that they had to go back into the code and make it case-insensitive to avoid User ID collisions.  The simplest way to accomplish this would be to make everything lower case at the time of input (so they don’t have to chase down all to the case-sensitive comparisons in the rest of the code).

Case-folding, in either direction, though, is another pet-peeve of mine.  I operate on the principle that systems should not munge user data without a damn good reason (and I will push back on a requirement for case-folding unless I understand the reason).  The user should get out what he put in, without any fiddling.  This means that when designing a system I usually specify that outputs should reflect the input, without any case-folding.  If comparisons are to be done, I will specify that they are case-insensitive if the input is human-generated. 

I found all of this in a less than five minute interaction with their site.  Perhaps it’s best for all concerned that I not look at anything else on there.  cool smirk

Leaving The Nest…

After all these years being hosted on my account, The Bitch Girls are moving to WordPress and their own hosting setup. 

I am now going to delete the hosting entry for the domain from my account so that Bitter can add it to hers.  There will be a period while the DNS changes propagate where you may not be able to access the site. 

Receiving Friendly Fire, Returning Same With Smile..

Now I’m starting to get people sending me emails via my contact form who are a bit steamed about supposedly getting spam from me.  Here’s the best, most succinct, example (from a gentleman who goes by the name TIM BLUST (and whose SHIFT-LOCK is locked in high dudgeon mode)):

I DO NOT KNOW HOW YOU GOT MY E-MAIL ADDRESS BUT PLEASE REMOVE ME FROM IT AND DO NOT SEND ME ANYMORE SHIT

Others were a bit more polite or used a bit more verbiage, but this one hit all the highlights:  How did you get my email? -and- Stop sending me emails.

It’s unfortunate that I can’t find a way to channel all the indignation and send it to its deserving target.  If I could figure it out we wouldn’t have any more problems with this spammer, as he would have long ago been reduced to a small pile of ash…

For the more irate ones, I use the following response:

I am not the one who is sending you email.  The sender has FORGED the email sender information to make it appear to have come from a user on my domain.  In general, one should never trust the “From:” address in a spam email, as spammers generally fake these to avoid getting irate emails such as the one I just got from you.  mad

For more information about TenTenTwelveCorp’s fraudulent emails, please go here:
http://www.aubreyturner.org/index.php?/orglog/tententwelvecorp/

The more polite ones get a bit more explanation (and no frowny).