Press "Enter" to skip to content

Douglas Whitaker Posts

Backups (Part 1)

Background

I have suffered massive data loss twice in my life.

Once was in high school, and I knew backing up my files was a good idea. I bought a second hard drive. As I was about to copy my files onto it, someone tripped on the power cord and knocked the spinning hard drive containing all of my data onto the tile floor. Angry sounds ensued, and all my data was gone.

The second time was near the end of the my second-to-last semester of undergrad. I used a Macbook at the time and was using the encrypted file system option that came with OS X 10.6. One day, right before finals week (Murphy's Law), the computer just lost the ability to unencrypt it... leaving me with a massive, encrypted .sparsebundle that was essentially useless. Well, not essentially: actually useless.

Data Backups Are NOT Optional! 

From both major data losses I experience I had partial, unorganized backups. A DVD of homework here, a flash drive of music and movies there, but nothing cohesive. Nothing deserving of the name "backup". For about the past year, I've been regularly backing up my data. I prioritize backing up data now. I plan for it. If I upgrade the hard drive in my laptop, I'll only do so when I can afford to upgrade the backup drive as well. Internalizing the importance of backups is vital.

When I say that I've been regularly backing up my data, I mean backing up all of my files to an external hard drive (weekly for my desktop and monthly (worst-case scenario) for my laptop). The reason the laptop is monthly is, frankly, because using the external hard drive can be hard to remember. Moreover, the projects I work on regularly are copied at somewhat irregular intervals to flash drives and files emailed back and forth, so recent changes are often in several places. Not ideal, but I think that this system allows me to say "Yes, I do backup my data."

Or do I?

The current state of my backups

Right now I'm using the built-in "Backup and Restore" feature in Windows 7 and KBackup on Linux.

On Linux, I know that I'm just backing up my home directory and some configuration files I've edited. (Whenever I edit a configuration file, I copy it and the original to /root - this is included in the backups.)

The Windows 7 backup options seem to have some pros and cons (about which I could be mistaken). If a system image is created, ostensibly everything is backed up... but it can only be used in an all-or-nothing fashion. If individual files are backed up, they can be individually restored in the case of accidental deletion but are only useful individually. On my laptop in Windows I backup "files in libraries and personal folders for all users and system image". On my desktop I backup "All local data files". (I got my desktop before my laptop and initially shied away from the system image option because I thought it was akin to the System Restore Points introduced with Windows Me.)

Right now, at the start of my Ph.D. studies I feel pretty good about my responsibility with backing things up... but I will NOT lose my data again without a fight. I will not passively accept my backups as infallible.

The stress test

I decided to delete my data. To simulate a hard drive failure I used DBAN on my desktop: my hard drive was completely erased — just as if I had replaced it with a brand new one. I primarily use my desktop for Netflix, casual gaming (Morrowind, Minecraft, etc.), and browsing the internet; all of my teaching and research is done on my laptop. I checked reading my desktop backup on my laptop and was able to access the files through the Restore option: if everything went wrong, I could still recover what I wanted to. But what would everything going wrong look like?

I guess I should describe what I want from a backup solution at some point, and now is as good as any. My main purpose for backing up is to be there in the case of hardware failure or other major data loss event. I'm not too concerned with losing individual files (knock on wood) because I rarely delete anything, though being able to restore those would be a nice secondary goal. In a crunch, I would like to be able to open up the external hard drive enclosure, swap the backup drive for the original, and boot directly from the backup drive to continue working.

Previously I had created a Windows 7 Repair Disk, so I thought that booting from it would allow me to restore from my backup. While I think this would work, I wasn't able to test it. After the graphical system had loaded and the mouse worked I received Error 0x4001100200001012. I'm not sure what this error is, nor do I care at this point. After retrying it two or three times I gave up and concluded that the disk wasn't working. Lesson learned: always check your repair/recovery disks to make sure they work before you need them!

The Windows 7 Repair Disk didn't work, but I did have Dell Windows 7 System Restore Disks that I had previously checked. Basically the same thing, right?

Wrong. While these disks give me the option to restore my computer the factory image (which is what I expected them to do), that isn't my goal. After booting the Recovery Disk, I noticed that these disks also allow one to restore the computer from backups. Success! I thought naïvely. The disks seemed to only look for backups made with Dell's DataSafe software, not the Windows Backup and Restore feature. Lesson learned: don't assume options to restore backups refer to backups made with the system you use!

Fine, I begrudgingly thought. A full system restore to factory settings — at least that would get me to the Windows Backup and Restore software (and a usable computer). A few minutes and a disk change later and I was at Windows desktop. I opened the Backup and Restore feature and chose to restore all of the files from my external hard drive to their original locations on my desktop... and was immediately struck with the realization that the backup options I had been using for this were unsuitable for restoring all files at once. I had incorrectly chosen the individual-files only option. As it was copying the files, I had to choose to Copy and Replace all of the system files. I also ran into Error 0x80070020 indicating that the computer was trying to replace files currently in use. Not ideal. I had it skip those files and move on.

After all of the files finished copying there was a very clear problem: not all of the files copied. Essentially, all files related to installed applications weren't copied. While none of the data was lost, this backup did not allow me to quickly get up and running again as none of the programs I use work.

Moving forward

This stress test of my backup solution taught me that, while my data are sufficiently backed up, they are not as accessible as I would like. Not nearly so. In the end, it took several hours to get to this point which is both too slow and not where I want it to be.

My desired ability to boot from my backup in a crunch also implies that I should be able to read the files individually when the drive is mounted in another computer. This last point is key, because neither option that Windows Backup and Restore has allows this. Backing up files individually results in many individual zip files each containing many files without much clear organization. The system image option results in one large file which is difficult (impossible?) to use to access individual files. I had known about the latter situation before, but I learned the former during this ordeal. Lesson learned: always understand what the backup options actually mean!

I want to change my backup solution for both Windows and Linux. Essentially, I want to be able to clone the hard drive on a regular basis, but not maintain parity with my computer every minute. Keeping the backup identical to the original all the time (like RAID 1) doesn't allow me to restore files in the case of an accidental deletion (a secondary goal). At the same time, creating a clone of the hard drive every week would undoubtedly be a drain on system resources when dealing with large hard drives.

At the moment, I don't know my ideal backup solution, but I will keep looking for it and will post my findings. For now, I'll keep using Windows Backup and Restore and KBackup while having a better understanding of the very real pros and cons associated with them. A summary of my wants and lessons-learned are below.

What I want in a backup solution

  • Backup of my entire hard drive occurs regularly (but not a RAID 1 style mirror)
  • Backup is bootable
  • Backup is readable when mounted by another computer
  • Backup is created while the computer is in use (I would prefer to not have to use an external system to periodically mirror my hard drive)
  • Backup software is Free Open Source Software (technically this is optional, but I doubt I'll go with any solution that isn't open source)

Lessons learned

  • Always check critical media before it is needed
  • Always make sure the tools you have do what you think they do
  • Always make sure you fully understand the meaning and implications of the options you have selected
  • Testing your backup system before you depend on it is a good thing
1 Comment

Problems annotating printed articles

Printing journal articles on an ink jet printer and using yellow highlighters doesn't seem to work well. An alternate solution is needed. Possible alternate solutions include:

  • Changing the printing (printing everything on a toner-based printer?)
  • Changing the method of annotating (maybe different highlighters work better?)
  • Going entirely digital (an e-reader or a tablet?)

While they all have their pros and cons (organization?), the last option will probably win out eventually. I currently have a Nook Simple Touch, but its PDF support is unsatisfactory at best. I might rooting it for better PDF support, but we'll see. In the end I may just end buying a Nexus 7.

Either way, this smeared yellow/black ink has got to end.

Leave a Comment

Grad school update

Add/Drop and the first full week of school are over. A few thoughts from the past month or so:

On teaching...

  • Having taught hundreds of students, some are bound to recognize me. Some excitedly wave to me. Some smile. Some glare. Some say cryptic statements as we pass each other: "Hey Doug: you gave me a B+." (Is that a good or bad thing in your mind?) I guess this goes with the territory. I remember none of their names.
  • I feel a little weird not teaching this semester.

On research...

  • Working with coauthors can be tricky. Working with seven coauthors is trickier.
  • The pace of research can be slow.
  • I'm trying to learn to stand my grand at times. (Come on, let's get this paper out there!)
  • I've got to get better about following up with people.

On technology...

  • I'm need to remain vigilant to avoid getting stuck in old paradigms. A friend/colleague of mine uses DropBox to share files with her students (and ostensibly to move her research around to different computers). I have 7 flash drives, several of which are identical. Yeah. DropBox would have been great for sharing some giant files with my students over the summer (I wanted to distribute JMP to my students because UF has a site license for it, but the short semester and some emails that JMP never responded to precluded me from including it in the course). I've resisted using DropBox because of privacy and security concerns, but for non-sensitive, non-critical data it seems like a good solution. I've been wanting to investigate running my own DropBox-like service, but I just... don't have the time. There are some open source alternatives I've looked at, but the time factor is really a key thing. The two main FOSS alternatives I've heard of are Sparkleshare and ownCloud, but as I've never used any, consider that more of a starting point for looking.
  • My current work-flow is more or less working for me. My computer boots Linux by default, and 90% of what I do is there. When I need Windows, the flash hard drive is fast enough that I can be in Windows in less than two minutes.

On classes...

  • When three hour blocks become standard, 50 minute classes feel really short.
  • Parking on campus after restrictions are lifted is more awful than I realized. (I'd really like UF to implement an after-hours permit that cost $50/year or something but allowed one to park in any lot that is restricted during the day. Who knows, this might not fix the problem and just cost more money. Something should be done, though.)
  • Textbooks are still really expensive. Many of my textbooks have been available on SpringerLink in previous semesters... this semester, not so much. In fact, an ebook I did find can only be checked out by one person at a time (really EBSCOhost? way to stay relevant).
  • I don't mind homework. I don't like exams.
  • Spanish class is terrifying, mostly because it isn't in English.

On growing up...

  • Because eventually I'll need a REAL ID to fly on an airplane, I headed to the DMV. The DMV really isn't that bad. I mean, it is byzantine, but GatherGoGet.com helps. Having GatheredWentGot, the initial "do you have the right papers?" part was a snap (despite a queue that began outside). Of course, they did mess up my address and hoped that I wouldn't think it was a big deal. All in all, about 75 minutes was all it took to get my new ID. Not terrible.
  • I no longer consider the majority of undergraduates my peers. Somewhere along the way a perspective shift happened and that was that.

I've got a few technology-related posts in the works (mostly about ThinkPad/Linux issues), but I figured a general blog post would be nice now.

Leave a Comment

Scientific Linux, ThinkPad, and WiFi

It's no secret that Linux doesn't have the same level of hardware support as Windows. In some cases, Linux systems can be run where Windows wouldn't dream of being installed. On the other hand, Windows has great support for many recent devices that Linux doesn't support (either in full or in part). The wireless device on my ThinkPad is:

[Doug@FLASHMAN-SL ~]$ lspci -nn | grep -i real
03:00.0 Network controller [0280]: Realtek Semiconductor Co., Ltd. RTL8188CE 802.11b/g/n WiFi Adapter [10ec:8176] (rev 01)

The appropriate driver for this (found from ELRepo.org) is kmod-r8192ce. I followed the instructions on the website to add ELRepo as a repository and then had it install kmod-r8192ce. After a restart, things seem to be looking good:

[Doug@FLASHMAN-SL ~]$ lsmod | grep 8192
rtl8192ce 51681 0
rtl8192c_common 45957 1 rtl8192ce
rtlwifi 71279 1 rtl8192ce
mac80211 423615 3 rtl8192ce,rtl8192c_common,rtlwifi

While this got the basic wireless working, things were not perfect. After connecting to any network, either secured or unsecured, I would be able to surf the web for a few minutes.... followed by nothing. The connection would just die after a few minutes (or seconds). When the connection would drop, my computer would still think I was connected, and disabling and re-enabling the network or disconnecting and reconnecting would not solve the problem. I had read that IPv6 sometimes can interfere with the network connectivity if it isn't being used (I am still on IPv4), so I appended ipv6.disable=1 to the kernel in my /etc/grub.conf file. Still not working great, but marginally better. Eventually, I began using a kernel from the 3.5 branch as opposed to the Scientific Linux 2.6 kernel. Together with the ipv6.disable=1, this seems to result in a solid connection.

Future goals

  • Whenever I update my kernel (and thus grub.conf file), a new kernel becomes the default... which is sometimes a 2.6 branch kernel. When I don't interfere with the start up, it can be frustrating when my internet dies again. I would like to have grub not change the kernel that I am running from 3.5 to 2.6. [Edit 2: For the time being, I am manually updating grub.conf after each kernel update.]
  • I need to check if ipv6.disable really offers benefits when used with the kernel version 3.5.
  • I'd like to figure out a way to display code snippets on wordpress. [Edit: It seems that this functionality is possible through wordpress plugins, but as my blog is currently hosted on wordpress.com it is not possible to use any. A medium-term goal of mine is to create a new website that includes this blog as a feature, so when that happens I'll re-edit this post to add in code snippets.] [Edit 3: I installed the SyntaxHighlighter Evolved plugin after moving to WordPress.org. Apparently, though, this feature is available on WordPress.com, though I didn't figure that out at the time. Of course, one could just use <pre> tags, which I had forgotten about.]
1 Comment

My ThinkPad specs

One of the primary focuses of this blog is going to be the technology (software/hardware/devices/etc.) that I use to be productive. The first few posts I assume will relate to software on my laptop (and fixing issues). Now, because laptop configurations vary considerably, I'm going to post everything I can about my computer so that if others experience similar issues there is a greater likelihood of finding the solution. I have more than one computer, so posts that are primarily concerned with this computer will be tagged with "ThinkPad T420i".

Lenovo ThinkPad T420i Hardware (Model 4177-CTO)

  • Processor: Intel Core i5-2430M processor (dual-core, 2.40GHz, 3MB Cache)
  • Memory: 8GB PC3-10600 1333MHz DDR3, non-parity, dual-channel capable (two 204-pin SO-DIMM sockets) (It may be possible to upgrade this to 16GB based on some things I've read (sources later), but it is currently listed as 8GB as the maximum memory.)
  • Chipset: Mobile Intel QM67 Express Chipset
  • Screen: 14" screen supporting 1600x900 resolution
  • Graphics: Intel HD Graphics 3000 (integrated)
  • Wireless: 1x1 11b/g/n, Wireless LAN PCI Express Half Mini Card, Intel Centrino Wireless-N 1000

Lenovo ThinkPad T420i Software

  • Microsoft Windows 7 Professional (64-bit)
  • Scientific Linux 6.2 (64-bit) (SL is a Red Hat Enterprise Linux clone)

I'm currently trying to switch to Scientific Linux as my primary operating system, so there will certainly be some posts about overcoming issues I've experienced with this (slow) transition. I'll keep this list updated with changes or additional information (or corrections) as I find them.

Below are a few documents I've found on the Lenovo support website which list detailed information about this computer (and related ones). I've posted them here because locating them wasn't obvious on the Lenovo website and, as products age and are no longer supported, the supporting documentation has a habit of disappearing.

1 Comment