Press "Enter" to skip to content

Tag: technology

On the perils of Express Scribe (software to aid transcription)

As part of the introductory qualitative methods course I am taking, each of us must conduct interviews and transcribe them as part of a larger class project. I recorded the interviews using Easy Voice Recorder Free (for Android), and it worked well for what I needed it to do. (Note to self: Put your cell phone on silent before conducting an interview. The recording device buzzing each time a text message is received is both unprofessional and distracting on the recording.)

As I am not (yet?) a qualitative researcher, I tried to complete the transcribing as inexpensively as possible, and free is the best kind of inexpensive. Rather than using specialty qualitative data analysis software (such as Nvivo), I've opted to go for transcribing and coding in Microsoft Word. Simple, but effective enough for a project of this size. (Of course, there is no reason another program such as LibreOffice Writer could not be used to really get at the "free" goal.) To play back the audio, a colleague recommended Express Scribe, a program by NCH Software.

Express Scribe has a free version which allows one to play the audio and control basic functions (Stop, Rewind, Fast-Forward, Play (regular and slow), etc.) using the function (Fn) keys on one's keyboard in lieu of using a pedal, though it also supports pedals. The function keys are used even the Express Scribe isn't in focus, allowing one to control the audio playback without leaving Word. Super convenient, and the entire transcribing process was relatively painless thanks, in large part, to Express Scribe.

But it isn't all roses.

When I downloaded the free version of Express Scribe, I didn't realize that wasn't all I was getting. Apparently, the free version of Express Scribe (and possibly the paid version?) includes 'extras.' Let's explore the situation.

The first thing that I noticed is that Express Zip had associated itself with nearly every type of archive (e.g. compressed files) on my system. Furthermore, it had given itself a context-menu (right-click menu) entry as "Extract with Express Zip". The picture below shows what I'm talking about.

Express Zip appears in both the context menu and as the icon for the archive files.
Express Zip has weaseled its way into my computer. (Note that the file type icon is very similar to the icon for Express Scribe.)

'Okay, so what?' you might be inclined to say. Surely this is benevolence from NCH Software - free software that might make our lives easier. Except, when one double-clicks a file that has been associated with Express Zip or chooses "Extract with Express Zip" from the context menu, this is what appears:

A pop-up window saying that Express Zip is an install-on-demand component.
Express Zip isn't even installed! All that is installed is an advertisement for Express Zip.

All "an install-on-demand component is required for this operation" means in this case is that Express Zip isn't even really installed yet - just an advertisement for Express Zip is installed! I was curious as to what all Express Scribe had done to my computer, and pulled up the Set Associations window. (The easiest way that I've found to get to it in Windows 7 is to search for "Set Associations" in the Control Panel window.)

24 file types associated with Express Scribe in the Set Associations window. 20 are boxed as being unreasonable.
The 24 file types associated with Express Scribe. Of these, the 20 boxed in red are unreasonable associations.

Now, of the file types that Express Scribe has oh-so-graciously associated itself with, I count four types that seem reasonable and twenty that are unreasonable (boxed in red above). In fact, Express Scribe (Zip?) doesn't even know what to do with some file types (e.g. .iso, a file type for disc images) and instead describes them as "Unhandled Extension Handler Finder". Oh, joy.

"Now, Doug," you might be tempted to begin saying, "Surely you assented to installing these 'features' when you installed Express Scribe!" My retort would be a resounding, "Not so!" While the inclusion of "extras" is a burgeoning trend in free software (e.g. Oracle's Java attempting to install the Ask Toolbar if the option is not unchecked), I carefully read each page of an install to make sure that shit like this doesn't happen. Excuse the language. But not really. These shenanigans are infuriating to me. In fact, I went back through the installer to see what actually transpired. Check out the next two images.

The License Agreement which gives only a hint about the "install-on-demand" components.
The License Agreement which gives only a hint about the "install-on-demand" components.
Every box corresponding to optional software that Express Scribe tries to install is unchecked.
Who has two thumbs and unchecked every single box for optional software to install? This guy.

As shown in the images above, even if all boxes for optional software are unchecked, there are still things installed besides Express Scribe. These "install-on-demand" components are only hinted at in the License Agreement, and one may reasonably assume (as I did), that the components referred to were the ones recommended on the following page. They weren't. Let's see what was actually installed.

NCH Software Suite in the Start Menu program list. I boxed Express Scribe in green because this was what I actually wanted to install.
NCH Software Suite in the Start Menu program list. I boxed Express Scribe in green because this was what I actually wanted to install.

The "NCH Software Suite" comprises no fewer than seventeen install-on-demand components. Keep in mind that none of these seventeen components are actually installed; rather, these are effectively advertisements for them.

So now we have a clear idea of the problem arising from installing Express Scribe. Even when a user is careful and chooses to not select any optional components for installation, Express Scribe infiltrates the system to associate itself with unrelated files to offer you advertisements using 'components' that you did not choose to install. This is the sort of behavior that malware undertakes and, if it walks like a duck and quacks like a duck...

cannot recommend Express Scribe or any software created by NCH to colleagues. In fact, I will actively recommend against using it whenever possible. I am not presently aware of a free (open-source or otherwise) alternative solution, but I cannot imagine that one does not exist (or that one would be easy to create). If you know of one, please leave a comment saying what it is and where to get it.

A dialog box confirming that the uninstall was completed.
A standard uninstall may do the trick for removing Express Scribe and the NCH Software Suite.

On my main computer (running Windows 7 Professional x64), uninstalling Express Scribe through Programs and Features in Control Panel seemed to remove the NCH Software Suite and the Express Zip context menu entry. I didn't have quite the same luck on another computer I use, and, if I can duplicate the problems, I will put up a guide for eliminating all traces of this software in the situation that a regular uninstall isn't sufficient.

A note to all software developers: I control what is installed on my computer, not you. Sneaking extra software onto my computer isn't cute or clever. Rather, this is the behavior of malicious software. If your software does this, as Express Scribe does, it is malicious - no matter how useful such software might be.

Update 2015-03-23: I'm working an open-source alternative to ExpressScribe called TranscribeSharp. An early preview release is available here.

28 Comments

On a return to blogging after a hiatus

With the winter holiday I returned to my lazy, non-blogging habits. A New Year's resolution did little to change the situation. I suppose one just jumps in, though. I'll try to keep up with things more this semester. Really.

Plans for this semester

I'm currently taking a seminar on statistics education and an introductory course on qualitative methods. While the former is clearly my area of interest, the latter is proving to be more enjoyable than I had anticipated. One of the books for the course is Crotty's The Foundations of Social Research: Meaning and Perspective in the Research Process which is a bit more abstract than I was expecting, focusing on epistemologies and theoretical perspectives. It is a refreshing change, and I'm currently working my way through Feyerabend's Against Method after having my views on post-positivism challenged. (They seemed to be most aligned with Popper before this academic year.) Other plans include a trip to San Diego for LOCUS-related things and In-N-Out Burger, insha'Allah.

Dealing with Protected/Secured PDFs

Occasionally I'll come across a PDF that is Protected/Secured (it says 'SECURED' in the title bar of Adobe Reader) which are rather annoying to deal with. I've been using Mendeley to organize the articles/books I've read, and I copy the abstract into the software so that it can be searched. Alas, one journal whose articles I often read secure every single PDF so that copying cannot be done. Really frustrating.

Thankfully, this "secured" state is not encrypted or password protected. From what I gather, the state is determined by setting a bit in the file to disable certain features and Adobe, upon finding this information, respects the file's instructions. Not all software respects the file's instructions, and those that don't allow copying without issue. Two such readers are Evince (part of GNOME) and Okular (part of KDE). Both are open source, and both at least have options for disabling the DRM on the files. They are also both available on Windows (as well as many other platforms and are exceedingly common on Linux); if you're just looking for a quick download on Windows, Evince might be better. Either way, problem solved.

Leave a Comment

Highlighting source code in posts

In an earlier post I had mentioned that I didn't know how to highlight source code in blog posts (syntax highlighting) or even offset code with a monospace font/typeface. Apparently, this is a feature of WordPress.com that I didn't know about, and is available as a plugin called SyntaxHighlighter Evolved for WordPress.org. I installed it, and it seems to do what I want. It supports a variety of languages, but not all that I use (for instance R is supported but not SAS). I'll keep my eyes open for a different plugin, but this works for now.

Of course, if one doesn't want or care about the syntax highlighting, then <pre> and <code> tags should suffice. A useful post from StackOverflow concerning some subtleties of the tags is here.

Leave a Comment

Moving from WordPress.com to WordPress.org

Because my new department doesn't provide web hosting in the same way that my old department did (and does), I decide to make use of a hosting I bought last year. Back in 2011 Hostable.com offered three years of unlimited hosting for $0.99. I caved and signed up, but haven't used it yet.

Hostable.com uses cPanel, an apparently popular website manager. It seems simple enough to use, and installing WordPress.org (the open source platform for blogging that many sites including WordPress.com use) was a breeze. Importing and Exporting my WordPress.com site was also easy. I picked a new theme (Catch Box), and tweaked a few settings (display name, permanent links, etc.). Nothing fancy.

Some things I did change were making the Pages based on "Disable Sidebar Template" and instead of "Default Template" (using the Screen Options, Discussion Checkbox) disabling comments on the pages. Also, pages are ordered using numbers that are all defaulted to 0. Quick Edit from the All Pages... page is the way to go for this.

Oh, don't change the WordPress URL and Site URL options if you don't know what they do. I did, and broke the site. I ended up uninstalling and reinstalling, which went quickly. Also, make backups when things are working.

Moving on, I updated the DNS servers at GoDaddy.com (where I bought my domain name) to point to Hostable's (ns1.hostable.com, ns2.hostable.com) and used the "Add on domains" option in cPanel. Now, douglaswhitaker.com redirects here. Changing the WordPress Address (URL) option in the settings page to http://douglaswhitaker.com and... nope, that doesn't work yet. Apparently Site URL had to change, too. Either way, it's working now. So, yeah. New website. Updates coming soon. Maybe.

Leave a Comment

Backups (Part 1)

Background

I have suffered massive data loss twice in my life.

Once was in high school, and I knew backing up my files was a good idea. I bought a second hard drive. As I was about to copy my files onto it, someone tripped on the power cord and knocked the spinning hard drive containing all of my data onto the tile floor. Angry sounds ensued, and all my data was gone.

The second time was near the end of the my second-to-last semester of undergrad. I used a Macbook at the time and was using the encrypted file system option that came with OS X 10.6. One day, right before finals week (Murphy's Law), the computer just lost the ability to unencrypt it... leaving me with a massive, encrypted .sparsebundle that was essentially useless. Well, not essentially: actually useless.

Data Backups Are NOT Optional! 

From both major data losses I experience I had partial, unorganized backups. A DVD of homework here, a flash drive of music and movies there, but nothing cohesive. Nothing deserving of the name "backup". For about the past year, I've been regularly backing up my data. I prioritize backing up data now. I plan for it. If I upgrade the hard drive in my laptop, I'll only do so when I can afford to upgrade the backup drive as well. Internalizing the importance of backups is vital.

When I say that I've been regularly backing up my data, I mean backing up all of my files to an external hard drive (weekly for my desktop and monthly (worst-case scenario) for my laptop). The reason the laptop is monthly is, frankly, because using the external hard drive can be hard to remember. Moreover, the projects I work on regularly are copied at somewhat irregular intervals to flash drives and files emailed back and forth, so recent changes are often in several places. Not ideal, but I think that this system allows me to say "Yes, I do backup my data."

Or do I?

The current state of my backups

Right now I'm using the built-in "Backup and Restore" feature in Windows 7 and KBackup on Linux.

On Linux, I know that I'm just backing up my home directory and some configuration files I've edited. (Whenever I edit a configuration file, I copy it and the original to /root - this is included in the backups.)

The Windows 7 backup options seem to have some pros and cons (about which I could be mistaken). If a system image is created, ostensibly everything is backed up... but it can only be used in an all-or-nothing fashion. If individual files are backed up, they can be individually restored in the case of accidental deletion but are only useful individually. On my laptop in Windows I backup "files in libraries and personal folders for all users and system image". On my desktop I backup "All local data files". (I got my desktop before my laptop and initially shied away from the system image option because I thought it was akin to the System Restore Points introduced with Windows Me.)

Right now, at the start of my Ph.D. studies I feel pretty good about my responsibility with backing things up... but I will NOT lose my data again without a fight. I will not passively accept my backups as infallible.

The stress test

I decided to delete my data. To simulate a hard drive failure I used DBAN on my desktop: my hard drive was completely erased — just as if I had replaced it with a brand new one. I primarily use my desktop for Netflix, casual gaming (Morrowind, Minecraft, etc.), and browsing the internet; all of my teaching and research is done on my laptop. I checked reading my desktop backup on my laptop and was able to access the files through the Restore option: if everything went wrong, I could still recover what I wanted to. But what would everything going wrong look like?

I guess I should describe what I want from a backup solution at some point, and now is as good as any. My main purpose for backing up is to be there in the case of hardware failure or other major data loss event. I'm not too concerned with losing individual files (knock on wood) because I rarely delete anything, though being able to restore those would be a nice secondary goal. In a crunch, I would like to be able to open up the external hard drive enclosure, swap the backup drive for the original, and boot directly from the backup drive to continue working.

Previously I had created a Windows 7 Repair Disk, so I thought that booting from it would allow me to restore from my backup. While I think this would work, I wasn't able to test it. After the graphical system had loaded and the mouse worked I received Error 0x4001100200001012. I'm not sure what this error is, nor do I care at this point. After retrying it two or three times I gave up and concluded that the disk wasn't working. Lesson learned: always check your repair/recovery disks to make sure they work before you need them!

The Windows 7 Repair Disk didn't work, but I did have Dell Windows 7 System Restore Disks that I had previously checked. Basically the same thing, right?

Wrong. While these disks give me the option to restore my computer the factory image (which is what I expected them to do), that isn't my goal. After booting the Recovery Disk, I noticed that these disks also allow one to restore the computer from backups. Success! I thought naïvely. The disks seemed to only look for backups made with Dell's DataSafe software, not the Windows Backup and Restore feature. Lesson learned: don't assume options to restore backups refer to backups made with the system you use!

Fine, I begrudgingly thought. A full system restore to factory settings — at least that would get me to the Windows Backup and Restore software (and a usable computer). A few minutes and a disk change later and I was at Windows desktop. I opened the Backup and Restore feature and chose to restore all of the files from my external hard drive to their original locations on my desktop... and was immediately struck with the realization that the backup options I had been using for this were unsuitable for restoring all files at once. I had incorrectly chosen the individual-files only option. As it was copying the files, I had to choose to Copy and Replace all of the system files. I also ran into Error 0x80070020 indicating that the computer was trying to replace files currently in use. Not ideal. I had it skip those files and move on.

After all of the files finished copying there was a very clear problem: not all of the files copied. Essentially, all files related to installed applications weren't copied. While none of the data was lost, this backup did not allow me to quickly get up and running again as none of the programs I use work.

Moving forward

This stress test of my backup solution taught me that, while my data are sufficiently backed up, they are not as accessible as I would like. Not nearly so. In the end, it took several hours to get to this point which is both too slow and not where I want it to be.

My desired ability to boot from my backup in a crunch also implies that I should be able to read the files individually when the drive is mounted in another computer. This last point is key, because neither option that Windows Backup and Restore has allows this. Backing up files individually results in many individual zip files each containing many files without much clear organization. The system image option results in one large file which is difficult (impossible?) to use to access individual files. I had known about the latter situation before, but I learned the former during this ordeal. Lesson learned: always understand what the backup options actually mean!

I want to change my backup solution for both Windows and Linux. Essentially, I want to be able to clone the hard drive on a regular basis, but not maintain parity with my computer every minute. Keeping the backup identical to the original all the time (like RAID 1) doesn't allow me to restore files in the case of an accidental deletion (a secondary goal). At the same time, creating a clone of the hard drive every week would undoubtedly be a drain on system resources when dealing with large hard drives.

At the moment, I don't know my ideal backup solution, but I will keep looking for it and will post my findings. For now, I'll keep using Windows Backup and Restore and KBackup while having a better understanding of the very real pros and cons associated with them. A summary of my wants and lessons-learned are below.

What I want in a backup solution

  • Backup of my entire hard drive occurs regularly (but not a RAID 1 style mirror)
  • Backup is bootable
  • Backup is readable when mounted by another computer
  • Backup is created while the computer is in use (I would prefer to not have to use an external system to periodically mirror my hard drive)
  • Backup software is Free Open Source Software (technically this is optional, but I doubt I'll go with any solution that isn't open source)

Lessons learned

  • Always check critical media before it is needed
  • Always make sure the tools you have do what you think they do
  • Always make sure you fully understand the meaning and implications of the options you have selected
  • Testing your backup system before you depend on it is a good thing
1 Comment