PEAR Text_Diff doesn’t split words on punctuation

The PEAR Text_Diff system’s inline parser has a silly word splitting algorithm: it only defines word boundaries as spaces or newlines (\n).

This causes problems with punctuation. Suppose you are diffing the following two sentences:

The quick cat jumped over the lazy fox.
The quick cat jumped over the lazy dog.

The final rendered output will look like this:

The quick cat jumped over the lazy fox.dog.

Notice how the period is included in the word boundary? That makes messy markup. This comparison is worse:

The quick cat jumped over the lazy fox, who was totally lazy and should be shot.
The quick cat jumped over the lazy fox.

Here’s how PEAR Text_Diff does the diff:

The quick cat jumped over the lazy fox, who was totally lazy and should be shot.fox.

This final diff is difficult to read. You are not deleting and reinserting fox, you are in fact just changing the punctuation on its right. But because the inline diff renderer only considers space and newline as word boundaries, it doesn’t catch this basic punctuation issue.

The fix took me 1.5 hours of PHP code review to figure out the system, but it’s painfully easy to do it. Edit PEAR/Text/Diff/Renderer/inline.php. At lines 158 and 159 (per the online source code), you’ll see " \n" at the end. That is a collection of word boundaries, passed as a mask to the PHP strspn function. Simply add your word boundaries between the quotes, and the diff engine works correctly.

I’ve reported this as PHP PEAR bug 16774.

Gallery 3, Windows 2008 R2, and IIS 7

EDIT: Gallery’s maintainers decline to fully support Gallery 3 on IIS. See http://gallery.menalto.com/node/90281 for more info.

Yes, you can run Gallery 3 on Windows 2008 and IIS 7. Here’s how I did it:

  1. Clean install of Windows 2008 R2 x64. NOTE: These days, 32 bit is pretty ridiculous. The instructions below are only guaranteed to work on x64.
  2. Install the Web Server (IIS) role. I think this will also force a portion of the Application Server role to be installed, too.
  3. Install PHP 5.3. Just go through the default installation steps. I used the latest VC9 x86 Non Thread Safe version from the Windows binary download page.
  4. Install MySql Community Edition for Windows x64. I used default options through the process.
  5. Download phpMyAdmin. Unzip and copy files to C:\inetpub\wwwroot\phpmyadmin.
  6. Visit http://localhost/phpmyadmin, sign in using your MySql’s root account, and create a new database for Gallery 3.
  7. Download Gallery 3. As of this writing, the latest version is beta 2.
  8. Extract files and place in C:\inetpub\wwwroot\gallery3.
  9. If you run Gallery right now, it will squawk about missing some PHP settings that are in its .htaccess file. That file is not read by IIS, so you must implement differently:
    1. Create C:\inetpub\wwwroot\gallery3\.user.ini (more info on .user.ini) and open with a text editor. (Might need to use Notepad launched as administrator because of the protection Windows gives to files in C:\inetpub\.) Yes, you do need the period before user in the filename.
    2. Add these lines:
      short_open_tag    =    1
      magic_quotes_gpc   =   0
      magic_quotes_sybase =  0
      magic_quotes_runtime = 0
      register_globals  =    0
      session.auto_start =   0
      upload_max_filesize =  20M
      post_max_size =      100M
      date.timezone = "America/Chicago"

      Note that the date.timezone is because of an additional problem with Gallery 3’s underlying Kohana framework and PHP 5.3 (link).
  10. Create a new directory at C:\inetpub\wwwroot\gallery3\var. Edit its permissions and give the Users and IIS_IUSRS groups Modify permissions. NOTE WELL: Generally, you should use the principle of least privilege and only give enhanced privileges to the smallest number of users possible, which means not the Users group. I’ll revise in the future if I confirm that only IIS_IUSRS–or even a specific account–is all you need.
  11. Set up mod_rewrite:
    1. Download and install the URL Rewrite Module x64.
    2. In Server Manager, click on Server Manager > Roles > Web Server (IIS) > Internet Information Services (IIS) Manager. To the right, find your gallery3’s directory under your web server under Sites. Click on that directory.
    3. Click URL Rewrite then Import Rules…
    4. Copy the mod_rewrite rules, including the IfModule directives, from the end of Gallery3’s .htaccess file and paste into the Rewrite rules field of the Import mod_rewrite rules screen. Remove the # characters at the beginning of each line; otherwise, they are just code comments.
    5. Delete the line containing RewriteBase. It is not supported, and the rules will not import until that is fixed.
    6. Click Apply on the right hand side.
  12. Now run Gallery 3 setup at http://localhost/gallery3.

Viola, you have Gallery 3 on IIS.

This may seem like a lot of steps, but it’s actually not much different than a setup on Ubuntu. It’s easier than how it used to be with IIS 6 or PHP 5.2. Kudos to Microsoft and The PHP Group for a dramatically easier setup process.

Mint.com = fail

250px-MintcomThis blog post was to be about converting to Mint.com. I’m instead sticking with Microsoft Money.

Mint.com’s philosophy, and biggest failure, is low effort. They want a low effort user experience, but they have a low effort technical staff: instead of finding simplified ways of doing complex tasks, they just leave them out!

For example, recurring transactions. Microsoft Money has a “bills” feature that tracks and auto-enters my recurring transactions–paychecks, investments, mortgage payment, church donation, utility bills, etc.

Sure, this is “complicated” because I must manually schedule these transactions. But it removes complexity because they are pre-entered before my monthly bill-paying session.

Mint.com doesn’t have a hint of this. It even lacks logic to suggest recurring transactions–that could have allowed them to simplify an otherwise complex feature.

Another is manual transactions. Mint.com is reactive: it only has what it downloads from financial service providers. You can’t enter transactions.

That’s a disaster for my checking account. I have no record of a check until it’s deposited!

How do you track outstanding checks, including ones that have sat undeposited for months or weeks? How do you know your true available balance? Currently, it must be some other log that you must constantly monitor and update. No way, that’s terribly error-prone!

Thanks to Microsoft Money, I don’t bounce checks!

Mint.com, on the other hand, requires a gigantic cash pad, loins girded for overdraft fees, or tricky accounting using other programs.

Mint.com is a fail. Its slick user interface redeems it from epic fail. But behind the user interface is a painfully simplistic system. I can appreciate the complexity of the infrastructure needed to support this system, but I cringe at how little it really does for its users.

Above, I wrote I am using Microsoft Money “for now.” I don’t know where I’m going. Quicken suffers from a kludgy user interface and Intuit’s anti-consumer business practices. Plus it can’t convert my Money data yet.

Rumor has it that Quicken 2010 will have better Microsoft Money import capabilities. I’m still with Microsoft Money for a few more months.