When Idealism Meets The Real World: Google Reader Was The Last Straw

There was a time when Google was the shining beacon of geekdom; when tales of their crazy interview process, fancy chefs, and 20% time were spoken of in reverent whispers.

I’m realizing now that I held onto that fantasy for a lot longer than was realistic.

While I love my freakishly good job at OpenNMS (work from home lots, open-source software, good people), Google is the one place I’d always thought I’d at least entertain if the right thing came along.

Last week, I got an email from a Google recruiter (I get one every year or so, just checking in). I told him the usual, that I wasn’t looking to move, but am always interested to hear about opportunities from Google. He responded back a few days ago, asking when we could talk.

Then they announced Google Reader was going away.

When I realized I was losing something that I spend at least 60% of my web browsing time in, I finally consciously reevaluated my feelings on Google. And then, I responded to the recruiter:

Hey, sorry it’s taken a bit long to . . . → Read More: When Idealism Meets The Real World: Google Reader Was The Last Straw

Share on Facebook

Be Careful What You Match For, You Might Not Get It

So I ran into a really interesting issue in Java regular expression parsing while trying to work on an issue for a customer.

OpenNMS has the ability to listen for syslog messages, and turn them into OpenNMS events. To configure it, you specify a mapping of substring or regular expressions to UEIs (OpenNMS’s internal event identifiers).

The customer saw a huge drop in performance from 1.8.0 to 1.8.1. Basically the only change to the syslog daemon was a change to use Matcher.find() instead of Matcher.matches(). The problem was that they were making regular expressions like this:

foo0: .*load test (\\S+) on ((pts\\/\\d+)|(tty\\d+))

…which weren’t matching. So they changed it to put .* at the front, so matches() would get it:

.*foo0: .*load test (\\S+) on ((pts\\/\\d+)|(tty\\d+))

Upon upgrading to 1.8.1, they saw orders of magnitude slowdown. The reason is that when you haven’t specified an anchor, find has to figure out the “right” starting point for the match. In doing so, it spins a LOT, compared to matches() and its implicit anchors. It’s very expensive to scan all the way through the string, attempting to re-apply the regex, if it turns out there is no match. We . . . → Read More: Be Careful What You Match For, You Might Not Get It

Share on Facebook