Wednesday, March 10, 2010

Instrument Your Apps

Last week a current client called me with an issue on a custom app I built for them over a year ago. I was lucky to remember what the application did, let alone the details of its inner workings. It took less than five minutes to log into the box, find the issue, determine exactly when it had occurred and fix the problem. Literally, 4 minutes, 37 seconds.

The app was instrumented. In this particular instance, it was the most simple kind of instrumentation: exceptions go to the Windows Event Log. This type of instrumentation or logging is required as a bare minimum for any well implemented piece of software. When errors happen, you need to log them; And since there are great toolkits out there for you to use (log4Net) there is no excuse not to. Find a toolkit you like and master it -- it won't take much time.

The hardest part of doing good instrumentation is getting started. Once the framework is in place, everything after that is easy. The best way to get started is to pick one piece of information that you really need to know and get instrumentation working around that single point of data. Because I always like to know who's on the system, I normally start with user log-ons and log-offs. By logging each time a user logs on and logs off, I can spot traffic patterns and potential application load issues. I can also report back to clients on how many people are using the system; clients love that kind of hard data as it helps them justify the cost of developing the application in the first place.

Using Recursion in SharePoint

Using Microsoft's SharePoint as a Content Management System can present some challenges. One of the issues I once had with a client was that of accessing data that was only available on the individual content pages themselves. Unlike other types of web applications that store data in custom back end databases, when the information that you are looking for lives on a particular content page, it can be difficult to get at that information if you want to aggregate it in any way.

For this particular client, the information we were looking for was the latitude and longitude of a location that was stored on the content page. We were using Google Maps to present a filterable map of all of the companies' locations, since the locations, and in fact all of the data existed only on the content page itself -- we had to arrive at some way to get that data and store it so that is could be displayed efficiently when a user wanted to look at the map.

Our solution looked like this:

1. A timer job would run every 10 minutes and iterate through all of the levels of the site (the site had a user-defined, ragged hierarchy, we we used recursion to iterate through all of the sub-sites of a given sub-site.)
2. The timer job then took the results of the search and dropped them into an XML document that was stored in a standard SharePoint document library.
3. The map page then read from this XML document to produce a map with the locations and the appropriate meta-data for each location.

protected void findProperties(SPWeb w, XElement category)
if (ListExists(w, "Pages"))
foreach (SPListItem i in w.Lists["Pages"].Items)
if (i.ContentType.Name == "propertyPageType")

// create xml document node
// add it to xml document
catch (Exception ex)

"MapRefreshJob - Find Properties / Create Property Node",
"Bad Property Data: " + i["Title"].ToString() + ex.ToString());
badProperties.Add(new XElement("property",
new XAttribute("title", i["Title"] != null ? i["Title"].ToString() : "No Title"),
new XAttribute("url", i.Web.ServerRelativeUrl + "/" + i.Url)

foreach (SPWeb wChild in w.Webs)
findProperties(wChild, category);


Fix It (And I Mean *Really* Fix It) or Leave it Alone

I've been meaning to write this bit for years now, and what I've seen this past week has finally convinced me to get this off my chest. I can't count the number of engagements where I've been taken aside by a manager or executive and asked to "help" with a "little problem".

The conversation typically starts out with: "We've got this process around X, and there are three paper forms and one spreadsheet that all of the Production Managers fill out each month . . . . . ". The client proceeds to describe an innocent little process that over the years has morphed into a monstrous, soul devouring monthly chore dreaded by Production Managers everywhere.

This is the part where I get excited; "A business process to automate! Paper forms to eliminate! Busywork to decimate! People to make happy!" I think gleefully to myself.

Then the client hits me with that old one-two punch: "There's this one form, in the middle, that we want to track a little better, can we whip something up quick and cheap to iron out that one wrinkle?" This is the part where I die inside, just a little bit. Sure, we could use some bubble gum, some duct tape, some chicken wire, we could patch something together that is slightly better than what we have now, but why? Why even bother? Either fix it, and I mean *really* fix it. Or just leave it alone.

And even though I know the answer already, I always try: "Look", I say, "there are some really great ways that we could streamline the entire process, Production Managers won't have to look up SKU's (or whatever) in System A and write them on paper, only to have them re-entered in System B by Betty Sue in Accounting".

And I always get the same response, "Oh yeah! That would be great, but we don't
  • have the money, or
  • have the time, or
  • have users that could handle the change, or
  • want to do that now, we're going to do it in three years"
And, because I have a family to feed and a mortgage to pay, I get out the bubble gum, and the duct tape and the chicken wire and proceed to fix up that "one little wrinkle" in an otherwise horrific, soul sucking process. When all I really want to do is yell: "Either fix it, and I mean *really* fix it. Or just leave it alone."

Fix It (And I Mean *Really* Fix It)

If you are going to start tinkering with known bad business processes in your organization, you either have to just fix the entire thing, or don't bother. There are lots of good reasons to improve processes in an organization: getting better, cleaner data; getting data faster; eliminating manual tasks that could be automated to free up time for more productive activities, etc. However, making an incremental improvement to something that is working poorly often isn't worth the effort for a number of reasons.

First, the time it takes for someone to ramp up on a process and the time it takes to troubleshoot, debug and deploy a patch to an existing process is always dramatically underestimated. Even a "quick" solution ends up taking a disproportionate amount of time given the normally small incremental improvement to the project. The return on investment for a small fix normally isn't a positive one. If you are going to get in and figure out the entire sordid mess, you may as well go a little bit further and just fix the thing.

The "just make a quick fix" mentality is normally accompanied by the "we'll just have users test it" and the "it's a small change so it should be easy" mentalities. None of these are good. Trying to fix a part of a broken process will often result in a more broken process rather than a less broken process. To make the small change in the right way, you have to understand the entire system, and create a test strategy for the entire system. If you've already done that, why not make the code changes to fix the entire system.

Most importantly, if you don't fix the entire system, (almost) no one cares. An incremental fix to a painful process is like fixing a cavity half-way. Even though the pain is less, everyone is still in pain. The people that do care: the accountants. You've gone to the well once for money for this system and now going back again is going to be twice as hard. And, everyone is still in pain.

Leave It Alone

If you can't *really* fix the problem, then just leave it alone. Explain to the powers that be that to mitigate the risk of breaking the process in yet another way, you'd need to spend more money. Explain that ramping-up a developers on the process and the system costs money. Essentially, testing and ramp up are fixed costs -- that money is gone whether you fix 1% or 100% of the problem. Let the problem fester. Leave it alone until it causes so much pain that someone in the organization is willing to pony up the dough to fix it the right way.

Thursday, March 04, 2010

Test Data

I had a client once that had one of the most robust processes I have ever seen for QA'ing code. There were 6, count them, 6 steps. Code moved from:
  1. a Developer's local machine to,
  2. the Proofing server to,
  3. the Test 1 server to,
  4. the Test 2 server to,
  5. the QA server to,
  6. Production (hooray!!!)

Now, if this was for an Air Traffic Control system, or the embedded system on a fighter, or even the OS for a machine used in surgery, that would be one thing. But this is to validate business rules from a completely separate system that doesn't do a very good job of validating data. But hey, why not be careful, right? Here's the sad part: for all of the byzantine, crazy deployment kung-fu employed -- there were almost always unpleasant surprises when code finally hit production.

But why weren't these issues caught in any of the previous five stages you ask? One reason: bad test data. The data that was used in steps 1 - 5 was so different from the production data as to be useless to test the behavior of the application. There are some right ways and some wrong ways to deal with test data, let's take a look.


First and foremost, make sure you are scrambling any personal data that identifies clients or customers. Ask yourself: If my laptop is stolen and the data is accessed, what is the worst that could happen? Obvious things like Social Security numbers, passwords, PINS, names and addresses need to be cleaned so that if the data is lost or stolen, it can't be used for nefarious purposes. This seems pretty obvious, but I've seen clients who don't think twice about dropping the latest backup of production data on to a developers laptop, social security numbers, names, addresses and yearly salary of clients intact and unscrubbed.

Beware, though, of shooting yourself in the foot by over-scrubbing your data. If you are doing searches or sorting on Social Security numbers, then flipping them all to 111-11-1111 isn't the best idea. The trick is the scrub data in such a way as to keep the sample valid and testing meaningful. Create a scrubbing routine that you can run against your production data that swaps out the first three digits of the ssn from one client with the first three digits from a different client, do the same for the last four digits, that way you maintain the integrity of the data set, yet won't be in the newspaper if your laptop is stolen.


Over time, your data sets will change, and most likely grow larger. If you're using a snapshot of data that has been scrubbed, it will be less and less representative of our production data as time goes by. It's a good idea to have a process for reloading production data on regular intervals; that way your search routine that works great on 500,000 records won't fail spectacularly when you move it to production with 2.6 million records.

All in all, your test data needs to look like your production data in all of the ways that are meaningful. Otherwise, all of your testing will be for naught.

Tuesday, January 12, 2010

Customer Experience Strategy

This is the first part of a multi-part series on what I like to call the Customer Experience Strategy as it relates to small to mid-size consulting firms.

I once had an interviewer tell me that at their firm, they like to hire people that can roll out of a moving car, walk into a customer and just start billing.  The ability for a consultant to be autonomous on a client engagement is highly valued by large and small consulting firms and it should be.  Today's market is too competitive for upper management at a firm to have to be involved in the day-to-day operations of each consultant.  Consultants need to be able to operate on their own, and act as the face of the firm to the client, handling both technical issues and business issues, including developing additional business as opportunities arise. 

However, firms and consultants need to be careful that this independence does not translate into a dilution of the firm's brand or the customer's experience.  When a client engages a consultant from company XYZ, in addition to a particular set of technical and business skills that will be unique to the consultant, they should be getting a framework around the consultant's work product that is consistent across all consultants (whether sub-contractors or not) that come from company XYZ.  For a firm, this framework is the Customer Experience Strategy (CES).

A consistent and well executed CES is one of the things that sets true consulting firms apart from run-of-the-mill "body shops".  A good CES is a convergence of good consulting practices and good marketing practices.  In addition to highly qualified people, the CES should define what it means to engage the services of a particular firm.  All too often, firms strive to define themselves as top tier by virtue of having "the best people" and stop there.  While a successful firm must have great people, it must also present the work of these "great people" in a consistent and effective manner or risk losing both new opportunities and profit.

In the next post: Specific elements of a good Customer Experience Strategy.