Several of us saw a demo of Coradiant Truesight yesterday (first mentioned in the BbWorld Monitoring post). Most of the demo, I spent trying to figure out the name Jeff Goldblum as one of team giving the demo had the voice and mannerisms of the actor’s characters. Had he mentioned a butterfly, then I definitely would have clapped. The other reminded me of John Hodgman.
Something I had not noticed at the time, but a reoccurring point of having Truesight is to tell our users, “Here is evidence the problem is on your end and not ours.” This assumes the users are rational or will even believe the evidence. They wish the problem never occurred (preference) and a resolution (secondarily). Preventing every problem, especially issues outside our domain, probably is outside the scope of the budget we receive. So, we are left with resolving the issues. Especially scary are the users who take evidence the problem is on their end or their ISP’s end to mean, “This is all your fault.”
Resolutions we can we offer are:
- Hardware change – We can replace or alter the configuration of the hardware components of the network, storage, database, or application.
- Software change – We can alter the configuration of the software components of the network, storage, database, or application.
- Request a code change from a vendor – We can work with our vendors to get a code change. These take forever to implement.
- Suggest a user resolve the issue –
- We can provide a work around (grudgingly accepted, remember the preferred wish is the problem never occurred).
- We suggest configuration changes the user can make to resolve the problem.
Truesight provides us information to help us try to resolve issues. Describing the information provided as “facts” was a nice touch. At Valdosta State, I gave up on users reporting the browsers accurately and captured the information from the User-Agent header. Similarly, at the USG, I’ve found users disagree ~30% of the time about the version of the browser according to the User-Agent string. Heck, they have errors in the name of the class ~40% of the time. My favorite is something took 15 minutes, but all I could find was it took four minutes. Ugh. Because Truesight is capturing the header info, it ought to be much easier to confirm what users were doing and where problems occurred more accurately than the users can describe.
After receiving all the “facts”, we still have to determine the cause. Truesight helps us understand the scope of the problem by how many users, how many web servers, and how many pages are affected by slowness to what degree. As a DBA and administrator, my job identifying cause ought to be easier, though quantifying how much easier probably is difficult to say.
Part of why: (Mostly speculation.) Problems identified as a spike in anything other than “Host” are external causes. These are causes in front of the device. Causes behind the device are “Host”. If these were more narrowly broken down, the maybe we could better determine cause. That would require knowledge web browsers typically would not know like the server processing time, query processing time, or even the health of the servers.