Page View Metric Dying

First Metricocracy measured hits. Pictures and other junk on pages inflated the results so Metricocracy decided on either unique visitors or page views. Now, the Metricocracy wants us to measure attention. Attention is engagement, how much time users spend on a page.

What do we really want to know? Really it is the potential value of the property. The assumption around attention is the longer someone spends on a web site, the more money that site gains in advertisement revenue. The rationale being users who barely glance at pages and spend little time on the site are not going to click ads. Does this really mean users who linger and spend large amounts of time on the site are going to click more ads?

This means to me attention is just another contrived metric which doesn’t measure what is really sought. I guess advertisement companies and the hosts brandishing them really do not want to report the click through rates?

My web browsing habits skew the attention metric way higher than it ought to be. First, I have a tendency to open several items in a window and leave them lingering. While my eyes spent a minute looking the content, the page spent minutes to hours in a window… waiting for the opportunity. Second, I actively block images from advertisement sources and block Flash except when required.

As a DBA, page views also has debatable usefulness. On the one hand we could use it because it represents a count of objects requiring calls to the database and rendering by application and web server code. Hits represent all requests for all content, simple or complex, so is more inclusive. Bandwidth throughput represents how much data is sucked out or pushed into the systems.

We DBAs also provide supporting information to the project leaders. Currently they look at the number of users or classrooms who have been active throughout the term. Attention could provide another perspective to enhance the overall picture of how much use our systems get.

Cat Finnegan, who conducts research with GeorgiaVIEW tracking data, measures learning effectiveness. To me, that is the ultimate point of this project. If students are learning with the system, then it is successful. If we can change how we do things to help them learn better, then we ought to make that change. If another product can help students learn better, then that is the system we ought to use.

Ultimately, I don’t think there is a single useful metric. Hits, unique users, page views, attention, bandwith, active users, etc., all provide a nuanced view of what is happening. I’ve used them all for different purposes.

Using An LMS As a Network Drive

Ran across a video describing how to get the WebDAV info in CE6 (aka Blackboard Vista 4 Lite) for the purpose of using CE6 as a network drive.

The narrator says this is a good idea because if the site has good policies, then backups are being made. In the event of a site disaster, you can recover your files from it.

This is a HORRIBLE idea.

  1. An LMS is unlikely to be sized in such a way to store backups of all user content. The IT administration will end up buying more expensive storage for CE6 than for other desktop backup solutions.
  2. By placing content unrelated to classes, you will contribute to making the CE6 site slower.
  3. The IT administration will not be able to recover a single file for you should you make a mistake. They will have to restore the whole database, place a CE6 web server in front of it, and get the one file for you. Its a more expensive investment in time to recover your content.

Use a backup system to do backups. Use a online instruction system to instruct.

Section Archive Easter Egg

Blackboard support wants a backup of a section to replicate the instructor claimed behavior. Fine. I started making the backup, but I ended up closing the browser to go home.

This morning, when I checked, the backup was not in the Vista file manager at the course level where I created it. So, I created another backup. That one was created at the course level as the first should have been.

Independently, I checked the web server logs and discovered the first had completed. So, it should have been put in the Vista file manager. The next place I looked was the institution level. Sure enough, the section archive was there.

So, I guess this means there is an operation that takes place in a subsequent web page to move the section archive from the institution learning context to the course learning context. Because the web browser was no longer managing the backup, the later operation was not conducted.

Too bad its a feature and not a bug…

Finding Sessions

Clusters can making finding where a user was working a clusterf***. Users end up on a node, but they don’t know which node. Heck, we are ahead of the curve to get user name, date, and time. Usually checking all the nodes in the past few days can net you the sessions. Capturing the session ids in the web server logs usually leads to finding an error in the webct logs. Though not always. Digging through the web server logs to find where the user was doing something similar to the appropriate activity consumes days.

Blackboard Vista captures node information for where the action took place. Reports against the tracking data provide more concise, more easily understood, and more quickly compiled. They are fantastic for getting a clear understanding of what steps a user took.

Web server logs contain every hit which includes every page view (well, almost, the gap is another post). Tracking data represents at best 25% of the page views. This problem is perhaps the only reason I favor logs over tracking data. More cryptic data usually means a slower resolution time not faster.

Another issue with tracking is the scope. When profiling student behavior, it is great. The problem is only okay data can be located for instructors while designers and administrators are almost totally under the radar. With the new outer join, what we can get for these oblivious roles has been greatly expanded.

Certainly, I try not to rely too much on a single source of data. Even I sometimes forget to do so.

Dumbfounded By The Numbers

Chancellor Eroll B. Davis Jr told the Georgia Board of Regents, “We grew essentially by a large university.” The USG gained 10,077 students (my alma mater has ~11,000) in a year. They calculate these fall term to fall term.

In the same fall term to fall term time period, in the same same university system, GeorgiaVIEW gained about 59,000 students (assumes 1/10th of 65,000 active user growth are instructors/designers). Its only 9x the system growth rate. It actually reflects a slowing in the growth rate for GeorgiaVIEW. Partly this is because we are fast approaching the number of potential users. Market penetration becomes more difficult when people are using it.

Fortunately, users will become more intelligent in their use over time. So, even though the number of users may plateau, because each user will use the system more, the amount of use will continue to increase.

Unfortunately, another DBA and I consider the number of users a more or less uninformative statistic. It looks good in news papers as its something the general public probably understands. Other numbers mean more for us:

  1. Hits – The count of items downloaded from the web servers. We often use hits as a measure of user activity. Unfortunately, we are only collecting this at the daily or monthly values.
  2. Who Is Online (Total / Active) – SQL pulls from the WIO table a count of all the rows (Total) and those whose time in the table is recent (Active). Both have issues… For example, users failing to logout and inflate the total. Active has weird spikes which suggests to me these tables are reaped every 1/2 hour or so.
  3. Storage – Amount of information stored by the users. For example, our storage growth is 2.23 times the previous year (slowing down from 2.25). The number of new users has largely slowed, but the amount of storage staying fairly consistent means to me the users are doing more with the system.

Amy’s presentation at BbWorld 2007 on capacity planning is a much more authoritative approach than this blog post.
πŸ™‚

Coradiant TrueSight

Several of us saw a demo of Coradiant Truesight yesterday (first mentioned in the BbWorld Monitoring post). Most of the demo, I spent trying to figure out the name Jeff Goldblum as one of team giving the demo had the voice and mannerisms of the actor’s characters. Had he mentioned a butterfly, then I definitely would have clapped. The other reminded me of John Hodgman.

Something I had not noticed at the time, but a reoccurring point of having Truesight is to tell our users, “Here is evidence the problem is on your end and not ours.” This assumes the users are rational or will even believe the evidence. They wish the problem never occurred (preference) and a resolution (secondarily). Preventing every problem, especially issues outside our domain, probably is outside the scope of the budget we receive. So, we are left with resolving the issues. Especially scary are the users who take evidence the problem is on their end or their ISP’s end to mean, “This is all your fault.”

Resolutions we can we offer are:

  1. Hardware change – We can replace or alter the configuration of the hardware components of the network, storage, database, or application.
  2. Software change – We can alter the configuration of the software components of the network, storage, database, or application.
  3. Request a code change from a vendor – We can work with our vendors to get a code change. These take forever to implement.
  4. Suggest a user resolve the issue
    1. We can provide a work around (grudgingly accepted, remember the preferred wish is the problem never occurred).
    2. We suggest configuration changes the user can make to resolve the problem.

Truesight provides us information to help us try to resolve issues. Describing the information provided as “facts” was a nice touch. At Valdosta State, I gave up on users reporting the browsers accurately and captured the information from the User-Agent header. Similarly, at the USG, I’ve found users disagree ~30% of the time about the version of the browser according to the User-Agent string. Heck, they have errors in the name of the class ~40% of the time. My favorite is something took 15 minutes, but all I could find was it took four minutes. Ugh. Because Truesight is capturing the header info, it ought to be much easier to confirm what users were doing and where problems occurred more accurately than the users can describe.

After receiving all the “facts”, we still have to determine the cause. Truesight helps us understand the scope of the problem by how many users, how many web servers, and how many pages are affected by slowness to what degree. As a DBA and administrator, my job identifying cause ought to be easier, though quantifying how much easier probably is difficult to say.

Part of why: (Mostly speculation.) Problems identified as a spike in anything other than “Host” are external causes. These are causes in front of the device. Causes behind the device are “Host”. If these were more narrowly broken down, the maybe we could better determine cause. That would require knowledge web browsers typically would not know like the server processing time, query processing time, or even the health of the servers.

tag: Blackboard Inc, Coradiant, , user agent,

A More Usable Usability

Previously I have seen usability describing ease of using a web site. These four essences of usability are interesting.

I believe that to satisfy customers, a Web site must fulfill four distinct needs:

  • Availability: A site that’s unreachable, for any reason, is useless.
  • Responsiveness: Having reached the site, pages that download slowly are likely to drive customers to try an alternate site.
  • Clarity: If the site is sufficiently responsive to keep the customer’s attention, other design qualities come into play. It must be simple and natural to use – easy to learn, predictable, and consistent.
  • Utility: Last comes utility — does the site actually deliver the information or service the customer was looking for in the first place?

Web Usability: A Simple Framework

The first two items deal with system administration issues like the network, server(s), database, or application. Redundancy and proactive dealing with problems before they impact the system hopefully maximizes availibility. Optimization for performance hopefully maximizes responsiveness. An unhealthy database could fail to deliver information.

The last two items deal with design issues. More utility issues are likely based in design than tuning.


UPDATE: In my past life as a “Webmaster,” my fingers were dirty in all four aspects of usability. These were my servers and while not my design, I certainly influenced it by cleaning up the HTML and presentation. We created in-house everything except some outsourced photography and the Apache web server.

Blackboard’s Vista is a proprietary application with decent opportunities for instructional designers to provide clarity and utility. As much as it provides, clients often purchase or create additional applications to integrate with Vista to fill in holes Blackboard left. Okay, technically, WebCT left those holes, but Blackboard took the same model with Academic Suite. Blackboard doesn’t really intend to fill in those holes. They should for issues affecting most of their customers on each platform. This is the same approach taken by open source products with the caveat that third party companies are not filling in the holes, customers are developing their own solutions and providing back to the community.

The declining responsiveness of Vista over time definitely seems to create one frustrating difficulty for some clients. As the database tables get larger, responsiveness of the sites declines. Ouch. Delete it all… Oh, wait… Can we really do that?

BbWorld Presentation Redux Part II – Monitoring

Much of what I might write in these posts about Vista is knowledge accumulated from the efforts of my coworkers.

This is part two in a series of blog posts on our presentation at BbWorld ’07, on the behalf of the Georgia VIEW project, Maintaining Large Vista Installations (2MB PPT).

Part one covered automation of Blackboard Vista 3 tasks. Next, let’s look at monitoring.

Several scripts we have written are in place to collect data. One of the special scripts connects to Weblogic on each node to capture data from several MBeans. Other scripts watch for problems with hardware, the operating system, database, and even login to Vista. Each server (node or database) has, I think, 30-40 monitors. A portion of items we monitor is in the presentation. Every level of our clusters are watched for issues. The data from these scripts are collected into two applications.

  1. Nagios sends us alerts when values from the monitoring scripts on specific criteria fall outside of our expectations. Green means good; yellow means warning; red means bad. Thankfully none in our group are colorblind. Nagios can also send email and pages for alerts. Finding the sweet spot where we get alerted for a problem but avoid false positives perhaps is the most difficult.
  2. An AJAX application two excellent members of our Systems group created called internallyl Stats creates graphs of the same monitored data. Nagios tells us a node failed a test. Stats tells us when the problem started, how long it lasted, and if others also displayed similar issues.We also can use stats to watch trends. For example, we know two peaks by watching WIO usage rise to a noonish peak slough by ~20% and peak again in the evening fairly consistently over weeks and months.

We also use AWStats to provide web server log summary data. Web server logs show activity of the users: where they go, how much, etc.

In summary, Nagios gives us a heads up there is a problem. Stats allows us to trend performance of nodes and databases. AWStats allows us to trend overall user activity.

Coradiant TrueSight was featured in the vendor area at BbWorld. This product looks promising for determining where users encounter issues. Blackboard is working with them, but I suspect its likely for Vista 4 and CE 6.

We have fantastic data. Unfortunately, interpreting the data proves more complex. Say the load on a server hosting a starts climbing, its the point we get pages and continues to climb. What does one do? Remove it from the cluster? Restart it? Restarting it will simply shift the work to another node in the cluster. Say the same happens with the database. Restarting the database will kick all the users out of Vista. Unfortunately, Blackboard does not provide a playbook on what to do with every support possibility. Also, if you ask three DBAs, then you will likely get three answers.
πŸ˜€

Its important to balance the underreaction and overreaction. When things go wrong, people want us to fix the problem. Vista is capable of handling many faults and not handling very similar faults. The link example was a failed firewall upgrade. I took a similar tact with another firewall problem earlier this week. I ultimately had to restart the cluster that evening because it didn’t recover.

Part three will discuss the node types.

Connotations of a Pronoun

Ezra Freelove, Information Technology

β€œWhen she saw that the web address was wrong on letterhead, she helped us correct the problem. Thank you, Ezra!
Valdosta State University I Caught You Caring

I do recall an occasion while at VSU in which I noticed a memo telling people to go to an address using “www.” when the host didn’t support that as an alias of the host. So I contacted the DNS folks and got new aliases so it would work.

Why she? It suggests whomever wrote this knows very little about me.