Automated Testing

On a call today, our new vendor asked that we verify every web site works before having them apply service packs. Our analyst said, “We can do that.” I pointed out the problem causing the present concern happened one in ten times on one site on one server of the instance. Therefore to catch it, they would need 10 views of the login page for 30 servers for each of 18 sites. That is 5,400 page views.

The conundrum came up because when the service pack was applied to test, some sites on one server failed this check. Over time they cleared and returned. We have monitoring in place to check a single site on each server works with a login and logout. This check is super-sensitive to changes. Originally this check was on a functional evaluation site, but it broke every other week because someone changed a color, icon, etc. That was with 7. With 111, we would go mad.

Clearly, I am going to have to develop automated testing to verify sites on each of their servers before and after server pack application. Too bad the vendor does not make sure everything works after they make changes to our systems.

State of the LMS

Watched an informative WebEx about The State of the LMS: An Insitution Perspective presented jointly by Delta Initiative and California State University. An true innovator in this market could become the leader.

Market share numbers annoy me. These are always self-reported numbers from a survey. The sample sizes are almost always not very impressive and when broken down doesn’t really represent the market. DI didn’t post a link to where they got the numbers just the name of the group. Some digging and turned up this Background Information About LMS Deployment from the 2008 Campus Computing Survey. For background information it is woefully lacking in important information such as sample size, especially the breakdown of the types of institutions in the categories.

The numbers DI quotes of CC are very different for the same year the Instructional Technology Council reports: Blackboard market share 66% (DI/CC) vs 77% (ITC). An 11% difference makes is huge when the next largest competitor is 10% (DI/CC).

Other missing critical information: Are these longitudinal numbers, aka the same respondants used participate in every year the survey quotes? Or is there a high turnover rate meaning an almost completely different set of people are answering every year so the survey completely relies on the randomness of who is willing to answer the survey? So the numbers could shift just because people refuse to answer giving Blackboard reduced market share only because Moodle customers are more willing to respond to questions about it?

Most of the major LMS products on the market started at a university or as part of a consortium involving universities. I knew the background of most of the products on in Figure 1. Somehow I never put that together.

Will another university take the lead and through innovation cause the next big shakeup? I would have thought the next logical step to address here in the DI presentation would be the innovative things universities are doing which could have an impact. Phil described Personal Learning Environments (not named) as potentially impacting the LMS market, but he was careful to say really PLEs are an unkown. The were no statements about brand new LMSs recently entering or about to enter the market.

Figure 1: Start year and origin of LMSes. Line thickness indicates market share based on Campus Computing numbers. From the DI WebEx.

Network Recording Player - State-wide LMS Strategy 8262009 90839 AM-1

When people use my project as an example, it gets my attention. GeorgiaVIEW was slightly incorrectly described on page 26 Trends: Changing definition of “centralization”.

  1. We do not have an instance per institution which has a significantly higher licensing cost. We do give each institution their own URL to provide consistency for their users. Changing bookmarks, web pages, portals, etc everywhere a URL is listed is a nightmare. So we try to minimize the impact when we move them by a single unchanging URL.We have 10 instances for the 31 institutions (plus 8 intercampus programs like Georgia ONmyLINE) we host. Learn 9 will not have the Vista multiple institution capability, so should we migrate to Learn 9 an instance per institution would have to happen.
  2. We have two primary data centers not have a primary and a backup data center. By having multiple sites, we keep our eggs in multiple baskets.

The primary point about splitting into multiple instances was correct. We performed the two splits because Vista 2 and 3 exhibited performance issues based on both the amount of usage and data. With ten instances we hit 20,000 4,500 users (active in the past 5 minutes recently) but should be capable of 50,000 based on the sizing documents. We also crossed 50 million hits and 30 million page views. We also grow by over a terabyte a term now. All these numbers are still accelerating (grows faster every year). I keep hoping to find we hit a plateau.

Figure 2: LMS consortia around the United States. From the DI WebEx.

Consortia Nationwide

All this growth in my mind means people in general find us useful. I would expect us to have fewer active users and less data growth should everyone hate us. Of course, the kids on Twitter think GeorgiaVIEW hates them. (Only when you cause a meltdown.)

UPDATE: Corrected the active users number. We have two measure active and total. 20,000 is the total or all sessions. 4,500 are active in the past 5 minutes. Thanks to Mark for reading and find the error!

Page View Metric Dying

First Metricocracy measured hits. Pictures and other junk on pages inflated the results so Metricocracy decided on either unique visitors or page views. Now, the Metricocracy wants us to measure attention. Attention is engagement, how much time users spend on a page.

What do we really want to know? Really it is the potential value of the property. The assumption around attention is the longer someone spends on a web site, the more money that site gains in advertisement revenue. The rationale being users who barely glance at pages and spend little time on the site are not going to click ads. Does this really mean users who linger and spend large amounts of time on the site are going to click more ads?

This means to me attention is just another contrived metric which doesn’t measure what is really sought. I guess advertisement companies and the hosts brandishing them really do not want to report the click through rates?

My web browsing habits skew the attention metric way higher than it ought to be. First, I have a tendency to open several items in a window and leave them lingering. While my eyes spent a minute looking the content, the page spent minutes to hours in a window… waiting for the opportunity. Second, I actively block images from advertisement sources and block Flash except when required.

As a DBA, page views also has debatable usefulness. On the one hand we could use it because it represents a count of objects requiring calls to the database and rendering by application and web server code. Hits represent all requests for all content, simple or complex, so is more inclusive. Bandwidth throughput represents how much data is sucked out or pushed into the systems.

We DBAs also provide supporting information to the project leaders. Currently they look at the number of users or classrooms who have been active throughout the term. Attention could provide another perspective to enhance the overall picture of how much use our systems get.

Cat Finnegan, who conducts research with GeorgiaVIEW tracking data, measures learning effectiveness. To me, that is the ultimate point of this project. If students are learning with the system, then it is successful. If we can change how we do things to help them learn better, then we ought to make that change. If another product can help students learn better, then that is the system we ought to use.

Ultimately, I don’t think there is a single useful metric. Hits, unique users, page views, attention, bandwith, active users, etc., all provide a nuanced view of what is happening. I’ve used them all for different purposes.

Finding Sessions

Clusters can making finding where a user was working a clusterf***. Users end up on a node, but they don’t know which node. Heck, we are ahead of the curve to get user name, date, and time. Usually checking all the nodes in the past few days can net you the sessions. Capturing the session ids in the web server logs usually leads to finding an error in the webct logs. Though not always. Digging through the web server logs to find where the user was doing something similar to the appropriate activity consumes days.

Blackboard Vista captures node information for where the action took place. Reports against the tracking data provide more concise, more easily understood, and more quickly compiled. They are fantastic for getting a clear understanding of what steps a user took.

Web server logs contain every hit which includes every page view (well, almost, the gap is another post). Tracking data represents at best 25% of the page views. This problem is perhaps the only reason I favor logs over tracking data. More cryptic data usually means a slower resolution time not faster.

Another issue with tracking is the scope. When profiling student behavior, it is great. The problem is only okay data can be located for instructors while designers and administrators are almost totally under the radar. With the new outer join, what we can get for these oblivious roles has been greatly expanded.

Certainly, I try not to rely too much on a single source of data. Even I sometimes forget to do so.

Made Stumbleupon.com?

Traffic to this web site “spiked” yesterday. It only tripled to about 300 page views in a day. Nothing compared to what we get at work.
🙂

I was curious why the sudden burst almost exclusively to the Quotes to Make You Think page. The referrer for 108 of the 168 visitors that day was stumbleupon.com. Good visitors found other pages as they looked around a bit. Best I can figure, MochiMochii bookmarked my site and five others have indicated they like it.

Wow, if a single review and just a bookmark drives this much traffic, then maybe I am fortunate this page has not hit a top ranking? That could means thousands of hits daily.

UPDATE 2007-JAN-27: Today, the traffic from these… uh… Stumblers… is over 600 page views and we have over 6 hours left in the day. I am impressed people are coming. This quotes page has always been the most popular since I created it back in 2000 or 2001. Will it hit 1,200 Monday, 20,000 Friday? Where is the ceiling? I should have remembered the principles from work…

UPDATE 2007-JAN-27 b: Ha… Topped out at 4,892. That’ll teach me to think maybe it will slow.