tracking

You are currently browsing articles tagged tracking.

On the WebCT Users email list (hosted by Blackboard) there is a discussion about a mysterious directory called unmarshall which suddenly appeared. We found it under similar circumstances as others by investigating why a node consumed so much disk space. Failed command-line restores end up in this unmarshall directory.

Unmarshalling in Java jargon means:

converting the byte-stream back to its original data or object 1

This suspiciously sounds like what a decryption process would use to convert a .bak file into a .zip so something can open the file.

This is fourth undocumented work space where failed files site for a while and cause problems and no forewarning from the vendor.

Previous ones are:

  1. Failed UI backups end up in the weblogic81 (Vista 3, does this still happen in Vista 8?) directory.
  2. Failed tracking data files end up in WEBCTDOMAIN/tracking (Vista 3, apparently no longer stored this way in Vista 4/8 according to CSU-Chico and Notre Dame)
  3. Web Services content ends up in /var/tmp/ and are named Axis####axis. These are caused by a bug in DIME (like MIME) for Apache Axis. No one is complaining about the content failing to arrive, so we presume the files just end up on the system.

#3 were the hardest to diagnose because of a lack of an ability to tie the data back to user activity.

Is this all there are? I need to do testing to see which of these I can cross off my list goring forward in Vista 8. Failed restores are on it indefinitely for now.
:(

References:

  1. http://www.jguru.com/faq/view.jsp?EID=560072

Clusters can making finding where a user was working a clusterf***. Users end up on a node, but they don’t know which node. Heck, we are ahead of the curve to get user name, date, and time. Usually checking all the nodes in the past few days can net you the sessions. Capturing the session ids in the web server logs usually leads to finding an error in the webct logs. Though not always. Digging through the web server logs to find where the user was doing something similar to the appropriate activity consumes days.

Blackboard Vista captures node information for where the action took place. Reports against the tracking data provide more concise, more easily understood, and more quickly compiled. They are fantastic for getting a clear understanding of what steps a user took.

Web server logs contain every hit which includes every page view (well, almost, the gap is another post). Tracking data represents at best 25% of the page views. This problem is perhaps the only reason I favor logs over tracking data. More cryptic data usually means a slower resolution time not faster.

Another issue with tracking is the scope. When profiling student behavior, it is great. The problem is only okay data can be located for instructors while designers and administrators are almost totally under the radar. With the new outer join, what we can get for these oblivious roles has been greatly expanded.

Certainly, I try not to rely too much on a single source of data. Even I sometimes forget to do so.

Blackboard Vista tracks student activity. This tracking data is viewed as a critical feature of Vista. Our instructors depended on the information until we revoked their ability to run reports themselves due to performance issues. Campus administrators can still generate reports (though some still fail). We doubt the solution to this is Blackboard improving the queries to create the reports. We favor deleting tracking data (data preserved outside of Vista) to resolve the performance issues.

We developed SQL reports to look at the tracking data where the user in question was not a student. Yes, the data is limited, but in determining when and where a user was active, can help determine where to look in logs. When we hit the performance issues we started using these reports where the user interface reports failed to generate.

My understanding was the user interface and SQL reports on tracking were the same. Both looked at the same data. The user interface reports were just sexier wrapped in HTML and using icons. I compared a user interface report to a SQL report. Just prior to doing this, I was thinking, WebCT was stupid for not tracking when students look at the list of assessments. Turns out “Assessment list viewed” was tracked in the user interface all along but was missing in our sqlplus queries. WTF?

The data has to be there. The problem has to be our approach in sqlplus is inadvertently excluding the information from the reports. Because these reports must be accurate, I’ll crack this nut… Or become nuts myself.

CRACKED THE NUT: So, part of the data WebCT collected was the name of pages. There is a page name table which was inner joined to the user action table. So pages without a name were not reported. George suggested an outer join. I placed it on the page name table which now lets us see the formerly missing tracked actions. For the specific case where I found this, I now get all the missing actions.

Considering a Blackboard (it’s their problem now) feature request to ensure every page in the application has a title. I consider it developer laziness (someone else said worthlessness) that some pages might not have something so core and simple.

ANOTHER TRICK: Oracle’s NVL function displays a piece of text instead of a null value. Awesome for the above.

One of my co-workers says about tracking, “One of the big selling points to [Blackboard] Vista is the wealth of tracking data for auditing, grade challenges, and catching cheaters.” Certainly, Reports and Tracking in Blackboard’s Vista 3 is one of the more favorite tools. So making sure the data gets there is critically important.

The tracking data is not immediately written to the database. Instead, its staged on each node and applied to the database on the schedule provided by the UI’s System Administrator role. The schedule can be hourly or daily. We normally have these set to upload daily at 5am which is a slow part of the day. Though, going into splitting our data sets, we had temporarily set them to hourly to ensure the data would be uploaded. Coming out the split, we lost 3 nodes in 3 hours for the first day of classes due to Java issues. Going back to daily upload of the tracking data brought down the rate to about 2-3 nodes a day. Much better!

This data is staged in the form of files on each node. In the domain directory, locate the tracking directory. In it should be files with the date and hour and hour in the name. Each is rolled at the top of the hour. A file called .active-log contains the name of the file being written to and which the node not to upload the data. The data in these files is a little difficult to interpret. However, you can get the timestamp (Java epoch), learning context id, node, role, action, location, and person id in these logs. Working against the reporting interface or tracking data in the database would yield better results as this information is matched against more useful information like the WebCT ID, names, etc.

So, its possible for these files not to get written to the database for one reason or another. One error causes the rejection of the whole file (not just the offending entry). However, attempting to process the same file again does not result in duplicate entries. A short list of problems:

  • Database can only accept entries of a certain length.
  • Improper use of doublequotes.
  • Special characters.
  • Sudden node failure.

Database can only accept entries of a certain length. When an event’s length is too long, the database encounters an error and cannot place the entry into the tracking data. Problems is Vista 3’s SP7 caused the problem length to drop to about 590 characters. An update in hotfix 2 increased the event length to 1151.

Improper use of doublequotes. Designers using doublequotes in object names (even properly paired) cause issues. Its possible a hotfix intercepts these as I have not seen a case of this in a really long time.

Special characters. The usual special characters work fine. So your amperstands and colons are fine. Its the  which bother me. Monday, Amy got a report a tracking file had not uploaded. It had one event with a length of 1154 which failed (lines of 1151 length did not have an error). The 1154 length one had a text block that looked like “   ” but when I copy that text locally to Windows, it looks like “”. I think the extra spaces bumped the length from 1151 (just under the fail point) to 1154 (just over) and caused the error.

Sudden node failure. As well as we manage things, unexpected problems happen. We had once incident where we lost 6 nodes at once. One of the aftermaths of that even was a tracking file would not get written to the database for two of the six. It turns out an error was written to the end of the current tracking file at the time. Once the error was removed, the files were processed correctly.

The last two always make me think, “Weird.”