tracking

You are currently browsing articles tagged tracking.

One of the common complaints instructors have about CE/Vista is the Tracking reports don’t have recent enough data. They are shown this for selecting the date range.

Select a Date Range for the Report

Select a Date Range for the Report

Including here the most recent time the tracking was processed (which the application already displays to the server administrator in background jobs) would help the instructor know whether the data is as recent as 4:00 am or 1:00pm.

Maybe when Tracking will run again ought to be displayed to the instructor so he or she knows it will run within the hour or the next morning. That might cut down on instructors running it again and again expecting it to magically show data which won’t be available until many hours later.

Administrators some times have to pick the best operational time to run Tracking. We have direct login checks running several times per hour. When Tracking is run every hour and these checks run at the same time, the time these direct login checks took spiked. Users also complained about poor performance. So we have these run in the wee hours of the morning when users are not generally on the system.


Related posts

CE/Vista Reports and Tracking displays summaries of activity. If an instructor seeks to know who clicked on a specific file, then Reports and Tracking falls down on the job.

Course Instructor can produce a report of the raw tracking data. However, access to the role falls under the Administration tab so people running the system need to make a user specifically to enroll themselves at the course level to get the reports. (Annoying.)

Instead the administrators for my campuses pass up to my level of support requests to generate reports. For providing these I have SQL to produce a report. This example is for users who clicked on a specific file. Anything in bold is what the SQL composer will need to alter.

set lines 200 pages 9999
col user format a20
col action format a32
col pagename format a80

clear breaks computes
break on User skip 1
compute count of Action on User

select tp.user_name "User",ta.name "Action",
      to_char(tua.event_time,'MM/DD/RR HH24:MI:SS') "Time",
      NVL(tpg.name,'--') "PageName"
  from trk_person tp, trk_action ta, trk_user_action tua,
      trk_page tpg, learning_context lc
  where tp.id = tua.trk_person_id
    and ta.id = tua.trk_action_id
    and tua.trk_page_id = tpg.id (+)
    and tua.trk_learning_context_id = lc.id
    and lc.id = 1234567890
    and tpg.name like '%filename.doc%'
  order by tp.user_name,tua.event_time
/

Output

  • User aka tp.user_name – This is the student’s account.
  • Action aka ta.name – This is an artifact of the original script. You might drop it as meaningless from this report.
  • Time aka tua.event_time – Day and time the action took place.
  • PageName aka tpg.name – Confirmation of the file name. Keep if using like in a select on this.

Considerations

I use the learning context id (lc.id aka learning_context.id) because in my multi-institution environment, the same name of a section could be used in many places. This id ensures I data from multiple sections.

The tricky part is identifying the file name. HTML files generally will show up as the name of in the title tag (hope the instructor never updates it). Office documents generally will show as the file name. Here are a couple approaches to determining how to use tpg.name (aka trk_page.name).

  1. Look at the file in the user interface.
  2. Run the report without limiting results to any tpg.name. Identify out of the results the name you wish to search and use: tpg.name = ‘page name

Most tracked actions do have a page name. However, some actions do not. This SQL is designed to print a “–” in those cases.


Related posts

Every time a Vista 3 node is shut down without going through the initiated shut down process, there is a chance of incorrect data written to the tracking files (in NodeA/tracking/). Normally it leaves strange characters or partial lines at the end of the file. This is the first time I have seen it write the contents of another log instead of the tracking data.

click – 1.0 – 1244228052889 – 1135588340001 – “nova.view.usg.edu-1244227762853-6288″ – SSTU – discussion – “compiled-message-viewed” – “page name” – 558711383 -

click – 1.0 – 1244228052891 – 15.0; .NET CLR 1.1.4322)”

2009-04-23      20:58:35        0.0030  xxx.xxx.xxx.xxx    JxH1zg4fZT1LTGcpmyNW    200     GET     /webct/libraryjs.dowebct        locale=en_US    0       “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)”

Even better. The node went down on June 5th at around 3pm. The lines from the other log were from April 23rd at 8:58pm.

Why am I surprised to see new incorrect behavior? Especially when the node was really confused?


Related posts

From 2001 to 2006, Microsoft Outlook was the email client I used for work (and on my home computer to access work stuff). Back then, Exchange was not available, so a number of the features were more hacks than reality. However, it worked pretty well.

When I changed jobs, Netscape and Thunderbird were the pre-installed clients. I opted for Thunderbird. It worked pretty well for me. Calendaring was in MeetingMaker. Everything worked pretty well.

Recently work shifted to Exchange, so going back to Outlook made sense. Maybe because I have so much experience, the transition was not as bad as it might have been. Still… These are gotchas which have annoyed me lately:

  1. Editable subject usability: The emails from our client issue tracking system put the description where its hidden. I was really pissed that I could not edit the subject until I figured out unlike most software which changes the shading to show it is now editable, Outlook just lets me edit at any time. Also, editing the subject after it is used by something else like a task results in the change in the email but not the task. (The main reason I want to change them is so it appears correctly in the task list. ) Copying to a second email results in the same problem. Apparently I have to either create a new task and copy-n-paste the subject I want or forward the email to myself.
  2. Spacebar moves to next message instead of next new message: I really like the Thunderbird method of skipping to the next unread message when I hit the spacebar at the end of the current message. It even will find the next unread message in another folder. Outlook just advances to the next message.
  3. Boolean is more than OR: I had this fantastic Thunderbird filter which looked for user@ AND domain.tld. Outlook only honors OR. We have 15 admin nodes and databases which send up reports. Alerts and tickets come from a different source and unaffected by this.
  4. Search ignores special characters: I thought in the past I had sent email to abc-defghi@domain.tld. However, the message bounced, so I searched my email for part of the address “abc-defghi” as its not in the address book. I got results which match “abc” not “abc-defghi”. So it ignored the hyphen and everything after. FAIL!
  5. Send email as plain text or paste a plain text: Yes, I know lots of people have HTML capable clients. I hate Outlook puts my replies in a sickly blue font. When I copy and paste from the elsewhere in the message, it changes the font. So then I have to go and do formatting to have a presentable email. I just want to type and send. I don’t care about fonts, colors, etc. If I did, then I would create a web page. … (Added 2009-JUN-03)

That’s it for now.


Related posts

On the WebCT Users email list (hosted by Blackboard) there is a discussion about a mysterious directory called unmarshall which suddenly appeared. We found it under similar circumstances as others by investigating why a node consumed so much disk space. Failed command-line restores end up in this unmarshall directory.

Unmarshalling in Java jargon means:

converting the byte-stream back to its original data or object 1

This suspiciously sounds like what a decryption process would use to convert a .bak file into a .zip so something can open the file.

This is fourth undocumented work space where failed files site for a while and cause problems and no forewarning from the vendor.

Previous ones are:

  1. Failed UI backups end up in the weblogic81 (Vista 3, does this still happen in Vista 8?) directory.
  2. Failed tracking data files end up in WEBCTDOMAIN/tracking (Vista 3, apparently no longer stored this way in Vista 4/8 according to CSU-Chico and Notre Dame)
  3. Web Services content ends up in /var/tmp/ and are named Axis####axis. These are caused by a bug in DIME (like MIME) for Apache Axis. No one is complaining about the content failing to arrive, so we presume the files just end up on the system.

#3 were the hardest to diagnose because of a lack of an ability to tie the data back to user activity.

Is this all there are? I need to do testing to see which of these I can cross off my list goring forward in Vista 8. Failed restores are on it indefinitely for now.
:(

References:

  1. http://www.jguru.com/faq/view.jsp?EID=560072

Related posts

First Metricocracy measured hits. Pictures and other junk on pages inflated the results so Metricocracy decided on either unique visitors or page views. Now, the Metricocracy wants us to measure attention. Attention is engagement, how much time users spend on a page.

What do we really want to know? Really it is the potential value of the property. The assumption around attention is the longer someone spends on a web site, the more money that site gains in advertisement revenue. The rationale being users who barely glance at pages and spend little time on the site are not going to click ads. Does this really mean users who linger and spend large amounts of time on the site are going to click more ads?

This means to me attention is just another contrived metric which doesn’t measure what is really sought. I guess advertisement companies and the hosts brandishing them really do not want to report the click through rates?

My web browsing habits skew the attention metric way higher than it ought to be. First, I have a tendency to open several items in a window and leave them lingering. While my eyes spent a minute looking the content, the page spent minutes to hours in a window… waiting for the opportunity. Second, I actively block images from advertisement sources and block Flash except when required.

As a DBA, page views also has debatable usefulness. On the one hand we could use it because it represents a count of objects requiring calls to the database and rendering by application and web server code. Hits represent all requests for all content, simple or complex, so is more inclusive. Bandwidth throughput represents how much data is sucked out or pushed into the systems.

We DBAs also provide supporting information to the project leaders. Currently they look at the number of users or classrooms who have been active throughout the term. Attention could provide another perspective to enhance the overall picture of how much use our systems get.

Cat Finnegan, who conducts research with GeorgiaVIEW tracking data, measures learning effectiveness. To me, that is the ultimate point of this project. If students are learning with the system, then it is successful. If we can change how we do things to help them learn better, then we ought to make that change. If another product can help students learn better, then that is the system we ought to use.

Ultimately, I don’t think there is a single useful metric. Hits, unique users, page views, attention, bandwith, active users, etc., all provide a nuanced view of what is happening. I’ve used them all for different purposes.


Related posts

Clusters can making finding where a user was working a clusterf***. Users end up on a node, but they don’t know which node. Heck, we are ahead of the curve to get user name, date, and time. Usually checking all the nodes in the past few days can net you the sessions. Capturing the session ids in the web server logs usually leads to finding an error in the webct logs. Though not always. Digging through the web server logs to find where the user was doing something similar to the appropriate activity consumes days.

Blackboard Vista captures node information for where the action took place. Reports against the tracking data provide more concise, more easily understood, and more quickly compiled. They are fantastic for getting a clear understanding of what steps a user took.

Web server logs contain every hit which includes every page view (well, almost, the gap is another post). Tracking data represents at best 25% of the page views. This problem is perhaps the only reason I favor logs over tracking data. More cryptic data usually means a slower resolution time not faster.

Another issue with tracking is the scope. When profiling student behavior, it is great. The problem is only okay data can be located for instructors while designers and administrators are almost totally under the radar. With the new outer join, what we can get for these oblivious roles has been greatly expanded.

Certainly, I try not to rely too much on a single source of data. Even I sometimes forget to do so.


Related posts

Blackboard Vista tracks student activity. This tracking data is viewed as a critical feature of Vista. Our instructors depended on the information until we revoked their ability to run reports themselves due to performance issues. Campus administrators can still generate reports (though some still fail). We doubt the solution to this is Blackboard improving the queries to create the reports. We favor deleting tracking data (data preserved outside of Vista) to resolve the performance issues.

We developed SQL reports to look at the tracking data where the user in question was not a student. Yes, the data is limited, but in determining when and where a user was active, can help determine where to look in logs. When we hit the performance issues we started using these reports where the user interface reports failed to generate.

My understanding was the user interface and SQL reports on tracking were the same. Both looked at the same data. The user interface reports were just sexier wrapped in HTML and using icons. I compared a user interface report to a SQL report. Just prior to doing this, I was thinking, WebCT was stupid for not tracking when students look at the list of assessments. Turns out “Assessment list viewed” was tracked in the user interface all along but was missing in our sqlplus queries. WTF?

The data has to be there. The problem has to be our approach in sqlplus is inadvertently excluding the information from the reports. Because these reports must be accurate, I’ll crack this nut… Or become nuts myself.

CRACKED THE NUT: So, part of the data WebCT collected was the name of pages. There is a page name table which was inner joined to the user action table. So pages without a name were not reported. George suggested an outer join. I placed it on the page name table which now lets us see the formerly missing tracked actions. For the specific case where I found this, I now get all the missing actions.

Considering a Blackboard (it’s their problem now) feature request to ensure every page in the application has a title. I consider it developer laziness (someone else said worthlessness) that some pages might not have something so core and simple.

ANOTHER TRICK: Oracle’s NVL function displays a piece of text instead of a null value. Awesome for the above.


Related posts

Our awesome sysadmins have put the user agent into our AWStats so we are tracking these numbers now. They discovered something I overlooked. Netscape 4.x is 10 times more used than 7.x or 8.x. Wowsers! Some people really do not give up on the past.

Back in the Netscape is dead post, I used this to count the Netscape 7 hits.

grep Netscape/7 webserver.log* | wc -l

Stupid! Stupid! Stupid! The above requires running for each version of Netscape. This is why I missed Netscape 4.

This is more convoluted, but I think it its a much better approach.

grep Netscape webserver.log* | awk -F\t ‘{print $11}’ | sort | uniq -c | sort -n

It looks uglier, but its much more elegant. Maybe I ought to make a resolution for 2008 to be elegant in all my shell commands.

This version first pulls any entries with Netscape in the line. Next, the awk piece reports only the user agent string. The first sort puts all the similar entries next to each other so the uniq will not accidentally duplicate. The -c in the uniq counts. The final sort with the -n orders them by the uniq’s count. The largest will end up at the bottom.


Related posts

I am blogging from the pre-conference GeorgiaVIEW meeting @ Rock Eagle yesterday afternoon and this morning. I enjoy connecting with people around the state of Georgia who use our Vista system. Most of them do not make it to BbWorld. Some hot topics:

  • Alternatives to Blackboard Vista
  • Training
    • Content repository
  • Returning Reports and Tracking to instructors.
    • Some reports still failing. One approach may be to remove tracking data from Vista database and make it available elsewhere.
  • Upgrade to Vista 4. People want a timeline, access to a training instance ASAP, please not do an in-place upgrade.
    • Limited shelf life on internals of Vista 3 / 4.0 – 4.1.2
    • More of customers have moved or are moving to Vista 4 / CE 6 than a year ago.
    • Can take advantage of new tools available in Vista 4.
    • Data retention – policy, reponsibilities (faculty, campus, OIIT)
    • Phased approach – parallel environments, at some point Vista 3 goes away and no longer available.
    • End of Fall 2008 or Spring 2009.
  • People are both quite happy we are going to Vista 4 and disconcerted at the prospect of having to move to Vista 4 in even over a year from now (at the worst by April 2009).
    • Export / import of non-SIS created users.
    • Training

Lovely (yeah a real person and she is) says Lovely Freelove would be one of the best names ever.


Related posts

« Older entries