DSID-0C090334

Working with our clients on LDAP configuration almost invariable starts with SSL certificates. Self-signed, intermediate, and take up a while. The two tools, openSSL and keytool have become my friends. Working with a network admin for the client, I finally saw the legitimate certificate correctly signed by the intermediate certificate not the self-signed. This means I finally saw this new I error I have never before seen.

javax.naming.AuthenticationException: [LDAP: error code 49 – 80090308: LdapErr: DSID-0C090334, comment: AcceptSecurityContext error, data 525, user@host.domain.tld:    at com.sun.jndi.ldap.LdapCtx.mapErrorCode(LdapCtx.java:3041)

Research on the error code DSID-0C090334 led to indications the LDAP search username was incorrect. The Blackboard CE/Vista LDAP client lacks capabilities many clients have to make it easier to use such as searching deeper into a tree or across branches. In this case our clients configured the user as “cn=account”. We looked at other clients who had something like “cn=account,ou=group,dc=domain,dc=edu”. When presented with this discrepancy as likely a problem, the client suggested a path for us to try like the latter. I entered it, tried our test user.

It worked. They also confirmed it worked. Something to add to the wiki, I guess.

xmllint

This Linux tool is my new best friend. We get thousands of XML files from our clients for loading user, class, and enrollment information. Some of these clients customize our software or write their own software for generating the XML.

This means we frequently get oddities in the files which cause problems. Thankfully I am not the person who has to verify these files are good. I just get to answer the questions that person has about why a particular file failed to load.

The CE/Vista import process will stop if its validator finds invalid XML. Unfortunately, the error “An exception occurred while obtaining error messages.  See webct.log” doesn’t sound like invalid XML.

Usage is pretty simple:

xmllint –valid /path/to/file.xml | head

  1. If the file is valid, then the whole file is in the output.
  2. If there are warnings, then they precede the whole file.
  3. If there are errors, then only the errors are displayed.

I use head here because our files can be up to 15MB, so this prevents the whole file from going on the screen for the first two situations.

I discovered this in researching how to handle the first situation below. It came up again today. So this has been useful to catch errors in the client supplied files where the file failed to load.

1: parser error : XML declaration allowed only at the start of the document
 <?xml version=”1.0″ encoding=”UTF-8″?>

162: parser error : EntityRef: expecting ‘;’
<long>College of Engineering &amp&#059; CIS</long>

(Bolded the errors.) The number before the colon is the line number. The carat it uses to indicate where on the line an error occurred isn’t accurate, so I ignore it.

My hope is to get this integrated into our processes to validate these files before they are loaded and save ourselves headaches the next morning.

Failed Sessions

For exactly two months now I have been working on a re-opened issue (on Oct 7, 2009) where sessions appear to die in Blackboard Vista 8.0.2 hf1.

The first time this came up, Blackboard support wanted us to overhaul the session management. BIG-IP documents saying attempting this new method was a horrible idea caused us never to get on board. We agreed to conduct dupe.pl tests which showed there wasn’t a problem with session spray, which the solution was designed to resolve. Stonewalled, we closed the ticket when the institution reporting it didn’t have any cases to provide us.

So our client with the issue asked us to resume work on it. The key information they provided me was their users hit the /webct/logonDisplay.dowebct. Since they use Single-Sign On (SSO) from a portal, no users should ever hit this page. From investigating these cases, I was able to find a number of cases of users hitting /webct/displayAssessment.dowebct or /webct/displayAssessmentIntro.dowebct with the guest user.

See, the guest user exists at the domain learning context. Users appear as guest before they login or as the logout. They should not appear as guest when taking a quiz.

So I provided this information to Blackboard with the web server logs. They wanted more cases, so I provided more. More clients reported the issue, so I had plenty of sources. Plus it pointed to this problem affecting at least 4 if not all clusters.

Next, our TSM left, so we were provide a new person unused to us. It took just the first note to make a huge mistake. “Provide us all the logs from all the nodes.” At 5GB of logs times 14 nodes in a cluster, 70GB of information for an event which took up maybe 10KB seems like overkill. So… No. I like to think of my self as proficient at system administration, which means I can gather whatever logs you desire.

Now we come to the second mistake. Please refrain from asking me questions already explained in the ticket. Sure, the ticket has a large amount of information. However, if I can remember what is in the ticket, then so can the people working it.

Unfortunately I had to answer a question about replicating this with: it was based on my log trolling not actual cases of students complaining. My mistake was not going to the clients to find a description of the problem. Therefore, Blackboard wanted a WebEx so I could explain the same one sentence repetitively. *headesk* We agreed on me getting a case where a user could explain the problem.

As luck would have it, I got just a case a few days later. So I captured the web server log information and sent it along with the user description. My laziness resulted in me not trimming the log set down to the period of the error. Therefore, this log set showed a user1 login, user2 login, then user1 login again. Blackboard responded this might be a case of sporadic shifting users. Hello! I guess these folks are not used to seeing the SSO login to be able to know the session shifted to another user because… it… logged… in?

By pulling the entries from the f5 log showing the client IP address, Blackboard now wants us to implement a configuration change to the f5 to reflect the browser’s IP in our web server log. Getting such a change isn’t easy for us. Don’t say this is the only way to get client IPs when I… have… sent… you… client IPs. We’ve been at this impasse for 3 weeks. So I get to have another WebEx where I explain the same thing I’ve already written. *headesk*

Maybe it is finally time to ask the people if they are at all familiar with the known issue which sounds like the issue?

VST-3898: When taking an assessment the session is not kept alive. The student’s session times out forcing the student to restart the assessment or makes them unable to complete the assessment.

We plan to implement the upgrade which resolves this issue next week. So, I am hoping this does resolve it. Also, I am tempted to just close this ticket. Should the institutions find they are still having problems in January when the students have had a few quizzes fail, then I might have forgotten how utterly completely useless Blackboard has been on this issue.

All I ask is:

  1. Know the information in the ticket so I don’t have to copy and paste from the same ticket.
  2. Don’t ask for all the logs. Tell me what logs you want to view.
  3. Don’t tell me something is the only way when I’ve already shown you another way. I’m not an idiot.
  4. Don’t ask me if the f5 log has the cookie when the entries I’ve already sent you don’t have it.

🙁

Linux Adventure Part 2

Linux Adventure Part 1Linux Adventure Part 3 [SOLVED]

So far into the story, I tried repairing Windows Vista which failed to actually give me a working entry into the operating system. The Linux Live CDs were non-committed forays into Knoppix, CentOS, and Ubuntu. All failed to turn on the wireless. An ethernet cord would have gotten me online.

So I was stuck with pretty much a brick.

My next step was to venture out to the store and buy a hard drive. The Ubuntu CD included an installer, so I used it to install a local copy. Continued research revealed my problem probably was the fact my computer came with a Broadcom 4312 card. (My brother said my problem was trying use wireless with Linux.)

Without an ethernet connection, I ended up installing Linux STA drivers from source by downloading them and transferring them by FTP.  No good. Multiple times. I never got it to recognize them. Other options called for installing a firmware update on the wireless card. The idea of a firmware update to the wireless card making me stuck on Linux worries me.

Thankfully I got home to where I have ethernet cords. By this point, I had so completely hosed things, so I reinstalled Ubuntu to start over fresh. Now seeing the Internet through the LAN, Ubuntu offered me “restricted” hardware drivers. The b43 set didn’t do anything. The STA set did enable the Wireless option. Even dhclient referenced eth2! However, the wifi status light doesn’t turn on when I enable wireless. Ugh. So the drivers work better but not enough to get it working.

Also, (based on the time stamp of the file I was able to find in a backup of the problem laptop) I haven’t connected a computer to my home network since February, so I didn’t remember what was the password for the network. Finding which computer or external drive contained the information took a few hours. Yay for backups.

Linux Adventure Part 1Linux Adventure Part 3 [SOLVED]

Linux Adventure Part 1

For about a week now I’ve been without my personal laptop as anything much more than a brick. I think tonight I am going to copy off the pictures and other important information to my desktop. From there, anything I do to make the situation worse will no longer matter as much.

Monday night, I shutdown the laptop. Microsoft Vista Automatic Updates said it was working on some updates post-logout. Rather than babysit, I went to bed. I should have babysat it.
🙁

The next morning, Tuesday, starting the computer told me I had a corrupted or missing \boot\BCD. The Boot Configuration Data file is pretty important, as without one the Windows operating system doesn’t even give me a command prompt. After some research I found out I needed my Windows installation DVD only 250 miles away. This caused me so much distress I even forgot I had a spare computer with me.

So I decided to download a Linux Live CD and use that while stuck away from home. At least I would be able to research the problem and possibly fix it later. The first Live CD I tried was a downloaded iso flavor called Knoppix, I remembered from many years ago. Ick. Knoppix Adriane is intended for the visually impaired slipped by me, so the computer reading everything got annoying extremely quickly. Finally turned off the reading stuff, but I had a new problem. Wireless wasn’t working.

Macintosh LC III … And I was out of CD-Rs.

So a newer memory was a few years ago, a friend with a barely functioning Macintosh LC III (pictured right) wanted to get her stuff off it. She brought it up again a few times since, the most recent occasion to ask me to explain why her Windows computer cannot just read 3.5″ floppies from the Mac without any computer-ese. A coworker mentioned a Live CD of CentOS could mount the drive and transfer the data.

So, I downloaded an iso of the CentOS Live CD while I went to the store to get some disks to burn. While starting up CentOS, I downloaded Ubuntu just in case this second Live CD failed. It was a good thing because the CentOS Live CD was prettier without any improvement in getting on the wireless.

Nor was the Ubuntu Live CD any better.

By this point, I had found a site offering a torrent to a Vista Recovery CD. The quandary was to go back to Windows or stick with Linux. The recovery CD off a random web site could just not work or at worst infect the non-functioning computer. So I installed BitTorrent and downloaded the recovery CD. I tried the Startup Repair, System Restore, and Command Prompt (to manually rebuild the booter). Since this failed, I decided Windows Vista was dead.

So I started looking into how to make Ubuntu work for me.

Linux Adventure Part 2Linux Adventure Part 3 [SOLVED]

Useful User Agents

Rather than depend on end users to accurately report the browser used, I look for the user-agent in the web server logs. (Yes, I know it can be spoofed. Power users would be trying different things to resolve their own issues not coming to us.)

Followers of this blog may recall I changed the Weblogic config.xml to record user agents to the webserver.log.

One trick I use is the double quotes in awk to identify just the user agent. This information is then sorting by name to count (uniq -c) how many of each is present. Finally, I sort again by number with the largest at the top to see which are the most common.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | sort | uniq -c | sort -n -r

This is what I will use looking for a specific user. If I am looking at a wider range, such as the user age for hits on a page, then I probably will use the head command to look at the top 20.

A “feature” of this is getting the build (Firefox 3.011) rather than just the version (Firefox 3). For getting the version, I tend to use something more like this to count the found version out of the log.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | grep -c ‘<version>’

I have yet to see many CE/Vista URIs with the names of web browsers. So these are the most common versions one would likely find (what to grep – name – notes):

  1. MSIE # – Microsoft Internet Explorer – I’ve seen 5 through 8 in the last few months.
  2. Firefox # – Mozilla Firefox – I’ve seen 2 through 3.5. There is enough difference between 3 and 3.5 (also 2 and 2.5) I would count them separately.
  3. Safari – Apple/WebKit – In searching for this one, I would add to the search a ‘grep -v Chrome’ or to eliminate Google Chrome user agents.
  4. Chrome # – Google Chrome – Only versions 1 and 2.

Naturally there many, many others. It surprised me to see iPhone and Android on the list.

Tracking Specific File Use

CE/Vista Reports and Tracking displays summaries of activity. If an instructor seeks to know who clicked on a specific file, then Reports and Tracking falls down on the job.

Course Instructor can produce a report of the raw tracking data. However, access to the role falls under the Administration tab so people running the system need to make a user specifically to enroll themselves at the course level to get the reports. (Annoying.)

Instead the administrators for my campuses pass up to my level of support requests to generate reports. For providing these I have SQL to produce a report. This example is for users who clicked on a specific file. Anything in bold is what the SQL composer will need to alter.

set lines 200 pages 9999
col user format a20
col action format a32
col pagename format a80

clear breaks computes
break on User skip 1
compute count of Action on User

select tp.user_name "User",ta.name "Action",
      to_char(tua.event_time,'MM/DD/RR HH24:MI:SS') "Time",
      NVL(tpg.name,'--') "PageName"
  from trk_person tp, trk_action ta, trk_user_action tua,
      trk_page tpg, learning_context lc
  where tp.id = tua.trk_person_id
    and ta.id = tua.trk_action_id
    and tua.trk_page_id = tpg.id (+)
    and tua.trk_learning_context_id = lc.id
    and lc.id = 1234567890
    and tpg.name like '%filename.doc%'
  order by tp.user_name,tua.event_time
/

Output

  • User aka tp.user_name – This is the student’s account.
  • Action aka ta.name – This is an artifact of the original script. You might drop it as meaningless from this report.
  • Time aka tua.event_time – Day and time the action took place.
  • PageName aka tpg.name – Confirmation of the file name. Keep if using like in a select on this.

Considerations

I use the learning context id (lc.id aka learning_context.id) because in my multi-institution environment, the same name of a section could be used in many places. This id ensures I data from multiple sections.

The tricky part is identifying the file name. HTML files generally will show up as the name of in the title tag (hope the instructor never updates it). Office documents generally will show as the file name. Here are a couple approaches to determining how to use tpg.name (aka trk_page.name).

  1. Look at the file in the user interface.
  2. Run the report without limiting results to any tpg.name. Identify out of the results the name you wish to search and use: tpg.name = ‘page name

Most tracked actions do have a page name. However, some actions do not. This SQL is designed to print a “–” in those cases.

State of the LMS

Watched an informative WebEx about The State of the LMS: An Insitution Perspective presented jointly by Delta Initiative and California State University. An true innovator in this market could become the leader.

Market share numbers annoy me. These are always self-reported numbers from a survey. The sample sizes are almost always not very impressive and when broken down doesn’t really represent the market. DI didn’t post a link to where they got the numbers just the name of the group. Some digging and turned up this Background Information About LMS Deployment from the 2008 Campus Computing Survey. For background information it is woefully lacking in important information such as sample size, especially the breakdown of the types of institutions in the categories.

The numbers DI quotes of CC are very different for the same year the Instructional Technology Council reports: Blackboard market share 66% (DI/CC) vs 77% (ITC). An 11% difference makes is huge when the next largest competitor is 10% (DI/CC).

Other missing critical information: Are these longitudinal numbers, aka the same respondants used participate in every year the survey quotes? Or is there a high turnover rate meaning an almost completely different set of people are answering every year so the survey completely relies on the randomness of who is willing to answer the survey? So the numbers could shift just because people refuse to answer giving Blackboard reduced market share only because Moodle customers are more willing to respond to questions about it?

Most of the major LMS products on the market started at a university or as part of a consortium involving universities. I knew the background of most of the products on in Figure 1. Somehow I never put that together.

Will another university take the lead and through innovation cause the next big shakeup? I would have thought the next logical step to address here in the DI presentation would be the innovative things universities are doing which could have an impact. Phil described Personal Learning Environments (not named) as potentially impacting the LMS market, but he was careful to say really PLEs are an unkown. The were no statements about brand new LMSs recently entering or about to enter the market.

Figure 1: Start year and origin of LMSes. Line thickness indicates market share based on Campus Computing numbers. From the DI WebEx.

Network Recording Player - State-wide LMS Strategy 8262009 90839 AM-1

When people use my project as an example, it gets my attention. GeorgiaVIEW was slightly incorrectly described on page 26 Trends: Changing definition of “centralization”.

  1. We do not have an instance per institution which has a significantly higher licensing cost. We do give each institution their own URL to provide consistency for their users. Changing bookmarks, web pages, portals, etc everywhere a URL is listed is a nightmare. So we try to minimize the impact when we move them by a single unchanging URL.We have 10 instances for the 31 institutions (plus 8 intercampus programs like Georgia ONmyLINE) we host. Learn 9 will not have the Vista multiple institution capability, so should we migrate to Learn 9 an instance per institution would have to happen.
  2. We have two primary data centers not have a primary and a backup data center. By having multiple sites, we keep our eggs in multiple baskets.

The primary point about splitting into multiple instances was correct. We performed the two splits because Vista 2 and 3 exhibited performance issues based on both the amount of usage and data. With ten instances we hit 20,000 4,500 users (active in the past 5 minutes recently) but should be capable of 50,000 based on the sizing documents. We also crossed 50 million hits and 30 million page views. We also grow by over a terabyte a term now. All these numbers are still accelerating (grows faster every year). I keep hoping to find we hit a plateau.

Figure 2: LMS consortia around the United States. From the DI WebEx.

Consortia Nationwide

All this growth in my mind means people in general find us useful. I would expect us to have fewer active users and less data growth should everyone hate us. Of course, the kids on Twitter think GeorgiaVIEW hates them. (Only when you cause a meltdown.)

UPDATE: Corrected the active users number. We have two measure active and total. 20,000 is the total or all sessions. 4,500 are active in the past 5 minutes. Thanks to Mark for reading and find the error!

Computer Metaphors

An effective way to explain something is to use a metaphor. This can be especially effective by picking an metaphorical object or behavior with which the audience is already familiar.

The one I see most often is comparing computers to a car. This morning I saw this on an email list describing a person’s experience  migrating to Vista 8 from Vista 3.

It is like I have traded in a familiar (though frustrating) car for one that has the lights, wipers, and radio in new locations.

Also this morning, Vista 8 was compared to a malfunctioning pen forced on faculty who would rather use a better pen. Nevermind all pens are not used exactly the same. (Fountain vs rollerball) Some require more maintenance and care than others.

A coworker always says Free Open Source Software like Sakai or Moodle are free as in free puppies not free beer. Nevermind proprietary bought systems like Blackboard are bought as in bought puppies.
🙂

Weblogic Diagnostics

I noticed one the nodes in a development cluster was down. So I started it again. The second start failed, so I ended up looking at logs to figure out why. The error in the WebCTServer.000000000.log said:

weblogic.diagnostics.lifecycle.DiagnosticComponentLifecycleException: weblogic.store.PersistentStoreException: java.io.IOException: [Store:280036]Missing the file store file “WLS_DIAGNOSTICS000001.DAT” in the directory “$VISTAHOME/./servers/$NODENAME/data/store/diagnostics”

So I looked to see if the file was there. It wasn’t.

I tried touching a file at the right location and starting it. Another failed start with a new error:

There was an error while reading from the log file.

So I tried copying to WLS_DIAGNOSTICS000002.DAT to WLS_DIAGNOSTICS000001.DAT and starting again. This got me a successful startup. Examination of the WLS files revealed the the 0 and 1 files have updated time stamps while the 2 file hasn’t changed since the first occurance of the error.

That suggests to me Weblogic is unaware of the 2 file and only aware of the 0 and 1 files. Weird.

At least I tricked the software into running again.

Some interesting discussion about these files.

  1. Apparently I could have just renamed the files. CONFIRMED
  2. The files capture JDBC diagnostic data. Maybe I need to look at the JDBC pool settings. DONE (See comment below)
  3. Apparently these files grow and add a new file when it reaches 2GB. Sounds to me like we should purge these files like we do logs. CONFIRMED
  4. There was a bug in a similar version causing these to be on by default.

Guess that gives me some work for tomorrow.
🙁