Report Just Usernames

Occasionally I’ll want to see the usernames who use something like a user-agent property or were doing something during a range of time. Rather than report all the log lines and pick them out of the data, I use this which Blackboard (or maybe BEA added).

Note  we’ve added user-agents to the webserver.log. The double quote I use as my delimiter in the awk is from us adding the user-agent to the webserver logs.If you have not set up your logs to use this, then you’ll either need to do so or figure out which position is appropriate for you with a space delimiter. The colon in the second awk is where just after the username the log records the reads and writes to the database.

| awk -F\” ‘{print $3}’ | awk -F\: ‘{print $1}’ | sort | uniq

An example usage is a case was escalated to me where a student had trouble taking an assessment. That student was, of course, using Internet Explorer 7, a web browser which prior CE/Vista 8.0.4 was supported. Now it is not. (Could be likely this is reason Blackboard stopped supporting in.) So I was curious how many users are still trying to use this browser.

Useful User Agents

Rather than depend on end users to accurately report the browser used, I look for the user-agent in the web server logs. (Yes, I know it can be spoofed. Power users would be trying different things to resolve their own issues not coming to us.)

Followers of this blog may recall I changed the Weblogic config.xml to record user agents to the webserver.log.

One trick I use is the double quotes in awk to identify just the user agent. This information is then sorting by name to count (uniq -c) how many of each is present. Finally, I sort again by number with the largest at the top to see which are the most common.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | sort | uniq -c | sort -n -r

This is what I will use looking for a specific user. If I am looking at a wider range, such as the user age for hits on a page, then I probably will use the head command to look at the top 20.

A “feature” of this is getting the build (Firefox 3.011) rather than just the version (Firefox 3). For getting the version, I tend to use something more like this to count the found version out of the log.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | grep -c ‘<version>’

I have yet to see many CE/Vista URIs with the names of web browsers. So these are the most common versions one would likely find (what to grep – name – notes):

  1. MSIE # – Microsoft Internet Explorer – I’ve seen 5 through 8 in the last few months.
  2. Firefox # – Mozilla Firefox – I’ve seen 2 through 3.5. There is enough difference between 3 and 3.5 (also 2 and 2.5) I would count them separately.
  3. Safari – Apple/WebKit – In searching for this one, I would add to the search a ‘grep -v Chrome’ or to eliminate Google Chrome user agents.
  4. Chrome # – Google Chrome – Only versions 1 and 2.

Naturally there many, many others. It surprised me to see iPhone and Android on the list.

Diff and Spaces

Amy’s dump of the CE/Vista settings table ended up with a slightly different format than mine. I was able to use sed to rejoin the correct lines. This resulted in two files with spaces on about half of the lines in the file. Ouch.

Thankfully diff has a -b flag to ignore the spaces. Really useful in this case.

Now… To figure out why a pattern I copy and paste from a file is not found with grep.
🙂

Odd Tracking File Recording

Every time a Vista 3 node is shut down without going through the initiated shut down process, there is a chance of incorrect data written to the tracking files (in NodeA/tracking/). Normally it leaves strange characters or partial lines at the end of the file. This is the first time I have seen it write the contents of another log instead of the tracking data.

click – 1.0 – 1244228052889 – 1135588340001 – “nova.view.usg.edu-1244227762853-6288” – SSTU – discussion – “compiled-message-viewed” – “page name” – 558711383 –

click – 1.0 – 1244228052891 – 15.0; .NET CLR 1.1.4322)”

2009-04-23      20:58:35        0.0030  xxx.xxx.xxx.xxx    JxH1zg4fZT1LTGcpmyNW    200     GET     /webct/libraryjs.dowebct        locale=en_US    0       “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)”

Even better. The node went down on June 5th at around 3pm. The lines from the other log were from April 23rd at 8:58pm.

Why am I surprised to see new incorrect behavior? Especially when the node was really confused?

Preserving CE/Vista Settings

I’ve been asked for notes about this a few times. So here’s a blog post instead.
🙂

A coworker is working on scripting our updates. We lost the Luminis Message Adapter settings in applying the patch to the environment we provide to our clients. Fortunately, those settings are maintained by us not our clients. So I pushed those settings back very easily. Unfortunately, it points to the need to capture the settings for the potential purpose of restoring the settings.

In Oracle databases, this is pretty easy. As the schema user, run the following. It does some intentional things. First, we have multiple institutions, so the breaks make identifying which institution easier. Second, the same label for multiple forms gets confusing, so I am sorting by setting description id under the theory these ids are generated at the time the page is created, so the same tools will float together. (The last modified time stamp is probably unnecessary, I used it in an earlier version and left it just in case Vista for whatever reason added a new setting for the same label instead of modifying the existing one.) This can be spooled both before and after the upgrade. Use diff or WinMerge to compare the versions. Anything lost from the before version should be evaluated for inclusion adding back to the settings.

col lc_name format a50
col setting_value format a80
col label format a80
col lock format 999
col child format 999

clear breaks computes
break on lc_name skip 1

select learning_context.name lc_name, settings_description.label, settings.setting_value,
settings.locked_flag “lock”, settings_description.inheritable_flag “child”
from learning_context, settings, settings_description
where settings.settings_desc_id = settings_description.id
and settings.learning_context_id = learning_context.id
and learning_context.type_code in (‘Server’,’Domain’, ‘Institution’,’Campus’,’Group’)
order by learning_context.name, settings.settings_desc_id
/

An example of the multiple forms issue is external authentication. CE/Vista provides an LDAP (A) and an LDAP (B). The settings_description.label for both is contextmgt.settings.ldap.source. The settings_description.name for both is source. It looks like each of the two identical labels has a different settings.settings_desc_id value depending on whether it is A or B. To me it seems lame to use the same label for two different ids.

The most vulnerable parts of the application to lose settings during an update are the System Integration settings. A mismatched Jar on a node will wipe all the settings associated with that Jar.

However, I can see using this to capture the settings as a backup just in case an administrator or instructor wipes out settings by mistake. Yes, this is scope creep. Create a backup of the settings table to actually preserve the settings.

create table settings_backup_pre_sp2hf1 tablespace WEBCT_DATA as select * from settings;

Contexts: As a server admin, I maintain certain settings and push those down. Each client has control over some other settings and may push those down from the institution context. Maybe some are creating division and group admins? Maybe some instructors are changing things at the course or section levels. I may end up capturing everything?

Restoration: The whole purpose of preserving the settings is to restore them later. There are a couple methods in theory:

  1. Providing the settings to a human to re-enter. The labelling issue makes me question the sanity of trying to explain this to someone.
  2. Update the database directly would just need settings.id ensure it is the right location. Maybe dump out the settings in the format of an update command with labels on each to explain the context? Ugh.

If settings were not so easily lost, then this would be so much easier.

View: Another table of interest is the settings_v view. (Redundant?) The only reason I don’t like this view is it reports the values for every learning context which makes reporting off it much, much longer. For example, the encryption key for a powerlink is listed 8 places in settings/settings_description and 18,769 places in settings_v.

IMS Import Error When Node Is Down

This is what I got when a node was down while I attempted to do an IMS import in Blackboard CE/Vista.

Failed to upload files, exiting.
Cause could include invalid permission on file/directory,
invalid file/directory or
repository related problems

The keywords permission, file, and directory in this would have sent me anywhere but to the right place. The keyword repository made me suspicious the node had a worse issue than just bad permissions. So I looked for the most recent WebCTServer log and found it to be a week old. Verifying the last messages in the log confirmed it had been down for a week.
🙁

To see anything in the log questioning whether or not the node was running would have saved me lots of time this morning.

Added to my .bashrc a couple lines to provide a visual indicator how many are running.

JAVA_RUNNING=`ps -ef | grep [j]ava | grep -c [v]ista`
echo ”  — No. Vista processess running = $JAVA_RUNNING”

Better might even be to have it evaluate whether less than one or more than two (or three) are running. If so, then put something obvious the world is falling. Maybe later. Took me just a couple minutes to write and test what I have. The rest will come after I decide what I really want. 🙂

Also, it wasn’t running because a coworker had run into a situation where the fifth node would not start. She thought maybe it was because the number of connection Oracle would accept was not high enough. I suggested a simple test would be to shut down a node and see if the problem one suddenly works. I happened to be working with the one she shut down for the test. It happens she had just started a script to bring them up when I asked.