xmllint

This Linux tool is my new best friend. We get thousands of XML files from our clients for loading user, class, and enrollment information. Some of these clients customize our software or write their own software for generating the XML.

This means we frequently get oddities in the files which cause problems. Thankfully I am not the person who has to verify these files are good. I just get to answer the questions that person has about why a particular file failed to load.

The CE/Vista import process will stop if its validator finds invalid XML. Unfortunately, the error “An exception occurred while obtaining error messages.  See webct.log” doesn’t sound like invalid XML.

Usage is pretty simple:

xmllint –valid /path/to/file.xml | head

  1. If the file is valid, then the whole file is in the output.
  2. If there are warnings, then they precede the whole file.
  3. If there are errors, then only the errors are displayed.

I use head here because our files can be up to 15MB, so this prevents the whole file from going on the screen for the first two situations.

I discovered this in researching how to handle the first situation below. It came up again today. So this has been useful to catch errors in the client supplied files where the file failed to load.

1: parser error : XML declaration allowed only at the start of the document
 <?xml version=”1.0″ encoding=”UTF-8″?>

162: parser error : EntityRef: expecting ‘;’
<long>College of Engineering &amp&#059; CIS</long>

(Bolded the errors.) The number before the colon is the line number. The carat it uses to indicate where on the line an error occurred isn’t accurate, so I ignore it.

My hope is to get this integrated into our processes to validate these files before they are loaded and save ourselves headaches the next morning.

TED Talk: Taryn Simon

My favorite quote from Taryn is, “Photography threatens fantasy.” Disney uses intricate interior design, photography, and video to construct fantasy. Advertisements, magazines, weddings, and portraits are about showing others the ideal instead of the reality. Have you seen the Dove Evolution video? (This one has music and singing by a Baha’i musician Devon Gundry.) What about the Ralph Lauren photo?

Reality bites. Hard.

(See Taryn Simon photographs secret sites on the TED site)

TED About this talk: Taryn Simon exhibits her startling take on photography — to reveal worlds and people we would never see otherwise. She shares two projects: one documents otherworldly locations typically kept secret from the public, the other involves haunting portraits of men convicted for crimes they did not commit.

Also: Taryn on Charlie Rose, Discomfort Zone (Telegraph)

Useful User Agents

Rather than depend on end users to accurately report the browser used, I look for the user-agent in the web server logs. (Yes, I know it can be spoofed. Power users would be trying different things to resolve their own issues not coming to us.)

Followers of this blog may recall I changed the Weblogic config.xml to record user agents to the webserver.log.

One trick I use is the double quotes in awk to identify just the user agent. This information is then sorting by name to count (uniq -c) how many of each is present. Finally, I sort again by number with the largest at the top to see which are the most common.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | sort | uniq -c | sort -n -r

This is what I will use looking for a specific user. If I am looking at a wider range, such as the user age for hits on a page, then I probably will use the head command to look at the top 20.

A “feature” of this is getting the build (Firefox 3.011) rather than just the version (Firefox 3). For getting the version, I tend to use something more like this to count the found version out of the log.

grep <term> webserver.log | awk -F\” ‘{print $2}’ | grep -c ‘<version>’

I have yet to see many CE/Vista URIs with the names of web browsers. So these are the most common versions one would likely find (what to grep – name – notes):

  1. MSIE # – Microsoft Internet Explorer – I’ve seen 5 through 8 in the last few months.
  2. Firefox # – Mozilla Firefox – I’ve seen 2 through 3.5. There is enough difference between 3 and 3.5 (also 2 and 2.5) I would count them separately.
  3. Safari – Apple/WebKit – In searching for this one, I would add to the search a ‘grep -v Chrome’ or to eliminate Google Chrome user agents.
  4. Chrome # – Google Chrome – Only versions 1 and 2.

Naturally there many, many others. It surprised me to see iPhone and Android on the list.

Weblogic Diagnostics

I noticed one the nodes in a development cluster was down. So I started it again. The second start failed, so I ended up looking at logs to figure out why. The error in the WebCTServer.000000000.log said:

weblogic.diagnostics.lifecycle.DiagnosticComponentLifecycleException: weblogic.store.PersistentStoreException: java.io.IOException: [Store:280036]Missing the file store file “WLS_DIAGNOSTICS000001.DAT” in the directory “$VISTAHOME/./servers/$NODENAME/data/store/diagnostics”

So I looked to see if the file was there. It wasn’t.

I tried touching a file at the right location and starting it. Another failed start with a new error:

There was an error while reading from the log file.

So I tried copying to WLS_DIAGNOSTICS000002.DAT to WLS_DIAGNOSTICS000001.DAT and starting again. This got me a successful startup. Examination of the WLS files revealed the the 0 and 1 files have updated time stamps while the 2 file hasn’t changed since the first occurance of the error.

That suggests to me Weblogic is unaware of the 2 file and only aware of the 0 and 1 files. Weird.

At least I tricked the software into running again.

Some interesting discussion about these files.

  1. Apparently I could have just renamed the files. CONFIRMED
  2. The files capture JDBC diagnostic data. Maybe I need to look at the JDBC pool settings. DONE (See comment below)
  3. Apparently these files grow and add a new file when it reaches 2GB. Sounds to me like we should purge these files like we do logs. CONFIRMED
  4. There was a bug in a similar version causing these to be on by default.

Guess that gives me some work for tomorrow.
🙁

TED Talk: Dangers of Serotonin

He’s associated damage to the temporal lobe with psychopathic killers. The epigenetic effects, brain damage, and environments appears to be an MAOA variant on the X chromosome with experiencing violence around 3 years old.

Males only get the X from their mother. Men are much more likely. Girls get one X from mother and one from father which dilutes. Bathing the brain in serotonin too early makes the brain insensitive to the calming serotonin later.

Interesting.

TED Jim Fallon: Exploring the mind of a killer

BBworld From Afar

Staying true to tradition, Blackboard found a great speaker, Seth Godin, with a positive message. Notes people took…

Scott found the best point, I think.

Compliance doesn’t work to create value. Compliant work will always go to the lowest bidder. We can always find someone cheaper to follow the manual. Value is created by doing something different.

See! This is a mind numbingly positive message.

I liked some people on Twitter pointed to Jeff Longland’s role with VistaSWAT as a leader in the vacuum Blackboard has left open in the community.

Created a Yahoo Pipe for Bbworld09.

UPDATED 2009-07-15:

This TED video has much of the same substance as Godin’s Bbworld keynote.

I’m blogging this.

Elizabeth For about eight months I have participated in a group called the Brunch Bunch here in Athens. We get together to eat and talk. Many conversations drift into the nerdy (my forté?). The locations vary so I have gotten to try new (to me) restaurants. Elizabeth (pictured right) vouched that I am a great guy. Well, these are great people.
🙂

Elizabeth also brought a friend of hers from out of town, Claudia. Claudia, smartly has a newer version of my Canon Rebel. I have the XT. She has the XSi (two models newer). The newest is the T1i.

Downtown Athens is a great place to shoot photos. So, we walked around for an hour or so looking in stores to get out of the heat. This is the hat Elizabeth bought from Helix who also had some cool stone candle holders. Native American Gallery had some interesting petroglyph jewelry and gray flower pottery. I’ve got some ideas for gifts to give for upcoming birthdays, holidays, etc.

One of the employees at Helix and Claudia both asked if I had a blog. I’m sure it was because of my shirt! I only admitted to this one and blogging about Blackboard. Though, I guess I have diversified somewhat here. I probably should blog more about local stuff as well. That would mean getting out more as well.

I'm blogging this.For years, I have been collecting teeshirts from thinkgeek.com. At present the collection consists of:

Some others are on my wishlist. I do have some shirts from other places. By far the most popular is the xkcd sudo comic. I’ve added a few others from xkcd to my wishlist as well.

TED Talk: Clay Shirky: How cellphones, Twitter, Facebook can make history

The tumult in Iran is huge news of late. As a Baha’i, news of the persecution of Baha’s in Iran has stepped up because of the Internet. Stories crossed the ocean through email. News agencies almost never picked up these stories. As fast as the Iran government could shut down CNN and NYT and BBC reporters, the same government cannot seem to quell dozens who don’t have press credentials or passports to revoke from sharing the message. So the idea of several thousand sharing a similar message evading the same government doesn’t seem all the surprising to me.

[The Iran unrest] is the first revolution that has been catapulted onto a global stage and transformed by social media. This is it. The big one.

Calling this unrest a revolution seems premature. Still, all this information making it overseas is interesting to watch.

June 15th – Nature Photography Day

For Nature Photography Day 2008, I made an NPD Flickr group and invited a bunch of people. The only rules were to a) post pictures taken on June 15th (thank you Flickr / EXIF) and b) about nature or destruction of nature. Unfortunately, I didn’t pay attention to the group as I should have. So a bunch of nature picture spammers (they post the same picture to dozens of groups) posted hundreds of rule violating photos to the group pool. A month later I closed posting to the group because the spammers wouldn’t likely stop of their own accord.

Anyway, I forgot about NPD until the day of. No one posted to the group of their own accord. Who remembers after a year? I cleaned out the photos not following the rules. Set calendar reminders a couple weeks in advance to publicize the group. Hoping NPD 2010 will go better.

I’m also considering bending the rules. Maybe close to June 15th is close enough. Something like anywhere in the range June 10th to 20th is close enough? What do you think?

Anyway, here are the pictures from the group:

VistaSWAT

Do you run one of these versions of the former WebCT products?

  • CE4.x
  • CE6.x
  • CE8.x
  • Vista 3.x
  • Vista 4.x
  • Vista 8.x

If so, then you should join us for the next Vista SWAT web conference call Thursday, May 14th (and every other Thursday). We help each other solve issues and better understand how to use / run the product.

To be added to the Vista SWAT e-mail list, please e-mail jeff.longland who uses the uwo.ca domain. He graciously sends out the reminders.

I’m sure the Blackboard acquisition of ANGEL will get discussed.
🙂