Dec
11
BbWorld 2008 Call for Proposals
Filed Under Blackboard Vista, Conferences | 1 Comment
Last year, we three DBAs submitted three proposals thinking one might be accepted. All three were. Its daunting to think of something because we are behind the times. We run Vista 3.0.7 while almost everyone else is at least on 4.1.x or higher. Also, we ended up changing our presentations last year because we were not doing things we thought we would be doing. Ugh.
Presenting at BbWorld or Blackboard Developers Conference is a great professional development opportunity and fabulous way to share your knowledge with your peers. BbWorld® ‘08Deadline for Proposal Submission: February 22, 2008
Maybe we could do one on:
- Staying Beneath the Threshold of Doom: 6-8 vs. 40 clusters?
- Planning the Largest Vista 3 to 4 Migration
- API Logging: Users Connection to Vista Not in Your Logs
- Creating an Audit of User Activity
Dec
2
Coradiant TrueSight
Filed Under Conferences, Work | 3 Comments
Several of us saw a demo of Coradiant Truesight yesterday (first mentioned in the BbWorld Monitoring post). Most of the demo, I spent trying to figure out the name Jeff Goldblum as one of team giving the demo had the voice and mannerisms of the actor’s characters. Had he mentioned a butterfly, then I definitely would have clapped. The other reminded me of John Hodgman.
Something I had not noticed at the time, but a reoccurring point of having Truesight is to tell our users, “Here is evidence the problem is on your end and not ours.” This assumes the users are rational or will even believe the evidence. They wish the problem never occurred (preference) and a resolution (secondarily). Preventing every problem, especially issues outside our domain, probably is outside the scope of the budget we receive. So, we are left with resolving the issues. Especially scary are the users who take evidence the problem is on their end or their ISP’s end to mean, “This is all your fault.”
Resolutions we can we offer are:
- Hardware change - We can replace or alter the configuration of the hardware components of the network, storage, database, or application.
- Software change - We can alter the configuration of the software components of the network, storage, database, or application.
- Request a code change from a vendor - We can work with our vendors to get a code change. These take forever to implement.
- Suggest a user resolve the issue -
- We can provide a work around (grudgingly accepted, remember the preferred wish is the problem never occurred).
- We suggest configuration changes the user can make to resolve the problem.
Truesight provides us information to help us try to resolve issues. Describing the information provided as “facts” was a nice touch. At Valdosta State, I gave up on users reporting the browsers accurately and captured the infsormation from the User-Agent header. Similarly, at the USG, I’ve found users disagree ~30% of the time about the version of the browser according to the User-Agent string. Heck, they have errors in the name of the class ~40% of the time. My favorite is something took 15 minutes, but all I could find was it took four minutes. Ugh. Because Truesight is capturing the header info, it ought to be much easier to confirm what users were doing and where problems occurred more accurately than the users can describe.
After receiving all the “facts”, we still have to determine the cause. Truesight helps us understand the scope of the problem by how many users, how many web servers, and how many pages are affected by slowness to what degree. As a DBA and administrator, my job identifying cause ought to be easier, though quantifying how much easier probably is difficult to say.
Part of why: (Mostly speculation.) Problems identified as a spike in anything other than “Host” are external causes. These are causes in front of the device. Causes behind the device are “Host”. If these were more narrowly broken down, the maybe we could better determine cause. That would require knowledge web browsers typically would not know like the server processing time, query processing time, or even the health of the servers.
tag: Blackboard Inc, Coradiant, web browser, user agent, support
Oct
26
Rock Eagle Wrap-Up
Filed Under Conferences, Work | Leave a Comment
Index of posts:
- RE 2007: GeorgiaVIEW Meeting (Pre-Conference)
- RE 2007: Birds of Feather: GeorgiaVIEW Vista
- RE 2007: Top Ten Disruptive Trends
- RE 2007: Birds of a Feather: Luminis
- RE 2007: Administering Sakai
- RE 2007: GeorgiaVIEW Vista File and Content Sharing
- RE 2007: USG Digital Content Repositories: Resources to Share
After this point, I got wrapped up in other things, moderating, fireworks, a Texas Hold ‘Em tournament, and dealing with tickets. The above are all sessions which affect my area even tangentially. Hope you enjoy.
Oct
24
RE 2007: GeorgiaVIEW Meeting (Pre-Conference)
Filed Under Conferences, Work | 2 Comments
I am blogging from the pre-conference GeorgiaVIEW meeting @ Rock Eagle yesterday afternoon and this morning. I enjoy connecting with people around the state of Georgia who use our Vista system. Most of them do not make it to BbWorld. Some hot topics:
- Alternatives to Blackboard Vista
- Training
- Content repository
- Returning Reports and Tracking to instructors.
- Some reports still failing. One approach may be to remove tracking data from Vista database and make it available elsewhere.
- Upgrade to Vista 4. People want a timeline, access to a training instance ASAP, please not do an in-place upgrade.
- Limited shelf life on internals of Vista 3 / 4.0 - 4.1.2
- More of customers have moved or are moving to Vista 4 / CE 6 than a year ago.
- Can take advantage of new tools available in Vista 4.
- Data retention - policy, reponsibilities (faculty, campus, OIIT)
- Phased approach - parallel environments, at some point Vista 3 goes away and no longer available.
- End of Fall 2008 or Spring 2009.
- People are both quite happy we are going to Vista 4 and disconcerted at the prospect of having to move to Vista 4 in even over a year from now (at the worst by April 2009).
- Export / import of non-SIS created users.
- Training
Lovely (yeah a real person and she is) says Lovely Freelove would be one of the best names ever.
Oct
22
What the Heck Is a Hot Fix?
Filed Under Blackboard Vista, Conferences | Leave a Comment
At the BbWorld Developers’ Conference (Thursday afternoon and Friday morning after BbWorld), there was a session by John Fontaine called What the Heck is a Hotfix? (PPT,audio recording). I’d been meaning to go look for this at the Bb Connections web site where the conference presentations were uploaded. However, I found this through a Bb knowledge base link to eduGarage which apparently is the new home of the Blackboard Developers Network.
- Ad Hoc Patch - fixes a single issue
- Hot Fix - Multiple usually related code changes (5-6 issues)
- Service Pack - Many code changes (50-60 issues)
- New Release (either Application Pack or new version number)- New features and Large scale code changes
Sep
22
BbWorld Presentation Redux Part II - Monitoring
Filed Under Blackboard Vista, Conferences, Work | 1 Comment
Much of what I might write in these posts about Vista is knowledge accumulated from the efforts of my coworkers.
This is part two in a series of blog posts on our presentation at BbWorld ‘07, on the behalf of the Georgia VIEW project, Maintaining Large Vista Installations (2MB PPT).
Part one covered automation of Blackboard Vista 3 tasks. Next, let’s look at monitoring.
Several scripts we have written are in place to collect data. One of the special scripts connects to Weblogic on each node to capture data from several MBeans. Other scripts watch for problems with hardware, the operating system, database, and even login to Vista. Each server (node or database) has, I think, 30-40 monitors. A portion of items we monitor is in the presentation. Every level of our clusters are watched for issues. The data from these scripts are collected into two applications.
- Nagios sends us alerts when values from the monitoring scripts on specific criteria fall outside of our expectations. Green means good; yellow means warning; red means bad. Thankfully none in our group are colorblind. Nagios can also send email and pages for alerts. Finding the sweet spot where we get alerted for a problem but avoid false positives perhaps is the most difficult.
- An AJAX application two excellent members of our Systems group created called internallyl Stats creates graphs of the same monitored data. Nagios tells us a node failed a test. Stats tells us when the problem started, how long it lasted, and if others also displayed similar issues.We also can use stats to watch trends. For example, we know two peaks by watching WIO usage rise to a noonish peak slough by ~20% and peak again in the evening fairly consistently over weeks and months.
We also use AWStats to provide web server log summary data. Web server logs show activity of the users: where they go, how much, etc.
In summary, Nagios gives us a heads up there is a problem. Stats allows us to trend performance of nodes and databases. AWStats allows us to trend overall user activity.
Coradiant TrueSight was featured in the vendor area at BbWorld. This product looks promising for determining where users encounter issues. Blackboard is working with them, but I suspect its likely for Vista 4 and CE 6.
We have fantastic data. Unfortunately, interpreting the data proves more complex. Say the load on a server hosting a starts climbing, its the point we get pages and continues to climb. What does one do? Remove it from the cluster? Restart it? Restarting it will simply shift the work to another node in the cluster. Say the same happens with the database. Restarting the database will kick all the users out of Vista. Unfortunately, Blackboard does not provide a playbook on what to do with every support possibility. Also, if you ask three DBAs, then you will likely get three answers.
Its important to balance the underreaction and overreaction. When things go wrong, people want us to fix the problem. Vista is capable of handling many faults and not handling very similar faults. The link example was a failed firewall upgrade. I took a similar tact with another firewall problem earlier this week. I ultimately had to restart the cluster that evening because it didn’t recover.
Part three will discuss the node types.
Sep
9
BbWorld Presentation Redux Part I - Automation
Filed Under Blackboard Vista, Conferences, Work | 1 Comment
Much of what I might write in these posts about Vista is knowledge accumulated from the efforts of my coworkers.
I’ve decided to do a series of blog posts on our presentation at BbWorld ‘07, on the behalf of the Georgia VIEW project, Maintaining Large Vista Installations (2MB PPT). I wrote the bit about tracking files a while back in large part because of the blank looks we got when I mentioned in our presentation at BbWorld these files exist. For many unanticipated reasons, these may not be made part of the tracking data in the database.
Automation in this context essentially is the scheduling of tasks to run without a human needing to intercede. Humans should spend time on analysis not typing commands into a shell.
Rolling Restarts
This is our internal name for restarting a subset (consisting of nodes) of our clusters. The idea is to restart all managed nodes except the JMS node, usually one at a time. Such restarts are conducted for one of two reasons: 1) have the node pick up a setting or 2) have Java discard from memory everything. The latter is why we restart the nodes once weekly.
Like many, I was skeptical of the value of restarting the nodes in the cluster once weekly. Until, as part of the Daylight Savings Time patching, we provided our nodes to our Systems folks (hardware and operating systems) and forgot to re-enable the Rolling Restarts for one batch. Those nodes starting complaining about issues into the second week. Putting back into place the Rolling Restarts eliminated the issues. So… Now I am a believer!
One of my coworkers created a script which 1) detects whether or not Vista is running on the node, 2) only if Vista is running does it shut down the node, 3) once down, it starts up the node, and 4) finally checks that it is running. Its pretty basic.
Log cleanup to preserve space
We operate on a relatively small space budget. Accumulating logs infinitum strikes us as unnecessary. So, we keep a months’ worth of logs for certain ones. Others are rolled by Log4j to keep a certain number. Certain activities can mean only a day’s worth are kept, so we have on occasion increased the number kept for diagnostics. Log4j is so easy and painless.
We use Unix’s find with mtime to look for files 30 days old with specific file names. We delete the ones which match the pattern.
UPDATE 2007-SEP-18: The axis files in /var/tmp will go on this list, but we will delete any more than a day old.
Error reporting application, tracking, vulnerabilities
Any problems we have encountered, we expect to encounter again at some point. We send ourselves reports to stay on top of potentially escalating issues. Specifically, we monitor for the unmarshalled exception for WebLogic, that tracking files failed to upload, and we used to collect instances of a known vulnerability in Vista. Now that its been patched, we are not looking for it anymore.
Thread dumps
Blackboard at some point will ask for thread dumps at the time the error occurred. Replicating a severe issue strikes us as bad for our users. We have the thread dumps running every 5 minutes and can collect them to provide Blackboard on demand. No messing with the users for us.
Sync admin node with backup
We use rsync to keep a spare admin node in sync with the admin node for each production cluster. Should the admin node fail, we have a hot spare.
LDIS batch integration
Because we do not run a single cluster per school and the Luminis Data Integration Suite does not work with multiple schools for Vista 3 (rumor is Utah has it working for Vista 4), we have to import our Banner data in batches. The schools we host send the files, our expert reviews the files and puts them in place. A script finds the files and uploads each in turn. Our expert can sleep at night.
Very soon, we will automate the running of the table analysis.
Anyone have ideas on what we should automate?
Sep
3
Back from TN
Filed Under Conferences, Religion / Baha'i Faith | Leave a Comment
I am home from the Tennessee Baha’i School. I enjoyed the weekend.
Meeting new people is not something I’d normally place high on my list. However, I have yet to go to a Baha’i conference or weekend school where I did not come away feeling happy to have met all those I did. Naturally, since I am horrible with names, I don’t remember the names of 1/2 of them.
I can do better.
Jul
13
BbWorld ‘07
Filed Under Blackboard Vista, Conferences | Leave a Comment
In the last throes of the BbWorld ‘07 Developer’s Conference (the regular conference ended yesterday). Some pictures are in Flickr in my “BbWorld 2007” set. I’ll likely post the rest tonight. Our presentations should be posted soon on the conference site.
Some important ideas of keynotes:
EdVentures in Technology » Notes from BbWorld 2007 in Boston, MA:
Incentives matter
– Steven Leavitt
Arkansawyer » 2007» July» 11 (most comprehensive BbWorld ‘07 blog I’ve fount):
Guy Kawasaki
- Make meaning.
- Make mantra.
- Jump to the next curve.
- Roll the DICEE.
- Don’t worry, be crappy.
- Polarize people.
- Let a hundred flowers blossom
- Churn, baby, churn
- Niche thyself
- Follow the 10/20/30 rule
- Don’t let the bozos grind you down
Talked to lots of Blackboard upper management. I haven’t drank the Kool-Aid. ![]()
Jun
26
Obscurity Obsolescence
Filed Under Conferences, University | Leave a Comment
Along the same lines as Lacey’s Travel and Usability post, libraries are not really designed to be very usable. Well… unless you think like a librarian. Who gets a MLIS degree in order to use a library. Okay… I would… bad example.
The below article’s Digital Natives are kids who have played video games all their lives. Its reporting on a talk given at an ALA conference that librarians should redesign libraries to be friendlier to these Digital Natives (aka more like video games). The strawman argument:
When ‘Digital Natives’ Go to the Library :: Inside Higher Ed:
“The librarian as information priest is as dead as Elvis,” Needham said. The whole “gestalt” of the academic library has been set up like a church, he said, with various parts of a reading room acting like “the stations of the cross,” all leading up to the “altar of the reference desk,” where “you make supplication and if you are found worthy, you will be helped.”
This similie is warped in my experience. When I worked the reference desk, I didn’t so much bestow books upon supplicants and demonstrate how to use the tools. In essence, it was like explaining to a friend who is stuck how to play the game. I had heard of libraries in which non-library employees are not allowed access to the stacks, but I thought them rare.
Maybe instead of librarians playing more video games, students who play video games should actually use those skills when they go to the library? They can master a university library by spending a couple hours a week for a month browsing, identifying patterns, and enjoying the fruits of their efforts: interesting books. For me, “research” meant skimming all books and articles on a topic and tangents to the topic. I could spend a year absorbing knowledge in a good library. Working in the library explosed me to such an enormous wealth of knowledge free for the asking.
Instead, students typically go into a library to find a list of books or articles. They want to spend the minimum amount of effort to accomplish the goal. This certainly is not how they approach video games.




