Should CS Be Required?

Each of the nearly 2,000 freshmen entering Georgia Institute of Technology each year must take a computer science course regardless of their major, says Charles Isbell, associate dean for academic affairs at the school’s College of Computing… Similar to traditional general education requirements such as philosophy or world history, the purpose of each courses is to turn out well-rounded graduates, Isbell says.

“Why you need to take a CS1 … is the same reason why you need to take humanities, why you need to take a science, why you need to take a math,” he says. “It’s not because you’re going to be programming …. it’s because each of those represents a different way of thinking.”

Computer science was not a requirement at my alma mater (not GT). Introduction to Computers was an easy core class lots of students took. The class offered by Mathematics and Computer Science was about the components of a desktop, using Microsoft Office, and making a web page. The College of Education and the College of Business offered their own versions tailored to their disciplines.

At first, I did not want to go through a class on “This is a mouse. This is a keyboard.” At the time I was looking at upgrading from an AT form factor to ATX. Microsoft Word 95 was my fifth word processor. Plus I had made the web site for African American Studies for the university. In the end I took the class because it would improve my GPA. Like, I thought, it was an easy A, but the instructor did challenge me by making me available to help the others in the class.

This was not a real CS class though. I had already taken one, FORTRAN, which apparently did not count towards my core to graduate, oddly enough. I took another, Introduction to Programming, where I picked up some Java. Both programming classes gave me novel practice at the time for how I solve problems, plan, and researched. They were good for me.

Despite not graduating with a computer degree, I did have a strong computer background and ended up in a computer profession. So my perspective pretty much is skewed in a positive direction for all college students taking computer science classes.

OpenSSL Handshake

Chain

One of the questions we ask our clients initiating an engagement to help them setup external authentication from our LMS to their server is, “What is the certificate authority for your SSL certificate?” We have been burned by people purchasing certificates from authorities Java does not support. (And the support is indeed limited compared to say, Mozilla.)

We were given the name of an intermediate certificate which set off warning klaxons. There are none of these in the cacerts file, the list of root CAs Java uses.

So the clients setup to test. Failures. The error:

javax.naming.CommunicationException: hostname.domain.tld:port [Root exception is javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated

From what I was able to find, the error meant the certificate was not understood. Framed into thinking the intermediate CA was the cause I started looking at how to make it work. The two potential routes were get the client to add the intermediate CA to their server or test ways to complete the chain by adding the intermediate to my client.

More failures.

Amy suggested looking at the certificate on the foreign server by connecting with openssl to get a better idea where it said there was a problem. The command looks like:

openssl s_client -connect hostname:port

The return was pretty clear that it could not understand or trust a self-signed certificate. The “i:” in the last line below is the Issuer. This made it clear the certificate was not signed by the intermediate CA we were told. It was a self-signed certificate. Doh!

depth=0 /CN=hostname.domain.tld
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 /CN=hostname.domain.tld
verify error:num=27:certificate not trusted
verify return:1
depth=0 /CN=hostname.domain.tld
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:/CN=hostname.domain.tld
   i:/DC=tld/DC=domain/CN=domain-NAME-CA

It is clear I need to make checking the certificate on the foreign host part of the standard practice. Did some spot checking of previous setups to test against LDAP and every one has a good certificate chain.

Smaller Java Cache

One of our campus Blackboard Learning System Vista Enterprise administrators reported to have reduced the number of Java cache related issues (failed sessions) by changing the Disk Space Allotment from the 1,000 MB default down to 100 MB. This is found in the Java Control Panel > General tab > Temporary Internet Files: Settings. I am curious if anyone else has found this to be the case?

The purpose of web browsers having a cache was to speed up use of a web site by not having to download content again. RAM is faster than disk is faster than Internet. (This especially was true in the mid 1990s.) Take a look at this web site. There is the image at the top plus various CSS, and JS files. It looks like there are a good 224 KB in CSS, JS, and their supporting images. Rather than download significant amount of content again, with the appropriate settings a browser will check whether the size changed (assume no changes) or it expired (really that it is stale). If neither are true, then it uses what it already has. This will make my web site load faster for the user. So caching is a very good thing.

Java Plug-in, the client downloading and rendering applets in a web browser, works similarly. It can keep a copy of the applet in a cache. Starting with Java 1.3 there are even parameters placed in the HTML for applet caching. It looks to me like the HTML Creator, really edit-on(R) Pro by RealObjects, JavaScript for instantiating the applet has settings which enable Java to keep it in its cache.

The default cache size of 1,000 MB sounded excessive at first. Do people really reach the point where the whole cached is used? Looking at mine, I have 4 items in Applications from running them on my desktop plus around 2,200 items in Resources. All this takes up only 155 MB. Most of them are tiny files. The largest ones in Resources are from the various Vista  clusters I administrate. Therefore setting this to 100 MB as recommended probably means these getting downloaded more often and waiting on 1MB+ files to download. Glad we have a fast Internet connection at work. Sucks to be the students on DSL who follow this advice and use lots of Java-based applets.

If the Java Plug-in cache was buggy, then I could foresee problems with display of applets. It should download the applet but does not, it should not download the applet but does, the wrong applet is used, a corrupted applet is used. Instead, this seems to be claiming to solve an issue were the web browser lost the session cookie. It seems very unlikely to me that a Java Plug-in could cause a web browser to lose a session cookie much less changing the cache size fix it.

New Root CA

One of our clients introduced a new LDAP server for authentication. Like a good partner, they implemented it in the test environment, found it did not work and alerted us to the problem. They also informed us the problem would be the new Thawte Root CA was not implemented on many operating systems and applications. Indeed our research pointed out Sun/Oracle had not implemented the Root CA in their list of valid ones.

Normally we do not add root CAs. A specific case which comes to mind is another client “bought” a free SSL certificate for higher education from a place in Barcelona. In this case, we decided to go ahead and add it. However, we wanted to do so in a supported manner, so I opened a ticket with Blackboard.

The recommendation was to use the keytool to apply the new Root CA. (I would not do this without having an open ticket with Blackboard as if anything goes wrong, then at least you can get faster support.) After a couple modifications of what they gave me, the keytool command worked. Namely, the cacerts file is installed with read only permission, so it needs write to be edited. Plus the -keypass  The process to use if we need to do this again will be:

# Navigate to where the root cert is stored.
cd ${JAVA_HOME}
cd ../lib/security/
# Copy the existing cacerts file to a backup in case we need to revert.
cp cacerts cacerts.bak
# Copy the existing cacerts file to a working copy so not affect running processes while editing.
cp cacerts cacerts.new
# Java sets permissions on the file to 444. Change to 644 to edit.
chmod 644 cacerts.new && ls -l cacerts*
# Set the variable for the name of the file to import.
NEWCERT=newthawteroot1
# Run the import keytool. Will ask for password. Ask the system admin or try the Java default.
${JAVA_HOME}/keytool -import -trustcacerts -alias ${NEWCERT} -file ${NEWCERT}.pem -keystore cacerts.new
# Verify change took.
${JAVA_HOME}/keytool -list -v -keystore cacerts.new | grep -A 10 ${NEWCERT}
chmod 444 cacerts.new && ls -l cacerts*
chmod 644 cacerts && cp cacerts.new cacerts && chmod 444 cacerts && ls -l cacerts*
# Another verify but on  cacerts would not hurt to make sure it has the new root ca.

At this point I asked for the institution test on the node where I made the change. (It happens that test cluster has a single node in a special pool. I had made this change on that node. It also meant other institutions testing on that cluster were not affected.) We still want to verify introducing this does not affect others using LDAP. I don’t think that it will, but that never comes across to people as reliable when something completely unexpected causes a problem.

Now to write a script to push this change to the other 180 nodes. Should be easy enough as I copied the new cacerts to our mounted file system. I just need a script to navigate to the where the file is stored, make a backup, chmod, copy, chmod, and verify.

Weblogic Diagnostics

I noticed one the nodes in a development cluster was down. So I started it again. The second start failed, so I ended up looking at logs to figure out why. The error in the WebCTServer.000000000.log said:

weblogic.diagnostics.lifecycle.DiagnosticComponentLifecycleException: weblogic.store.PersistentStoreException: java.io.IOException: [Store:280036]Missing the file store file “WLS_DIAGNOSTICS000001.DAT” in the directory “$VISTAHOME/./servers/$NODENAME/data/store/diagnostics”

So I looked to see if the file was there. It wasn’t.

I tried touching a file at the right location and starting it. Another failed start with a new error:

There was an error while reading from the log file.

So I tried copying to WLS_DIAGNOSTICS000002.DAT to WLS_DIAGNOSTICS000001.DAT and starting again. This got me a successful startup. Examination of the WLS files revealed the the 0 and 1 files have updated time stamps while the 2 file hasn’t changed since the first occurance of the error.

That suggests to me Weblogic is unaware of the 2 file and only aware of the 0 and 1 files. Weird.

At least I tricked the software into running again.

Some interesting discussion about these files.

  1. Apparently I could have just renamed the files. CONFIRMED
  2. The files capture JDBC diagnostic data. Maybe I need to look at the JDBC pool settings. DONE (See comment below)
  3. Apparently these files grow and add a new file when it reaches 2GB. Sounds to me like we should purge these files like we do logs. CONFIRMED
  4. There was a bug in a similar version causing these to be on by default.

Guess that gives me some work for tomorrow.
🙁

How Not To Break a Frame

Correct:

<script language=”Javascript” type=”text/javascript”>
if (top != self)
{
top.location = window.location;
}
</script>

Incorrect:

<script language=”Javascript” type=”text/javascript”>
if (top != self)
{
top.location = “/webct/urw/lc18361011.tp0/logonDisplay.dowebct”;
}
</script>

The problem with incorrect is the address used here is not the address in the location bar.  The one in the location bar has the values required to login. Instead I get something which causes users to be unable to login. Example: So we send someone to http://westga.view.usg.edu. They get redirected to another address in which we provide the glicid, insId, and insName. Correct breaks the frame and gives the browser back the same address. Incorrect breaks the frame and gives the browser back a different, non-functional address. Bad. Bad. Bad.

WebCT Vista 3 used the Correct JavaScript which just passes back the address used. Blackbord Vista 8 for some reason changes what worked to Incorrect.

Yay for first day of classes.
🙁

UPDATE 1:

It gets better… Bb Vista’s Custom Login and Institution List pages are unaffected (aka use the Vista 3 style JS). Only going to the generated logon page, loginDisplay.dowebct, has the issue.

Recap of Vista Stuff

It has been a hectic week. A recap…

Java certificate fix – Yesterday, August 23rd, the certificate distributed in various Java applets expired. The community discovered the issue and informed Blackboard who put out a fix for the more current products on August 15th. Many customers are leery of having such little lead time to test, verify, and install a fix. Well, Vista 3.0.7.17 was also reported to have the problem, but Blackboard didn’t provide a fix until the 20th after I got my TSM to verify it really still is a problem on the 18th. (The corrected 3.0.7.17.8 version was provided August 21st. Why is in the next paragraph.)

The fix for Vista 3 required us to be on 3.0.7.17.8 (hotfix 8 which we had not yet applied), had references to the “webctapp” directory (in Vista 3 it is applications), and distributed a webct.sh script to add updateWar which didn’t work with Vista 3. FAIL. Thankfully we have modified War files in the past, so adding the updates was more work and accomplished before Blackboard provided a corrected version.

To see the Java certificates in Windows: Control Panel > Java > Security > Certificates. The Blackboard ones are verified by Thawte (the Certificate Authority). The old one is issued to Blackboard. The new one is issued to dc.blackboard.com.

Vista 3.0.7.17.8 – This hotfix was released a couple weeks ago. However, since the priority has been the migration to Vista 8, this was on hold. The previous problem made us step up and throw this into production. The testers went to heroic efforts to get this and the certifcate fix tested. Testing was mixed.

  1. Losing session cookie because of Office 2007 in Internet Explorer. Happened less often post fix, but still happens in some cases.
  2. Autosignon MAC2. Mode to allow insecure MAC works to give the one school using it time to correct update their portal to use MAC2. Originally the plan was to let them work out MAC2 in test.

Slammed by our users…

  1. systemIntegrationApi.dowebct – The school using the autosignon wanted to have the correct consortiaId to create the MAC. Some time back in January they started calling this any time users tried to login because a handful (guess was ~12) have had their username changed. So the autosignon failed. Yes, they were sent us 25,000 requests in a busy day (about 20% of the queues were working on these during the day) to handle potential 12 problems in a term. FAIL.
  2. pmSelfRegister.dowebt – One of the clusters started to have issues. Two nodes went crappy. I looked at the Weblogic console and found all of the failing nodes had no free spots in the queues. 90% of the queues were working on these. Much of this is because the requests were hanging around for at least 4800 seconds (an hour is 3600 seconds). At about 6000 seconds the cluster recovered when the queues cleared.I think the queues cleared because I changed to false a couple settings:
    • Allow users to register themselves as a Student in a section = false
    • Allow users to register themselves as an Auditor in a section = false

    As I recall, we only had about 22 queue spots open (out of 308) across the whole cluster. We got lucky.

Nothing We Can Do?

A statement by a faculty member to the effect of, “There isn’t anything our IT people can do to resolve this problem. The web application is overloaded,” puts the people running the application on the defensive. The problem turned out to be resolved with changes to the local browser environment (remove all the installs of Java and replace with a single known good version). In other words, it wasn’t only because the web application was overloaded and could have been resolved by their IT people consulting the knowledge base intended to provide the information to resolve exactly this kind of stuff.

The last situation we want to have is an overloaded server before we even hit the heavy usage period.

CE / Vista Undocumented Workspaces

On the WebCT Users email list (hosted by Blackboard) there is a discussion about a mysterious directory called unmarshall which suddenly appeared. We found it under similar circumstances as others by investigating why a node consumed so much disk space. Failed command-line restores end up in this unmarshall directory.

Unmarshalling in Java jargon means:

converting the byte-stream back to its original data or object 1

This suspiciously sounds like what a decryption process would use to convert a .bak file into a .zip so something can open the file.

This is fourth undocumented work space where failed files site for a while and cause problems and no forewarning from the vendor.

Previous ones are:

  1. Failed UI backups end up in the weblogic81 (Vista 3, does this still happen in Vista 8?) directory.
  2. Failed tracking data files end up in WEBCTDOMAIN/tracking (Vista 3, apparently no longer stored this way in Vista 4/8 according to CSU-Chico and Notre Dame)
  3. Web Services content ends up in /var/tmp/ and are named Axis####axis. These are caused by a bug in DIME (like MIME) for Apache Axis. No one is complaining about the content failing to arrive, so we presume the files just end up on the system.

#3 were the hardest to diagnose because of a lack of an ability to tie the data back to user activity.

Is this all there are? I need to do testing to see which of these I can cross off my list goring forward in Vista 8. Failed restores are on it indefinitely for now.
🙁

References:

  1. http://www.jguru.com/faq/view.jsp?EID=560072

IMS Import Error When Node Is Down

This is what I got when a node was down while I attempted to do an IMS import in Blackboard CE/Vista.

Failed to upload files, exiting.
Cause could include invalid permission on file/directory,
invalid file/directory or
repository related problems

The keywords permission, file, and directory in this would have sent me anywhere but to the right place. The keyword repository made me suspicious the node had a worse issue than just bad permissions. So I looked for the most recent WebCTServer log and found it to be a week old. Verifying the last messages in the log confirmed it had been down for a week.
🙁

To see anything in the log questioning whether or not the node was running would have saved me lots of time this morning.

Added to my .bashrc a couple lines to provide a visual indicator how many are running.

JAVA_RUNNING=`ps -ef | grep [j]ava | grep -c [v]ista`
echo ”  — No. Vista processess running = $JAVA_RUNNING”

Better might even be to have it evaluate whether less than one or more than two (or three) are running. If so, then put something obvious the world is falling. Maybe later. Took me just a couple minutes to write and test what I have. The rest will come after I decide what I really want. 🙂

Also, it wasn’t running because a coworker had run into a situation where the fifth node would not start. She thought maybe it was because the number of connection Oracle would accept was not high enough. I suggested a simple test would be to shut down a node and see if the problem one suddenly works. I happened to be working with the one she shut down for the test. It happens she had just started a script to bring them up when I asked.