Last night was supposed to be easy. Shut down the Vista application, wait for a call after some work, and bring up the app. It was easy… at first. I got the app back online and continued poking after letting others know it was back in service.
Satisfied all was well, I went to bed at 3am or so. Well… I got a phone call. There might be a problem with that earlier maintenance which was being addressed, but was my application affected? Well, no one could log in, so… yeah. Only the tools I use most often to find issues were also unavailable. Oops.
So I improvised. With netstat, I could see nodes in the application cluster were talking to each other and the database. Logs on one node showed it was failing to contact the database. So, the response should have been to shut down the application, right? As Amy says, “When all you have is a hammer, every problem starts to look like a nail.”
I didn’t make that choice. I decided it would be an overreaction. Instead, I reassured people asking me what’s wrong. When the network stabilized, the application did as well. It took over 1/2 an hour for the app to finally resolve the issues. As a result of this funk, it appeared the JMS services might have migrated, so I migrated them back. Hindsight being 20/20, I think even that might have been an overreaction, but it made me feel better to have done something.
Doing something feels better than doing nothing. It just doesn’t resolve the cause. It can easily become the cause of another issue. Two for the price of one?
P.S. I finally got to bed at about 7am, called in to 1pm meeting, and crashed again after that meeting. I’m still tired. Let’s do it again in couple days.