Friday, May 2, 2008

Recent WiscMail Outage

The CIO/Vice Provost for Information Technology sent a detailed message to the Information Technology Committee about Wednesday's WiscMail outage. We will, I'm sure, be getting another update at our next ITC meeting.

Just to quick translate a few things in the message. First, a "memory leak" basically means that the longer the program runs, the slower it becomes. They are bugs and are occasionally encountered in new versions of software, and in general the right thing to do is to go back to the old version. Second, to understand "rebuild the message storage index database", just focus on "rebuild index." If you rip the index out of the back of a book, to replace it you have to start at the beginning of the book and re-read the whole thing. Without the index, the mail server can't find someone's email messages without searching through all of the messages. You can imagine that building an index for 60,000 inboxes takes a little bit of time.

All

Now that WiscMail service is restored, I want to give you an update on what happened.

First, let me apologize for the inconveniences this caused you. I am certain that we all count on e-mail for much of what we do every day and having these services unavailable is a significant disruption for us.

Tuesday morning, the WiscMail team installed an update to the Sun software we use to process mail. It was believed to be a fairly modest upgrade. By noon Tuesday, a few of our mail servers were experiencing a "memory leak," which degraded from poor to no performance rather quickly. The mail team worked with the vendor to address the problems and decided to roll back the upgrade to the previous version of software. The team worked around the clock to restore service for our
users. Wednesday morning we experienced performance problems that resulted from the rollback, which required the mail team to rebuild the message store index databases. Full IMAP, POP, and webmail email delivery services were restored early afternoon Wednesday.

It is extremely rare that our WiscMail service experiences an outage. I want to assure you that steps are being taken to make sure every future upgrade does not bring this effect. The mail team is working with Sun Microsystems to fully diagnose the error(s), fix the upgrade and reapply it when we have full assurance it will work under current loads. We will also hold a post-incident review to learn from this event and adjust our strategies as appropriate.

Thank you for your patience and trust.


--
Ron Kraemer
Vice Provost for Information Technology
Chief Information Officer (CIO)
University of Wisconsin-Madison
ron.kraemer@cio.wisc.edu


I'm sure that this will bring up some discussions about the future of WiscMail as well. Here are some things to think about.
  • First, WiscMail is not designed to have "no downtime." To truly take it to the next level, we would need to have multiple active data centers. At present, there is an emergency plan, and all data continuously stored at multiple locations. If something catastrophic were to happen to the Computer Sciences building, not a single email would be lost. However, if the Computer Sciences building can't recover, it would take a few days to totally restore email service from the backup site. It would be VERY expensive to both fully equip the additional data centers and more importantly, modify our email system (and the additional software that is necessary to run the email system) to automatically fail over and have just a few minutes at most of downtime. I don't know if there's something to be found in the middle of these two extremes, where we could restore email within say a few hours of a catastrophe at Computer Sciences. Clearly, there was no catastrophe at Computer Sciences Wednesday, so there is a lot of work to do with the current system so it can recover quickly even with one active data center.
  • Second, switching to GMail or some other outsourced email provider is not a "no-brainer" in terms of cost. GMail would reduce some costs, and will increase others. We don't know entirely know how those balance out. In addition, there are a whole host of new legal issues related to data ownership, privacy, and state records requirements that would need to be identified as part of any move. GMail does have multiple data centers, but they too have had the occasional multi-hour outage and are much less forthcoming about those outages.
The big questions will be How much does Email cost?, and How much is Email worth to us? I don't know how much of this we'll get to discuss at the last ITC meeting of the year, but these issues will certainly be on people's minds, and hopefully we'll be able to have a good discussion on May 16th.

Wednesday, April 30, 2008

Final Post


Dear SGers,

This is my final post as chair of ASM Shared Governance. First, thank you to everyone who attended the transition mixer we had on Tuesday night. It was nice to see old and new faces alike, and I'm very excited to know we have an outstanding crop of new appointees for next year.

As we've been discussing throughout this year, Shared Governance is in transition, and this work will continue into the fall. With new bylaws and a moved appointment process, we have paved the way for a more functional and efficient system. The great work of Travis Weller (Shared Governance Appointment Coordinator), whose position was created only 1.5 yrs ago, has allowed us to be more robust in our representation - Shared Governance is providing more active and effective input on key university committees than ever before, expanding is reach in helping to create new committees, and is streamlining its internal operations.

Starting this fall, the committee will hold for the first time a comprehensive orientation for all new appointees - complete with parliamentary procedure trainings and discussions with other shared governance stakeholders (i.e., faculty, staff, and administration). Also, I hope the next chair will utilize the shared governance board, especially those positions that aim to: a) foster the expansion of student representation, where and when appropriate; and b) educate the student body on what shared governance is and how it affects them as members of the university community.

Shared governance has a rich history at the University of Wisconsin, making this institution one of the most robust when it comes to student representation. Indeed, no other student government in the country partakes in the governance process of its university to the extent that we do. Please keep this in mind when serving next year, and know that students entrust you with a high degree of responsibility to serve in the interests of students to the best of your ability. Good luck, and I'll see you in the fall.

Warm Regards,
Jeff Wright

Union Design Committee - student appointees


ASM Shared Governance is pleased to announce that Jonathan Augelli and Cait Hardy will be representing students on the Union Design Committee. The Committee also appointed three student alternates -  Matthew Colvin, Matthew Hansen, and Zachary Ward. The terms of these students begin immediately.

ASM Shared Governance received a total of 20 applications for two voting seats.

The Associated Students of Madison Shared Governance Committee Blog serves as a space for shared governance appointees and the UW-Madison student body to communicate on issues relating to shared governance. As part of their responsibilities as student representatives, appointees will post a report following each meeting attended.