|
Home > LITS > About LITS > LITS Annual Reports > Annual Report 2004-05 > Networking
Networking
LITS Annual Report 2004-2005
Networking is responsible for the design and maintenance of the college network and the system administration of many of the host computers that reside on that network. The network and systems are maintained in a state of high reliability and security. We maintain efficient means for users to interact with these systems. This often requires installation or creation of new software applications. It also involves many aspects of instruction for end users and other staff members within LITS, including application development questions.
The Fall, 2004 The student arrival in September means that Networking becomes heavily involved with:
- Answering questions from students and colleagues.
- Helping configure machines that do not network properly.
- Making house calls to fix bad data ports which often are not bad.
- Dealing with a variety of user issues related to networking for faculty and staff as well as students as the load on all the systems jumps.
Student network registration and update check
We changed the network registration program the previous year so that Windows computer users would download a program written by Kevin Slate that would check the computer system for patches and operating system upgrades. If the computer was not up to date, it would automatically connect to the Microsoft update site and download updates.
In the Fall 2004 we continued to spend a great deal of time in the Information Commons assisting students in getting their systems patched and up to date. It was a great deal of work because there were very many computers that were very far out of date.
Overall, however, we feel it was less of a problem than the previous year. In general, the Fall semester went very well with regard to virus attacks. It is likely that our work getting many of the systems up to date was responsible for this lack of problem.
Some computers remained not up to date. There was a mechanism in the network registration program where someone could opt out of the process of getting their system up to date. Many of these systems had firewalls installed anyway, so there was some protection for them.
The computer system called MHC
During the fall the Digital Alpha DS20 system which was MHC began to have serious load problems. This system was in its last planned year of operation, but it was having severe problems keeping up with email.
The performance of the Digital Alpha was so bad that we decided to accelerate the move to Linux systems over Christmas break and January term.
The Spring, 2005
The decision to move email off of the Digital Alpha DS20 was determined by the miserable performance that started around November 2004. The system onto which we moved the MHC function developed several serious complications.
While we had been using this same operating system for many years on systems such as the webserver and the ISIS servers, these systems used disk storage from other systems. The new system turned out to have serious problems with:
- Disk quotas.
- Operating system bugs triggered by high numbers of users.
- Disk transaction speeds.
- General system tuning parameters.
These problems did not manifest all at once. The disk quota issue and the other operating system bugs caused the new system to crash with noticible frequency (about every five days). Determining the causes of these problems took well over a month. In fact, during that time, the disk quota issue which caused one kind of system crash was recognized and corrected.
By mid February the system was staying up for two weeks before a crash or planned reboot was required. For a user community that had become accustomed to these kinds events of events occurring less than once a semester, this was a very bad situation.
By March, the system stability became acceptable and crashes had stopped occurring, but the problem of disk speed began to become problematic. Between March and the end of the semester, we spent a great deal of time attempting to tune the system to eek out better disk performance.
Independently from the problems on the email server MHC, we developed some disk problems on the Digital Alpha called AMBR which is our general fileserver for web pages and department space. Although we recovered from this problem rapidly and in conjunction with our disaster recovery plans, this further eroded confidence in the systems in general.
The Summer, 2005 During the spring and into the summer, the primary task we engaged in was developing a Linux system for the email server (MHC) and file server (AMBR) that had sufficiently fast disks and sufficient memory to handle the load.
In next year's annual report, we will be able to say that these efforts were successful and the systems have performed well and reliably (2Feb06).
Prepared by Mike Crowley September 2005
|