|
Home > LITS > About LITS > LITS Annual Reports > Annual Report 2002-03 > Networking
Networking
LITS Annual Report 2002-2003
Networking: an overview
Networking is responsible for the design and maintenance of the college network and the system administration of many of the host computers that reside on that network. The network and systems are maintained in a state of high reliability and security. We maintain efficient means for users to interact with these systems. This often requires installation or creation of new software applications. It also involves many aspects of instruction for end users and other staff members within LITS, including application development questions.
The Summer We began the year in July in fear. The new structure called Kendade seemed far from complete. We had people to move into that structure. By August 1, we still did not have secure wiring closets for our equipment, and August 7 was the move-in day. On August 5 fiber was not ready in the wiring closets, but by August 7 it was and we got temporary network electronics in place just as Physics was moving into the building. They had to wear hard hats to get to their offices.
The last week in August arrived along with many students. The various renovations had been completed in Clapp, 1837, Abbey, Ham, 1 Faculty Ln, Pearsons, N. Rockefeller, and S. Rockefeller. Networking was complete August 26.
The most exciting days at the end of August involved looking at Kendade with unfinished classrooms with less than a week to go before scheduled classes were supposed to be held. In the day or so before those classes were held, the rooms were sufficiently complete for network drops to be made live and the classroom computers to become operational.
Although disaster was barely averted in Kendade, another crisis occurred earlier in August.
Construction on Blanchard was continuing. Unfortunately August 2, they cut our 60 singlemode fiber run to Pratt. We were able to use our redundant linkage via Mary Woolley to route around the problem. As of July 2003, we still do not have the replacement fiber pulled. This is on the job list of the construction project.
There were other, less exciting but productive aspects of the summer.
Wireless was installed in the main reading room of the library on July 26,2002.
Fred Kass and Kevin Slate got 17 Woodbridge Street on the network for Student Programs by August 15.
The end of August was also filled with a lot of back and forth with project managers regarding problems in the planned wiring of Blanchard for voice, data, and card access.
The most pleasant part of the summer was on August 19, 2002. We were informed that we were accepted as sponsored participants in Internet2. We were one of the first small liberal arts colleges to achieve this and the only one of the colleges in Five Colleges.
The Fall The student arrival in September means that Networking becomes heavily involved with:
- Answering questions from students and colleagues.
- Helping configure machines that do not network properly.
- Making house calls to fix bad data ports which often are not bad.
- Dealing with a variety of user issues related to networking for faculty and staff as well as students as the load on all the systems jumps.
In mid September, we helped Doug Vanderpoel get some additional phone lines running over fiber into the Health Center. We did not know at the time how important this piece of technology would become.
By September 13, 2002, 78% of the resident students had gotten on the network and registered their computers. This was a number that we did not reach in 2001 until November. By October, that number was 87% of the residents, and it edged up to a bit more than 90% by the end of the calendar year.
These data are consistent with the survey data we obtain from incoming students. This survey has been done since 1995 and there are more details than I will present here. The most interesting data are how many incoming students said they brought a computer and of those, how many were Macintoshes.
| |
Class 1999 |
Class 2000 |
Class 2001 |
Class 2002 |
Class 2003 |
Class 2004 |
Class 2005 |
Class 2006 |
|
Percent with computer |
48 |
50 |
57 |
68 |
74 |
79 |
84 |
89 |
|
Percent Macintoshes of the computers |
52 |
42 |
26 |
16 |
12 |
11 |
11 |
10 |
On September 23, 2002, we provided Five Colleges with a plan from Russ Boudreau on how the fiber for the Five College Fiber Network project might traverse this campus.
On October 10, 2002, we were actually hooked up and traffic was flowing to Internet2.
Coincidentally at that time, we received our first Windows Popup message spam to many computers across campus. This is an annoying popup message that someone out on the Internet can send to a windows machine, usually containing some sort of advertisement. We convened at a Five College network meeting and soon decided to block the TCP ports on which those messages could appear.
At the end of October, we began to offer a proxy service so that off campus users could reach restricted library sites as if they were on campus. By writing some specialized programs for this, we were able to provide current faculty, staff, and students only with this service. This kept us within the licensing parameters of the various databases. We were able to convert the bulk of the web pages on the Mount Holyoke web server to use this service. Though November and December, more and more of the databases were linked this way on the web pages from the III system. This service decreased the need for the user to install VPN software on their home computers, a task that was overkill for this kind of service. (The VPN service remains for users who need other kinds of on-campus access, such as file services or secure administrative computing services.
At this time (October 22), we also installed wireless access for the public areas of Kendade.
But not all was well. As we moved into the Fall semester, we began to receive notices of copyright infringement by (primarily) students who were using peer-to-peer (p2p) filesharing programs that were uploading music and videos to the Internet. We began to construct standard messages and procedures for dealing with these alleged infringements.
Related to the p2p filesharing was some severe bandwidth issues. In mid November, the outbound traffic was so large that college business, such as web serving and outbound email was affected. On November 22, 2002, we limited the aggregate bandwidth from the residence halls to 10 Megabits/second (Mbps). Our total commodity Internet was 15 Mbps. The 5 Mbps was plenty on which to conduct the college business. (Incoming traffic was not as affected by the p2p as was the outbound traffic.) This rate limiting at the campus Cisco router has proven quite successful in managing the traffic without having to resort to purchasing expensive traffic shaping network appliances, devices which are at risk of becoming obsolete as p2p programs encrypt their content.
A Dark Cloud At the end of September, we began to catch signs of hacked systems. We found that ATLAS, a server for the Center for Environmental Literacy, had been hacked and was serving Gigabytes of data to the Internet per day. Because of the amount of bandwidth being consumed, we soon also discovered that one of the Facilities Management Servers (BG Unit II) was also hacked and distributing copyrighted material to the Internet. We also got an MIT report of a system in the residence halls scanning their network.
These were early indicators of a large problem. We did not yet know how much time it was going to take dealing with insecure systems that had become hacked.
Another Dark Cloud On October 11, 2002, we were told that our connection from Smith College went out briefly in the afternoon. Later, at 23:01, Ron Peterson had a brief episode where MHC was unreachable from his home. These seemingly anomalous episodes were the first indicators of a problem that went from annoying to very disruptive over the next two months.
As we began to monitor the problem, we found that there were several instances per day of two minute Internet outages to UMass. In the beginning, there were a few per day. By late November, it was running between 5 and 15 per day, with most days running about 10 episodes per day. By early December, it was common to have 12-17 episodes per day.
We not only kept careful data, but we also posted those data to mhc.announce.college at regular intervals with explanations of the various troubleshooting processes that both UMass and Verizon were taking. As the problem dragged on, confidence in the network stability waned, and our frustration grew..
Finally, on December 17, 2002, Verizon failed over one of the cards in the circuit to a backup card and the problem vanished. (This was a test that had actually been suggested early in the process by one of the first Verizon technicians and which many of us feel they should have done at that time.) The problem did not recur while we ran on the backup card and on January 2 they replaced the bad card.
November 2, 2002 -- hacked systems take down the network The other dark cloud turned into a severe event sometime after 20:00 Saturday, November 2. It turned out to be a distributed denial of service (DDoS) attack was launched from 6 hacked Windows computers. We were quickly able to ascertain the location of these 6 hacked computers from our records. With Public Safety, we went to the computers and disconnected them from the network, leaving notes for the users of those systems. Two were public machines in the library which we confiscated to do an analysis of the system. At 01:00 November 3, we emailed the owners of the computers and others in LITS describing the event and what we had done.
The aftermath of that incident was that starting Sunday and for over two weeks, Networking did little else except work on Windows 2000/XP security: tightening up security on existing machines, discovering other hacked machines, and fixing hacked systems. This involved writing programs to fix vulnerabilities remotely and the very time-consuming process of cleaning (or reformatting) hacked systems. (These tasks were very high priority and so also distracted us from the intermittent two-minute circuit outage problems that were also occurring.)
Our first priorities were to get hacked machines off the network and secure vulnerable machines. We discovered a number of college machines that were already hacked and discovered that we had some major vulnerabilities in most of our other systems. With the programs that we had developed, we were able to correct most of these machines remotely.
After the first couple of weeks, we were able to concentrate on identifying vulnerable student machines that were not yet hacked and to instruct the individuals on how to "harden" their systems from attack. This included running our programs to check for hacking activity and to set system parameters to safer settings.
In mid November, some of us attended a UMass computer security conference that discussed many of the issues we had been experiencing and in which we had become reluctant close-to-experts. At the conference were a few experts, but the majority of the people were just beginning to experience what we had. It was nice to know we were not alone, but it also indicated that this was an ongoing, more serious problem than was previously envisioned.
Interestingly, we had closed some of the vulnerabilities by blocking certain TCP ports earlier in the semester in response to the pop-up window events. Closing those ports also increased the level of security on the network by preventing outsiders from discovering machines with these ports open, as is the case with Windows computers. Blocking those ports also reduced the ability to use hacks on those ports to gain entry. While this does not lower the risk from on-campus hackers, most of the risk here is from unknown outsiders as opposed to malicious insiders.
Quarantining systems In the fall, we adopted a formal procedure called "quarantining" systems. Computers that were quarantined were restricted to on-campus access only. This was done for several reasons:
- Computers that were hacked or likely hacked.
- Computers for which we had received notification of copyright infringement.
- Computers that were using significantly more of their share of the available bandwidth to off-campus sites.
- Computers which were extremely vulnerable to being hacked where the registrant of the computer would not properly secure it so that it constituted a risk to the network.
We emphasize that a quarantine is not a punishment but is a protection for either the individual or the general network usage. The quarantine is lifted as soon as the registrant of the computer had rectified the problem. In the case of copyright infringement it required a statement that the computer was not distributing copyrighted materials.
The one downside of this method is that the individuals cannot web surf off campus nor can they receive email at other off-campus locations via the web. We are planning on rectifying this by setting up a proxy server so that such machines may get off campus web access through one of our computers.
The quarantine method has worked so well that, with the addition of the proxy server, it is possible that some users who are more concerned about security might be provided with network access that would not have any direct off-campus capabilities.
Data on quarantined systems The actual number of machines that we found to be hacked were relatively small in number. We did not keep accurate data in the aftermath of the DDoS attack, but after mid November, we have only recorded 7 college computers and 10 student computers that were definitely hacked. I'm afraid this number may be low in that some computers were fixed without our becoming aware of them.
The number of machines for which there were Copyright Infringement Notices were much higher. There were 72 notices. Of those, 71 were for computers registered in the residence halls. During the year, there was a significant increase in the rate of notices:
| Sep |
Oct |
Nov |
Dec |
Jan |
Feb |
Mar |
Apr |
May |
| 1 |
1 |
5 |
4 |
4 |
10 |
10 |
34 |
1 |
The Spring
The spring semester went a lot easier in that there were fewer crises, other than the telephone cable.
On February 4, 2003, we announced to the campus community that there was a serious problem developing with the phone system cable that connected more than 20 buildings to the telephone switch. The details of that problem are not relevant to this report, but we did work closely with the Telephone Office to help get alternative phone connections over fiber to the more severely affected buildings.
Carr went back online at the end of January and during the semester we made 21 wiring cabinets for the labs live on the ethernet. Each cabinet had between 17 and 58 dataports, with the total count being 618.
During the spring we got back to doing normal system management and other applications work. We improved the network monitoring. Work on the SIS system progressed well and security concerns began to be addressed. The VPN server was upgraded to two servers to provide redundancy as that service took on more importance for the SIS implementation. We also had a number of meetings with Facilities to plan out the summer renovations, especially those in Clapp.
In the middle of the first semester, we announced that we had a web-based email client called IMP, the same one that MIT has standardized on. Through the spring semester, the use of IMP increased, though the use of other email clients such as desktop clients (Netscape, Outlook Express) or shell access (pine) remained relatively constant.
|
|
Different Users Logging In |
Different Users Desktop Email |
Different Users IMP |
|
total |
dorm |
IMAP |
POP |
Total |
|
Nov 01 |
3827 |
1672 |
601 |
340 |
917 |
-- |
|
Dec 01 |
3757 |
1634 |
600 |
350 |
923 |
-- |
|
Mar 02 |
3901 |
1713 |
617 |
348 |
941 |
71 |
|
Apr 02 |
3746 |
1705 |
625 |
340 |
934 |
53 |
|
May 02 |
3874 |
1714 |
611 |
334 |
920 |
98 |
|
Nov 02 |
4015 |
1902 |
716 |
295 |
983 |
924 |
|
Dec 02 |
3968 |
1810 |
713 |
289 |
974 |
1211 |
|
Mar 03 |
3927 |
1838 |
716 |
289 |
978 |
1362 |
|
Apr 03 |
3897 |
1865 |
718 |
286 |
977 |
1320 |
|
May 03 |
3899 |
1823 |
711 |
286 |
966 |
1630 | Unfortunately, correlated with this usage pattern was a dramatic change in response of the MHC computer. While the other usage remained relatively constant, IMP use dramatically increased. The MHC webserver that runs IMP does so in a secure fashion. In addition, many of the shell logins were done via secure connections (ssh vs telnet). Secure connections like these require more processing.
Those factors may explain why, in the absence of change in other variables, the performance of MHC dramatically decreased during the spring. There were a few times when the load was such that the system was annoying to use. The effect on the IMP client was proportionally much greater than on other email clients, which are inherently faster.
The solution we will adopt for the Fall 2003 is to have a separate computer run the web-based email client IMP. Unless there may be other factors of we are not aware at this point, this should significantly reduce the load on MHC.
During this year, one of the projects that moved to a lower priority, but only because of more pressing problems, was the upgrading the "sendmail" program. This program is at the core of the mail transport system. This upgrade was finally accomplished in April. This upgrade is necessary to be able to experiment with and install more aggressive spam blocking, marking, and filtering programs.
General During this tumultuous year, there were a number of other tasks that had progress or were accomplished.
- The SIS server (STAR) was moved toward a production state. An ancillary server (STARLITE) for SIS web display was brought up and tested, though the actual server will be upgraded for production use in 2003-2004.
- Additional disk space was put on AMBR, the general file server for offices, departments, courses, and the web. Even more disk space is required.
- A number of account management functions on AXIS were moved to AMBR. AXIS was MHC in 1993 and had moved to being AXIS in 1996. This reduced the reliance on this aging computer system.
- New server (SONIC) installed for Public Safety's database.
- The concept of Webshell was put into production along with the web-based email client, IMP. Webshell provides some functions via the web which were not otherwise available without a shell login with ssh or telnet. These functions include telephone directory lookup, getting grades, and getting advisee or advisor lists. It is planned that this is the way individuals will change their password, set email forwarding, and set email auto-reply (vacation) messages.
- A program for the web was developed to allow department heads and assistants to update online the telephone book section, Departments, Offices, and Programs.
- For research and development, there were experimental implementations of LDAP, D-Space, proxy serving, and a number of network monitoring tools. Some of the network monitoring tools were put into production and helped us get reports of high bandwidth users per day by IP number.
- The core of the network was moved from the aging 3Com 6012 to the two Cisco 6509 switches.
- The Beowulf cluster for Maria Gomez in Chemistry was installed and we have begun to put automatic system management functions on that system as we have our other major systems.
- MHC/AMBR were not upgraded. It had been planned over a year ago that we would upgrade both MHC and AMBR with the last round of Alpha computers. HP had announced that the Alphas would be manufactured into 2006 after which they would be on maintenance mode for some number of years. The impetus for upgrading was not because of overall system performance, but was for the need to increase disk space and have multiple Gigabit ethernet interfaces. (They currently are 100 Mbps interfaces.)
The load levels in the Spring of 2003 on MHC were unexpected and suggested that a faster system would be advised. However, we decided instead to begin moving functions off of MHC, such as the web serving of IMP and Webshell, onto Linux-based servers which can be outfitted for a fraction of the cost. Maintenance is done by having redundant systems rather than by paying a company to be ready to fix a system.
We are hoping that moving this function off of MHC will reduce its load to levels we have seen in previous years and avoid the urgency of an upgrade of the system. We will also explore the use of fast Linux systems for the upgrade path of both MHC and AMBR.
|