Downtime: Tuesday, Sept 15, 2009

Update 3:25pm: — Copy operation is complete. E-mail service is back online. It may take an hour or two for all queued mail to be delivered.

Update 2:40pm: — Copy operation is 86% complete. Our guess is now 3:20pm +/- 30 minutes.

Update 1:20pm: — Copy operation is 57% complete. The past few percentage points have taken wildly differing amounts of time to reach. Therefore, the estimate for a completion time is highly speculative. Our guess is now 3:30pm +/- 1 hour.

Update 11:10am: — Copy operation is 42% complete. Our revised estimate for operational e-mail is now 2:15pm.

Update 9:45am: — We are in the midst of making a new working copy of the disk image of the e-mail store. We will be able to restart the e-mail server shortly after the copy finishes. Based on the progress meter for the copy (over a few percentage points), we estimate that e-mail will be back up at approximately 2:00pm. We understand that this is a major inconvenience and will we post periodic updates as the progress continues.

Update 8:10am: — Everything except E-mail service is back up and running. E-mail is down due to issues with the underlying virtual machine infrastructure. We are on the line with VMware now and hope to have this resolved soon.

Basically, we have two pools (the \”old\” and the \”new\”) of physical storage on which we run virtual machines. Within these virtual machines (VMs) we run various services within the department. Over the past few months we have migrated services from the old storage pool to the new storage pool. The old pool has each VM tied to a particular storage device while the new pool will let us migrate the VMs between devices as needed. On August 25, we were to migrate the last VMs (running e-mail) from the old pool over to the new pool. Due to a bug in the VM control software, this failed and we were forced to get the E-mail VMs running without having much control over them. The primary goal of today\’s downtime is to complete this migration. For reasons unknown, the simple act of shutting down the virtual machines has put the system in a state where it can\’t be started cleanly. We are on the line with VMware to address the issue.

 

On Tuesday, September 15, 2009, we will have a scheduled downtime from 4:00am to 8:00am EDT.

This downtime affects all users of the department\’s computing and networking infrastructure.

During this time, most of the services (e.g., E-mail, web, cycle servers) will be unavailable. E-mail destined to the department will be queued and delivered at the end of the maintenance window.

Scheduled work includes:

  • Updating our virtual machine machine infrastructure
  • Rebooting remote switches in Sherrerd Hall and 221 Nassau

This work is a follow-up to our previous downtime on August 25 and will increase the stability of our infrastructure in preparation for the Fall semester.

Downtime: Tuesday, Sept 15, 2009 Read More »

Mail Trouble – 8/25/2009

9:45am: Our e-mail service is currently down. We are working with our vendors to correct the problem.

10:15am: Problem appears to be with the virtual machine software that is hosting our mail server. We have \”Gold\” service with the VM vendor and are awaiting a return call.

11:00am: We are on the phone with the vendor\’s technical support.

11:15am: Mail is working again.

Mail Trouble – 8/25/2009 Read More »

Summer 2009 Downtime Schedule

Here is the maintenance schedule for the summer:

Tue, Jun 16, 2009, 4:00am-8:00am
Tue, Jun 30, 2009, 4:00am-8:00am
Tue, Jul 14, 2009, 4:00am-8:00am
Tue, Jul 28, 2009, 4:00am-8:00am
Tue, Aug 11, 2009, 4:00am-8:00am
Tue, Aug 25, 2009, 4:00am-8:00am

During these times we will be performing a variety of update, installation, and maintenance tasks.

Summer 2009 Downtime Schedule Read More »

Outgoing Mail is Down

Because our mail server was sending large amounts of spam, we have temporarily turned off the SMTP portion of the system while we identify, isolate, and correct the problem. As a result, outgoing mail is disabled. Users are able to receive mail. Users can also create draft messages but these must be manually sent after the SMTP server is re-enabled.

Update 9:07am, April 3, 2009:
Outgoing mail is re-enabled. Downtime was approximately 10 minutes.

Outgoing Mail is Down Read More »

Downtime: Tuesday, March 17, 2009

On Tuesday, March 17, 2009, we will have a scheduled downtime from 4:00am to 8:00am EDT.

This downtime affects all users of the department\’s computing and networking infrastructure.

Scheduled work includes:

  • Moving data in user home directories to a new disk array
  • Updating the software for our e-mail system
  • Rebooting the c2 cluster to pick-up some recent configuration changes

This work is part of normal maintenance.

Downtime: Tuesday, March 17, 2009 Read More »

Downtime: Tuesday, January 27, 2009

On Tuesday, January 27, 2009, we will have a scheduled downtime from 4:00am to 8:00am EST.

This downtime affects all users of the department\’s computing and networking infrastructure.

Scheduled work includes:

  • Reconfiguring the wireless network
  • Updates to our NetApp filer
  • Updates to the OS on the cycles machines
  • Updates to our e-mail system

This work is part of normal maintenance. Note that this downtime overlaps the OIT shutdown of all campus network services. For details, see their posting.

The change that will affect users the most is the reconfiguration of our wireless infrastructure. After the change, all wireless users should use the \”csvapornet\” SSID. For details see the CS Guide.

Downtime: Tuesday, January 27, 2009 Read More »

Connectivity Problems

Due to a yet-to-be identified source, we are seeing very large bursts of connections to large numbers of outside IP addresses. These hour-long bursts occurred at approximately 1:00am and 7:00pm on Sunday, and 1:00am and 7:00am on Monday. These events filled the firewall connection table and disrupted connections for about 3 hours each.

Update: While the source has been identified, we have not been able to reach the user. The traffic began again at 1:00pm today. We have disabled that port. You may notice some delays for a few more minutes while the network settles.

Connectivity Problems Read More »

Downtime: Thursday, December 18, 2008

On Thursday, December 18, 2008, we will have a scheduled downtime from 8:00am to 10:00am EST.

This downtime only affects direct and indirect users of the project file server. This includes the web servers, cycle servers, c2 cluster, ftp server, and the ftp mirror.

Note that e-mail, networking, the CVS server, and the database machines will remain operational during this time.

As one of the steps to clean up the file system mess, we will do a final sync between our temporary storage and our re-built production storage.

Downtime: Thursday, December 18, 2008 Read More »

Scroll to Top