Princeton CS Status

CS File System Maintenance, Mon, August 12, 2013, 09:00-16:00

This maintenance is the work rescheduled from July 25, 2013.

Who is affected:
Potentially, all users of CS Department computing and storage infrastructure.

What is happening:
On Thursday, the storage cluster that provides file service for all CS Department public services will undergo upgrades to its network interfaces.

During this maintenance, connections to Samba/CIFS file shares will experience occasional, temporary interruptions. Users may need to re-establish connections to the file share in some cases.

We DO NOT anticipate actual downtime for other services, but since all maintenance involves the risk of broader problems, this announcement serves as a heads-up. In the event of problems with this maintenance, impacts could range up to outages of ALL CS Department public services that rely on the storage cluster (file service, public login hosts, e-mail, web service, etc). Note that wired and wireless networking connectivity will not be affected.

Why is it happening:
This upgrade will provide 10Gbit network connectivity to our storage cluster, getting us one step closer to end-to-end 10Gbit capability for our services.

Update – 14:12 – The upgrade has run into some issues that are resulting in some services being slow to respond or unavailable. We apologize for the trouble, but are working with the vendor presently to get things back in full working order.

Update – 16:40 – All services should be back to normal operation now. Please contact CS Staff if you continue to have problems.

CS File System Maintenance, Mon, August 12, 2013, 09:00-16:00 Read More »

CS Network Downtime, Tuesday, July 30, 2013, 06:00-07:00

All / jrc

Who is affected:
All users of the CS Department Network

What is happening:
The Department\’s primary border firewall will be replaced with a new firewall device. During the replacement, connections from the CS Department to the main campus, the internet, or to portions of the CS Department network which are outside our firewall, will be interrupted. However, as OIT will also be performing router maintenance between 05:00 and 07:00 (see earlier announcement), we don\’t anticipate this work causing much additional impact.

Why is it happening:
This is the second step of our three-step process to provide 10Gbps networking from our network core out to the main campus, and, by extension, the internet. This upgrade will give our firewall the internal capacity to support 10Gbps connections.

CS Network Downtime, Tuesday, July 30, 2013, 06:00-07:00 Read More »

Cancelled: CS File System Maintenance, Thursday, July 25, 2013, 09:00-16:00

All / jrc

Due to technical complications, the decision was made to postpone the work in order to minimize the risk of problems. Those issues having been worked out, we have rescheduled this maintenance for Monday, August 12, 2013.

Who is affected:
Potentially, all users of CS Department computing and storage infrastructure.

What is happening:
On Thursday, the storage cluster that provides file service for all CS Department public services will undergo upgrades to its network interfaces.

During this maintenance, connections to Samba/CIFS file shares will experience occasional, temporary interruptions. Users may need to re-establish connections to the file share in some cases.

Why is it happening:
This upgrade will provide 10Gbit network connectivity to our storage cluster, getting us one step closer to end-to-end 10Gbit capability for our services.

Cancelled: CS File System Maintenance, Thursday, July 25, 2013, 09:00-16:00 Read More »

Wireless networking in all Administrative and Academic Buildings on both Main Campus and Forrestal B Site 6/15/2013 from 06:00 to 10:00

All / cmmiller

This is a repost of an OIT message. This downtime is being performed by OIT but affects wireless networking in the CS Building.

Service(s) affected: Wireless networking in all Administrative and Academic Buildings on both Main Campus and Forrestal B Site 6/15/2013 from 06:00 to 10:00.
Date/time of outage: 06/15/2013 6:00 am
Duration of outage: 4 hours 0 minutes

There will be a wireless networking outage in all administrative and academic buldings on both Main Campus and Forrestal B Site on 6/15/2013 from 06:00 to 10:00 AM. This is needed for a system reconfiguration.

Wireless networking in all Administrative and Academic Buildings on both Main Campus and Forrestal B Site 6/15/2013 from 06:00 to 10:00 Read More »

CS Penguins / Cycles System Downtime / Replacement, Tuesday, June 18, 2013, 06:00-08:00

All / cmmiller

Who is affected:

All users of the CS Department \”cycles\” or \”penguins\” systems (soak, wash, rinse, spin, opus or tux).

What is happening:

On Tuesday morning, the cycles systems will be replaced with newer, faster hardware with more memory. The existing cycles machines will later be added to the ionic compute cluster. At the same time, opus and tux will have OS updates applied.
SPECIAL NOTE: As we are replacing the hardware, the operating systems on the cycles systems will also be reinstalled, so all crontabs will be deleted. You will need to back up your crontabs before the downtime, and put them back after.

Why is it happening:

As part of routine maintenance, and to provide enhanced capabilities, the hardware running our systems is periodically replaced, and OS updates are periodically applied.
Further, OS updates are urgent at this time for all of our Linux systems as a result of recent kernel vulnerabilities for which we have seen active exploits occurring.

CS Penguins / Cycles System Downtime / Replacement, Tuesday, June 18, 2013, 06:00-08:00 Read More »

CS Infrastructure Down: Wed, May 29, 2013

All / scott

4:00pm : We are experiencing a problem with our infrastructure. E-mail, websites, cycle servers, and file systems are impacted. We are investigating.

4:25pm: The CS LDAP servers are currently having some issues, which are causing problems with email and the file system. We are working to address the problem and should have an update soon.

4:40pm: The LDAP Servers have been fixed. The file system and email should start working again in the next 15 minutes.

CS Infrastructure Down: Wed, May 29, 2013 Read More »

DropBox is down

All / jrc

We are working to resolve an issue with DropBox. The site is currently down. We will post status updates here.

Update 11:39 AM – DropBox is now working again. Please let us know if you have any issues.

DropBox is down Read More »

Downtime: Tue, March 19, 2013

All / scott

On Tuesday, March 19, 2013, we will have a scheduled downtime from 6:00am to 8:00am EDT.

Some of this work (bullets \”1\” below) was originally scheduled for March 5, 2013.

Who is affected:

Users of the public login machines (soak, wash, rinse, spin, tux, opus), certain CS websites (jobs, kiosk, msdnaa, search, wiki), database servers, runscript, CAS, lpdrelay (printing), labpc machines (Friend Center fishbowl lab) and the ionic cluster.
Dynamic websites for projects or groups with hostnames of the form project-or-group-name.cs.princeton.edu that are accessing executables (e.g., python, perl, java) on the shared /usr/local filesystem.

What is happening:

These hosts will receive a critical security update and be rebooted.
Websites of the form project-or-group-name.cs.princeton.edu are moving to a new server that no longer mounts the shared /usr/local filesystem.

Why is it happening:

There is a critical kernel security patch available for our Springdale hosts. It addresses a specific security vulnerability. As this is a kernel update, all machines must be rebooted. Actual expected downtime should only be a minute or two for each host to reboot.
The website migration is the next step to decommission the shared /usr/local filesystem. This work has already been done for the cycles and penguin machines and will result in decreased load on our central file server and simplified management of our web server infrastructure. Additional notes about the website migration:
- In the week leading up to the downtime, we will reach out to the owners of as many affected websites as we can identify with instructions on what they need to do for the migration. We anticipate that very few sites will need modification and that those modifications will be minor.
- After the downtime, all projects and groups should double check that their sites are operating as expected. If not, please notify CS Staff immediately and we can assist and/or temporarily move the site back to the old server.

Downtime: Tue, March 19, 2013 Read More »

Downtime: Sat, March 16, 2013

All / scott

On Saturday, March 16, 2013, we will have a scheduled downtime from 6:00am to 8:00am EDT.

Who is affected:

All users of the CS wired network (including PlanetLab at 221 Nassau, CS hosts in CITP in Sherrerd, and the CS section of the data center at 151 Forrestal) as well as users of the OIT wireless network in the CS Building.

What is happening:

We will be updating the firmware on the department\’s gateway router and firewall service module. During this time, there will be no OIT wireless connectivity in the CS Building (as the OIT access points use the CS wired infrastructure) and there will be no Internet connectivity between the CS network and the outside world.

Why is it happening:

This update will address some ongoing network issues affecting a limited number of services.

Downtime: Sat, March 16, 2013 Read More »

OIT Wireless Down: March 6, 2013

All / scott

OIT is reporting that the puwireless is not available to some users. For more details, see their outage message.

OIT Wireless Down: March 6, 2013 Read More »