2024-01-29 – Unplanned Outage

Several services are suffering unplanned outage this morning, including DNS and Web services. At this time, staff are aware, en route, and looking into the issues. More updates will be published as we learn more.

07:50 Update – A problem was located and mitigated with the DNS servers. All services should be returning to normal at this time.
08:10 Update – We are still having issues with the CS DNS servers. We are still working on the issue.
08:59 Update – We are still working on the CS DNS issues. You can using the wireless EDUROAM network to connect to things outside CS.
09:38 Update – We have tracked down the issue for the CS DNS server and things should start returning to normal. The CS clusters are currently offline until we can track down an issue.

10:02 Update – The clusters are back online. All services should be returned to normal.

2024-01-29 – Unplanned Outage Read More »

[rescheduled] CS Database Downtime, Monday, January 8, 2024, 07:00-10:00

Due to an unforeseen scheduling conflict, this downtime, previously
announced for Tuesday, is being rescheduled by one day to Monday,
January 8th, 2024.

Please contact CS Staff if it causes you undue hardship.

Thank you,
CS Staff

—– Original Message —–
From: “csstaff” <csstaff@cs.princeton.edu>
To: “downtime” <downtime@lists.cs.princeton.edu>
Sent: Wednesday, December 20, 2023 11:10:35 AM
Subject: [downtime] CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00

Date: Tuesday, January 9, 2024 (07:00-10:00)

Who is affected:
All users of the CS Department ”publicdb” database server, including
any dependent web properties and all CS Department Beowulf high-performance
computing cluster users, known as ionic.

All users of CS Department administrative web properties (Dropbox, CS
Guide, the Main website, etc.)

What is happening:
During this window, the ”publicdb” database server will be replaced
with a newer server. All existing MariaDB databases will be migrated to the
new server, so no data loss is anticipated. However, while Slurm jobs will
continue, new jobs cannot start during the migration.

In addition, the database server underlying the administrative systems
will be upgraded and replaced. During the upgrade, all database-dependent
administrative systems will be unavailable. This includes the CS Dropbox
service, the main website, the CS Guide, ADM, and any content feeds
provided by CS Staff.

Why is it happening:
The old servers running MariaDB 10.1.24 will be upgraded to newer ones
running MariaDB 10.5.22.

phpMyadmin web interface will be upgraded from version 4.4.14 to 5.2.1.

This is part of regular maintenance to enhance system performance and
security.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

[rescheduled] CS Database Downtime, Monday, January 8, 2024, 07:00-10:00 Read More »

CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00

Date: Tuesday, January 9, 2024 (07:00-10:00)

Who is affected:
All users of the CS Department ”publicdb” database server, including
any dependent web properties and all CS Department Beowulf high-performance
computing cluster users, known as ionic.

All users of CS Department administrative web properties (Dropbox, CS
Guide, the Main website, etc.)

What is happening:
During this window, the ”publicdb” database server will be replaced
with a newer server. All existing MariaDB databases will be migrated to the
new server, so no data loss is anticipated. However, while Slurm jobs will
continue, new jobs cannot start during the migration.

In addition, the database server underlying the administrative systems
will be upgraded and replaced. During the upgrade, all database-dependent
administrative systems will be unavailable. This includes the CS Dropbox
service, the main website, the CS Guide, ADM, and any content feeds
provided by CS Staff.

Why is it happening:
The old servers running MariaDB 10.1.24 will be upgraded to newer ones
running MariaDB 10.5.22.

phpMyadmin web interface will be upgraded from version 4.4.14 to 5.2.1.

This is part of regular maintenance to enhance system performance and
security.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00 Read More »

CS Ionic Cluster Downtime, Thursday, October 19, 2023, 7:30-9:30

Date: Thursday, October 19, 2023 (7:30-9:30)

Who is affected:
All users of the CS Department Beowulf high performance computing cluster,
known as ionic.

What is happening:
CS Staff will upgrade the cluster management and job scheduling system
Slurm and its database. No reboot will be necessary; thus, we expect to
finish the upgrade earlier than this window. However, the wide time frame
acknowledges the uncertainties involved.

Why is it happening:
The upgrade is necessary to patch against an urgent security bug.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Ionic Cluster Downtime, Thursday, October 19, 2023, 7:30-9:30 Read More »

CS Storage System Downtime, Tuesday, August 29, 2023, 07:30-10:30

Date: Tuesday, August 29, 2023 (07:30-10:30)

Who is affected:
All users of CS Department storage systems, including project spaces, home
directories, and web spaces.

What is happening:
The central file storage cluster will be rebooted a few times during this
window in order to facilitate physical upgrades. During the reboots, file
services will be interrupted, but will resume after the cluster finishes
its boot. This will affect access to the cycles login hosts, CS Department
web services, and CS Department SMB/CIFS services. Email services should
not be affected.

Actual outage time is not expected to encompass the full 3 hour window, but
may occur sporadically during this period.

A reservation has been placed on the ionic cluster to hold all jobs that
would overlap with this maintenance. Jobs will automatically start again
after completion of the work.

Why is it happening:
The storage cluster will be upgraded during this window. The upgrades will
modernize the backend network of the cluster, as well as add new all-flash
nodes to speed up front-end operations.

This, combined with recent front-end network upgrades, will continue the
improvements to the cluster necessary to prepare for the arrival of the new
SEAS HPC cluster that will be hosted in CS.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Storage System Downtime, Tuesday, August 29, 2023, 07:30-10:30 Read More »

CS System Downtime – Project Web Server, Wednesday, July 26, 2023, 08:00-10:00

Date: Wednesday, July 26, 2023 (08:00-10:00)

Who is affected:
All users of the CS Department project web space service.

What is happening:
CS Staff will upgrade the web project servers to the latest Springdale 9
distribution.

PHP on this server will be upgraded from version 8.0.13 to 8.1.14, and
Phusion Passenger, the system which allows for support of web application
frameworks, will be upgraded from version 6.0.14 to 6.0.18. There are
several incompatibility changes between the PHP versions, and some project
websites will need code upgrades/adjustments to work properly on the new
server. You can read more about the changes between the PHP versions on
these pages:

www.php.net/manual/en/migration81.php
www.php.net/manual/en/migration81.deprecated.php
www.php.net/manual/en/migration81.incompatible.php

Note the “Backward Incompatible Changes” link, which is worth reviewing to
prepare for your site update.

We don’t anticipate any Phusion Passenger breaking changes; however, if
you’d like to review some of the newest features, please review the
following link(s).

blog.phusion.nl/2023/06/12/passenger-6-0-18/

CS Staff is performing a basic review of each project website on the
upgraded web server, and /most/ sites appear to be in good working order.
We will contact site owners directly for sites with obvious compatibility
issues to advise on expected changes. However, as it is impossible for us
to review all possible aspects of your site, we strongly encourage you to
review your site after the upgrade on July 26 to ensure it is working as
expected, as well as review the PHP changes before the upgrade to
anticipate changes you may need to make.

Please note that the above changes apply ONLY to the project websites.
Personal (“tilde”) sites and any other content hosted under
“www.cs.princeton.edu” are not yet affected by this upgrade. If you are
concerned that your site may need substantial change and would like to
review it using the new web server before the upgrade, please contact
[csstaff at cs.princeton.edu] for assistance.

Why is it happening:
This is part of the routine maintenance of the web servers and will bring
newer versions of installed tools and software.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS System Downtime – Project Web Server, Wednesday, July 26, 2023, 08:00-10:00 Read More »

CS Ionic Cluster Downtime, Wednesday, July 26, 2023, 05:00-17:00

Date: Wednesday, July 26, 2023 (05:00-17:00)

Who is affected:
All users of the CS Department Beowulf high performance computing cluster,
known as ionic.

What is happening:
CS Staff will upgrade the ionic cluster to the latest Springdale 9
distribution. In addition, cluster management and job scheduling system
slurm and its database will be upgraded.

SPECIAL NOTE: As we are reloading the Linux servers, all local disk storage
will be wiped, thus resulting in a loss of any data stored in the /scratch
partition. If you have data in /scratch that needs to survive the reload,
please ensure it is copied somewhere safe before the start of the
maintenance.

Please note that the downtime window is significantly longer than our usual
windows due to the high-touch nature of OS reinstallations. We expect to
finish the upgrades earlier than this window, but the wide time frame
acknowledges the uncertainties involved.

Why is it happening:
This is part of the routine maintenance and will bring newer versions of
installed tools and software.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Ionic Cluster Downtime, Wednesday, July 26, 2023, 05:00-17:00 Read More »

CS Cycles Downtime, Wednesday, July 26, 2023, 08:00-10:00

Date: Wednesday, July 26, 2023 (08:00-10:00)

Who is affected:
All users of the CS Staff-managed public login systems, including the
cycles, courselab, and armlab systems.

What is happening:
CS Staff will upgrade the user-accessible servers in our infrastructure,
including cycles, courselab, and armlab.
The systems will be upgraded to the latest Springdale 9 distribution for
the x86_64 architecture and RockyLinux 9 distribution for the aarch64
architecture (i.e., armlab).

To help ensure a smooth transition, we currently have the new distribution
installed on the following servers for your testing. Please keep in mind
that these servers are only reachable from inside the CS network.

cycles-test
courselab-test
armlab-test

SPECIAL NOTE: As we are reloading the Linux servers, all crontabs will be
deleted. If you have crontabs that you wish to persist, you will need to
back up your crontabs before the downtime and restore them after.

Why is it happening:
This is part of the routine maintenance of the publicly-accessible systems
and will bring newer versions of installed tools and software.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Cycles Downtime, Wednesday, July 26, 2023, 08:00-10:00 Read More »

CS Storage Maintenance, Wednesday, June 14, 2023, 08:15-10:15

Date: Wednesday, June 14, 2023 (08:15-10:15)

Who is affected:
Users of CS Department Disk Storage Facilities, including home directories,
project spaces, and web servers.

What is happening:
During this window, the CS Department’s primary storage cluster will have
its network relocated and upgraded.

NO OUTAGE is expected, but some users may experience brief pauses in
service or the need to disconnect and reconnect mounted filesystems.

Why is it happening:
This is part of a larger project upgrading the capacity of the CS
Department’s research network. This change will vastly increase the
available network bandwidth to the department’s central Isilon storage.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.
Sincerely,
CS Staff

CS Storage Maintenance, Wednesday, June 14, 2023, 08:15-10:15 Read More »

Scroll to Top