Author name: csstaff

CS Storage Maintenance, Tuesday, July 1, 2025, 06:00-16:00

Date: Tuesday, July 1, 2025 (06:00-16:00)

Who is affected:
All users of the CS department storage or computing facilities.

What is happening:
We are upgrading our storage operating system, which requires the
components of the CS storage system to be rebooted. This upgrade is
expected to be mostly non-disruptive, and department services will continue
as usual, but sporadic moments of interruption may be noticeable at times.

All services that depend upon access to storage share the same minor risk
of performance hiccups for some periods during this window, including cycle
servers, ionic and neuronic clusters, web content, home directories, CIFS,
etc.

Why is it happening:
Our current operating system has reached end-of-life status and needs an
upgrade.

While we do not anticipate any extended service outages, you may find that
there are momentary interruptions, and some connections (especially
CIFS/SMB connections) may need to be reestablished.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Storage Maintenance, Tuesday, July 1, 2025, 06:00-16:00 Read More »

CS Network Emergency Downtime, Monday, June 9, 2025, 07:30-08:00

Date: Monday, June 9, 2025 (07:30-08:00)

Who is affected:
All users of CS Department networked services, including email, web
service, and HPC computing.

What is happening:
Critical backbone switches will be upgraded to newer firmware and
rebooted.

We expect no lasting negative impacts from this maintenance.

Why is it happening:
Firmware upgrades are necessary on a few devices to address bugs that are
slowing progress on our Summer project list.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Network Emergency Downtime, Monday, June 9, 2025, 07:30-08:00 Read More »

Brief CS Email Outage, Thursday, March 27, 2025, 07:00-08:00

Date: Thursday, March 27, 2025 (07:00-08:00)

Who is affected:
Users of CS Department IMAP/Pop/Webmail/SMTP services

What is happening:
Sometime during this window, CS email services will experience a brief
outage while the underlying LDAP service is reconfigured. The actual outage
time is expected to be less than 5 minutes, but the larger window is
scheduled to allow for the unexpected.

Why is it happening:
This outage will allow us to begin the longer process of updating our
Zimbra mail servers to the latest available software release. This is a
necessary first step that is intended to avert a much longer outage later
in the process, by enabling a live upgrade of the rest of the system.

No visible changes are anticipated until after the end of the semester, but
minor systems will be updated before that time to prepare for the major
upgrade in the Summer.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

Brief CS Email Outage, Thursday, March 27, 2025, 07:00-08:00 Read More »

CS Cycles/Ionic Downtime, Tuesday, March 11, 2025, 07:00-12:00

Date: Tuesday, March 11, 2025 (07:00-12:00)

Who is affected:
All users of the CS Department Beowulf high performance computing cluster
known as ionic.

All users of the CS Staff-managed public login systems, including the
cycles, courselab, and armlab systems.

What is happening:
CS Staff will upgrade the Ionic cluster as well as Cycles, Courselab, and
Armlab systems to the latest Redhat 9 distribution.

Additionally, MATLAB configurations will be updated. Please review the CS
Guide for new instructions:
csguide.cs.princeton.edu/software/matlab

Why is it happening:
This is part of the routine maintenance and will bring newer versions of
installed tools and software.

MATLAB changes will allow for us to have multiple versions of the software
available.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Cycles/Ionic Downtime, Tuesday, March 11, 2025, 07:00-12:00 Read More »

CS Cycles/Ionic/Neuronic System Downtime, Tuesday, January 7, 2025, 07:00-15:00

Date: Tuesday, January 7, 2025 (07:00-15:00)

Who is affected:
All users of the CS Department Beowulf high performance computing clusters,
known as ionic and neuronic.

All users of the CS Staff-managed public login systems, including the
cycles, courselab, and armlab systems.

What is happening:
Ionic and neuronic nodes will have Nvidia, CUDA, and kernel drivers updated
to fix GPU-related failures. In addition, cluster management and job
scheduling system slurm and its database will be upgraded. No data loss is
anticipated. After the upgrade, machines will be rebooted.

Cycles, courselab, and armlab machines will be rebooted during this window
to clear some defunct user processes interfering with research work.

Why is it happening:
Ionic nodes are experiencing various GPU-related failures. To address these
problems, we will be updating Nvidia, CUDA, and kernel modules.

Additionally, some user processes have entered a defunct state, hindering
research activities. To resolve this, a system reboot is necessary to clear
these processes.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Cycles/Ionic/Neuronic System Downtime, Tuesday, January 7, 2025, 07:00-15:00 Read More »

CS Cycles/Ionic/Neuronic System Downtime, Tuesday, July 2, 2024, 06:00-17:00

Date: Tuesday, July 2, 2024 (06:00-17:00)

Who is affected:
All users of the CS Department Beowulf high performance computing cluster,
known as ionic.

All users of the CS Staff-managed public login systems, including the
cycles, courselab, and armlab systems.

What is happening:
During this window, all CS managed systems (cycles, ionic, neuronic,
courselab and armlab) will be upgraded to the latest Red Hat Operating
System – 9.4. In addition, cluster management and job scheduling system
slurm and its database will be upgraded. No data loss is anticipated.

SPECIAL NOTE:
As we are reloading the Linux servers, all crontabs will be deleted. If you
have crontabs that you wish to persist, you will need to back up your
crontabs before the downtime, and restore them after.

In addition, all local disk storage will be wiped, thus resulting in a loss
of any data stored in the /scratch partition. If you have data in /scratch
that needs to survive the reload, please ensure it is copied somewhere safe
before the start of the maintenance.

Why is it happening:
This is part of regular maintenance to keep systems up-to-date.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Cycles/Ionic/Neuronic System Downtime, Tuesday, July 2, 2024, 06:00-17:00 Read More »

[rescheduled] CS Database Downtime, Monday, January 8, 2024, 07:00-10:00

Due to an unforeseen scheduling conflict, this downtime, previously
announced for Tuesday, is being rescheduled by one day to Monday,
January 8th, 2024.

Please contact CS Staff if it causes you undue hardship.

Thank you,
CS Staff

—– Original Message —–
From: “csstaff” <csstaff@cs.princeton.edu>
To: “downtime” <downtime@lists.cs.princeton.edu>
Sent: Wednesday, December 20, 2023 11:10:35 AM
Subject: [downtime] CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00

Date: Tuesday, January 9, 2024 (07:00-10:00)

Who is affected:
All users of the CS Department ”publicdb” database server, including
any dependent web properties and all CS Department Beowulf high-performance
computing cluster users, known as ionic.

All users of CS Department administrative web properties (Dropbox, CS
Guide, the Main website, etc.)

What is happening:
During this window, the ”publicdb” database server will be replaced
with a newer server. All existing MariaDB databases will be migrated to the
new server, so no data loss is anticipated. However, while Slurm jobs will
continue, new jobs cannot start during the migration.

In addition, the database server underlying the administrative systems
will be upgraded and replaced. During the upgrade, all database-dependent
administrative systems will be unavailable. This includes the CS Dropbox
service, the main website, the CS Guide, ADM, and any content feeds
provided by CS Staff.

Why is it happening:
The old servers running MariaDB 10.1.24 will be upgraded to newer ones
running MariaDB 10.5.22.

phpMyadmin web interface will be upgraded from version 4.4.14 to 5.2.1.

This is part of regular maintenance to enhance system performance and
security.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

[rescheduled] CS Database Downtime, Monday, January 8, 2024, 07:00-10:00 Read More »

CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00

Date: Tuesday, January 9, 2024 (07:00-10:00)

Who is affected:
All users of the CS Department ”publicdb” database server, including
any dependent web properties and all CS Department Beowulf high-performance
computing cluster users, known as ionic.

All users of CS Department administrative web properties (Dropbox, CS
Guide, the Main website, etc.)

What is happening:
During this window, the ”publicdb” database server will be replaced
with a newer server. All existing MariaDB databases will be migrated to the
new server, so no data loss is anticipated. However, while Slurm jobs will
continue, new jobs cannot start during the migration.

In addition, the database server underlying the administrative systems
will be upgraded and replaced. During the upgrade, all database-dependent
administrative systems will be unavailable. This includes the CS Dropbox
service, the main website, the CS Guide, ADM, and any content feeds
provided by CS Staff.

Why is it happening:
The old servers running MariaDB 10.1.24 will be upgraded to newer ones
running MariaDB 10.5.22.

phpMyadmin web interface will be upgraded from version 4.4.14 to 5.2.1.

This is part of regular maintenance to enhance system performance and
security.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Database Downtime, Tuesday, January 9, 2024, 07:00-10:00 Read More »

CS Ionic Cluster Downtime, Thursday, October 19, 2023, 7:30-9:30

Date: Thursday, October 19, 2023 (7:30-9:30)

Who is affected:
All users of the CS Department Beowulf high performance computing cluster,
known as ionic.

What is happening:
CS Staff will upgrade the cluster management and job scheduling system
Slurm and its database. No reboot will be necessary; thus, we expect to
finish the upgrade earlier than this window. However, the wide time frame
acknowledges the uncertainties involved.

Why is it happening:
The upgrade is necessary to patch against an urgent security bug.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Ionic Cluster Downtime, Thursday, October 19, 2023, 7:30-9:30 Read More »

CS Storage System Downtime, Tuesday, August 29, 2023, 07:30-10:30

Date: Tuesday, August 29, 2023 (07:30-10:30)

Who is affected:
All users of CS Department storage systems, including project spaces, home
directories, and web spaces.

What is happening:
The central file storage cluster will be rebooted a few times during this
window in order to facilitate physical upgrades. During the reboots, file
services will be interrupted, but will resume after the cluster finishes
its boot. This will affect access to the cycles login hosts, CS Department
web services, and CS Department SMB/CIFS services. Email services should
not be affected.

Actual outage time is not expected to encompass the full 3 hour window, but
may occur sporadically during this period.

A reservation has been placed on the ionic cluster to hold all jobs that
would overlap with this maintenance. Jobs will automatically start again
after completion of the work.

Why is it happening:
The storage cluster will be upgraded during this window. The upgrades will
modernize the backend network of the cluster, as well as add new all-flash
nodes to speed up front-end operations.

This, combined with recent front-end network upgrades, will continue the
improvements to the cluster necessary to prepare for the arrival of the new
SEAS HPC cluster that will be hosted in CS.

We will post updates to the status page: www.csstaff.org
as necessary.

If this downtime will cause you undue hardship, please contact
csstaff@cs.princeton.edu immediately, so we can discuss options to reduce
any negative impact. Your patience is appreciated.

Sincerely,
CS Staff

CS Storage System Downtime, Tuesday, August 29, 2023, 07:30-10:30 Read More »

Scroll to Top