System Status

All Systems ActiveAll Systems Active: All Systems Active

Power Maintenance Thursday, June 13th from 6PM to Midnight

Monday, June 10, 2013 - 10:18am

This Thursday, June 13th from 6PM to Midnight, power will be turned off at the Physical Plant Services Building (PPSB) to upgrade the primary feed to the main transformer. CARLI Production servers are in the PPSB Data Center and we will be cutting over to generator power at around 5:30PM to prepare for the power outage.

Campus staff are verifying that the network equipment that connects the PPSB Data Center to the Internet is on generator power and/or UPS. Assuming all networking equipment continues to receive power, there should be no interruption to CARLI services during this maintenance.

Brandon Gant
CARLI System Services

Information on the Network Outage this Morning

Saturday, June 8, 2013 - 5:25pm

This morning at 6:40AM, campus networking staff applied firmware upgrades to various network switches on campus including those in the CARLI Production Data Center at UIUC. This update failed on our redundant 10Gb networking equipment. They were able to get the network online after a few hours but without redundancy. At 12:10PM, they backed out the firmware updates on our data center switches which took services offline for another half hour but restored the network to full redundancy. We also had problems with two Sun servers that offlined their 10Gb networking cards each time the switch firmware changed (which required reboots).

Here are the "After Incident" items for us to work on:

1) Campus networking staff are looking into why an announcement did not go out before these changes were made.

2) I will work with networking staff to understand why both sides of our redundant network went offline during this maintenance and see what we can do to avoid this in the future.

3) I need to have a second copy of our Mailing List server running in the CARLI Disaster Recovery Data Center at UIS to make sure we can send out mass e-mails during an outage.

4) The two Sun servers that had problems are already scheduled to be replaced this summer.

Brandon Gant
CARLI System Services

VuFind Local Catalog outage this morning

Friday, April 19, 2013 - 10:40am

At 10:10AM this morning, the server running the SOLR index containing the VuFind Local catalogs stopped responding. A java process was consuming all CPU and the server was restarted to restore service. It looks like the service was offline for 12 minutes.

This has happened before and we think it has something to do with Java Garbage Collection (which happens at random intervals). The solution may be to schedule restarts of this server on a regular basis in the middle of the night.

Brandon Gant

CARLI System Services

VuFind Local Catalog Maintenance Tonight

Tuesday, March 5, 2013 - 11:09am

On Sunday morning, we moved a rebuilt SOLR index of Local Catalogs to the VuFind server. That index has not updated the past two nights. After 10PM tonight, we will copy the index over again. This will require a reboot of the server which will take this service offline for a few minutes. Brandon Gant CARLI System Services

VuFind Problem 02/07

Thursday, February 7, 2013 - 8:47pm

According to monitoring, the local catalog SOLR index for VuFind went offline at 7:16PM. This caused searches in all local catalogs to display a connection timeout error message ("Data not available for display"). That server has been rebooted and the system appears to be working again. Brandon Gant CARLI System Services

Voyager Connection Problems on Tuesday, Jan 29th

Wednesday, January 30, 2013 - 4:37pm

Yesterday (January 29th) at 12:00PM and again at 1:30PM, client connections to Voyager were dropped. The Production CARLI servers are at the University of Illinois at Urbana-Champaign. The campus network received a large amount of external network traffic which caused one of the firewalls to drop all existing external connections to campus. They worked with the firewall vendor and resolved the problem by 5PM. Brandon Gant CARLI System Services

Reports Server Updates

Monday, January 14, 2013 - 12:21pm

Right now, the Reports Server is a copy of Production from Saturday (before we took it offline). We are working on gettting the Reports Server back in sync with Production and it should be caught up by this afternoon. Brandon Gant CARLI System Services

Reports Server Online

Monday, January 14, 2013 - 8:09am

The data transfer is complete and the server has been been booted up on the new disk array. We still need to do some verifications on the data today and change a few more cables, but it should not require more downtime. Brandon Gant CARLI System Services

Extended Downtime for Reports Server

Sunday, January 13, 2013 - 10:51am

After 10 hours, the data copy process is about 25% complete. At this rate, the Reports Server should be back online Tuesday, January 15th. Sorry about the additional downtime, but we need to get this data migrated off of the older Sun 6140 hardware. Brandon Gant CARLI System Services

Reports Server Reboot

Saturday, January 12, 2013 - 1:06pm

Tonight at midnight, I will bring the Reports Server down and kick off a script to migrate data from the current Sun 6140 disk array to our new Dell Compellent disk array. To be ready for tonight, I need to at least reboot the Reports Server so that it will apply kernel changes to see the new disk volumes. I will start the reboot process in the next few minutes. Brandon Gant CARLI