Downtime Sunday, August 10th for Patching
Friday, August 1, 2014 - 4:48pm
Sunday, August 10th from 6AM to 10AM, Production Voyager, VuFind, CONTENTdm, and SFX services will be taken offline to apply operating system patches. Downtime for each service should be less than an hour.
If you discover any issues with the services after 10AM, please contact email@example.com.
Network Outage this Saturday, July 12th
Monday, July 7, 2014 - 3:51pm
This Saturday, July 12th between 4AM and Noon, UIUC campus networking staff (CITES) will replace the backbone routers connected to the Production CARLI Data Center.
For some period of time during this upgrade, all CARLI services could be unavailable.
Information about this network event, any status updates, and an announcement that work has been completed can be found on the following website:
After the upgrade, please contact firstname.lastname@example.org if you notice that any CARLI services are not working properly. We will be checking systems and restarting any that were impacted by the network outage. If there are any ongoing issues after the work is completed, they will be posted to http://www.carli.illinois.edu/system-status.
Network Switch Failure
Friday, June 6, 2014 - 9:04pm
At approximately 6:19PM today (Friday, June 6th), one of two redundant network switches in the Production Data Center went into an odd state. All lights were on, but no traffic was moving in the switch. CARLI servers are plugged into both switches for redundancy, but the failure took down all networking. The failed switch was restarted and all services were back online at 6:54PM.
We will need to look closely at the log files to see if we can determine what caused the network switch error and why the redundancy did not protect us from this error.
UIUC Database Outage April 15
Tuesday, April 15, 2014 - 4:28pm
At 10:51AM this morning, our Database Administrator changed the Oracle password for the University of Illinois at Urbana-Champaign (UIUC) database account to run some tests. He thought he was logged into the Oracle Test server, but he was actually logged into the Production Oracle server. This caused Voyager client errors, forced UIUC circulation clients into "offline circ" mode, displayed "The catalog is not available" message in UIUC's WebVoyage instance, and blocked UIUC's VuFind access. The problem was corrected and UIUC Voyager services were brought back online at 11:12AM and VuFind at 11:29AM.
I apologize for this outage. At our next IT staff meeting we will discuss ways to prevent this type of error from happening again.
Oracle security changes caused Voyager and VuFind problems
Monday, April 14, 2014 - 4:08pm
At 9:30AM yesterday morning (Sunday, April 13th) we made a change to Production Oracle to enhance our database security. This change caused problems in VuFind, so it was backed out by 10AM Sunday morning. The change did not cause any issues in our Test server environment. We have identified what is different between Production and Test and are working to make sure they are identical for future testing.
The work on Sunday also introduced a permissions conflict on some database tables. The effect was that some of our weekend batch jobs did not run properly and will need to be submitted again. Some libraries were also not able to save changes to records or received errors in their Voyager clients. We identified this problem and corrected it at 10AM this morning (Monday, April 14th).
Hopefully we have identified the issues surrounding this security change, but we will wait until Spring semester classes have ended before applying it to Production Oracle again.
No Heartbleed on CARLI Servers
Friday, April 11, 2014 - 4:31pm
The Heartbleed bug in OpenSSL has been all over the news this week. It is a serious enough problem that it even has its own website (www.heartbleed.com). We scanned our systems and we did not find this problem on any of them, so there is no need to worry about changing passwords on CARLI systems at this time.
We are always looking for ways to improve the performance and security of our services. For example, we were already planning to make some changes to our web servers this summer to improve the strength of our SSL connections (newer versions, better ciphers, Perfect Forward Secrecy). If you have suggestions for improving our services, please contact us at email@example.com.
UIUC DNS Outage Today (April 4th)
Friday, April 4, 2014 - 6:03pm
The UIUC Domain Name Service (DNS) went offline at approximately 3:06PM today and campus networking staff report that it was brought back online at 3:42PM. The DNS service translates human-friendly names (i.e. voyager.carli.illinois.edu) into computer-friendly addresses (i.e. 184.108.40.206).
Without this service, some CARLI servers are unable to lookup addresses so that they can talk to other CARLI servers. The campus Voice-over-IP phone service also was impacted by this outage which resulted in busy signals when calling our office.
The UIUC campus maintains three DNS servers for redundancy: two located in Urbana and one in Chicago. A bad configuration was replicated across all three servers simultaneously causing them to go offline. The DNS manager has provided campus IT staff with a list of the issues that occurred today and the changes they will make to avoid these issues in the future. If needed, CARLI IT staff also have the option to setup our own caching DNS servers that we can use along with the UIUC DNS servers.
RESOLVED: ILDS label problems when using Internet Explorer
Thursday, February 13, 2014 - 8:41am
The issue with Internet Explorer has been resolved. Users should now be able to create labels on the ILDS website when using the Internet Explorer browser.
Description of the problem that has been resolved: A problem creating ILDS labels when using Internet Explorer. If you experience problems with the ILDS website, please try using Firefox or another browser to create labels. CARLI staff are investigating the problem.
SSL Certificate Expiration - Update
Tuesday, January 21, 2014 - 8:42am
As of 10:00 pm on Tuesday, January 21, the SSL Certificate has been renewed and users should no longer see an "expired certificate" warning when accessing online catalogs.
Analysis of the Voyager Issues Thursday Morning (11/07/13)
Friday, November 8, 2013 - 10:45am
Yesterday morning (Thursday, Nov 7th) at around 4:45AM, the disk space on the Voyager server reached 100% usage. All services were still online, but it prevented requests in VuFind and WebVoyage, blocked record updates in the Voyager Staff Clients, and caused other issues. Two problems combined to create this service outage:
First, a non-MARC file was manually generated to diagnose a record loading issue. An automated batch job retrieved this file and began working on it. We have modified the batch job to only retrieve files with a very specific file name structure.
Second, the automated batch job tried to parse the non-MARC data as if it were MARC data. This caused the job to go into an infinite loop writing data to disk (which filled up the disk space). We have modified this batch job to ignore any non-MARC data it may encounter in the future.
Once we understood the problem and cleaned up files to free up disk space, the fastest way to restore service was to reboot the Voyager server.