VuFind

Additional info on the Universal Catalog outage

This morning at 2:53AM, the Universal Catalog database stopped responding. The new Oracle version has a parameter that limits the amount of data that can be written to Archive Logs to 10GB. We have increased that setting for all databases and will monitor it to see if it should be increaed further. After making that change, the Universal Catalog started working again at 8:53AM this morning.

Resolved: Oracle Issues Affecting I-Share 6/17/2015

This issue has been resolved. If you start seeing problems with holdings displays in I-Share please email support@carli.illinois.edu

Oracle Issues Affecting I-Share 6/17/2015

This morning we are seeing Oracle issues affecting the display of holdings in I-Share. Searching is not affected. We are actively investigating this issue.

All Systems Back Online

A change made at 5PM had unintended consequences that didn't really manifest until about 7PM tonight. There were so many processes backed up waiting to run that we eventually had to take most systems offline to implement a fix.

We still have more tuning that needs to be done, but hopefully those changes will be scheduled on Sunday mornings from now on.

We have also put in place the threshold changes we were planning to make at 11PM, so there will be no need for another outage tonight.

Brandon Gant
CARLI

Oracle Performance Problems

We see some serious performance problems that are affecting Voyager, VuFind, and WebVoyage.

We have started investigating and will attempt to determine the cause and stabilize the system.

I-Share Voyager Down for Software Upgrade

I-Share Voyager and its VuFind catalog interface are down this weekend (June 12-15) while CARLI staff upgrade its software to Voyager version 9.1.1 .

VuFind Maintenance this Sunday Morning (04/04)

This Sunday, April 4th between 6AM and 10AM, the VuFind service (http://vufind.carli.illinois.edu) will experience multiple outages while we upgrade three Linux servers running the SOLR indexes and MySQL database.

If you encounter any problems after 10AM, please report them to support@carli.illinois.edu.

Brandon Gant
CARLI

Voyager 9 Upgrade

Voyager (staff clients, VuFind, and WebVoyage) will be unavailable from 5PM Friday, June 12 through 8AM Monday, June 15 for a system upgrade. See the Upgrade Webpage for more detail.

VuFind Outage, Saturday, Jan 31

On Saturday, January 31st, at 11:53 AM we received reports of VuFind not responding. CARLI System Services staff investigated the problem and determined that it was due to one of VuFind's services exhausting its memory resources. More memory was added and VuFind returned to normal operation by 3:30 PM.

Todd Pavlik
CARLI

Emergency Maintenance Complete

All Production CARLI Linux servers have been patched and rebooted.

Brandon Gant
CARLI

Emergency Maintenance Starting

Due to a new Linux glibc exploit called "GHOST" that could be a serious threat to the security of our systems, Production CARLI servers/services will now start going offline to be patched and rebooted.

I will post another update when this work has been completed. If a service is offline or not working properly after emergency maintenance is completed, please contact support@carli.illinois.edu.

Brandon Gant
CARLI

VuFind Performance this Morning (Monday, Jan 12)

This morning at around 9:20AM, the Production http://vufind.carli.illinois.edu server stopped responding. We tweaked the Apache 2.4 conf settings and it was running smoothly again by 9:50AM.

Our apache2.conf file contains performance tuning parameters and we discovered that the new Apache 2.4 has separate conf files that set the parameters back to their defaults. We need to make a few more changes to the Apache conf files to clean them up, but I don't see any other issues with the operating system upgrade.

VuFind Downtime Sunday Morning Jan 11

This Sunday morning between 6AM and 10AM, we will upgrade the Production VuFind Apache/Linux server. This will cause an outage of at most 30 minutes while the operating system is being upgraded.

We are currently running on Ubuntu Server 10.04 64-bit and support ends for this release in April 2015. We are upgrading this server to 12.04, then to 14.04 which is supported until April 2019. We will also take this opportunity to improve our HTTPS encryption settings in Apache by using the recommendations from https://cipherli.st and by verifying the new settings at https://ssllabs.com.

VuFind Local Catalog Outages

This morning at 7:14AM Monday, December 8th, the SOLR search index for VuFind "local" catalogs failed (there is a separate SOLR index that handles deduplicated "consortial" searches). The java process controlling the index was pinned at 100% CPU and no requests were being processed. At 7:34AM this morning, the SOLR service was restarted restoring service. At 11:22AM the service went offline again and was restarted.

Downtime Sunday, August 10th for Patching

Sunday, August 10th from 6AM to 10AM, Production Voyager, VuFind, CONTENTdm, and SFX services will be taken offline to apply operating system patches. Downtime for each service should be less than an hour.

If you discover any issues with the services after 10AM, please contact support@carli.illinois.edu.

Brandon Gant
CARLI

Network Outage this Saturday, July 12th

This Saturday, July 12th between 4AM and Noon, UIUC campus networking staff (CITES) will replace the backbone routers connected to the Production CARLI Data Center.

For some period of time during this upgrade, all CARLI services could be unavailable.

Information about this network event, any status updates, and an announcement that work has been completed can be found on the following website:
http://status.cites.uiuc.edu/SystemStatus/jsp/view_events.jsp?eventId=427

Network Switch Failure

At approximately 6:19PM today (Friday, June 6th), one of two redundant network switches in the Production Data Center went into an odd state. All lights were on, but no traffic was moving in the switch. CARLI servers are plugged into both switches for redundancy, but the failure took down all networking. The failed switch was restarted and all services were back online at 6:54PM.

We will need to look closely at the log files to see if we can determine what caused the network switch error and why the redundancy did not protect us from this error.

Brandon Gant
CARLI

UIUC Database Outage April 15

At 10:51AM this morning, our Database Administrator changed the Oracle password for the University of Illinois at Urbana-Champaign (UIUC) database account to run some tests. He thought he was logged into the Oracle Test server, but he was actually logged into the Production Oracle server. This caused Voyager client errors, forced UIUC circulation clients into "offline circ" mode, displayed "The catalog is not available" message in UIUC's WebVoyage instance, and blocked UIUC's VuFind access.

Oracle security changes caused Voyager and VuFind problems

At 9:30AM yesterday morning (Sunday, April 13th) we made a change to Production Oracle to enhance our database security. This change caused problems in VuFind, so it was backed out by 10AM Sunday morning. The change did not cause any issues in our Test server environment. We have identified what is different between Production and Test and are working to make sure they are identical for future testing.

No Heartbleed on CARLI Servers

The Heartbleed bug in OpenSSL has been all over the news this week. It is a serious enough problem that it even has its own website (www.heartbleed.com). We scanned our systems and we did not find this problem on any of them, so there is no need to worry about changing passwords on CARLI systems at this time.

Pages