Introduction
A leap second is a one-second adjustment that is occasionally applied to UTC (Coordinated Universal Time) in order to keep its time of day close to the mean solar time. The most recent leap second insertion happened on June 30, 2012 at 23:59:60 and caused disruptions in computing systems all around the world. Check out Wired’s article ‘Leap Second’ Bug Wreaks Havoc Across the Web for a little history.
The next leap-second adjustment will happen on June 30, 2015 at 23:59:60. And as far as your Atlassian applications are concerned, you can get out in front of it this time.
Impact
Atlassian applications are indirectly affected by this bug through some of the supported platforms. Basically, computing systems are not always prepared for this additional second. The bug affects a variety of software – notably, certain versions of Java, the Linux kernel, and the MySQL database server. Since Atlassian applications rely on these technologies, there are several points on your system that may require attention.
The impact of this bug may include increased and unexplained CPU usage, application slowdowns, application crashes, and failures on startup. Although we are not currently aware of any issue related to this bug resulting in data loss or corruption, subsequent application/server crashes caused by it, may result in such.
Affected environment and products
Atlassian Cloud customers are not affected by this issue as our system administrators have prepared for it in advance.
Since Atlassian applications hosted on your own network are indirectly affected by this bug, you will need to contact the administrators of all systems that interface with Atlassian products (the hosting server, database server, LDAP and mail server, etc.). We also suggest you review and contact all your software products’ vendors to ensure your product version contains the fix for the leap second bug.
Upgrading your Linux kernel, Java, and MySQL to the most recent version supported by your Atlassian application is the best way to prevent any disruption from the leap second bug. Another option reported to fix the issue is to restart the hosting server/VM, but this is usually undesirable in production environments. Check out the “Workaround” section below for more options.
Keep in mind that only systems synchronized to an NTP server – which we do recommend – may be affected.
Suggested fix
Consult your software vendor’s website and support to see if your version of the product(s) are affected. Please note that affected products may not be limited to those listed here, so contact your system administrator for further preparation. Applying the fixes below may not be sufficient in your environment – please make sure to perform an analysis locally.
- Linux kernel: Make sure your version of the kernel has the fix for the bug: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d
- MySQL database server: Certain combinations of MySQL server and kernel versions are affected: https://bugs.mysql.com/bug.php?id=65778
- Java: Fixes available in Java versions: 6u95, 7u80, 8u45. Source: http://www.oracle.com/technetwork/java/javase/tzdata-versions-138805.html
- NTP: Prepare your NTP daemon with leap seconds file: http://support.ntp.org/bin/view/Support/ConfiguringNTP#Section_6.14.
Although recent versions of software likely fix the bug, it is always best to be prepared with a solution to alleviate the problem. For this, please refer to the “Workaround” section below.
Workaround
If you experience any of the possible symptoms (high CPU usage, application crash, failure on startup), it is safe to apply the following workaround.
Since these updates can be problematic and often require service downtime, it’s best to prepare a workaround. We suggest implementing this on all potentially affected servers that are architecturally linked to service(s) provided by Atlassian applications (for example, database and proxy servers).
- Stop the NTP daemon (if possible, before 2015-07-01 00:00:00)
- Execute date -s “`date`” as root user, after 2015-07-01 00:00:01
- Start the NTP daemon
Note: We recommend minimizing the time during which the NTP daemon is stopped.
Source: https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/
Liability
Atlassian is not responsible for fixing this bug as it affects platforms Atlassian applications interface with, not Atlassian applications themselves. We do our best to raise awareness of potential problems, but may not be able to advise on all individual scenarios, so we suggest that you do an impact analysis on your systems.
FAQ
How can I check if I am affected by this bug?
You will need to contact your system administrator to check whether the software your Atlassian applications run on are affected by the leap second bug. The list of software is wider than what I mentioned above. The platforms I’ve mentioned are the ones mostly likely to directly affect Atlassian products. There are various articles (also referenced at the bottom of this one) that offer scripts and methods to check if you are affected.
What can I do to minimize the impact?
Upgrading your software versions and checking to see if you are affected are the best preventive measures, however it is best to be prepared to execute the workaround proposed above, if necessary.
Can I turn to Atlassian in case I need support on this bug?
As Atlassian applications are indirectly affected by this issue, it is best to reach out to your company’s system administrator, but we’ll do our best to advise you if you raise a ticket at support.atlassian.com.
Where is this issue tracked?
The bug is tracked in various software vendors’ websites and online documentation/forums. Contact your software vendors to see if you are affected.
As Atlassian, applications are only indirectly affected through supported platforms (Linux, Java, and MySQL), we are not tracking the leap second bug officially in our issue tracker. This knowledge base article also explains the workaround solution that can be issued run-time: JIRA Performance Problem due to System Time Settings. This is applicable not only for JIRA, but all Atlassian applications.
I am an Atlassian Cloud customer, what actions do I need to take?
Our system administrators have prepared our systems for the leap-second insertion, so no action is required from Cloud customers.
References
- MySQL bug #65778 – User has reported large and continued spike in CPU after leap second
- Bug 769972 – Java is choking on leap second.
- MySQL Time Zone Leap Second Support
- Oracle support note for Leap Second Hang problem that may result into 100% CPU utilization in Linux environment
- MySQL and the Leap Second, High CPU and the Fix
- Resolve Leap Second Issues in Red Hat Enterprise Linux
- Linux Cron Guide
- http://marc.info/?l=linux-kernel&m=134113577921904
The post What Atlassian application administrators need to know about the “leap second” bug lurking in your systems appeared first on Atlassian Blogs.