Massive drive failures after a datacenter gas attack. A critical MySQL vulnerability you should know about & is Cisco responsible for the death of an MMO?

Plus great questions, our answers & much more!

RSS Feeds:

HD Video Feed | Mobile Video Feed | MP3 Audio Feed | Ogg Audio Feed | iTunes Feed | Torrent Feed

Become a supporter on Patreon:


Show Notes:

Whoosh! That was the sound of your bank’s hard drives being destroyed

  • “ING Bank’s main data center in Bucharest, Romania, was severely damaged over the weekend during a fire extinguishing test. In what is a very rare but known phenomenon, it was the loud sound of inert gas being released that destroyed dozens of hard drives. The site is currently offline and the bank relies solely on its backup data center, located within a couple of miles’ proximity.”
  • “The drill went as designed, but we had collateral damage”, ING’s spokeswoman in Romania told me, confirming the inert gas issue. Local clients were unable to use debit cards and to perform online banking operations on Saturday between 1PM and 11PM because of the test. “Our team is investigating the incident,” she said.”
  • “The purpose of the drill was to see how the data center’s fire suppression system worked. Data centers typically rely on inert gas to protect the equipment in the event of a fire, as the substance does not chemically damage electronics, and the gas only slightly decreases the temperature within the data center.”
  • “The gas is stored in cylinders, and is released at high velocity out of nozzles uniformly spread across the data center. According to people familiar with the system, the pressure at ING Bank’s data center was higher than expected, and produced a loud sound when rapidly expelled through tiny holes”
  • “The bank monitored the sound and it was very loud, a source familiar with the system told us. “It was as high as their equipment could monitor, over 130dB”.”
  • “here is still very little known about how sound can cause hard drive failure. One of the first such experiments was made by engineer Brendan Gregg, in 2008, while he was working for Sun’s Fishworks team. He recorded a video in which he explains how shouting in a data center can result in hard drives malfunction.”
  • The test Brendan did was just a demonstration, the problem they were diagnosing in the video was caused by traffic on the street outside of the office basement data center. The rumble of the diesel bus engine as it pulled away from the stop on a regular basis caused latency on their hard drives
  • “Researchers at IBM are also investigating data center sound-related inert gas issues. “[T]he HDD can tolerate less than 1/1,000,000 of an inch offset from the center of the data track—any more than that will halt reads and writes”, experts Brian P. Rawson and Kent C. Green wrote in a paper. “Early disk storage had much greater spacing between data tracks because they held less data, which is a likely reason why this issue was not apparent until recently.””
  • “The Bank said it required 10 hours to restart its operation due to the magnitude and the complexity of the damage. A cold start of the systems in the disaster recovery site was needed. “Moreover, to ensure full integrity of the data, we’ve made an additional copy of our database before restoring the system,” ING’s press release reads.”
  • “Over the next few weeks, every single piece of equipment will need to be assessed. ING Bank’s main data center is compromised “for the most part”, a source told us.”

Critical MySQL vulnerability

  • “An independent research has revealed multiple severe MySQL vulnerabilities. This advisory focuses on a critical vulnerability with a CVEID of CVE-2016-6662 which can allow attackers to (remotely) inject malicious settings into MySQL configuration files (my.cnf) leading to critical consequences.”
  • “The vulnerability affects all MySQL servers in default configuration in all version branches (5.7, 5.6, and 5.5) including the latest versions, and could be exploited by both local and remote attackers. Both the authenticated access to MySQL database (via network connection or web interfaces such as phpMyAdmin) and SQL Injection could be used as exploitation vectors.”
  • The vulnerability also affects forks of MySQL including MariaDB and Percona
  • “Official patches for the vulnerability are not available at this time for Oracle MySQL server. The vulnerability can be exploited even if security modules SELinux and AppArmor are installed with default active policies for MySQL service on major Linux distributions.”
  • Oracle has decided to not release a patch until their next “Critical Patch Update” in the middle of October
  • How does it work?
  • “The default MySQL package comes with a mysqld_safe script which is used by many default installations/packages of MySQL as a wrapper to start the MySQL service process”
  • This wrapper allows you to specify an alternate malloc() implementation via the mysql config file (my.cnf), to improve performance by using a specially designed library from Google performance team, or another implementation.
  • The problem is that many MySQL tutorials, guides, how-tos, and setup scripts chown the my.cnf file to the mysql user. Even most MySQL security guides give this bad advice.
  • “In 2003 a vulnerability was disclosed in MySQL versions before 3.23.55 that
    allowed users to create mysql config files with a simple statement:”
    SELECT * INFO OUTFILE ‘/var/lib/mysql/my.cnf’
  • “The issue was fixed by refusing to load config files with world-writable permissions as these are the default permissions applied to files created by OUTFILE query.”
  • This issue has been considered fixed for more than 10 years.
  • However, a new vector has appeared:

    mysql> set global general_log_file = ‘/etc/my.cnf’;
    mysql> set global general_log = on;
    mysql> select ‘
    ‘> ; injected config entry
    ‘> [mysqld]
    ‘> malloc_lib=/tmp/
    ‘> ‘;
    1 row in set (0.00 sec)
    mysql> set global general_log = off;

  • If MySQL has permission, it will write that content into that file
  • Now, the config file will be invalid, and mysql will not like it because it contains excess lines, however:
  • “mysqld_safe will read the shared library path correctly and add it to the LD_PRELOAD environment variable before the startup of mysqld daemon. The preloaded library can then hook the libc fopen() calls and clean up the config before it is ever processed by mysqld daemon in order for it to start up successfully.”
  • Another issue is that the mysqld_safe script loads my.cnf from a number of locations, so even if you have properly security your config file, if one of the other locations is not locked down, MySQL could create a new config file in that location
  • “The vulnerability was reported to Oracle on 29th of July 2016 and triaged by the security team. It was also reported to the other affected vendors including PerconaDB and MariaDB. The vulnerabilities were patched by PerconaDB and MariaDB vendors by the end of 30th of August.”
  • “During the course of the patching by these vendors the patches went into public repositories and the fixed security issues were also mentioned in the new releases which could be noticed by malicious attackers. As over 40 days have passed since reporting the issues and patches were already mentioned publicly, a decision was made to start disclosing vulnerabilities (with limited PoC) to inform users about the risks before the vendor’s next CPU update that only happens at the end of October.”
  • “No official patches or mitigations are available at this time from the vendor. As temporary mitigations, users should ensure that no mysql config files are owned by mysql user, and create root-owned dummy my.cnf files that are not in use. These are by no means a complete solution and users should apply official vendor patches as soon as they become available.”

Bugs in Cisco networking gear at center of hosting company bankruptcy fight

  • “Game of War: Fire Age, your typical melange of swords and sorcery, has been one of the top-grossing mobile apps for three years, accounting for hundreds of millions of dollars in revenue. So publisher Machine Zone was furious when the game’s servers, run by hosting company Peak Web, went dark for 10 hours last October. Two days later, Machine Zone fired Peak Web, citing multiple outages, and later sued.”
  • “Then came the countersuit. Peak Web argued in court filings that Machine Zone was voiding its contract illegally, because the software bug that caused the game outages resided in faulty network switches made by Cisco Systems, and according to Peak Web’s contract with Machine Zone, it wasn’t liable. In December, Cisco publicly acknowledged the bug’s existence—too late to help Peak Web, which filed for bankruptcy protection in June, citing the loss of Machine Zone’s business as the reason. The Machine Zone-Peak Web trial is slated for March 2017.”
  • “There’s buggy code in virtually every electronic system. But few companies ever talk about the cost of dealing with bugs, for fear of being associated with error-prone products. The trial, along with Peak Web’s bankruptcy filings, promises a rare look at just how much or how little control a company may have over its own operations, depending on the software that undergirds it.”
  • “Peak Web, founded in 2001, had worked with companies including MySpace, JDate, EHarmony, and Uber. Under its $4 million-a-month contract with Machine Zone, which began on April 1, 2015, it had to keep Game of War running with fewer than 27 minutes of outages a year, court filings show. According to Machine Zone, the hosting service couldn’t make it a month without an outage lasting almost an hour. Another in August of that year was traced to faulty cables and cooling fans, according to the publisher.”
  • “Cisco’s networking equipment became a problem in September, says a person familiar with Peak Web’s operations, who requested anonymity to discuss the litigation. The company’s Nexus 3000 switches began to fail after trying to improperly process a routine computer-to-computer command, and because Cisco keeps its code private, Peak Web couldn’t figure out why. The person familiar with the situation says Cisco denied Peak Web’s requests for an emergency software fix, and as more switches failed over the next month, the hosting service’s staffers couldn’t move quickly enough to keep critical systems online.”
  • “Finally, late in October, came the 10 hours of darkness. Three people familiar with Peak Web’s operations say the lengthy outage gave the company time to deduce that the troublesome command was reducing the switches’ available memory and causing them to crash. The company alerted Cisco. Machine Zone’s attorneys wrote that Peak Web has “aggressively sought to place the blame elsewhere for its failures” and that it could have prevented the downtime. In December, Cisco confirmed to Peak Web that it had replicated the bug and issued a fix, according to e-mails filed as evidence in the lawsuit.”
  • “Networking equipment such as switches and routers, which carry the world’s internet and corporate data traffic, tend to be especially difficult to fix with a software patch”
  • “In one previously unreported incident, in 2014, a glitch in a Cisco Invicta flash storage system corrupted data and disabled the emergency-room computer systems at Chicago’s Mount Sinai Hospital for more than eight hours, says a person familiar with the incident. Cisco later froze shipments of Invicta equipment and discontinued the product line. In another unreported case, a Cisco server in 2012 overheated inside a data center at chipmaking equipment manufacturer KLA-Tencor, forcing the facility to close and costing the company more than $50 million, according to a person familiar with the matter.”
  • This is definitely a tough spot to be in. I have been on both sides of this, and even in the middle. I use the services of a larger ISP to provide service to my customers, so when a problem is with that upstream ISP, their SLA only covers a fraction of what I pay them, not what my customers pay me
  • One of the worst cases for me was when a automated configuration error at an upstream ISP changed a bunch of switch ports from gigabit to 100mbps, severely degrading the performance of our servers, and interrupting an important live stream.
  • While our ISP gave us a large credit to cover their screw up, it didn’t cover the lossed revenue we didn’t get because of the screw up, nor the even larger lost revenue of our customer. That customer left, so we ended up also missing out all of future revenue


Round Up:

Question? Comments? Contact us here!