|Read the Digest in
You need the free
Thanks to This Month's Availability Digest Sponsor
In this issue:
Browse through our useful links.
See our article archive for complete articles.
Sign up for your free subscription.
Visit our Continuous Availability Forum.
Check out our seminars.
Check out our writing services.
Digest’s Managing Editor to Speak to NonStop Users in Korea and Taiwan
I have been asked by HP to address the Korea NonStop User Group (KNUG) in Seoul, South Korea, on July 3, 2014, and NonStop users in Taipei, Taiwan, on July 8th on the topic of data-center availability. The title of my talk is “Help! My Data Center is Down!” It describes several disasters that have impacted major data centers for hours and even for days. Amazon, Google, Microsoft, and the U.S. Internal Revenue Service (IRS) have all suffered from such disasters. These talks build on my presentation last year at InNUG, the India NonStop User Group.
Data centers run many critical applications that must be continuously available. Downtimes of a few minutes, though painful, might be acceptable; but downtimes of hours or days is absolutely unacceptable. In my talk, I discuss how active/active systems can be used to protect critical applications and can provide recovery times in seconds in the event of a massive data-center outage.
In my longer presentation for the Taiwan NonStop users, I also explore Distributed Denial of Service (DDoS) attacks and how they can take down even fault-tolerant systems such as NonStop servers.
These are all topics covered in some detail in our one-day, two-day, and three-day seminars.
Dr. Bill Highleyman, Managing Editor
A major South American stock exchange found that it could not reliably achieve same-day securities-settlement commitment times due to the manual reporting of trades to its clearinghouse. This often resulted in time-consuming data-entry errors. The exchange chose to re-architect its interaction with the clearinghouse to make trade reporting fully automatic. The enhancement eliminated erroneous manual input and enabled the exchange to meet its settlement commitment times.
The re-architected system required heterogeneous data replication between the exchange’s HP NonStop trading system and the clearinghouse’s AIX/Sybase system. The Shadowbase data replication engine from Gravic, Inc., was chosen by the exchange to satisfy this need. Shadowbase software now plays a major role in integrating the many heterogeneous systems in the exchange’s new IT infrastructure as well as providing continuous availability for its mission-critical business services.
This case study highlights the capabilities and flexibility of the Shadowbase data replication product for homogeneous and heterogeneous data replication. The study demonstrates the Shadowbase replication engine’s suitability for business continuity, data integration, and application integration purposes.
Joyent is a high-performance cloud provider aimed at real-time and mobile applications. On Tuesday, May 27, 2014, one of Joyent’s data centers was taken totally offline by an operator error. The first indication of a problem came when the Joyent data center located in Ashburn, Virginia, began to report “transient availability issues.”
After a quick investigation, Joyent administrators discovered the source. An operator had erroneously entered a command to reboot all of the servers in the data center. The operator was performing capacity upgrades to some of the compute nodes in the data center using tools that allowed for remote updating of software. After completing the upgrades, he issued a command to reboot those servers.
Unfortunately, the operator mistyped the command. Instead of rebooting just the servers that he had upgraded, he rebooted every server in the data center. There was no validation in the reboot command tools to ensure the operator was "really sure" that he/she wanted the reboot to be performed against all systems. All of the servers in the data center stopped functioning during the reboot process, and the entire US-East-1 data center went down.
Even though Microsoft’s Windows XP operating system still runs on 25% of the world’s desktop computers and PCs, Microsoft elected to end XP support on April 8, 2014. According to the Payment Card Industry (PCI) standards organization, XP systems no longer comply with the PCI Data Security Standard (DSS) since Microsoft has stopped providing security patches. Merchants still using XP-based systems to process payment cards will not be able to pass the PCI DSS mandatory annual compliance audit.
If you accept payment cards, you have something hackers want. Processing payment cards with Windows XP systems just makes it easier for the hackers to get at them. They always go for the low-hanging fruit.
Should your organization experience a breach, you will be deemed “non-compliant,” even if you were previously validated to be compliant. Furthermore, you will not be able to effectively pass an ASV network scan because these scans are required to automatically fail unsupported operating systems.
As complex and expensive as migrating from Windows XP may be, the security of the worldwide payment-card system is dependent upon retiring all of the Windows XP systems involved in payment-card processing and replacing them with modern operating systems.
A long series of public-cloud failures have included some of the largest cloud providers – Amazon, Google, and Microsoft, to name a few. These failures emphasize the need to be prepared for public-cloud services to suddenly disappear for hours or days. Even worse, data might disappear.
In this article, we review many of the public-cloud failures and see what we can learn about trusting cloud services and about protecting our applications and data from their faults.
Many clouds provide SLAs guaranteeing three 9s of availability or better. However, a multi-day outage, as has been experienced by many of the largest clouds, can reduce cloud availability for the year to two 9s. A several-day outage is extremely painful to an organization even for non-critical applications. It is intolerable for critical applications, many of which cannot withstand outages lasting for more than several minutes.
Cloud computing is becoming an important resource for increasing numbers of companies. Unfortunately, cloud utilities do not yet provide the availability offered by electrical utilities and telephone services. Until that time, careful thought and planning must go into any decision to utilize cloud services for your applications, whether they are critical or not.
A challenge every issue for the Availability Digest is to determine which of the many availability topics out there win coveted status as Digest articles. We always regret not focusing our attention on the topics we bypass.
Now with our Twitter presence, we don’t have to feel guilty. This article highlights some of the @availabilitydig tweets that made headlines in recent days.
Sign up for your free subscription at https://availabilitydigest.com/signups.htm
Would You Like to Sign Up for the Free Digest by Fax?
Simply print out the following form, fill it in, and fax it to:
+1 908 459 5543
The Availability Digest is published monthly. It may be distributed freely. Please pass it on to an associate.
Managing Editor - Dr. Bill Highleyman email@example.com.
© 2014 Sombers Associates, Inc., and W. H. Highleyman