Read the Digest in PDF. You need the free Adobe Reader.

The digest of current topics on Continuous Availability. More than Business Continuity Planning.

BCP tells you how to recover from the effects of downtime.

CA tells you how to avoid the effects of downtime.

www.availabilitydigest.com

Thanks to This Month's Availability Digest Sponsor

Connect – HP's largest and most engaged IT professional user community.

Join us at the NonStop Advanced Technical Boot Camp in San Jose, October 14th through 16th.

Sunday's Preconference Day includes four in-depth, full-day HP NonStop education seminars.

Monday's and Tuesday's Education Days feature dozens of breakout sessions and vendor talks.

In this issue:

Never Again

Knight Capital Crippled by Software Bug

Best Practices

NonStop Boot Camp is Coming in October

Availability Topics

Court Decides - HP 1, Oracle 0

Linux Leap-Second Bug

Browse through our Useful Links.

Check our article archive for complete articles.

Join us on our Continuous Availability Forum.

Check out our seminars.

Check out our technical writing services.

Can a Software Bug Cripple a Company?

A software bug can, and it did. As we describe in this month’s Never Again article, “Knight Capital Crippled by Software Bug,” the major market maker for the NYSE and NASDAQ stock exchanges flooded the NYSE with erroneous bid and offer quotes at the opening of trading on Wednesday morning, August 1, 2012. To Knight’s horror, its quotes were executed to the tune of USD $21 billion; and it lost USD $440 million in just 45 minutes. The next week, the crippled firm was taken over by a consortium of financial services firms.

This was a failure of the modern technology of high-frequency trading (HFT), in which computerized trading systems can execute trades in microseconds. HFT was the cause of the “flash crash” of 2010. The problem with HFT has become a major thread on our LinkedIn Continuous Availability Forum. The thread was started by Paul Green of Stratus Technologies when he asked, “What advice would you give to the NYSE for avoiding problems with high-frequency trading software?” Dozens of responses have analyzed this problem in some detail.

The thread exemplifies how the IT community can be tapped for valuable insights. We encourage you to start your own threads on the Continuous Availability Forum and to share your experience on other threads.

Dr. Bill Highleyman, Managing Editor

Never Again

Knight Capital Crippled by Software Bug

On Wednesday morning, August 1, 2012, Knight Capital, the Number One market maker for the NYSE and NASDAQ stock exchanges, was virtually wiped out in just forty-five minutes by a software bug. The bug flooded the NYSE with orders at the market open. Billions of dollars were traded, and Knight lost $440 million in unintended trades. Within days, Knight lost 78% of its market value and was acquired by a consortium of other financial-services firms.

With high-frequency trading (HFT), brokerage firms such as Knight can place thousands of buy/sell orders a second via their proprietary, computer-based trading algorithms. Though each order may result in a gain of only fractions of a penny per share, the pennies add up to big profits. But as Knight found out, an HFT system gone awry can also rack up enormous losses in just a few minutes.

Will the exchanges take drastic steps to ensure that incidents such as the Knight debacle do not happen again? Probably not. Exchanges compete with each other and do not want to lose business due to tightened rules not adopted by other exchanges.

Besides, the billions of dollars that brokerage firms have invested in HFT systems and the billions of dollars they stand to make are major barriers to any change.

--more--

Best Practices

NonStop Boot Camp is Coming in October

It’s been several years since the HP NonStop community has had an opportunity to gather in a dedicated setting. The time is coming again. The NonStop Advanced Technical Boot Camp will be held on Sunday, October 14, through Tuesday, October 16, 2012, in San Jose, California.

The three-day session will include a preconference day on Sunday followed by two education days. The education days will include several tracks featuring content presented by HP, customers, and NonStop vendors.

The NonStop Boot Camp is an essential continuation for the NonStop community of HP Discover 2012, which was held in Las Vegas from June 4^th through June 7^th. At Discover 2012, attendees learned all that is new in HP’s converged-infrastructure initiative. The purpose of the NonStop Boot Camp is to drill down into all things NonStop.

About 400 attendees are expected to attend the Boot Camp, including HP staff, customers, analysts, vendors, and other industry professionals.

Also, you can learn about the many new offerings from the NonStop third-party vendors in the Partner Pavilion. And of course, look forward to the Boot Camp receptions to relax and interact with all your peers.

--more--

Availability Topics

Court Decides – HP 1, Oracle 0

Oracle, after thirty years of close cooperation with HP, issued on March 12, 2011, a press release in which it declared that it would no longer support its products on Itanium processors. The Itanium processors form the basis of HP’s new blade systems upon which HP’s predominant operating system, HP-UX, runs. About 140,000 customers run Oracle on HP-UX. HP immediately launched a lawsuit to require Oracle to continue its support of Itanium for all of Oracle’s current and future products.

On August 1, 2012, the Superior Court of the State of California released its decision in that lawsuit. Judge James P. Kleinberg likened Oracle’s arguments to a Seinfeld sitcom – a lot of nothing. He directed Oracle to continue its support of Itanium to the same extent that it had supported HP products in the past, including the porting of all new Oracle product versions.

We summarize the Court’s deliberations and decisions in this article. Though an appeal is likely, it appears that those HP and Oracle customers dependent upon Itanium servers and Oracle databases for part or all of their business systems now have a reasonable expectation that their investments are protected and that there will be continued support and development for the foreseeable future

Stay tuned for the next installment in the HP/Oracle saga.

--more--

Linux Leap-Second Bug Takes Down Data Centers

What a difference a second can make. At the stroke of midnight the evening of Saturday, June 30, 2012, servers all over the world began to crash. Was this another Stuxnet virus propagated by some rogue government to take down the world’s IT infrastructure?

No. It was caused by a leap second, which is added every few years to keep the world’s clocks in synchronism with the earth’s rotation. A bug in thousands of unpatched versions of Linux choked on this. These servers had to be rebooted, causing hours of downtime at some of the Internet’s most popular sites. They included LinkedIn, Mozilla, and Reddit.

The Amadeus’ ALTEA airline hosting system that many airlines such as Qantas and Virgin Australia use for passenger check-in and ground services was down for almost an hour. In some data centers, hundreds of servers had to be rebooted.

Interestingly, another consequence of the bug in large data centers was a sudden spike in power utilization when a good number of servers started to run at full load.

--more--

Sign up for your free subscription at https://availabilitydigest.com/signups.htm

Would You Like to Sign Up for the Free Digest by Fax?

Simply print out the following form, fill it in, and fax it to:

Availability Digest

+1 908 459 5543

Name:

Email Address:

Company:

Title:

Telephone No.:

Address:

____________________________________

The Availability Digest is published monthly. It may be distributed freely. Please pass it on to an associate.

Managing Editor - Dr. Bill Highleyman editor@availabilitydigest.com.