Keystone is a proposed design that could unify and simplify many forms of storage. We plan to do this by building the storage to expect failure and recover quickly.
ROC (Recovery Oriented Computing) is a collection of research projects led by UC Berkeley from about 2001 to 2006. One of its major premises is that systems can be made more robust by making their components fail quickly if they are in trouble. By focusing on MTTR (Mean-Time-To-Repair) rather than MTBF (Mean-Time-Between-Failures), the entire system can offer better service even while things break. This is the approach taken by the very large web-sites we see today.
SOFT (Storage Over Flaky Technology) is an acronym I made up to fit the talk title. It represents the trend to get ever simpler and less expensive components. For example, consumer grade SSD costs about 1/8th as much as Enterprise Grade SSD (and is denser in the server). Still, Consumer Grade SSD will return an uncorrectable error about 100,000 times as often as Enterprise Grade. By designing the architecture of Keystone to store immutable data and surrounding that data with aggressive error checking in software, we can be extremely confident of detecting an error and fetching the desired data from one of the other places it has been stored.
In this talk, we will outline the types of storage used by Salesforce (in addition to our use of Oracle and SANs for the relational database). We will talk about the architectural principals used to provide performance, correctness, and availability even while the components themselves fail. Finally, we will walk through some math about availability of data which provides some fun perspective on using SOFT.
About Pat
Pat Helland has been working in distributed systems, databases, transaction processing, scalable systems, and fault tolerance since 1978.
For most of the 1980s, Pat worked at Tandem Computers as the Chief Architect for TMF (Transaction Monitoring Facility), the transaction and recovery engine under NonStop SQL. After 3+ years designing a Cache Coherent Non-Uniform Memory Multiprocessor for HaL Computers (a subsidiary of Fujitsu), Pat moved to the Seattle area to work at Microsoft in 1994. There he was the architect for Microsoft Transaction Server, Distributed Transaction Coordinator, and a high performance messaging system called SQL Service Broker which ships with SQL Server. From 2005-2007, he was at Amazon working on the product catalog and other distributed systems projects including contributing to the original design for Dynamo. After returning to Microsoft in 2007, Pat worked on a number of projects including Cosmos, the scalable "Big Data" plumbing behind Bing. While working on Cosmos, Pat architected both a project to integrate database techniques into the massively parallel computations as well as a very high-throughput event processing engine.
Since early 2012, Pat has worked at Salesforce.com in San Francisco. He is focusing on multi-tenanted database systems and scalable reliable infrastructure for storage.