Speakers :: Ricon

There is a trend towards distributed systems made up of rapidly changing ephemeral single function microservices that can be deployed globally using public clouds. Monitoring tools have tended to focus on a small number of slowly changing monolithic applications in a single datacenter at a time, so a new breed of monitoring tools is emerging that collect data every second, are cloud and container aware, and can see the flows of requests across microservices. Testing these tools is a challenge, and this talk will describe how the SimianViz simulator can be used to model interesting architectures and create stress tests and failure scenarios.

About Adrian Cockcroft

Adrian Cockcroft has had a long career working at the leading edge of technology. He's always been fascinated by what comes next, and he writes and speaks extensively on a range of subjects. At Battery, he advises the firm and its portfolio companies about technology issues and also assists with deal sourcing and due diligence.

Before joining Battery, Adrian helped lead Netflix's migration to a large scale, highly available public-cloud architecture and the open sourcing of the cloud-native NetflixOSS platform. Prior to that at Netflix he managed a team working on personalization algorithms and service-oriented refactoring.

Adrian was a founding member of eBay Research Labs, developing advanced mobile applications and even building his own homebrew phone, years before iPhone and Android launched. As a distinguished engineer at Sun Microsystems he wrote the best-selling "Sun Performance and Tuning" book and was chief architect for High Performance Technical Computing.

He graduated from The City University, London with a Bsc in Applied Physics and Electronics, and was named one of the top leaders in Cloud Computing in 2011 and 2012 by SearchCloudComputing magazine. He can usually be found on Twitter @adrianco.

Follow Adrian @adrianco

The Internet of Things is not only exciting but has huge potential to benefit society. This talk highlights three key opportunities and associated technical challenges. First, indoor-location based services promise to have a huge impact with a variety of novel and surprising services – especially when we can get highly accurate estimates of a person's or object's indoor location. Second, software-defined infrastructure has contributed to great advancements in the cloud, data center, and other complex distributed systems. We describe how it can potentially provide even greater benefits to IoT via simplification, automation, and agility to roll out new applications and services. Third, while today's popular mobile-to-cloud architecture is highly successful, many IoT applications require an enhancement of this architecture where an additional layer (the "Fog") brings some of the cloud's capabilities to the edge of the network – including support for real-time analytics. We will highlight the technical challenges, recent advances toward their solution, and promising future directions of research.

About John Apostolopoulos

John Apostolopoulos is the VP/CTO of the Enterprise Segment where he drives the technology and architectural direction for Cisco's efforts in the enterprise space. John also directs the Innovation Labs within CTAO whose mission is to drive technology innovations aligned with Cisco's strategic directions. This covers the broad Cisco portfolio including the Internet of Things (IoT), enterprise mobility/BYOD, software defined networking (SDN) and SDN-empowered applications such as collaboration and security, video over wired/wireless networks, and network analytics. In addition, John co-leads Cisco's next-generation Enterprise Architecture effort.

Prior to joining Cisco, John was Lab Director for the Mobile & Immersive Experience Lab at HP Labs. His work spanned novel mobile devices and sensing, client/cloud multimedia computing, multimedia networking, signal processing, immersive video conferencing, SDN, and mobile streaming media content delivery networks for all-IP (4G) wireless networks.

John received a number of honors and awards including IEEE Fellow, IEEE SPS Distinguished Lecturer, named "one of the world's top 100 young (under 35) innovators in science and technology" (TR100) by MIT Technology Review, Certificate of Honor for contributing to the US Digital TV Standard (Engineering Emmy Award 1997), and his work on media transcoding in the middle of a network while preserving end-to-end security (secure transcoding) was adopted in the JPSEC standard. He has published over 100 papers, including receiving 5 best paper awards, and has about 65 granted US patents. John also has strong collaborations with the academic community and was a Consulting Associate Professor of EE at Stanford (2000-09), and is a frequent visiting lecturer at MIT. He received his B.S., M.S., and Ph.D. from MIT.

Follow John @john_apos

In this talk I will present Transformations, a new vision for building principled distributed systems. I will show how searching for the appropriate abstractions leads to a new approach for building and reasoning about distributed systems.

Large-scale distributed systems change and evolve very fast as a result of adding functionality to the original system, deploying new hardware in the datacenter, responding to changes in the workload, and so on. The high rate of change in their design makes distributed systems harder to reason about and implement correctly over time. Consequently, we sought out a way to build and evolve large-scale distributed systems in a principled manner, using what we call Transformations.

Transformations offer a new way to reason about a large-scale distributed system's properties such as correctness, safety, fault-tolerance and availability as the system changes and evolves over time. Transformations introduces an algebra to reason about how distributed systems evolve over time. Moreover, using Transformations developers can transform an existing large-scale system and change its properties automatically while being able to reason about these changes. During my talk I will cover the main contributions of Transformations by first explaining the principles we use in Transformations and then showing how it can solve very challenging problems we face as distributed system developers: Sharding different components of an already replicated system while batching some of the requests and implementing encryption between select components, or taking a system replicated with Paxos and transforming it into a system replicated using Chain Replication.

About DenizAltinbuken

Deniz Altinbüken is a PhD candidate in Systems at Cornell University, working with Robbert van Renesse. Her interests are in distributed systems and the theory of distributed computing with a focus on building infrastructure services for large-scale distributed systems. In addition to being a consensus enthusiast, currently, she is passionate about building principled distributed systems that are easy to reason about and implement correctly.

Website: http://www.cs.cornell.edu/~deniz/

Follow Deniz @denizaltinbuken

In this talk I plan to discuss the design and implementation of geo-replicated data stores that go beyond offering weak consistency semantics, and instead offer causal consistency guarantees (with eventual convergence guarantees, often called Causal+ Consistency) or go beyond offering operations over single objects in the data store to offer transactions that manipulate multiple objects with isolation guarantees. The talk will be divided in three main sections.

In first section I will cover the motivation for providing stronger semantics than weak consistency on data stores illustrating the need for either causal consistency or transactional semantics with some simple examples. The talk will briefly cover the design and implementation of several existing solutions, addressing briefly the differences of solution in the design space. From this quick overview, the talk will then shift focus to two concrete solutions proposed by the academia in which I have participated in the design and implementation.

In the second section of the talk, I will focus on the design and implementation of ChainReaction, a geo-replicated data store that offers causal+ consistency. ChainReaction was implemented on top of the Fast Array of Wimpy Nodes (FAWN) datastore, which features a one-hop DHT (similar to the Riak one). I will discuss the variant of Chain Replication employed at the core ChainReaction, which trades write latency for fast causal consistent reads, and explain how this design is extended to a geo-replicated scenario.

In the final section of the talk I will address the design and implementation of Blotter, a geo-replicated data store that supports arbitrary transactions with an isolation level of non-monotonic snapshot isolation. At the core of Blotter we leverage on a special configuration of Paxos, which allows a single round-trip among the closest datacenters for committing a transaction.

About João Leitão

João Leitão has a PhD in Computer Engineering (2012) from the Technical Institute of Lisbon, he also owns a Master degree (2007) and graduated in Computer Engineering from the Faculty of Sciences of the University of Lisbon. He is an Integrated Researcher in the NOVA LINCS laboratory of the NOVA University of Lisbon and an Invited Assistant Professor in the Computer Science department in the Faculdade de Ciências e Tecnologia of the same University.
His research interests are mostly focused on the design and implementation of large-scale systems ranging from cloud-based geo-replicated to peer-to-peer infrastructures, with emphasis on questions related to large-scale, consistency, fault-tolerance/reliability, efficiency, and security.

He is a professional member of the IEEE and ACM. Has two cats and occasionally (when time allows) practices Iaido and Jodo.

Follow João @jcaleitao

Modern NoSQL systems have made it significantly easier to build complex, constantly evolving applications. These systems' support for data-models that allow nested data and dynamic schemas allows developers to quickly prototype and deploy new features. While their support for flexible data-models is a boon for developers, several NoSQL systems have eschewed one of the main programmer-friendly abstractions of traditional DBMSs; atomic transactions. Atomicity guarantees that a group of writes to multiple data items are performed in an all-or-nothing fashion. Atomicity frees application developers from writing error-prone corner-case code to deal with scenarios in which only a subset of a transactions' writes succeeds.

The reason that these systems eschew atomic transactions is their seemingly prohibitive performance cost, particularly in distributed settings. In a distributed database system, atomicity requires all the machines involved in a given transaction to coordinate with each other. In particular, if a transaction reads or writes data residing on several machines, then atomicity requires each such machine to coordinate with every other machine involved in the transaction. To circumvent this distributed coordination, several popular NoSQL systems give up on general atomic transactions, and instead restrict their scope to a single key or partition.

This talk takes a step back and asks; does distributed coordination necessarily preclude performant general atomic transactions? The answer is a resounding NO. This talk analyzes the distributed coordination necessary for general atomic transactions, and describes its impact on three key performance and correctness properties; fairness, isolation, and throughput. We will find that there exists a three-way tradeoff between fairness, isolation, and throughput (FIT); a system which supports general atomic transactions can achieve at most two of these three properties simultaneously. Database architects can use the FIT tradeoff to reason about the performance and correctness tradeoffs associated with general atomic transactions in a principled fashion.

About Jose Faleiro

Jose Faleiro is a fourth year PhD student at Yale University, where he is advised by Daniel Abadi. His research interests lie in concurrency control and recovery for main-memory multi-core database systems. More broadly, he is interested in dealing with concurrency in parallel and distributed systems.

Follow Jose @jmfaleiro

Chain replication, a variation of primary-backup replication, is an increasingly-common technique for maintaining strong consistency semantics within a data store. To date, more research effort has gone into formalizing the data replication technique than into formalizing management of its metadata: i.e., what decides the chain's order, and decides when the chain order may change without violating consistency constraints.

This talk introduces Humming Consensus, a new technique for managing chain replication metadata. This metadata defines cluster membership, server order within the chain, and changes to chain order in response to peer failure. All participants in Humming Consensus are equal peers that operate without external assistance from ZooKeeper or other coordination service. Humming Consensus may also be used to manage chain replication-based eventual consistency data stores. The talk will also briefly introduce Basho's new distributed file store, Machi, which relies upon Humming Consensus to operate either in strongly consistent or eventually consistent environments. Attendees will need no prior knowledge of Erlang: the talk will focus on Machi's design and testing methods rather than its implementation internals.

About Scott Lystig Fritchie

Scott Lystig Fritchie met his first UNIX system in 1986 and has almost never met one since that he didn't like. A career detour as a UNIX systems administrator got him neck-deep in messaging systems, e-mail, and Usenet News. He rediscovered full-time programming while at Sendmail, Inc. and has been writing code for, designing, and testing distributed systems ever since. Scott Lystig Fritchie met his first UNIX system in 1986 and has almost never met one since that he didn't like. A career detour as a UNIX systems administrator got him neck-deep in messaging systems, e-mail, and Usenet News. He rediscovered full-time programming while at Sendmail, Inc. and has been writing code for, designing, and testing distributed systems ever since.

Scott is a senior software engineer at Basho Japan KK.

Follow Scott @slfritchie

In recent years, researchers and practitioners have been trying to ensure applications correctness while providing low-latency and highly available operations in geo-replicated systems. Despite these continuous efforts, it is well known that many types of operations must rely on cross-replica coordination to ensure that application invariants are preserved at all times. This is a fundamental limitation on improving services, as many companies cannot afford the extra latency to ensure consistency across replicas nor the reduced availability during faults, as this directly translates into revenue losses. In this talk, we discuss how to reduce cross-replica coordination while still preserving correctness properties by exploiting applications semantics.

The talk is divided in two parts. In the first part, we show how to preserve global numeric invariants on top of an existing production KV-store, while moving most coordination outside of the critical path of operation execution. We achieve this using a new replicated data-type to maintain the necessary information and show experimental results for a proof-of-concept prototype.

In the second part, we explore the more ambitious vision of providing generic application invariants with virtually no coordination. The fundamental insight is to identify and prevent the execution of concurrent operations, at different sites, that would result in invariant violations when the effects of the operations are merged together. We combine a static analysis tool and an online concurrency control mechanism to avoid unsafe executions.

About Valter Balegas

Valter Balegas a third year PhD Student in Universidade Nova de Lisboa currently working under the supervision of Nuno Preguiça, one of the creators of CRDTs. The context of his PhD is the improvement of correctness properties in geo-replicated systems without diminishing their availability and latency properties. He has recently proposed Explicit Consistency, a new consistency model that allows efficient implementation of applications that preserve invariants on top of weak consistency models. In the past he has made an internship at Basho working on CRDTs.

Research interests include Distributed Systems, Scalable Data Stores, Geo-Replication and home brewing.

Follow Valter @vbalegas

San Francisco | November 4-6th, 2015

A Distributed Systems Conference

RICON 2015 Speakers

Thanks to all those who submitted and congratulations to the following who have been accepted. Details on the presentation titles and outlines will be posted shortly.

Adrian Cockcroft

Don Rippert

Ben Hindman

Nicholas Weaver

Alex Heneveld

Alex Williams

Armon Dadgar

Bridget Kromhout

Carlos Baquero

Casey Rosenthal

Christopher Meiklejohn

Colin Scott

Damien Krotkine

David Greenberg

Deniz Altinbüken

DeWayne Filppi

Duncan Grazier

Gordon Guthrie

Heather McKelvey

João Leitão

Johan Sommerfeld

John Apostolopoulos

Jon Moore

Jose Faleiro

Kevin Jones

Luis Mariano Guerra

Manav Mishra

Mark Allen

Matt Davis

Matt Ranney

Mike Zaccardo

Mikhail Panchenko

Noah Gift

Sarah Cooper

Scott Lystig Fritchie

Sean Kelly

Valter Balegas

Monitoring and Simulating Microservices

About Adrian Cockcroft

The rise of distributed systems and how to make it stop

About Nicholas Weaver

Internet of Things: Opportunities and Challenges

About John Apostolopoulos

Managing the Basho Data Platform with the Cloudsoft UX

About Mike Zaccardo

About Alex Heneveld

Preemptive, multi-tenant Spark on Mesos

About David Greenberg

Distributed: of Systems and Teams

About Bridget Kromhout

Leveraging Open Technology to Fuel the Digital Enterprise

About Don Rippert

TBD

About Ben Hindman

Distributed Chaos Operations

About Casey Rosenthal

Building Principled Distributed Systems with Transformations

About DenizAltinbuken

Riak TS for Time Series Data

About Gordon

TBA

About DenizAltinbuken

Nomad: A Distributed, Optimistically Concurrent Scheduler

About Armon Dadgar

TBD

About DavidP alaitis

Implementation of CRDTs with δ-mutators

About Carlos Baquero

Lasp: A Language for Declarative, Distributed Edge Computation

About Christopher Meiklejohn

Minimizing Faulty Executions of Distributed Systems

About Colin Scott

Events storage and analysis with Riak at Booking.com

About Damien Krotkine

Scaling Globally powered by Riak

About Duncan Grazier