The TimedSDN Project

Tal Mizrahi and Yoram Moses, Technion - Israel Institute of Technology

Abstract. The TimedSDN project explores the use of time and synchronized clocks in Software Defined Networks (SDNs). One of the main goals of this project is to analyze use cases in which using time is beneficial. Both theoretical and practical aspects of timed coordination and synchronized clocks in SDN are analyzed in this work. Some of the products of this work are already incorporated in the OpenFlow specification, and open source prototypes of the main components are publicly available.

Introduction

Network time synchronization has evolved significantly over the last decade. Indeed, the Precision Time Protocol (PTP), defined in the IEEE 1588 standard, can synchronize clocks to a very high degree of accuracy, typically on the order of 1 microsecond. Since its publication in 2008, PTP has matured and has become a common and affordable feature in commodity switches. We argue that since most of the SDN products already have built-in hardware capabilities for accurate clock synchronization, it is only natural to harness this powerful technology to coordinate events in SDNs.

The key concept behind the TimedSDN project is simple: accurate time can be used to coordinate network configuration and policy updates in SDNs. In a nutshell, the SDN controller can invoke time-triggered updates in network switches, allowing near-simultaneous network-wide updates, or allowing the controller to invoke multi-step updates, in which each step is invoked at a different update time.

Four main aspects of time in SDN are analyzed in this project, depicted in the four rows of Fig. 1.

Figure 1. The TimedSDN project at a glance.

Why do we need time in SDN?

We briefly present a few key use cases that greatly benefit from using time.

Flow swaps. Time4 is a network update approach that we introduced in [1,2]. In this approach multiple changes are performed at different switches at the same time.

The work of [1,2] considers a class of network update scenarios called flow swaps, and shows that Time4 is the optimal approach for implementing them; Time4 allows fewer packet drops and higher scalability than state-of-the-art approaches. A flow swapping example is presented in Fig. 2.

In this work we presented the lossless flow allocation (LFA) problem, and used a game-theoretic analysis to formally show that flow swaps are inevitable in the dynamic nature of SDN.

Figure 2. Flow swapping: flows need to convert from the `before' configuration to the `after' configuration. Updating S₁ and S₂ at the same time is the optimal approach.

Timed consistent updates. A consistent network update [3] is an update that guarantees that every packet is forwarded either according to the previous configuration before the update, or according to the new configuration after the update, but not according to a mixture of the two.

The approach we presented in [4,5] introduces time-triggered multi-phase network updates, which can guarantee consistency while requiring a shorter duration than existing consistent update methods. An example is depicted in Fig. 3.

This approach is shown to reduce the expensive overhead of maintaining duplicate configurations.

Figure 3. A timed consistent update example: when updating from the `before' to the `after' configuration, each local update is performed at a different time, T₁, T₂, and T₃.

Data plane timestamping. In [6] we argued that in the unique environment of SDN, attaching a timestamp to the header of all packets is a powerful feature that can be leveraged by various diverse SDN applications. We analyzed three key use cases that demonstrate the advantages of using DPT: (i) network telemetry, (ii) consistent network updates, and (iii) load balancing. We also showed that SDN applications can benefit even from using as little as one bit for the timestamp field.

Coordinated management. As discussed in [7], time can be a valuable tool in coordinating network management, not only in SDNs, but in any centrally managed network. For example, synchronized time can be used to coordinate a configuration update that should be performed at multiple nodes at the same time, or to take a coordinated snapshot of the state of multiple nodes in the network.

Scheduling protocols

One of the main goals of this project is to define extensions to standard network protocols, enabling practical implementations of the concepts we present. We defined a new feature in OpenFlow called Scheduled Bundles [1,2], and a similar capability in NETCONF [7]. These two features enable time-triggered operations in OpenFlow and in NETCONF, respectively. As a result of our work, the capability to perform time-triggered updates has been incorporated into the OpenFlow 1.5 protocol [8], and has been defined for the NETCONF protocol in an experimental IETF RFC [9]. Open source prototypes are available for these extensions [10].

Accurate scheduling methods

One of the main challenges in the use of accurate time is to implement accurate execution of events, i.e., guaranteeing that scheduled network updates are executed as close as possible to the time for which they were scheduled. Even if all the switches have perfectly synchronized clocks, executing events at their scheduled time may be challenging due to the nondeterministic nature of the switches' operating systems, and due to other running tasks. Two accurate scheduling methods were defined and analyzed in this project, TimeFlip and OneClock.

TimeFlip. In [11,12] we introduced TimeFlips, which can be implemented in Ternary Content Addressable Memories (TCAM). A TimeFlip is a timestamp-based TCAM range in a hardware switch. We showed that TimeFlip is a practical method of implementing accurate time-based network updates using time-based TCAM ranges. TimeFlip was tested in practice on a real-life device, a Marvell DX switch, showing that a TimeFlip can be performed with a sub-microsecond accuracy, requiring very limited TCAM memory resources.

OneClock. OneClock [7] is a prediction-based scheduling approach that uses timing information collected at runtime to accurately schedule future operations. Three prediction algorithms were analyzed in this work: (i) an average-based algorithm, (ii) a fault-tolerant average (FT-Average), and (iii) a Kalman-Filter-based algorithm.

Clock synchronization

Clock synchronization is an essential piece in the puzzle. ReversePTP [13,14,15] is a clock synchronization scheme that is adapted to the centralized SDN environment; in ReversePTP all nodes (switches) in the network distribute timing information to a single software-based central node (the SDN controller), that tracks the state of all the clocks in the network (see Fig. 4). Thus, all computations and bookkeeping are performed by the central node, whereas the `dumb' switches are only required to periodically send their current time to the controller. In accordance with the SDN paradigm, the `brain' is implemented in software, making ReversePTP flexible and programmable from an SDN programmer's perspective.

Interestingly, ReversePTP can be defined as a PTP profile, i.e., a subset of the features of PTP. Consequently, ReversePTP can be implemented by existing PTP-enabled switches.

Figure 4. ReversePTP in SDN: switches distribute their time to the controller. Switches' clocks are not synchronized. For every switch i, the controller knows offset_i, the offset between switch i's clock and its local clock.

Conclusion

The TimedSDN project studies various aspects of using synchronized time in SDN. We believe there is great potential for future work, as many SDN applications may benefit from using time and synchronized clocks.

References

[1] T. Mizrahi and Y. Moses, “Software Defined Networks: It’s about time,” in IEEE INFOCOM, 2016.

[2] T. Mizrahi and Y. Moses, “Time4: Time for SDN,” IEEE Transactions on Network and Service Management (TNSM), 2016.

[3] M. Reitblatt, N. Foster, J. Rexford, C. Schlesinger, and D. Walker, “Abstractions for network update,” in ACM SIGCOMM, 2012.

[4] T. Mizrahi, E. Saat, and Y. Moses, “Timed consistent network updates,” in ACM SIGCOMM Symposium on SDN Research (SOSR), 2015.

[5] T. Mizrahi, E. Saat, and Y. Moses, “Timed consistent network updates in software defined networks,” IEEE/ACM Transactions on Networking (ToN), 2016.

[6] T. Mizrahi and Y. Moses, “The case for data plane timestamping in sdn,” in IEEE INFOCOM Workshop on Software-Driven Flexible and Agile Networking (SWFAN), 2016.

[7] T. Mizrahi and Y. Moses, “OneClock to rule them all: Using time in networked applications,” in IEEE/IFIP Network Operations and Management Symposium (NOMS) mini-conference, 2016.

[8] Open Networking Foundation, “Openflow switch specification,” Version 1.5.0, 2015.

[9] T. Mizrahi and Y. Moses, “Time Capability in NETCONF,” RFC 7758, IETF, 2016.

[10] “TIME4 source code,” https://github.com/TimedSDN, 2014.

[11] T. Mizrahi, O. Rottenstreich, and Y. Moses, “TimeFlip: Scheduling network updates with timestamp-based TCAM ranges,” in IEEE INFOCOM, 2015.

[12] T. Mizrahi, O. Rottenstreich, and Y. Moses, “TimeFlip: Using Timestamp-based TCAM Ranges to Accurately Schedule Network Updates,” IEEE/ACM Transactions on Networking (ToN), 2016.

[13] T. Mizrahi and Y. Moses, “Using ReversePTP to distribute time in software defined networks,” in International IEEE Symposium on Precision Clock Synchronization for Measurement Control and Communication (ISPCS), 2014.

[14] T. Mizrahi and Y. Moses, “ReversePTP: A software defined networking approach to clock synchronization,” in ACM SIGCOMM Workshop on Hot topics in Software Defined Networks (HotSDN), 2014.

[15] T. Mizrahi and Y. Moses, “ReversePTP: A clock synchronization scheme for software defined networks,” International Journal of Network Management (IJNM), 2016.

Tal Mizrahi is a switch architect at Marvell, with over 15 years of experience in networking. He has recently completed his PhD at the Technion. Tal is an active participant in the Internet Engineering Task Force (IETF), and in the IEEE 1588 working group. His fields of interest include network protocols, switch and router architecture, time synchronization, and distributed systems.

Yoram Moses is the Israel Pollak academic chair and a professor of electrical engineering at the Technion. His research focuses on distributed and multi-agent systems, with a focus on fault-tolerance and on applications of knowledge and time in such systems. He is a co-author of the book Reasoning about Knowledge, recipient of the Gödel prize in 1997 and the Dijkstra prize in 2009.

Editor:

Eliezer Dekel is a Chief Architect for Huawei Technologies Corporate Reliability Department. He is researching RAS for SDN and NFV. He retired from IBM Research - Haifa, as a Senior Technical Staff Member and Chief Architect for Distributed Systems. In his this role he focused on developing infrastructure technologies for very large scale distributed systems.

Eliezer Dekel is the editor in chief of EAI Endorsed Transaction on Cloud Systems. He is also an Associate Editor for ACM Computing Surveys and a member of the editorial board for IEEE SDN Newsletter. Eliezer served on numerous conference program committees and organized, or served as chair in some of them. He has been involved in research in the areas of distributed and fault-tolerant computing, service-oriented technology, and software engineering. He was recently working on technologies for providing Quality of Service, with a focus on dependability, in very large scale multi-tier environments. For this area he initiated together with colleagues the very successful International Workshop on Large Scale Distributed Systems and Middleware (LADIS). This workshop, sponsored by ACM. It was one of the first workshops to focus on the foundations of "cloud computing." He was an organizer of CloudSlam'09 the first cloud computing virtual conference. Eliezer was also involved in several EU FP7 ICT funded projects.

Eliezer has a Ph.D. and M.Sc. in computer science from the University of Minnesota, and a B.Sc. in mathematics from Ben Gurion University, Israel. Prior to joining IBM Research - Haifa, Eliezer served on the faculty of the University of Texas at Dallas computer science department for more than ten years.