Telemetry and Performance Management in Softwarized Environments

Dan Conde

IEEE Softwarization, May 2017

 

Network telemetry and performance management is a challenge in softwarized environments. Unlike traditional hardware-based systems, many assumptions that made possible conventional measurement and performance management start to break down and must be revisited. There are issues of capacity, topology, dynamic configurations (ephemeral nature) and intent.

Capacity
Physical architectures reflected the capacity that was required for networks. As virtualized networks, such as overlays, become prevalent, and with a different and dynamic number of endpoints based on objects such as virtual machines, it is difficult to determine how many connections are going to use a network’s capacity. Although abstractions like VLANs have existed in the past, they were relatively static.  Software-defined networks promise to create and release connections more frequently, and that situation makes bandwidth use more difficult to predict. These changes have made it quite challenging to determine baseline usage and to identify abnormal conditions when they occur.

Topology
With software-defined networks, the logical network topology is no longer strongly associated with the physical topology. This means that telemetry or monitoring based on physical connections needs to be augmented with measurements based on logical connections to obtain a more realistic view of how logical entities communicate with each other.  The over-arching goal is to understand how high-level constructs such as applications or services perform, as opposed to only having access to information about low-level components such as network ports.

Dynamic configurations or Ephemeral nature
With physical networks, end points do not get created or destroyed quickly.  However, in software-based networks, the endpoints may be virtual machines, virtual switches, or containers, any of which may be created and deleted rapidly.  Assumptions on the static nature of devices are removed, and monitoring metrics instead need to track these software constructs. Rather than treat these objects as analogues to their physical counterparts, we need to look at them as representing logical services. Therefore, a virtual switch does behave like a physical switch, but it is more worthwhile to see how it fits into an application or service. Therefore, instead of looking at the traffic from container #123, it is preferable to assess the traffic generated by a set of containers by implementing the service “DBLookup”.

Intent – implicit or explicitly defined?
The key element for understanding softwarized networks is to distinguish logical from physical connections, to understand how the elements utilize the networks, and to determine how they are  affected by monitoring. Understanding the intent of the application is the foundation of this way of thinking.

This is a complex issue, and we first need to understand why we are performing telemetry, analytics and performance monitoring in the first place.

The common reasons include security, application performance monitoring, troubleshooting and capacity planning.  In most of these cases, it is necessary to know the original intent of the applications involved, in order to understand how one needs to monitor the network appropriately.  A network link must be examined carefully, as it connects two active and critical applications. Less attention may be paid to links between dormant systems.

A large gap has developed between the domains of applications and of infrastructure, such as networks. Network engineers often work at the packet level, application writers are concerned about their application’s behavior, which  utilizes the transmission data on network connections, and system administrators or DevOps engineers deal with configuring them to work together.  The separation of domains has also led to each group not fully understanding the intent of the other and to being susceptible to making multiple assumptions.

Some technologies have attempted to bridge those gaps.  In commercial data center networks, the most famous is Cisco’s ACI, or Application Centric Infrastructure, in which the application architectures are codified as declarative intent through a policy model, and that drives the operation and monitoring of the network through compatible switches.

Other systems that capture similar information include Topology and Orchestration Specification for Cloud Applications (TOSCA), which are often used to orchestrate network based services or network function virtualization and can also be used in telco VNFs.

Without going into these systems in detail, it suffices to say that there is widespread acknowledgement of the need for network models. However, there is no widespread agreement on a common model for describing orchestration or what is sometimes called declarative intent for networks. Until then, models specific to each deployment need to be used. However, having some model is better than no model, so it is worth examining how these models work in order to continue to advance the state of the art.

Methods for Monitoring
Many methods are available for monitoring purposes, some of which are new, while some have traditionally been available in non-softwarized environments and are now are proving to have a higher relevance in softwarized environments.

Sensors for capturing telemetry
These may be sensors placed into switches, data gathered from the network via TAPs or network packet brokers, or any other entity that provides network visibility at the packet level.

Some sensors can examine application behavior or storage performance. Network sensors play a role in providing application metrics -- network data can provide data for network performance, and application metrics can be derived by examining applications directly and especially by examining how an application uses the network.

Application behavior, intent and performance may be derived from the network, via solutions such as varied APMs (application performance management), obtaining statistics from network device configurations and statistics, and even code injection, which puts pieces of code right into byte code streams in languages such as Java.

VMs appliances
In addition to using traditional hardware-based sensors, it is possible to deploy virtual machines into cloud environments to gather metrics if network traffic is channeled into these virtual machine appliances before moving to its destination.  This channeling is necessary if specialized hardware cannot be deployed into a public cloud environment.

Sensors Built-in to infrastructure
Cloud orchestration software and underlying foundational software may  have built-in monitoring capabilities that can interface with other telemetry solutions.  This is similar to traditional operating systems that have statistics available via interfaces calls, or special files that are read to provide the required statistics.

Summary
Telemetry and monitoring in a softwarized environment is an extension of what has been performed in a physical environment.  This is not a sudden shift, as the adoption of virtual machines and other software-based environments has created a set of solutions that will also apply to systems with higher degrees of softwarization, such as virtualized networks.

However, new challenges arise due to the dynamic, elastic and shared nature of these environments;  one of which is that we can no longer deduce the architectural intent by examining the design of the infrastructure.  New standards that attempt to capture this intent via higher level models are emerging, and we should expect to see more developments in this area.  By understanding intent and utilizing a behavioral model for operations of systems, we can monitor and understand the data gathered by various sensors, and interpret how it affects performance, security and other high level concerns.

 


 

Dan CondeDan Conde is an analyst covering distributed system technologies including cloud computing and enterprise networking. In this era of IT infrastructure transformation, Dan’s research focuses on the interactions of how and where workloads run, and how end-users and systems connect to each other. Cloud technologies are driving much of the changes in IT today. Dan’s coverage includes public cloud platforms, cloud and container orchestration systems, software-defined architectures and related management tools. Connectivity is important to link users and applications to new cloud based IT. Areas covered include data center, campus, wide-area and software-defined networking, network virtualization, storage networking, network security, internet/cloud networking and related monitoring & management tools. His experience in product management, marketing, professional services and software development provide a broad view into the needs of vendors and end-users.

 

Editor:

Noel CrespiProf. Noël Crespi holds Masters degrees from the Universities of Orsay (Paris 11) and Kent (UK), a diplome d’ingénieur from Telecom ParisTech, a Ph.D and an Habilitation from Paris VI University (Paris-Sorbonne). From 1993 he worked at CLIP, Bouygues Telecom and then at Orange Labs in 1995. He took leading roles in the creation of new services with the successful conception and launch of Orange prepaid service, and in standardisation (from rapporteurship of IN standard to coordination of all mobile standards activities for Orange). In 1999, he joined Nortel Networks as telephony program manager, architecting core network products for EMEA region. He joined Institut Mines-Telecom in 2002 and is currently professor and Program Director, leading the Service Architecture Lab. He coordinates the standardisation activities for Institut Mines-Telecom at ITU-T, ETSI and 3GPP. He is also an adjunct professor at KAIST, an affiliate professor at Concordia University, and is on the 4-person Scientific Advisory Board of FTW (Austria). He is the scientific director the French-Korean laboratory ILLUMINE. His current research interests are in Service Architectures, Services Webification, Social Networks, and Internet of Things/Services.
http://noelcrespi.wp.tem-tsp.eu/

 


 

Subscribe to IEEE Softwarization

Join our free SDN Technical Community and receive IEEE Softwarization.

Subscribe Now

 


Article Contributions Welcomed

Download IEEE Softwarization Editorial Guidelines for Authors (PDF, 122 KB)

If you wish to have an article considered for publication, please contact the Managing Editor at sdn-editor@ieee.org.

 


Past Issues

November 2018

March 2018

January 2018

December 2017

September 2017

July 2017

May 2017

March 2017

January 2017

November 2016

September 2016

July 2016

May 2016

March 2016

January 2016

November 2015


IEEE Softwarization Editorial Board

Laurent Ciavaglia, Editor-in-Chief
Mohamed Faten Zhani, Managing Editor
TBD, Deputy Managing Editor
Syed Hassan Ahmed
Dr. J. Amudhavel
Francesco Benedetto
Korhan Cengiz
Noel Crespi
Neil Davies
Eliezer Dekel
Eileen Healy
Chris Hrivnak
Atta ur Rehman Khan
Marie-Paule Odini
Shashikant Patil
Kostas Pentikousis
Luca Prete
Muhammad Maaz Rehan
Mubashir Rehmani
Stefano Salsano
Elio Salvadori
Nadir Shah
Alexandros Stavdas
Jose Verger