Title: Models and Software Tools for Managing Network Complexity Date: May 16, 2018 Researcher Name(s): Xin Sun University: Ball State University Long Term Goal(s) New Project Proposal Status Report Final Report This project seeks to develop quantitative models for measuring network management complexity, and a set of software tools for managing such complexity. Our long term goals are the following: Goal 1: We will develop a suite of objective models that can accurately quantify the management complexity of a given network, by measuring the amount of effort required from the operators to understand and reason about the behavior of a network. Our key insight here is that such effort is directly dependent on the various characteristics in the design and configuration of the network, including the number of moving parts, their interplay, and how they together implement networkwide policies or requirements. Thus our models will be based on such design/configuration characteristics. Goal 2: Capitalizing on the complexity models, we will develop a set of software tools for managing the complexity. More specifically, we envision the tools will be able to: - Perform what-if analysis of how a proposed design change to an existing network may impact its management complexity, and thus help operators to avoid costly quick fixes ; - Compare multiple design proposals for a new network, and enable operators to select the simplest alternative. - Diagnose an existing network to find the root cause of its high complexity, and allow operators to quickly get to the bottom of the complexity problem. Background for Long Term Goals Operator interviews and anecdotal evidence suggest that networks with higher degrees of complexity in their design generally require more manual intervention to manage, are more difficult to reason about, predict and troubleshoot, and are more prone to human errors. For many complex enterprise networks, the amount of management effort required has become the dominant cost of operation [1]. Despite this investment, design mistakes and configuration errors account for more than half of network outages [1][2], 80% of Air Force network vulnerabilities [3], and 65% of all successful cyber-attacks [4]. In a recent white paper titled "What's behind network downtime" by Juniper Networks, it is noted that "Human error is blamed for 50 to 80 percent of network outages. But in this era of complex systems, the cause is probably not incompetence. Human error should be seen not as the direct cause of the problem but as a symptom of complexity. [R]educing and managing network complexity will have the biggest potential impact on network downtime. "[2]
Part of the complexity is inherent, given the wide range of operational objectives that these networks must support, to include security (e.g., implementing reachability policy), resiliency (e.g., tolerating up to two component failures), safety (e.g., free of forwarding loops), and performance. There is also evidence, however, to suggest that some of the network design complexity may have resulted from a semantic gap between the high-level design objectives, and the diverse set of routing protocols and lowlevel router primitives for the operators to choose from [4]. Operators generally agree that for the same target network, multiple designs often exist to meet the same operational objectives (e.g., security and resiliency requirements), and some designs are significantly easier to implement and manage than others. However, operators' reasoning about complexity remains qualitative and objective, and essentially a black-art. Despite being an intrinsic and critically important property of a network, complexity remains the least understood one. Unlike other network properties such as performance and resiliency that have all been extensively studied and for which models and algorithms have been successfully developed, complexity has received little systematic treatment from the networking community. This project seeks to bridge this gap, by creating quantitative models for measuring network complexity, and software tools for managing network complexity. The proposed research will be facilitated by the large configuration dataset the PI has obtained, which includes complete device configuration snapshots from five operational enterprise networks. This valuable dataset will be used to guide and validate the design of the complexity metrics and models and the development of the software tools. The PI has worked extensively in the field of network management [5][6][7][8][9][10][11][12][13][14][15], and have gained valuable insights, theoretical techniques, and software tools from the past experience, which will ensure the success of this project. The proposed research will also be strengthened by the extensive interactions the PI has with the operators community. Intermediate Term Objectives - Characterize the design and configuration of operational enterprise networks, and distill common design/configuration patterns. - Analyze the management complexity of the distilled patterns, and abstract the findings into mathematical models. - Create an algorithm for automatically applying the models and compute the complexity of a given network; - Develop software tools for what-if analysis and for complexity root-cause diagnose. Schedule of Major Steps We expect the project to starts on August 1 st, 2018, and below is the estimated timeline of the major steps:
Task Timeline Design of preliminary complexity models August December 31, 2018 Development of software tool for what-if analysis Development of software tool for complexity root-cause diagnose Write up the final project report July, 2019 January March 31, 2019 April June 30, 2019 Dependencies The design and validation of the complexity models depend on the availability of configuration data from operational networks. The PI has successfully secured such data from six networks, and they will be used in every step of the research project. Major Risks None. Budget One graduate research assistant, for one semester (stipend and tuition): $7,930 Two-week summer pay for the PI: %5,964 Fringe benefits: $1,083 Travel to future S2ERC meetings (both PI and student): $3,380 Indirect cost (10%): $1,643 Total: $20,000 Staffing PI: Dr. Xin Sun One graduate student Category of Current Stage New proposal
Contacts with Affiliates None. Publications and Other Research Products (actual or potential) We expect multiple publications from this project in peer-reviewed conferences and journals. References [1] Z. Kerravala. "As the value of enterprise networks escalates, so does the need for configuration management." The Yankee Group Report, 2004. [2] Juniper Networks. "What is behind network downtime?". White paper. Available at http://www- 935.ibm.com/services/tw/gts/ pdf/200249.pdf, 2008 [3] Center for Strategic and International Studies. "Securing cyberspace for the 44th presidency." Available at http: //csis.org/files/media/csis/pubs/081208_securingcyberspace_44.pdf, 2008. [4] J. Pescatore. Taxonomy of software vulnerabilities. The Gartner Group Report, 2003. [5] Xin Sun and Geoffrey Xie. An Integrated Systematic Approach to Designing Enterprise Access Control. IEEE/ACM Transactions on Networking. Accepted for publication, 2016. [6] Xin Sun. A top-down framework for modeling routing design complexity. Chapter in book Redesigning the Future of Internet Architectures. IGI Global. ISBN13: 9781466683716. May 2015 [7] Dennis Volpano, Xin Sun and Geoffrey Xie. Towards Systematic Detection and Resolution of Network Control Conflicts, In Proceedings of ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN), Chicago, IL, August 2014. pp. 67-72. [8] Xin Sun and Geoffrey Xie. Minimizing Network Complexity through Integrated Top-down Design. In proceedings of ACM CoNEXT, Santa Barbara, CA, December 2013. pp. 259-270. [9] Xin Sun, Sanjay Rao and Geoffrey Xie. Modeling Complexity of Enterprise Routing Design. In proceedings of ACM CoNEXT, Nice, France, December 2012. pp. 85-96. [10] Minlan Yu, Xin Sun, Nick Feamster, Sanjay Rao, Jennifer Rexford. A Survey of Virtual LAN Usage in Campus Networks, IEEE Communications Magazine. Volume 49, issue 7, pp. 98-103, July 2011. [11] Yu-Wei Sung, Xin Sun, Sanjay Rao, Geoffery Xie and David Maltz. Towards Systematic Design of Enterprise Networks. IEEE/ACM Transactions on Networking, Volume 1,9 Issue 3, pp. 695-708, June 2011. [12] Xin Sun, Jinliang Wei, Sanjay Rao and Geoffrey Xie. A Software Toolkit for Visualizing Enterprise Routing Design. In Proceedings of IEEE Symposium on Configuration Analytics and Automation (SafeConfig), Arlington, VA. October 2011. pp. 1-8. [13] Xin Sun and Sanjay Rao. A Cost-Benefit Framework for Judicious Enterprise Network Redesign. IEEE INFOCOM (mini-conference), Shanghai, China, April. 2011
[14] Mohammad Hajjat, Xin Sun, Yu-Wei Sung, David Maltz, Sanjay Rao, Kunwadee Sripanidkulchai, and Mohit Tawarmalani. Cloudward Bound: Planning for Beneficial Migration of Enterprise Applications to the Cloud. In Proceedings of ACM SIGCOMM, New Delhi, India, 2010. pp. 243-254. [15] Mohammad Hajjat, Xin Sun, Yu-Wei Sung, David Maltz, Sanjay Rao, Kunwadee Sripanidkulchai, and Mohit Tawarmalani. Cloudward Bound: Planning for Beneficial Migration of Enterprise Applications to the Cloud. In Proceedings of ACM SIGCOMM, New Delhi, India, 2010. pp. 243-254.