Interactive Analysis of Large Distributed Systems with Scalable Topology-based Visualization

Size: px

Start display at page:

Download "Interactive Analysis of Large Distributed Systems with Scalable Topology-based Visualization"

Georgiana Barrett
6 years ago
Views:

1 Interactive Analysis of Large Distributed Systems with Scalable Topology-based Visualization Lucas M. Schnorr, Arnaud Legrand, and Jean-Marc Vincent Firstname.Lastname@imag.fr Laboratoire d Informatique de Grenoble, University Grenoble-Alpes, France Austin, 2013, April / 25

2 Outline 1 Large Applications Analysis 2 The Scalable Topology-based Visualization 3 Analysis on Case Studies 4 Synthesis 2 / 25

3 Scalable Topology-based Visualization 1 Large Applications Analysis 2 The Scalable Topology-based Visualization 3 Analysis on Case Studies 4 Synthesis 3 / 25

Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today HPC (clusters, petascale systems, soon exascale.

4 Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing 562,960 cores Complex platforms with many open issues resource discovery and monitoring resource & data management energy consumption reduction resource economics application scheduling fault-tolerance and availability scalability and performance decentralized algorithms 4 / 25

5 Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing Complex platforms with many open issues resource discovery and monitoring resource & data management energy consumption reduction resource economics application scheduling fault-tolerance and availability scalability and performance decentralized algorithms 4 / 25

6 Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing Complex platforms with many open issues resource discovery and monitoring resource & data management energy consumption reduction resource economics application scheduling fault-tolerance and availability scalability and performance decentralized algorithms 4 / 25

3+ PetaFLOPS Complex platforms with many open issues resource discovery and monitoring resource & data management energy consumption

7 Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing 540,000+ hosts 7.3+ PetaFLOPS Complex platforms with many open issues resource discovery and monitoring resource & data management energy consumption reduction resource economics application scheduling fault-tolerance and availability scalability and performance decentralized algorithms 4 / 25

8 Large Applications Analysis The Scalable Topology-based Visualization Analysis on Case Studies Synthesis Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today Google Data Center HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing Complex platforms with many open issues resource discovery and monitoring application scheduling resource & data management fault-tolerance and availability energy consumption reduction scalability and performance resource economics decentralized algorithms 4 / 25

9 Large Applications Analysis The Scalable Topology-based Visualization Analysis on Case Studies Synthesis Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today Google Data Center HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing Complex platforms with many open issues resource discovery and monitoring application scheduling resource & data management fault-tolerance and availability energy consumption reduction scalability and performance resource economics decentralized algorithms 4 / 25

Large Applications Analysis The Scalable Topology-based Visualization Analysis on Case Studies Synthesis Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production

10 Large Applications Analysis The Scalable Topology-based Visualization Analysis on Case Studies Synthesis Large-Scale Distributed Systems Large-scale parallel and distributed systems are in production today Google Data Center HPC (clusters, petascale systems, soon exascale...) Grid platforms Peer-to-peer file sharing Distributed volunteer computing Cloud Computing Complex platforms with many open issues resource discovery and monitoring application scheduling resource & data management fault-tolerance and availability energy consumption reduction scalability and performance resource economics decentralized algorithms 4 / 25

11 Large-Scale Distributed Systems Such applications and systems deserve very advanced analysis Their debugging and tuning are technically difficult Their use induce high methodological challenges Some important factors Very large number of components (Hundreds to millions of processing units) Organization in large hierarchical networks Application Performance depends on Resource characteristics (bandwidth, latency, power) Network Topology Keep in mind data locality, avoiding backbone congestion Or heavily use backbones if local networks are slow Observe intra and inter-application congestion Necessity for well-suited analysis tools Explicitly showing where the congestion is The moment (time) and the network links (space) 5 / 25

12 Large-Scale Distributed Systems Such applications and systems deserve very advanced analysis Their debugging and tuning are technically difficult Their use induce high methodological challenges Some important factors Very large number of components (Hundreds to millions of processing units) Organization in large hierarchical networks Application Performance depends on Resource characteristics (bandwidth, latency, power) Network Topology Keep in mind data locality, avoiding backbone congestion Or heavily use backbones if local networks are slow Observe intra and inter-application congestion Necessity for well-suited analysis tools Explicitly showing where the congestion is The moment (time) and the network links (space) 5 / 25

13 Analysis of Large-Scale Distributed Systems Behavior Textual or statistical analysis of traces is usually not sufficient (Topology is not depicted at all) Performance Visualization Analysis: Event Traces are Visually Depicted Gantt-charts :Detailed visualization of causality Space (vertical dimension) shows resources Time (horizontal dimension) show behavior Several tools have Gantt-Charts (to cite a few) Projections (Urbana), Jumpshot (among others) Vampir (Dresden), Paje (Grenoble), Paraver (Barcelona) Difficulty: lacks topological data (or flat representation of the network) Fixed Topology for HPC supercomputers Shows profiling information on the topology Tools: Scalasca (Juelich), ParaProf (Portland),... Difficulty: Visualization scalability? No temporal data. Topological Analysis for Distributed Systems Shows topology with application behavior Tools: Overview Difficulty: scales only to a few dozens of nodes 6 / 25

14 Our contribution Scalable technique for Interactive Visualization Analysis Methodology Multi-scale data aggregation Correlate application behavior with network topology and scale up to thousands of nodes Let analyst decide the aggregation policy Both in time and space Dynamic and interactive graph layout Force-directed algorithm (Barnes-Hut) User interacts with the graph Quick demonstration of the visualization 7 / 25

15 Scalable Topology-based Visualization 1 Large Applications Analysis 2 The Scalable Topology-based Visualization 3 Analysis on Case Studies 4 Synthesis 8 / 25

16 Mapping Trace Metrics to the Graph Nodes Nodes of the graph : Model monitored entities network links, computing nodes, routers,... Edges of the graph : Model a relationship between two monitored entities communications, synchronizations,... Attributes of nodes : Geometric shapes: Hosts are squares, Links are diamonds, Routers are dots Sizes: Depending on trace metrics associated to the node Filling: decomposition of the metric, qualitative information (color), / 25

17 Mapping Trace Metrics to the Graph Example Three different mappings (at timestamps A, B and C) Left shows the trace metrics associated Right shows the corresponding graph visualizations Filling is used as resource utilization Analysing an application timestamp by timestamp? not really useful, not scalable 10 / 25

18 Mapping Trace Metrics to the Graph Example Three different mappings (at timestamps A, B and C) Left shows the trace metrics associated Right shows the corresponding graph visualizations Filling is used as resource utilization Analysing an application timestamp by timestamp? not really useful, not scalable 10 / 25

19 Multi-Scale Data Aggregation Constraints Large amount of trace data Time (microseconds events for long observation periods) Space (structured large-scale hierarchy) Particular behavior and anomalies Appear on different scales Network congestion: Only a part of the topology presents contention Global temporal network congestion A network link is always the contention We need to easily navigate in all scales of trace data in time and space dimensions 11 / 25

20 Multi-Scale Data Aggregation How it works? Assume we measure a given quantity ρ on each resource: { R T R ρ : (r, t) ρ(r, t) Example : ρ(r, t) is the computing power of resource r at time t Assume we can define a neighborhood N Γ, (r, t) (eg hyperrectangle) Then define an approximation F of ρ on N Γ, by R T R F Γ, : (r, t) ρ(r, t )dr dt (1) Remarks : N Γ, (r,t) A summation of the quantity ρ in the neighborhood Γ and Correctly and dynamically deciding the neighborhood is a key that is left to the analyst that dynamically adjust the neiborhood parameter according to the visualization 12 / 25

21 Multi-Scale Data Aggregation How it works? Assume we measure a given quantity ρ on each resource: { R T R ρ : (r, t) ρ(r, t) Example : ρ(r, t) is the computing power of resource r at time t Assume we can define a neighborhood N Γ, (r, t) (eg hyperrectangle) Then define an approximation F of ρ on N Γ, by R T R F Γ, : (r, t) ρ(r, t )dr dt (1) Remarks : N Γ, (r,t) A summation of the quantity ρ in the neighborhood Γ and Correctly and dynamically deciding the neighborhood is a key that is left to the analyst that dynamically adjust the neiborhood parameter according to the visualization 12 / 25

22 Temporal Data Aggregation Temporal neighborhood is a time slice For example, considering a time slice from A 1 to A 2 Trace metrics, temporally aggregated, are mapped to HostA Possible operations Slide the time-slice Animation in time Increase and decrease its size looking for other scales In our approach Analyst can interactively select the time slice Visual feedback is instantaneous 13 / 25

23 Temporal Data Aggregation Temporal neighborhood is a time slice For example, considering a time slice from A 1 to A 2 Trace metrics, temporally aggregated, are mapped to HostA Possible operations Slide the time-slice Animation in time Increase and decrease its size looking for other scales In our approach Analyst can interactively select the time slice Visual feedback is instantaneous 13 / 25

24 Temporal Data Aggregation Temporal neighborhood is a time slice For example, considering a time slice from A 1 to A 2 Trace metrics, temporally aggregated, are mapped to HostA Possible operations Slide the time-slice Animation in time Increase and decrease its size looking for other scales In our approach Analyst can interactively select the time slice Visual feedback is instantaneous 13 / 25

25 Spatial Data Aggregation Spatial neighborhood is the resource hierarchy Grid, Cluster, Machine, Processor, Core Cluster, Rack, U, Processor, Core For example, considering two groups of resources Two spatial aggregation steps In our approach Analyst chooses the level of detail by clicking on nodes Visual feedback is instantaneous (but challenging) How to represent a few thousands nodes interactively? Force-directed graph layout 14 / 25

26 Spatial Data Aggregation Spatial neighborhood is the resource hierarchy Grid, Cluster, Machine, Processor, Core Cluster, Rack, U, Processor, Core For example, considering two groups of resources Two spatial aggregation steps In our approach Analyst chooses the level of detail by clicking on nodes Visual feedback is instantaneous (but challenging) How to represent a few thousands nodes interactively? Force-directed graph layout 14 / 25

27 Spatial Data Aggregation Spatial neighborhood is the resource hierarchy Grid, Cluster, Machine, Processor, Core Cluster, Rack, U, Processor, Core For example, considering two groups of resources Two spatial aggregation steps In our approach Analyst chooses the level of detail by clicking on nodes Visual feedback is instantaneous (but challenging) How to represent a few thousands nodes interactively? Force-directed graph layout 14 / 25

28 Implementing Interactivity Force-directed layout Force-directed parameters Example Barnes-hut algorithm for graph layout Coulomb s repulsion law All nodes have a (positive) charge Hooke s attraction attraction Connected nodes have an attraction force (spring) Changing charge and spring parameters 15 / 25

29 Implementing Interactivity Multiple Scales Dealing with multiple scales Example Changing the mapping to the visualization scale Note that from situation B to C, time-slice is the same But nodes size are different 16 / 25

30 Scalable Topology-based Visualization 1 Large Applications Analysis 2 The Scalable Topology-based Visualization 3 Analysis on Case Studies 4 Synthesis 17 / 25

31 Case Studies and Results Applications NAS-DT Benchmark Analysis Traces obtained with the Simulated MPI (SMPI) tool Non-cooperative Master Worker Applications Traces from the SimGrid toolkit Objective of the evaluation Illustrate the potential of the approach Pin-point network contention Evaluate load balancing Trace metrics mapping Host are squares Network links are diamonds Filling is resource utilization 18 / 25

32 NAS-DT Benchmark Analysis Scenario configuration Analysis NAS-DT Class A White Hole algorithm Two clusters: Adonis (left) and Griffon (right) Processes are allocated sequentially starting at Adonis Central backbone is always the point of contention Changing to smaller time-slices allows us to observe that Possible explanation: bad process deployment of MPI 19 / 25

33 NAS-DT Benchmark Analysis Scenario configuration Analysis NAS-DT Class A White Hole algorithm Two clusters: Adonis (left) and Griffon (right) Processes are allocated sequentially starting at Adonis Central backbone is always the point of contention Changing to smaller time-slices allows us to observe that Possible explanation: bad process deployment of MPI 19 / 25

34 NAS-DT Benchmark Analysis Scenario configuration Analysis NAS-DT Class A White Hole algorithm Two clusters: Adonis (left) and Griffon (right) Processes are allocated sequentially starting at Adonis Central backbone is always the point of contention Changing to smaller time-slices allows us to observe that Possible explanation: bad process deployment of MPI 19 / 25

35 NAS-DT Benchmark Analysis (Best Deployment) New execution with a better deployment Analysis Exploit locality, avoid backbone link Central backbone is no longer a contention point Smaller time-slices shows that it is only at the beginning Reduction of execution time of approximately 20% 20 / 25

36 NAS-DT Benchmark Analysis (Best Deployment) New execution with a better deployment Analysis Exploit locality, avoid backbone link Central backbone is no longer a contention point Smaller time-slices shows that it is only at the beginning Reduction of execution time of approximately 20% 20 / 25

37 NAS-DT Benchmark Analysis (Best Deployment) New execution with a better deployment Analysis Exploit locality, avoid backbone link Central backbone is no longer a contention point Smaller time-slices shows that it is only at the beginning Reduction of execution time of approximately 20% 20 / 25

38 Non-Cooperative Master Worker Applications Scenario configuration Two master-worker applications competing for resources Based on a realistic model of the Grid5000 platform 2170 computing hosts + network links and routers Each application uses a bandwidth-centric optimal strategy Tasks layout for each application First application: tasks are CPU bound cheap to transfer, hard to calculate Second application : tasks are network bound tasks are a little harder to transfer Mapping trace metrics Each application has its own resource utilization variable First application (light gray) Second application (dark gray) Three Expected Behaviors First application should achieve better resource utilization Form of locality from the second application Applications may interfere on computing resources 21 / 25

39 Non-Cooperative Master Worker Applications Scenario configuration Two master-worker applications competing for resources Based on a realistic model of the Grid5000 platform 2170 computing hosts + network links and routers Each application uses a bandwidth-centric optimal strategy Tasks layout for each application First application: tasks are CPU bound cheap to transfer, hard to calculate Second application : tasks are network bound tasks are a little harder to transfer Mapping trace metrics Each application has its own resource utilization variable First application (light gray) Second application (dark gray) Three Expected Behaviors First application should achieve better resource utilization Form of locality from the second application Applications may interfere on computing resources 21 / 25

40 Non-Cooperative Applications Spatial Fixed temporal aggregation, changing spatial aggregation From left to right Hosts (no aggregation), Clusters, Sites, Grid (full aggregation) Hosts Clusters Sites Grid Analysis Host level is too cumbersome Cluster, resource usage favors the CPU-bound application Site, enables to quantify perfectly how much it is Grid, gives the overview for this particular time slice 22 / 25

Analysis Host level is too cumbersome Cluster, resource usage favors the CPU-bound application Site,

41 Non-Cooperative Applications Spatial Fixed temporal aggregation, changing spatial aggregation From left to right Hosts (no aggregation), Clusters, Sites, Grid (full aggregation) Hosts Clusters Sites Grid Analysis Host level is too cumbersome Cluster, resource usage favors the CPU-bound application Site, enables to quantify perfectly how much it is Grid, gives the overview for this particular time slice 22 / 25

42 Non-Cooperative Applications Temporal Clusters Fixed spatial aggregation to Cluster level From left to right, time slice sliding forward These are actually screenshots from an animation t 0 t 1 t 2 t 3 time slice time slice time slice time slice A B C Sites Grid Analysis Resource usage is not uniform in the platform for the CPU-bound app t 3 shows cluster A full of tasks, but not cluster B Site B is full of tasks at t 2 while Site C has to wait until time slice t 3 Possible explanation: single master process, big platform 23 / 25

43 Non-Cooperative Applications Temporal Clusters Fixed spatial aggregation to Cluster level From left to right, time slice sliding forward These are actually screenshots from an animation t 0 t 1 t 2 t 3 time slice time slice time slice time slice A B C Sites Grid Analysis Resource usage is not uniform in the platform for the CPU-bound app t 3 shows cluster A full of tasks, but not cluster B Site B is full of tasks at t 2 while Site C has to wait until time slice t 3 Possible explanation: single master process, big platform 23 / 25

44 Non-Cooperative Applications Temporal Clusters Fixed spatial aggregation to Cluster level From left to right, time slice sliding forward These are actually screenshots from an animation t 0 t 1 t 2 t 3 time slice time slice time slice time slice A B C Sites Grid Analysis Resource usage is not uniform in the platform for the CPU-bound app t 3 shows cluster A full of tasks, but not cluster B Site B is full of tasks at t 2 while Site C has to wait until time slice t 3 Possible explanation: single master process, big platform 23 / 25

45 Scalable Topology-based Visualization 1 Large Applications Analysis 2 The Scalable Topology-based Visualization 3 Analysis on Case Studies 4 Synthesis 24 / 25

46 Synthesis and Open Questions Proposition: Multi-scale aggregation with dynamic graph layout Interactive, analyst can freely choose level of detail Scalable to a few thousands Open-source software Viva Currently being packaged to Debian Issues yet to be addressed Aggregating link utilization is questionable Aggregation leads to data loss which may be harmful, evaluation of an aggregation quality Viva has only a few graphical objects : grammar of graphical objects (ggplot2 philosophy) Thank you for your attention 25 / 25

47 Synthesis and Open Questions Proposition: Multi-scale aggregation with dynamic graph layout Interactive, analyst can freely choose level of detail Scalable to a few thousands Open-source software Viva Currently being packaged to Debian Issues yet to be addressed Aggregating link utilization is questionable Aggregation leads to data loss which may be harmful, evaluation of an aggregation quality Viva has only a few graphical objects : grammar of graphical objects (ggplot2 philosophy) Thank you for your attention 25 / 25

48 Synthesis and Open Questions Proposition: Multi-scale aggregation with dynamic graph layout Interactive, analyst can freely choose level of detail Scalable to a few thousands Open-source software Viva Currently being packaged to Debian Issues yet to be addressed Aggregating link utilization is questionable Aggregation leads to data loss which may be harmful, evaluation of an aggregation quality Viva has only a few graphical objects : grammar of graphical objects (ggplot2 philosophy) Thank you for your attention 25 / 25

Analysis of Program Behavior

Analysis of Program Behavior High Performance Computing, Visualization Lucas Mello Schnorr probably soon (LIG-CNRS INF-UFRGS) 2 nd LICIA Workshop Grenoble, France September 5th, 2012 1/ 25 Introduction