Light-path Monitor System of TWAREN Optical Network National Center for High-Performance Computing Speaker: Ming-Chang Liang 1
INTRODUCTION 2
TWAREN phase 2 TWAREN was adapted for more protection methods and better availability at the end of 2006, called TWAREN phase 2. Tens of optical switches and hundreds of lightpaths were then served as the foundation of the layer 2 VLAN services and the layer 3 IP routing services. In 2008, tens of VPLS switches were further incorporated to provide additional Multi-point VPLS VPN service. The layer 1 lightpaths can be protected by SNCP, layer 2 VLAN by spanning tree recalculation and layer 2 VPLS by fast reroute technology. All these improvements transform TWAREN phase 2 into a true hybrid network capable of providing multiple layers of services and high availability. 3
Architecture of Optical Network 4
Architecture of TWAREN phase 2 NTU 6509 7609 15454 ASCC 15454 7609 NCCU 6509 NIU 15454 6509 7609 15454 3750 7609 NDHU 6509 15600 NCU 7609 15454 12816 15454 12816 6509 12816 MOEcc 7609C Taipei 12816 NCNU 7609 3750 NHLTC 6509 NCTU 15454 15600 15454 7609C Hsinchu Taichung 7609C 15454 7609 6509 12816 NCHC Tainan NCHC 7609C 12816 15454 12816 NCHC 12816 6509 15454 NCHU 7609 NTTU NTHU 15454 7609 15600 3750 6509 15454 6509 7609 NSYSU 15454 6509 7609 NCKU 15454 6509 7609 CCU STM64 STM16 10GE GE 5
CTC is not enough for us TWAREN phase 2 inherently has the ability to guard against a single point of hardware or circuit failure, so the failure is less likely to affect the actual service provisioning. When port or circuit is fault, we must determine which lightpaths are affected and then correlate with services of upper layers. 6
DESIGN OF NMS 7
1 st Stage Architecture of NMS Monitor Objs Control API GUI & Ticket System Traps Data Collectors Fault Detection MIBs Syslogs Net flows Telnet/SSH TL1 Current Status DB Long Term DB Threshold DB Case/Action DB Fault Location Auto Action Mirror Threshold Analyzer Interactive Passive Report System 8
Lightpaths Monitor System Monitor System in NOC Alarm trigger Alarm Email Database Trap parser Light-path data Parser TWAREN Optical Network ONS 9
Frequent incident case 1 10
Frequent incident case 2 11
Important information in a trap Value index Value Description 1 Trap agent host name The hostname which sent this trap 2 Trap agent IP address The IP address of host which sent this trap 3 sysuptime The system uptime of host which sent this trap 4 snmptrapoid The mapped OID of this trap 5 Cerent454NodeTime 6 cerent454alarmstate The ONS clock time (YYYYMMDDhhmmss) 1.3.6.1.4.1.3607.6.10.20.30.20.1.80 The severity level of this warning (defined by Cisco CerentNotificationClass) 12
cerent454alarmstate Alarm State Number Meaning Traps that use this alarm stats 31 diagnostic All traps of ONS 40 cleared 50 minornonserviceaffecting All traps of ONS will sent this state when fault be solved 1. carrierlossonthelan 2. transportlayerfailure 80 minorserviceaffecting All traps of ONS 90 majorserviceaffecting 100 criticalserviceaffecting 1. carrierlossonthelan 2. transportlayerfailure 1. lossofsignal 2. lossofframe 13
The OID of ONS Trap ONS-15600 Trap Name MIB OID lossofsignal.1.3.6.1.4.1.3607.2.20.0.430 lossofframe.1.3.6.1.4.1.3607.2.20.0.390 carrierlossonthelan.1.3.6.1.4.1.3607.2.20.0.220 transportlayerfailure.1.3.6.1.4.1.3607.2.20.0.3540 ONS-15454 Trap Name MIB OID lossofsignal.1.3.6.1.4.1.3607.6.10.30.0.430 lossofframe.1.3.6.1.4.1.3607.6.10.30.0.390 carrierlossonthelan.1.3.6.1.4.1.3607.6.10.30.0.220 transportlayerfailure.1.3.6.1.4.1.3607.6.10.30.0.3540 14
DESIGN OF DATABASE 15
Relationship of Data Tables Basic Data Tables Component People Relationship Tables Circuit VLAN Services Location Unit Vendor., etc VPLS Services ONS Light Path ONS Cross Connection., etc 16
Basic Data Tables Component Data Table Component_ID Parent_C_ID Name 1 0 TN7609P 12 1 Slot_1 2 0 TP15454 16 2 Slot_3 135 12 Port_9 Vendor Data Table ID Name 1 CHT 2 APBT 3 RingLine People Data Table ID Name Phone Address Service_Time Service_WeekDay 1 John 0939123123 xxxxxxx 8-17 1,3,5 2 Mary 0958123123 xxxxxxx ALL ALL Location Data Table ID Name Address 1 MOEcc xxxxx 2 NTU xxxxx Unit Data Table ID Name 1 NCKU 18 THU 17
Port Table Field Type Description PortID int Port Component ID Type int Port type: Ethernet(0) SDH(1) CardName varchar Line Card type name Bandwidth int Bandwidth Status int Defined by us 18
Topology Link Table Field Type Description NodeA NodeB int int The component IDs of the equipments connected with the link. PortA PortB int int The component IDs of the ports connected with the link. 19
Cross Connection Table Field Type Description CRS int Cross Connection unique number SNCP int If SNCP protection? (0: No, 1:Yes) PortFrom1 int Port Component ID - From 1 PortFrom2 int Port Component ID - From 2 PortTo1 int Port Component ID - To 1 PortTo2 int Port Component ID - To 2 ChannelFrom1 int Port From 1 Channel ID ChannelFrom2 int Port From 2 Channel ID ChannelTo1 int Port To 1 - Channel ID ChannelTo2 int Port To 2 - Channel ID SNCPPathFrom int SNCPPathTo int Current SNCP selector uses WORKING(1) or PROTECTION(2) Size int VC bandwidth size, unit is VC1 (155mbps) Status int Defined by us CKTID varchar Circuit Identification string 20
Light-Path Table Field Type Description LightPath int Light Path unique number Name varchar Light Path name (CKTID) PortFrom int Port From PortTo int Port - To SNCP int If SNCP protection? (0: No, 1:Yes) Size int VC bandwidth size, unit is VC1 (155mbps) TraceCRS TraceConfigured TraceCurrent varchar varchar varchar Status int Defined by us Cross-connection path string that a light-path passed through Configured ports path (port Component ID string) that a light-path should pass through. Actual ports path (port Component ID string) that a light-path pass through currently. 21
Alarm Table Field Type Description Id Int Alarm serial number EventName Varchar Alarm identify name HostName Varchar The name of the host which sent this alarm AgentIP Varchar The IP address of the host which sent this alarm Category Varchar The class of this alarm Severity Varchar The severity level of this alarm (defined by us) UpTime Varchar The uptime of the equipment which sent this alarm TrapTime Varchar The time that this alarm be generated Interface Varchar The port be affected AlarmStatus Varchar The severity level that be defined by CISCO LightPath Varchar The light-path names be affected 22
IMPLEMENTATION 23
Working steps Build Port table by reading from Component table. Send some TL1 commands to all ONS. Build TopologyLink and CrossConnection tables by parsing the responses of TL1. Build LightPath table by aggregating Port, TopologyLink, CrossConnection tables. Determine the affected ports When receive traps from ONS, and then correlate the affected services with database. 24
TL1 commands Command ACT-USER::username:123::password; RTRV-NE-IPMAP:::123; RTRV-CRS::ALL:123; RTRV-VC::ALL:123; CANC-USER::username:123:; Description Login Get information about topology link neighbors Get information about cross connections Get information about all VC statuses, including SNCP selector status logout 25
Partial SNCP (1) RA ONS-A ONS-B ONS-C ONS-D RB Configured working path Configured protection path 26
Partial SNCP (2) RA ONS-A ONS-B ONS-C ONS-D RB Actual working path Configured protection path 27
28
WEB-BASED ALARM LOG SYSTEM 29
Example 1 30
Example 1 31
Example 2 32
INTEGRATED VISUAL INTERFACE 33
34
35
36