Multicast routing protocols IGMP IP Group Management Protocol PIM Protocol Independent Multicast MOSPF Multicast OSPF DVMRP DV Multicast Routing Protocol E7310-Multicast-2/Comnet 1
Multicast in local area networks and IGMP E7310-Multicast-2/Comnet 2
Multicast addresses MSB(t) Flat address space Reserved addresses: network 32 bits host 1110 28 bits - multicast group address 224.0.0.1-224.0.0.255 224.0.0.1 224.0.0.2 224.0.0.5 224.0.0.6 224.0.0.9 239.0.0.0-239.255.255.255 239.192.0.0-239.195.255.255 224.0.0.0-239.255.255.255 Class Local network control (TTL=1) All hosts All routers All OSPF routers OSPF designated routers + backup designated r. All RIP-2 routers Administratively scoped multicast (=private) Organization local scope D E7310-Multicast-2/Comnet 3
Sending multicast packets The sender does not need to belong to the multicast group. In broadcast networks only one copy of the multicast packet should be sent Multicast can be implemented as broadcast + filtering Ethernet: broadcast = FF:FF:FF:FF:FF:FF Point-to-point links need no special arrangements ATM needs a centralized server, e.g. the Broadcast and Unknown Server (BUS) in LAN Emulation (LANE) E7310-Multicast-2/Comnet 4
Multicast in broadcast networks Some broadcast networks (e.g. Ethernet) support group addresses The group address is based on the IP address Place low-order 23 bits of IP multicast address into low-order 23 bits of MAC address 01-00-5E-00-00-00 1110 28 bit multicast group address 00000001 00000000 01011110 0 23 bits Overlapping mapping some filtering still required No ARP required: Address Resolution Protocol (ARP) is a telecommunications protocol used for resolution of network layer addresses into link layer addresses E7310-Multicast-2/Comnet 5
Internet Group Management Protocol (IGMP) Purpose: Let routers discover if there are multicast receivers on the network Routers do not need to know the exact members, only whether there are members for a specific group Used locally within a network TTL=1 in all IGMP messages Router with lowest IP address is active on a network Runs directly over IP (protocol type 2) Replaced by Multicast Listener Discovery (MLD) in IPv6 RFC 2236 E7310-Multicast-2/Comnet 6
IGMPv2 - Internet Group Management Protocol Type Max Resp Time Group Address (GA) Checksum à Sent to all systems multicast address 224.0.0.1 à Leave group message sent to all routers address 224.0.0.2 Host waits random time [0...Max Resp Time] prior to response and suppresses its response if it sees another response to the same group Host Membership Query (type 0x11) [General (GA=0) / Group specific (GA 0)] V2 Membership Report (type 0x16) (v2) [Group] V1 Membership Report (type 0x12) (v1) Router Leave Group (type 0x17) Membership Query (type 0x11) [Group specific] Are there other members left? E7310-Multicast-2/Comnet 7
IGMPv3 adds selective reception from sources within a group Host Type=0x11 Max Resp Time Checksum Group Address Reserved + Flags Number of Sources Source Address Source Address Source Address Router Membership Query (type 0x11) Variants: General query (GA=0 and number of sources=0) Group specific query (GA 0 and number of sources=0) Group and source specific query (GA 0 and number of sources>0) V3 Membership Report (type 0x22) Can exclude listed sources within a group or include only listed sources within a group RFC 3376 E7310-Multicast-2/Comnet 8
PIM Protocol Independent Multicast E7310-Multicast-2/Comnet 9
PIM Protocol Independent Multicast Most popular multicast protocol Two modes 1. Dense mode (PIM-DM) 2. Sparse mode (PIM-SM) Independent of any particular unicast routing protocol Uses unicast routing table Simple protocol Assumes the links are symmetric No tunnels Messages sent in IGMP packets E7310-Multicast-2/Comnet 10
PIM Dense Mode For dense multicast groups Dense: The probability is high that a small randomly picked area contains at least a group member, e.g. LAN Based on RPF with pruning, flood-and-prune Principle similar to DVMRP Simpler Less efficient Messages implemented as extensions to IGMP: Hello, Join/Prune, Assert, Graft DVMRP: Distance Vector Multicast Routing Protocol RFC 3973 E7310-Multicast-2/Comnet 11
PIM-DM implementation of RPF Receive multicast packet P from source S to group G on interface U Equal-cost multipath no yes Received from router with largest IP address yes no X U used to send packets to S yes no Send prune(s,g) message on U M(n,S,G)=1 for all n yes Send prune(s,g) message on U no For all n where M(n,S,G) 1: Forward P on interface n M is a cache for prune messages: M(n, S, G) = 1 if a prune(s,g) has been received on interface n Least recently used entries may be dropped E7310-Multicast-2/Comnet 12
PIM-DM Pruning A B R prune S C R D prune E prune F prune G R = receiver E7310-Multicast-2/Comnet 13
PIM-DM Grafting A B R Graft messages are sent when there is a new receiver on a previously pruned branch S C R D graft E F graft G R (new!) R = receiver E7310-Multicast-2/Comnet 14
PIM-DM Pruning on broadcast networks Prune messages sent to all-routers local multicast address (224.0.0.2) S A 3. delay the prune 4. join 5. cancel the prune 1. multicast B D 2. prune C no members Note: this is the only situation when a join message is sent in PIM-DM member E7310-Multicast-2/Comnet 15
PIM-DM Resolving multicasts received on multiple path S multicast A B A and C see prune(s,g) if d other router < d my distance : packets for (S,G) prune interface from another router C assert(s,g,d A ) assert(s,g,d C ) D D receives duplicate packets E7310-Multicast-2/Comnet 16
PIM Sparse Mode Most popular PIM variant, e.g. in IPTV Uses the center-based tree algorithm Evolved from the Core-Based Tree (CBT) protocol Uses the normal unicast routing table build by OSPF or MBGP (both interior and exterior protocol) Rendezvous point (=center) connects the receivers with the senders Receivers must explicitly join Generates a shared unidirectional tree Can switch to source based trees to optimize routes RFC 4601 E7310-Multicast-2/Comnet 17
PIM-SM route entries Route entry includes source address (can be *) group address (can be *) incoming (upstream) interface list of outgoing (downstream) interfaces timers, flags Packets match on the most specific entry 1. (S,G) a specific source in a specific group 2. (*,G) all sources in a specific group 3. (*, *, RP) all groups that hash to a specific RP (Rendezvous point) E7310-Multicast-2/Comnet 18
Messages Join (*, G) To join a group in order to receive traffic from all sources of the group Adds a branch to a shared tree Join (S, G) To receive traffic from a given source of the group Adds a branch to a source-specific tree Prune (*, G) Prunes a branch from the shared tree Prune (S,G, RP) Prunes a source from the shared tree (if the same packets are received on a sourcespecific tree as well) Register To register a sender and send an encapsulated packet to the multicast group Register stop To ask a sender to stop sending Register messages (and instead use the multicast tree) E7310-Multicast-2/Comnet 19
PIM-SM example (1) Join packets are sent toward the RP (Rendezvous point) Address=G, Join=RP, wildcard (WC) bit, RP-tree (RPT) bit, Prune=(empty) Intermediate routers set up (*, G) state and forward the join A B C D E F G H I J RP 1.join (*,G) K L M N Receiver E7310-Multicast-2/Comnet 20
PIM-SM example (2) Senders send packets to RP encapsulated in register messages, RP resends packets on the tree RP may contruct a (S,G) entry, and send periodic joins to the sender Receiver A B 5.join (*,G) C D E 4.register-stop (to M) F G H I J K 1.join (*,G) L RP 3.join (S,G) (to M) 2.register (to RP) M N Receiver Sender E7310-Multicast-2/Comnet 21
PIM-SM example (3) If the last-hop router (K and A) sees many packet from the source, it can switch from a shared tree to a shortest path tree for (S,G). It then sends a join directly to the source, and prunes the previous path. Receiver A B C D E F prune (S, G, RP) K G prune join (S,G) H L I RP M J N Receiver Sender E7310-Multicast-2/Comnet 22
PIM-SM example (4) Copies of the packets are still sent to RP Join/prune messages are sent periodically for each route entry so that they do not time out Receiver A B C D Source based trees + minimize delay + distribute traffic E F G H I J RP K (S,G) L M N Receiver Sender E7310-Multicast-2/Comnet 23
Selection of Rendezvous Point A small group of routers configured as bootstrap router candidates One of them selected as bootstrap router (BSR) for the domain BSR periodically sends Bootstrap messages through the domain A set of routers are configured as candidate RPs for given groups typically same as candidate BSRs Candidate RPs periodically unicast Candidate-RP-Advertisements to the BSR, which includes them in the Bootstrap message Candidate RP s own address Optional group address and mask length The RP is selected by a hash function from the valid candidate RPs All routers use the same hash functions, therefore all routers select the same RP for a given group E7310-Multicast-2/Comnet 24
PIM-SM can interoperate with DVMRP and other multicast protocols PIM Multicast Border Routers (PMBR) connect PIM-SM with other multicast protocols PMBR external group external receivers PMBR Join/Prune (*, *, RP) Register (G, External) Register stop RP RP PMBR E7310-Multicast-2/Comnet 25
Considerations PIM can switch from sparse mode to dense mode Controlled by a parameter, which defines when the group is dense enough Source specific routes require the use of IGMPv3 The RP may be a single point of failure The RP may be a bottle-neck E7310-Multicast-2/Comnet 26
Other variants (based on PIM-SM) 1. Source-Specific Multicast (SSM) Defines channels in terms of source-group pairs several multicast groups can use the same group address Avoids collisions of group addresses Adds security and prevents spamming RFC 3569 2. Bi-directional PIM (BIDIR-PIM) Uses a bidirectional tree Selects a Designated Forwarder (DF) that forwards data to the RP without source-specific routes The main responsibility of the designated forwarder is to decide what packets need to be forwarded upstream toward the rendezvous point. RFC 5015 E7310-Multicast-2/Comnet 27
MOSPF Multicast extensions to OSPF E7310-Multicast-2/Comnet 28
MOSPF Multicast extensions to OSPF Idea: if the location of receivers is known to all routers, multicast should be possible to exactly the receivers only! MOSPF is an extension of OSPF, allowing multicast to be introduced into an existing OSPF unicast routing domain. Unlike DVMRP, MOSPF is not susceptible to the normal convergence problems of distance vector algorithms. MOSPF limits the extent of multicast traffic to group members only Desirable for high-bandwidth multicast applications or limitedbandwidth network links (or both). Unlike OSPF, MOSPF does not support multiple equal-cost paths MOSPF calculates the source-based trees on demand E7310-Multicast-2/Comnet 29
MOSPF can be deployed gracefully Introduce multicast routing by adding a new type of LSA to the OSPF link-state database adding calculations for the paths of multicast packets The introduction of MOSPF to an OSPF routing can be gradual Multicast capability marked with a M-bit in the option flag Routers without multicast capability are ignored in calculating multicast routes MOSPF will automatically route IP multicast datagrams around routers incapable of multicast routing No tunnels there may be a unicast path, but no multicast path MOSPF can be deployed in the MBONE. A MOSPF domain can be attached to the edge of the MBONE, or can be used as a transit routing domain within the MBONE s DVMRP routing system. E7310-Multicast-2/Comnet 30
An MOSPF Routing Domain S1 128.186.1.0/24 128.186.2.0/24 3 3 10 A B 1 1 1 128.186.3.0/24 128.186.6.0/24 1 1 C D 10 F 3 3 G1 10 10 128.186.5.0/24 E G G1 3 128.186.4.0/24 3 G2 G1 E.g. G1 = 226.1.7.6 Group m-lsa created and flooded when e.g. host on 128.186.4.0 joins G1. S2 E7310-Multicast-2/Comnet 31
A Group-membership-LSA is created and flooded when a user joins a multicast group using IGMP LS Type 6 = Group Membership LSA 6=Group Membership LSA Advertised multicast group MOSPF designated router 1=Router, 2=Network Originating router's router ID 32 bits LS age Options=E LS type=6 Link state ID = 226.1.7.6 (group G1) Advertising router = 128.186.4.1 (router E) LS sequence number = 0x80000001 LS checksum length Referenced LS type = 2 (network) Referenced Link State ID = 128.186.4.1 Common LS header E7310-Multicast-2/Comnet 32
MOSPF calculates shortest-path trees on demand B A C E S1 3 A 128.186.1.0/24 3 1 1 128.186.4.0/24 A separate tree is calculated for each combination of group and source Result is stored in Multicast Forwarding Cache Entry Hierarchy reduces the number of calculations 128.186.3.0/24 10 10 G1 B 1 D G 3 G1 B G1 1 1 B F 3 3 128.186.5.0/24 128.186.6.0/24 G1 Lines with label A are pruned when removing redundant shortest paths. Lines with label B are pruned when removing links that do not lead to G1 E7310-Multicast-2/Comnet 33
On demand route calculations use Dijkstra s shortest path first algorithm Calculation is rooted on the source not in the current router as for unicast The sources are not advertised in Group-Membership-LSAs, only learned from the source addresses in multicast data packets For a new multicast, every router performs the same calculation Stub networks do not appear in MOSPF calculation e.g., the stub networks behind router F For equal cost routes, the previous hop router with the highest address is chosen e.g. G chosen instead of E E7310-Multicast-2/Comnet 34
The Multicast Forwarding Cache Entry stores multicast path routing info For each source and group: Router or network for multicast reception List of interfaces, multicasts must be sent Metrics to nearest group member When network conditions change, paths are recalculated Cache entries must be deleted, when changed LSAs are received Router-LSA, Network-LSA (on router or link failure or cost change) Delete all entries since it is not possible to tell which are affected Group-Membership-LSA Delete entries of that group Hierarchy The farther away the change is the fewer cache entries are deleted. When the first packet arrives in a multicast group, the routes are recalculated E7310-Multicast-2/Comnet 35
Inter-area multicast forwarders summarize membership info to the backbone 193.16.1.0/24 Backbone Area 0.0.0.2 D S2 B S1 193.15.6.0/24 A 193.18.3.0/24 Area 0.0.0.1 G1 C E G2 membership G1, G2 193.16.3.0/24 Area-border routers are called interarea multicast forwarders in MOSPF These summarize the memberships of their area (e.g. router C lists groups G1 and G2) to the backbone The inter-area multicast forwarder does not advertise/forward membership info from the backbone to their area Non-backbone areas have no knowledge of group members outside of their own area. E7310-Multicast-2/Comnet 36
For forwarding traffic to the backbone, the interarea multicast forwarders are members of all groups Backbone membership of all groups Area 0.0.0.2 193.16.1.0/24 D S2 B S1 193.15.6.0/24 A 193.18.3.0/24 Area 0.0.0.1 G1 C E G2 193.16.3.0/24 Inter-area multicast forwarders recieve traffic from all groups in their areas (e.g. B from S2) indicated with a W-bit (wildcard) in the router LSA The backbone knows the receivers and forwards to the following interarea multicast forwarder (e.g. traffic from S2 to router C) Also inter-as multicast forwarders (=external routers) are considered as member of all groups E7310-Multicast-2/Comnet 37
S1 à G1 193.16.1.0/24 Backbone Area 0.0.0.2 H G1 S2 Area 0.0.0.3 VL VL F B S4 S1 Two level hierarchy aggregates both sources and group addresses I A 193.18.3.0/24 193.17.1.0/24 193.17.2.0/24 193.15.6.0/24 D S3 G1 C E x Area 0.0.0.0 Area 0.0.0.1 G G2 193.16.3.0/24 In aggregation some info is lost sometimes multicasts are sent needlessly: C à G: to G1 Presence of sources is reported by summary-lsa with MC-bit set: F to H à S3+S4 entry Area border router advertise Group-m-LSAs to backbone (B:G1, D,E,F:G1, C,D,E:G2) no exact location Routers in non-backbone do not know location of group members Wildcard multicast receiver receives all multicasts to all groups E7310-Multicast-2/Comnet 38
MBone E7310-Multicast-2/Comnet 39
MBone an overlay multicast Internet Multicast backbone (MBone) was deployed to support research Enable multicast applications without waiting for full availability of multicasting standards Started in 1992 Overlay network Uses tunnels to link multicast islands Previously with source routing header option Now with IP-in-IP encapsulation R1 R2, IP-in-IP S G, UDP UDP-packet Uses DVMRP and IGMP [ http://www.savetz.com/mbone/ ] E7310-Multicast-2/Comnet 40
MBone overlay is based on workstations running DVMRP G1 W1 G1 192.5.3/24 Tunneling is used to bypass unicast sections of the Internet 128.5.2/24 S1 W2 S3 G1 192.7.1/24 128.5.1/24 E7310-Multicast-2/Comnet 41
Experimental routing protocols have been developed for MBone Tree type Shared tree Source based trees Algorithm Center based tree Flood and prune Domain-wide reports Protocols PIM Sparse* Core Based Tree* DVMRP PIM Dense* MOSPF * These rely on unicast routing protocol to locate multicast sources. (The other ones can route multicast on routes separate from the unicast routes) E7310-Multicast-2/Comnet 42
DVMRP Distance Vector Multicast Routing Protocol E7310-Multicast-2/Comnet 43
DVMRP Distance Vector Multicast Routing Protocol First multicast protocol in the Internet (1988) Distance vector routing protocol similar to RIP Except that sources are like destinations in RIP Routers maintains separate multicast routing tables Uses the reverse-path-forwarding (RPF) algorithm with pruning Nodes exchange Distance in hops (reverse path distance) IP address and mask of source Tunnels explicitly configured with Destination router Cost Threshold RFC 1075 E7310-Multicast-2/Comnet 44
DVMRP is used for multicast routing in the MBone DVMRP messages are IGMP messages (IP protocol=2=igmp, TTL=1) DVMRP header: Type=0x13 Code Checksum Minor vers Major vers Reserved =0xff = 3 Version 3 presented here draft-ietf-idmr-dvmrp-v3-11 Router DVMRP Probe [Code=1] neighbor discovery DVMRP Report [Code=2] route exchange DVMRP Prune [Code=7] cut a branch of multicast tree DVMRP Graft [Code=8] add a branch of multicast tree DVMRP Graft Ack [Code=9] ack of graft reception Router E7310-Multicast-2/Comnet 45
Probes are used for neighbor discovery Router DVMRP Probe [à all-dvmrp-routers 224.0.0.4] Router whole DVMRP routing table [à router s address] Yes My address on list first time Probes are exchanged on tunnel and physical interfaces Contains the list of neighbors on the interface If empty, this is leaf network managed by IGMP Multicasts are not exchanged until two-way neighbor relationship is established Routers see each others versions and capability flags compatibility Keepalive fault detection, restart detection sent each 10s, timeout set at 35s E7310-Multicast-2/Comnet 46
DVMRP uses the concept of dependent downstream routers DVMRP uses the route exchange as a mechanism for upstream routers to determine if any downstream routers depend on them for forwarding from particular source networks Implemented with poison reverse If a downstream router selects an upstream router as the best next hop to a source, it echoes back the route with a metric = original metric + inf E7310-Multicast-2/Comnet 47
Route reports are used to build the source based trees Router DVMRP Report Each DVMRP router periodically (60s) broadcasts to its neighbors - the list of pairs (source, metric) - source aggregation according to CIDR may be used Router Known neighbor yes The receiving MC router calculates the previous hop on each sources multicast path = the DVMRP router that reports shortest distance from the source If equal distance choose smallest IP address Designated forwarder DVMRP Report [inf < metric < 2*inf] (compare to poisonous reverse, inf=32) Downstream Dependent neighbor E7310-Multicast-2/Comnet 48
Reports are processed: Router DVMRP Report [S, metric] Other interfaces Adjusted metric=metric+interface cost If Metric<inf & Adjust metric inf Set adjusted metric to inf If Route is new and Adj metric<inf Add route to RT Delete prune state of more general route Elseif Route exists If Received metric < inf Check if Designated forwarder status for (S,G) changes If Adjusted metric > existing metric From same neighbor: update metric, Schedule flash update for route Elseif Adj.metric < existing metric Update metric for the route If sender was different, update RT, schedule flash updates Elseif Adj.metric = existing metric: refresh route... Elseif Received metric =inf... Elseif Inf < Received Metric < 2 * inf... E7310-Multicast-2/Comnet 49
The multicast algorithm of DVMRP is based on Reverse Path Forwarding (RPF) Multicast packet [from=s, to=g] upstream Received on interface u Router u=rpfinterface(s,g) Yes Reverse Path Forwarding check No X Multicast packets downstream At first multicast from RPF interface, a Forwarding Cache Entry [S,G]: (u,list...) is created using the DVMRP routing table The list contains all downstream routers that have reported dependency on S The router is designated forwarder for downstream nodes If the designated forwarder becomes unreachable, another router assumes the role of designated until it hears from a better candidate E7310-Multicast-2/Comnet 50
List of dependent neighbors is used to minimize the multicast tree Multicast packet [from=s, to=g] Received on interface u Router u=rpfinterface(s,g) Cache = [S,G]:(u,list) Prune [S, G, lifetime] Multicast packet Empty list Yes No Remove Cache Entry Initially list may contain all multicast interfaces except the upstream interface Downstream address is removed from the list if It is a leaf network and G is not in IGMP DB for this phys. network Downstream node has selected another designated forwarder Prune received from all dependent neighbors on this interface No X E7310-Multicast-2/Comnet 51
Prunes minimize the multicast tree u Upstream Router Multicast packets Prune [S,(netmask),G, Lifetime] Dependent Router Leaf network If known dependent neighbor If mask and mask=sent mask with (S,G) Prune all sources in network (S, mask) If prune is already active reset timeout to new value If all dependent neighbors have sent prunes If no group members on the mc-interface Remove u from all Forwarding Cache entries If last u Send prune Prune[S,(m), G] If Mcasts keep arriving (3s) Resend Prune with exponential backoff = double interval each time Remove Cache Entry E7310-Multicast-2/Comnet 52
Grafts are used to grow the tree when a new member joins the group Router Upstream router for [S,G] Graft [S, G] Graft Ack Router The graft is always acknowledged if no multicast, nobody is sending If no ack is received, the graft is resent with exponential backoff retransmissions The graft is forwarded upstream if necessary Exponential backoff is an algorithm that uses feedback to multiplicatively Decrease the rate of some process, in order to gradually find an acceptable rate. E7310-Multicast-2/Comnet 53
On probe timeout caches are flushed Router A DVMRP Probe X Probe timeout Router All routes learned from A à hold-down All downstream dependencies on A are removed If A was designated forwarder, a new one is selected for each (source, group) pair Forwarding cache entries based on A are flushed Graft acks to A are flushed. Downstream dependencies are removed. If the dependence is the last, send prune upstream E7310-Multicast-2/Comnet 54
Route hold-down is a state prior to deleting the route Routes expire on report timeout or when an infinite metric is received An alternate route (that in RIP caused temporary loops) may exist Routers continue to advertise the route with inf metric for 2 report intervals this is the hold-down period All forwarding cache entries for the route are flushed During hold-down, the route may be taken back, if metric <inf, and metric = SAME, and received from SAME router E7310-Multicast-2/Comnet 55
Summary of Multicast Protocols for the Internet Tree type Shared tree Source based trees Algorithm Center based tree Flood and prune Domain-wide reports Protocols PIM Sparse* Core Based tree* DVMRP PIM Dense* MOSPF * These rely on unicast routing protocol to locate multicast sources. (The other ones can route multicast on routes separate from the unicast routes) For shared tree protocols an additional step of finding the Core or Rendezvous Point must be performed. Directories are useful on service management level. E7310-Multicast-2/Comnet 56
Questions about multicast Does flood and prune conform to the idea of Carrier Grade? Does the idea that anyone can send a mc to G conform to the idea of Carrier Grade? Is the principle of source specific MC IP-agnostic? Can source addresses in multicast packets be spoofed? Is multicast a secure communication method? E7310-Multicast-2/Comnet 57