CURRENT_MEETING_REPORT_ Reported by Joel Halpern/Network Systems Corporation Minutes of the Routing Over Large Clouds Working Group (ROLC) Agenda - Tuesday's Session o Charter review o Scope of work and approaches o Modeling, assumptions, and requirements o Overview of ongoing work on IS-IS over NBMA networks o Description of RIP and BGP over demand circuits A definition of a ``large cloud'' was presented. The definition was taken directly from the charter. It was noted that the group summary uses the term ``shared media'' which some people found confusing. However, the formal definition does not, so no change was actually made to the charter. In discussion, it was agreed that a large cloud could be a broadcast network, but it was not necessarily so. Also, a cloud would normally be transitive (i.e., A$B and B$C connectivity implies A$C connectivity), but special cases could arise (e.g., because of policy constraints). It includes connection-less large clouds (e.g., SMDS, or a large bridged Ethernet network) and connection-oriented large clouds with signaling (e.g., ATM, Frame-Relay with signaling, or X.25). It was suggested that, in the connection-oriented case, each entity connected to the cloud must be able to have a certain minimum number of connections (e.g., the extreme case where an entity can have only one connection open at a time is not a large cloud). Thus, POTS and possibly N-ISDN do not qualify as large clouds. (VC management was mentioned as a factor in POTS/ISDN.) It was noted that, while not large clouds, POTS/ISDN needs to be dealt with, and should borrow from this work. They do fall within the charter, but will need separate attention. Discussion of the charter highlighted the need for the working group to strive for a general-purpose solution applicable to all types of large cloud. The solution will consider internetwork-layer(s) over, rather than between, large clouds, but will not prohibit such interworking. Today's problems with routing over such large clouds were listed as: 1. The ability of two entities attached to the same large cloud to communicate directly when they do not have a common IP network number, in respect to both: (a) The operations of existing protocols between entities attached to a large cloud, and (b) The assumptions of routing/information protocols concerning paths between entities attached to large clouds. 2. Policy restrictions and constraints. The operation of routing protocols over large clouds is likely to involve the aggregation of routing information. For example, a large cloud with 5000 attached routers has to have aggregation. It was agreed that the working group should aim to solve the more complex case of having MULTIPLE levels of aggregation. There was discussion of the complication that, in the abstract, one wants to optimize the entire end-to-end routing path, not just the hops across the cloud. It was agreed that while the more general solution was desirable, this group would concentrate on the intra-cloud optimization. The further complication of trying to allow for actual ``costs'' for the paths across the cloud were discussed. That was felt to be more than the group could tackle. One of the items necessary to achieve the groups goals is the relaxation of the constraints on direct communication between addresses on different IP (sub-)networks (see RFC 1122). NBMA Networks Chris Gunner made a presentation on the work being defined for NBMA networks for IS-IS. This involves the use of a Designated Router and both Data-Redirects and Hello-Redirects, with NBMA-addresses being extracted from inside NSAP addresses. The latter is problematic for ATM because the NBMA-address for ATM has the same syntax/structure as an (CLNP) NSAP, and thus cannot be embedded in the (CLNP) NSAP. IS-IS NBMA will reduce the number of hops across the cloud to one per IS-IS area. Thus, a ROLC solution is still needed above the IS-IS NBMA in order to obtain a single hop across multiple areas. Comments on the use of Redirects included: o Problems with knowing that routes have changed. o The security issue of knowing that Redirects are authentic. o The need for timers. o Redirects are less per-packet overhead than the short-cut routing approach of including the address of the entry-point into the cloud in the headers of each packet. o Redirects are invoked by data packets, as opposed to the use of a separate query-response interaction (c.f., NHRP) in which the data packets and control packets can take different paths. It was agreed that the working group needs to have a Requirements document to list both what is needed and what is not needed. ``Routing over Demand Circuits - RIP'' There was discussion of the ``Routing over Demand Circuits - RIP'' Internet-Draft, which seeks to avoid the need for N2 connections between RIP entities wishing to be peers. When the exchange of routing information reaches a stable state the circuits between peers are terminated, and each peer assumes that while a circuit is down, the information contained in the last RIP update remains valid. It was observed that BGP is looking at something similar in terms of not invalidating information when it brings a demand circuit down, and not requiring keep-alives in such circumstances to maintain information validity It was suggested that a possible race-condition exists with 3rd-party announcements. Agenda - Thursday's Session o Discussion of draft-braden-shared-media-00.txt (`Braden draft') o Discussion of ietf-rolc-nhrp-00.txt (`NHRP draft') o Continued discussion of RIP over demand circuits o Discussion of additional work o Recruiting of editors Joel Halpern opened the meeting, and presented the agenda. It was announced that Yakov Rekhter would present the ``Braden draft'' in Robert Braden's absence. ``Braden Draft'' Yakov Rekhter gave an overview of the ``Braden draft''; he stated that the intention of the authors was only to stimulate discussion within the IETF, and not to make any specific proposals. The draft discusses the limitations of the current IP subnet model with respect to `shared media' networks---i.e., networks such as ATM, Frame Relay, etc., where it is possible to have multiple subnets defined on the same medium. The current subnet model allows for direct connectivity between systems on the same medium, only if the nodes are within the same subnet---`short cut' or direct routes are precluded. This is the same problem that the ROLC Working Group proposes to solve. The paper proposes four possible solutions to this problem: 1. Hop-by-hop redirection 2. Extended routing protocols 3. Proxy ARP mechanisms 4. Route query protocols Yakov noted that any solution to the problem needed to meet certain criteria, including: o Interoperability. Modified hosts and routers must interoperate with unmodified nodes. o Practicality. Minimal software changes should be required. o Security o Robustness. The new scheme must be robust against errors in software, configuration, or transmission. There was general agreement on these criteria. There was extensive discussion of the limitations of the current model, and of the various solutions. It was noted that there were circumstances where direct routes were not desirable, or where policy constraints might preclude direct routes (e.g., to maintain firewalls, etc.). There was also some discussion about whether it was in fact optimal to have direct routes, but the consensus appeared to be that it was desirable to always have access to (and generally use) the direct path. There was extensive and wide ranging discussion about the specific proposals, as well other issues raised by the discussion. With respect to proposal hop-by-hop redirection, the number of re-directs needs to be limited, since some hosts might be unable or unwilling to set up direct routes. Yakov also noted that changes were required in the host software to allow them to accept redirects from routers on different subnets (refer to draft). He also discussed the Extended ARP mechanism described in the draft, whereby the redirect also contains the shared media address of the redirect router, to facilitate the ARP process. It was noted that the extended routing protocols proposal can be viewed as an optimization of the hop-by-hop proposal, in that extended routing protocols allow a single redirect from the the first router in the path, since this has information, obtained through the extended routing protocols, about the final router, rather than having multiple redirects from each router in the route. A problem identified with the use of redirects was that they would not work when direct routes were needed between two routers in the shared medium network (i.e., the hosts were outside the network, and could not use the information). Routers cannot (and should not, it was noted; it was generally agreed that host routes were a bad idea) listen to redirects. It was agreed that this might require that third party routing information be passed, as in BGP or EGP. Other questions included: o What happens in the presence of aggregation? Only get a direct route to the point of aggregation. o Are direct routes optimal? Not necessarily - only get optimal path within the routing domain, not end to end. o Does the routing information flow across the same path as the data? Not necessarily. o What happens to existing router to router connections if the routing information ceases? Not clear---it may be best to take all paths down. It was also noted that the problem of route partition within the cloud cannot generally be solved. John Garrett also noted that a virtue of these proposals is that they use routing to solve a routing problem, rather than some other mechanism such as ARP (as in the NHRP proposal). It was noted that this approach may be safer, and may fit better with policy considerations. Joel stated that he felt that the redirect proposals had limitations, but that they were worthy of consideration. There was no discussion about the proxy ARP mechanisms proposal, since John Garrett stated that, contrary to the assertion in the ``Braden draft,'' directed ARP had no relation at all to the model presented in the paper. Similarly, Joel noted that NHRP is much more like the route query protocols proposal rather than the proxy ARP proposal, as suggested by the draft. There was then an extensive discussion about the route query proposal in general, and about the NHRP proposal in particular. Joel presented an overview of NHRP, noting that it proposed to use route queries in place of, or in addition to, the first packet (i.e., data forwarding could go on through the default routers, while the direct route was being found). The route respose uses the same path as the route queries, and `cuts through' the route hierarchy. The complications of this scheme arise from its interaction with routing. Yakov noted that this scheme can also be used to cut through from router to router, not only between hosts, since routers can send route queries. Points raised in the discussion: o How will policy restrictions be supported. Not all policies may be able to be supported. o Will it be possible to discover autonomous system paths? The route response records router addresses, so it could also record AS path information. o Why not have the egress router send the route response directly to the ingress router/host? This could be done, but requires a stronger trust model (i.e. it is more secure for response to follow same path as query). o What if the end router cannot, or will not, accept a direct connection (policy issues, lack of connection space, etc.)? NHRP backs up from the final router to the furthest intermediate node (may be none) that is willing to accept a connection. This node could also then attempt a further cut through. o Why is this different from router redirect? Route query is controlled by the source, hence it can better control and understand the context of the reply; this is more robust and may require less state. o What triggers the route query? This is a complex question, and needs further thought. Maybe a query is always sent if the source is on a cloud? o What about staleness of the route information? Periodic checks of the routes are needed. At this point, Yakov completed his presentation by noting that it was very important to preserve backwards compatibility, and to minimize the impact on the infrastructure. He concluded, however, by stating that the traditional subnet model was causing lots of problems, not just with shared media networks like ATM, but also with mobile IP, etc. Perhaps it is time to abandon the model? There was much support at the meeting for this sentiment. It was noted that the ``Braden draft'' only addressed half the current limitations of the subnet model (i.e., direct routes), but did not address the increasing problems of configuring hosts with subnet masks. Many felt that it was not appropriate for hosts to have to tackle such issues. Joel asked whether the ROLC group should work with the ``Braden draft'' to generate a document to submit to the IESG to argue the case for abandoning the subnet model. Yakov responded that the MOBILEIP Working Group was already working on such a document. NHRP Discussion Juha Heinanen gave a more detailed overview of the NHRP proposal, and lead a discussion of issues about it. He also introduced and thanked his co-author, Ramesh Govindan. Juha noted that the NHRP route request is forwarded between next hop servers (NHS), which COULD also be next hop routers; not all routers need be next hop servers, however. The NH servers would span the same administrative and routing hierarchy of the router network, so that end-to-end routes can be found. The NH servers have permanent connections between themselves (i.e., by PVCs, or by having configured addresses of the adjacent NH servers). He noted that this configuration information was required since a single cloud network could support multiple DISJOINT (logical) NBMAs. Joel noted that the NH servers COULD be co-resident with the classical ARP servers. It was noted that the NHRP proposal was still tied to the subnet model since it required ``mask and match'' to determine whether the last hop had been reached. It was proposed that the entire model should be abandoned, but Joel noted that this would require, in order to determine whether the last hop had been reached, that there be some kind of `hello' or registration protocol, as with ES-IS or the classical ARP model. Noel Chiappa stated that it would be acceptable for the router to use mask and match, as long as the hosts did not need to; Joel agreed, and noted that NHRP was at least an enabler for getting rid of the subnet model. He also noted that one virtue of the NHRP proposal was that it separated the local registration problem from that of direct route discovery---hosts could use a variety of mechanisms for the local ARP process, including redirects or simply ARPing for everything. There was also discussion about whether the NHRP proposal should be made network layer independent, or whether it should be re-written for each protocol. There was much discussion of this topic, with the final consensus being that it made sense to spend some time trying to make the proposal generic, so that it would be `easy to steal the technique'. It was noted that address registration was protocol dependent, and there was also a request that the IP specific binding be made explicit, in order to facilitate interoperability, and because the IETF only has de jure authority over IP. Juha noted that the latest proposal has added a route record capability, in order to allow hosts to seek connections along the path to the destination, if the final router was not willing to make a direct connection. He added that not all NH servers need be routers (e.g., some could be route servers), and that only NH servers that were willing and able to forward packets need record their addresses. This comment was in the context of recording in the backwards direction. If we record going forwards, then each record must indicate whether that entity is willing to forward packets. All entities must be recorded, so that the path can be used for the response. o Should the route be recorded going forward or backward? Going forward, since the path may not be symmetric (e.g. routing may be asymmetric, and the response should go the same way the request went.) o What happens if packets are being forwarded (hop by hop) at the same time as the route query? Get a cascade of route queries, hence need to decide when to send a route query. Joel suggested that connection IDs should be cached and incremented for the number of packets forwarded, and that a route query should only be sent once a threshold has been exceeded. Others suggested that only the initiating host/router should send a query, but it was noted that it was very hard to determine if a router is the first hop. Another suggestion was that the host should set a bit in the PDY which would be cleared by the first hop router. No clear consensus was reached. NHRP and Routing Protocols There was then an extensive discussion of the interaction between NHRP and routing protocols. In particular, much discussion centered around what may happen if routes change (e.g., a better path opens up, link goes down, etc.), and whether this may lead to loops. It was noted that this problem was particularly bad in the case where the route changes occurs outside the NBMA network, but may affect a direct route. (This diagram and problem were presented by John Garrett, courtesy of work he had done on Directed ARP and Shortcut Routing.) The discussion revolved around the following network diagram, which shows packets being sent from host H1 to host H2, both outside the shared media network, but through a direct route from R1 to R4: R7 ------------- R6 -------------------R5 I I I I I |________________________| I I | Shared Media Network | I I | | I I H3--| ooooooooooooooooooo | I I | o o| I I |____R1____R2_____R3____R4--------I I I I I I I I I L1 I----------I R8 I I I ]--------[ ]-------[ I I I I R9---------I I I H2 H1 Hn - Host Rn - Router ooooo - direct (short cut) route ]----[ - Non NBMA network It was stated that a problem arises if link L1 goes down - because R4 still has a path to R1, through (R5, R6, etc.), it will forward packets down that path, while R1, still having a path to R4 (the direct route), will simply loop them back to R4---assuming that no routing information is sent down the direct route. There was some discussion about which routing protocols would actually not detect this loop, but it was generally agreed that the problem could arise with some protocols, at least, assuming particular values of route metrics, etc. Tony Li stated that this problem implies that the ends of the direct connection needed to be told if the current path becomes less optimal, even for apparently unrelated changes. He proposed that a host level IDRP adjacency (`mini-IDRP') be formed between the host and the first hop router, to solve this problem (details of this were apparently given in an e-mail message to the list some months ago). It was noted, however, that it may be necessary to detail the circumstances under which the connection needs to be changed. Noel also noted that there has to be some mechanism to allow easy identification of which flows might be affected. He proposed that this was a clear argument for flows, but this did not meet with widespread approval. It was agreed that lots of state information might need to be kept. Joel stated that the fundamental question was what changes should be noted, and who should notice them---i.e., whether or not it was sufficient for only the two end points of the direct route to notice the route change or not. In order to reduce the `churning' of connections it was noted that only changes within a given level of route aggregation should cause a change within that level. There was no clear resolution of these issues. John Garrett stated that the real problem was that NHRP violates the fundamental premise of (current) routing, in that the router uses a path (the one found by NHRP) different from the one it learned by routing. Routing does not talk about this path and therefore can produce inconsistency. There was agreement that this was indeed a source of difficulty. There was no agreement as to whether John's direct ARP solution, no solution at all, or a minimal exchange of routing information across the direct path were the best solution. Joel Halpern did note that the Direct ARP solution did not work with aggregation, which was an agreed ROLC requirement. John responded that there was a need to write down the set of criteria and requirements for the ROLC work, since the aggregation requirement, for instance, was not stated in the ROLC scope. Joel agreed. Attendees Masuma Ahmed mxa@mail.bellcore.com Anthony Alles aalles@cisco.com Susie Armstrong susie@mentat.com William Barns barns@gateway.mitre.org Jim Beers Jim.Beers@cornell.edu Nutan Behki nebhki@newbridge.com Tom Benkart teb@acc.com Scott Brim Scott_Brim@cornell.edu Caralyn Brown cbrown@wellfleet.com Steve Buchko stevebu@newbridge.com Glen Cairns cairns@mprgate.mpr.ca Ross Callon rcallon@wellfleet.com Lida Carrier lida@apple.com John Chang jrc@uswest.com J. Noel Chiappa jnc@lcs.mit.edu George Clapp clapp@ameris.ameritech.com Robert Cole rgc@qsun.att.com Michael Collins collins@es.net Rob Coltun rcoltun@ni.umd.edu Thomas Coradetti tomc@digibd.com Matt Crawford crawdad@fncent.fnal.gov Michael Davis mike@dss.com Thomas Dimitri tommyd@microsoft.com Waychi Doo wcd@berlioz.nsc.com Ed Ellesson ellesson@vnet.ibm.com Robert Enger enger@seka.reston.ans.net Dario Ercole Dario.Ercole@cselt.stet.it William Fenner fenner@cmf.nrl.navy.mil Dennis Ferguson dennis@ans.net James Forster forster@cisco.com Craig Fox craig@ftp.com Dan Frommer dan@isv.dec.com John Garrett jwg@garage.att.com John Gawf gawf@compatible.com Vincent Gebes vgebes@sys.attjens.co.jp Eugene Geer ewg@cc.bellcore.com Shawn Gillam shawn@timonware.com Fengmin Gong gong@concert.net Ramesh Govindan rxg@thumper.bellcore.com Daniel Grossman dan@merlin.dev.cdx.mot.com Chris Gunner gunner@dsmail.lkg.dec.com Joel Halpern jmh@network.com John Hanratty jhanratty@agile.com Dimitry Haskin dhaskin@wellfleet.com Marc Hasson marc@mentat.com Ken Hayward Ken.Hayward@bnr.ca Denise Heagerty denise@dxcoms.cern.ch Juha Heinanen juha.heinanen@datanet.tele.fi Kathryn Hill khill@newbridge.com Robert Hinden hinden@eng.sun.com Refael Horev horev@lannet.com Kathy Huber khuber@wellfleet.com Melanie Humphrey msh@uiuc.edu David Jacobson dnjake@vnet.ibm.com Ronald Jacoby rj@sgi.com B.V. Jagadeesh bvj@novell.com Jan-Olof Jemnemo jan-olof.jemnemo@farsta.trab.se Merike Kaeo mkaeo@cisco.com Akira Kato kato@wide.ad.jp Yasuhiro Katsube katsube@mail.bellcore.com Hiroshi Kawazoe kawazoe@trl.ibm.co.jp Ted Kuo tik@vnet.ibm.com Sundar Kuttalingam sundark@wiltel.com Mark Laubach laubach@hpl.hp.com Tony Li tli@cisco.com Robin Littlefield robin@wellfleet.com Kanchei Loa loa@sps.mot.com Thang Lu tlu@mcimail.com Bryan Lyles lyles@parc.xerox.com Dan Magorian magorian@ni.umd.edu Tracy Mallory tracym@3com.com David Marlow dmarlow@relay.nswc.navy.mil Jun Matsukata jm@eng.isas.ac.jp Keith McCloghrie kzm@hls.com Donald Merritt don@arl.army.mil Dennis Morris morris@altair.disa.mil Robert Moskowitz 3858921@mcimail.com Sath Nelakonda sath@lachman.com Karen O'Donoghue kodonog@relay.nswc.navy.mil Zbigniew Opalka zopalka@agile.com Maryann Perez perez@cmf.nrl.navy.mil Charles Perkins perk@watson.ibm.com Drew Perkins ddp@fore.com Radia Perlman perlman@novell.com James Philippou japhilippou@eng.xyplex.com Ram Ramanathan ramanath@bbn.com Kenneth Rehbehn kjr@netrix.com Yakov Rekhter yakov@watson.ibm.com Allen Rochkind Allen_Rochkind@3com.com Robert Roden roden@roden.enet.dec.com Benny Rodrig brodrig@rnd-gate.rad.co.il Shawn Routhier sar@epilogue.com Michal Rozenthal michal@fibronics.co.il Greg Ruth gruth@gte.com Timothy Salo tjs@msc.edu Hal Sandick sandick@vnet.ibm.com Isil Sebuktekin isil@nevin.bellcore.com Michael See mikesee@vnet.ibm.com Paul Serice serice@cos.com Satya Sharma ssharma@chang.austin.ibm.com Vincent Shekher vin@sps.mot.com Ming Sheu msheu@vnet.ibm.com Uttam Shikarpur uttam@zk3.dec.com W. David Sincoskie sincos@bellcore.com Keith Sklower sklower@cs.berkeley.edu Andrew Smith asmith@synoptics.com Tae Song tae@novell.com Martha Steenstrup msteenst@bbn.com Steve Suzuki steve@fet.com George Swallow gswallow@bbn.com Susan Thomson set@bellcore.com Dono van-Mierop dono_van_mierop@3mail.3com.com Thomas Walsh tomw@kalpana.com Guy Wells guyw@uswest.com Douglas Williams dougw@vnet.ibm.com David Woodgate David.Woodgate@its.csiro.au Jessica Yu jyy@merit.edu Chin Yuan cxyuan@pacbell.com