Danny's tech notebook | 丹尼技術手札: July 2013

Friday, July 19, 2013

[TRILL] The keypoint for Appointed VLAN-x Forwarder

If you study a lot of TRILL related documents and still are not able to figure out what Appointed VLAN-x Forwarder is and what purpose it is for, please refer to the following excerpts about it.

Whether they run STP or not, the RBridges have to ensure there’s a single point of contact between a VLAN in the STP domain and the backbone, otherwise all the flooded packets would enter the backbone through multiple entry points, resulting in duplicate packets received by the remote hosts (which might break some odd fainthearted protocols running directly on top of L2). One of the RBridges therefore becomes an appointed forwarder for an edge VLAN.

The right-hand part of the figure illustrates the appointed forwarder concept: the RBridges don’t participate in the STP, none of their edge ports are blocked, but only one of the RBridges acts as a forwarder between the edge STP domain and the TRILL backbone (marked with A), all other RBridges ignore packets received through that VLAN (marked with B).

http://blog.ioshints.info/2011/03/trillfabric-path-stp-integration.html

Having multiple RBridges active on a LAN segment could be an issue if they all start forwarding traffic over the TRILL network, as this would cause both traffic duplication and also confusion in terms of the appropriate return path with which to populate the MAC mapping tables. Consequently, RBridges on a VLAN see each other and elect a Designated RBridge (DRB) for the segment, which in turn normally becomes the Appointed Forwarder that is exclusively responsible for sending/receiving frames on that shared segment while all other RBridges effectively are in a kind of standby mode. Technically (i.e. in the protocol specifications) it is possible for a DRB to make other RBridges Appointed Forwarders, but I am not aware of this being implemented yet, and the likelihood is that the DRB will do the AF job itself.

http://lamejournal.com/2011/05/16/layer-2-routing-sort-of-and-trill/

If there are multiple RBridges on the same link, together with end nodes, it is important that only one of them encapsulate a packet from an end node. As illustrated in Figure 9, if both R1 and R2 were to encapsulate a unicast packet from S, two copies would be delivered to the destination. However, if S were to transmit a multidestination packet (such as a multicast, or an unknown destination), then the copy that R1 encapsulates would be forwarded through the campus, received by R2 (which likely would not know that the packet originated on its port to R1), and R2 would decapsulate it. Then R1 would see a native packet from S, exactly as the first copy, and again encapsulate it and send it into the campus.

The hop count in the TRILL header would not solve this loop, because the hop count does not exist while the packet is not encapsulated with a TRILL header.
IS-IS has an election protocol in which one of the RBridges is elected as the Designated RBridge (DRB). In order to allow load-splitting the task of encapsulating and decapsulating traffic, the DRB may delegate the job of encapsulation/decapsulation based on VLAN. In other words, if R1 is DRB, R1 can delegate to R2 the task of encapsulating/decapsulating traffic for a set of VLANs, say VLANs x, y, and z, and delegate to R3 a different set of VLANs, and R1 might handle the rest.
http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_14-3/143_trill.html

By the way, in this blog the author mentions the concept of Designated VLANs. I excerpt from it as follows:
Some background points that will help to explain things:
1) When RBridges see other RBridges on a multi-access link, they will determine between them which is to be the Designated RBridge (DRB). I should note that this on Point-to-Point (P2P) links, no DRB is elected.
2) When an RBridge receives a native (i.e. non-TRILL) frame that it’s going to forward as TRILL-encapsulated, it first adds a 802.1q header to the frame so that the origin VLAN will be known when the frame is decapsulated at the egress RBridge. Thus when the frame format shows the “original Ethernet frame”, it’s really the original frame plus an 802.1q header. You could, if you wanted to make the Shortest Path Bridging folks laugh quietly, liken this a little to QinQ – you’re sending TRILL-encapsulated frames sourced from multiple VLANS over a single VLAN, and inside the TRILL data frame the 802.1q header in the “original” packet means it can be ‘demuxed’ correctly at the other end. Ugh, horrible analogy

3) The reality is that links between RBridges are unlikely to be carrying a single VLAN, but rather they’re likely to be 802.1q trunk links with many VLANs on them. You don’t want to send out TRILL-IS-IS Hellos and run an instance of IS-IS on every VLAN, as that wouldn’t be scalable. It would also be pointless, as TRILL encapsulated frames are not forwarded on the VLAN on which the frame ingressed; rather the TRILL data frames are forwarded on a common VLAN – the Designated VLAN.
So, if we put all that together:
- On any given link, there must be a single VLAN that the RBridges agree to use for the exchange of TRILL-IS-IS and TRILL data.
- On a multi-access link, the DRB dictates what the Designated VLAN will be; other (non-DRB) RBridges on that link MUST use whatever VLAN the DRB dictates.
- On a point-to-point link, the RBridges use tie-break mechanisms to determine whose Designated VLAN should reign supreme (since there’s no DRB)
- The best design obviously would be that you configure all RBridges to prefer the SAME Designated VLAN, so that if the DRB changes, you don’t change Designated VLAN as well.
- You also need to ensure that all RBridges on a link have connectivity to that Designated VLAN. Common sense, really.
So in summary, the Designated VLAN is the VLAN where TRILL-IS-IS really runs, and over which TRILL data forwarding between RBridges occurs. Make sure all RBridges on a link have the same preferred Designated VLAN configured, and ensure they all have connectivity to that VLAN.
http://lamejournal.com/2011/05/16/layer-2-routing-sort-of-and-trill/

Wednesday, July 17, 2013

[OpenFlow] OpenFlow 1.3 Spec Summary

Compared with OF1.0, OF1.3 is more tables and complex design than 1.0. Here I try to summarize the main items in OF1.3 spec included the table, message, and so on for me to review it more quick in the future.

OpenFlow Table

Flow Table
+-----------------------------------------------------------------------------------------+
| Match Fields | Priority | Counters | Instructions | Timeouts | Cookie |
+-----------------------------------------------------------------------------------------+

Group Table
+-----------------------------------------------------------------------------+
| Group Identifier | Group Type | Counters | Action Buckets |
+-----------------------------------------------------------------------------+

Group Types

Required: all: Execute all buckets in the group
Optional: select: Execute one bucket in the group.
Required: indirect: Execute the one defines bucket in this group.
Optional: fast failover: Execute the first live bucket.

Meter Table
+-------------------------------------------------------+
| Meter Identifier | Meter Bands | Counters |
+-------------------------------------------------------+

Meter Bands

Band Type

Drop
Remark DSCP

New Data Structure in Pipeline
+-------------------------------------------------------+
| media data | packet header | Action Set |
+-------------------------------------------------------+

Instructions
Each flow entry contains a set of instructions that are executed when a packet matches the entry.

Optional Instruction: Meter meter id: Direct packet to the specified ed meter.

Optional Instruction: Apply-Actions action(s): Applies the specified ed action(s) immediately, without any change to the Action Set.

Optional Instruction: Clear-Actions: Clears all the actions in the action set immediately.

Required Instruction: Write-Actions action(s): Merges the specified ed action(s) into the current action set.

Optional Instruction: Write-Metadata metadata / mask: Writes the masked metadata value into the metadata field.

Required Instruction: Goto-Table next-table-id: Indicates the next table in the processing pipeline.

Action Set

The actions in an action set are applied in the order specifi ed below, regardless of the order that

they were added to the set.

copy TTL inwards: apply copy TTL inward actions to the packet

pop: apply all tag pop actions to the packet

push-MPLS: apply MPLS tag push action to the packet

push-PBB: apply PBB tag push action to the packet

push-VLAN: apply VLAN tag push action to the packet

copy TTL outwards: apply copy TTL outwards action to the packet

decrement TTL: apply decrement TTL action to the packet

set: apply all set-eld actions to the packet

qos: apply all QoS actions, such as set queue to the packet

group: if a group action is specied, apply the actions of the relevant group bucket(s) in the order specied by this list

output: if no group action is specied, forward the packet on the port specied by the output action

Action List

The Apply-Actions instruction and the Packet-out message include an action list.

Actions

Required Action: Output. The Output action forwards a packet to a speci ed OpenFlow port
Optional Action: Set-Queue. The set-queue action sets the queue id for a packet.
Required Action: Drop. There is no explicit action to represent drops.
Required Action: Group. Process the packet through the speci ed group.
Optional Action: Push-Tag/Pop-Tag. Switches may support the ability to push/pop tags

Push / Pop VLAN header
Push / Pop MPLS header
Push / Pop PBB header

Optional Action: Set-Field. The various Set-Field actions are identified by their field type and modify the values of respective header fields in the packet.
Optional Action: Change-TTL. The various Change-TTL actions modify the values of the IPv4 TTL, IPv6 Hop Limit or MPLS TTL in the packet.

Set MPLS TTL

8 bits: New MPLS TTL

Decrement MPLS TTL
Set IP TTL

8 bits: New IP TTL

Decrement IP TTL
Copy TTL outwards
Copy TTL inwards

OpenFlow Channel
Controller-to-Switch Message
Handshake:
Features:
Switch-Configuration:
Flow Table Configuration:
Modify-State message:
Multipart message:
Queue-Configuration message:
Read-State:
Packet-out message:
Barrier message:
Role-Request message:
Set-Asynchronous-Configuration message:

Asynchronous Message
Packet-in:
Flow-Removed:
Port-status:
Error:

Symmetric Message
Hello:
Echo Request/Reply:
Experimenter:

Flow Table Modification Messages

Group Table Modification Messages

Meter Modification Messages

Tuesday, July 16, 2013

[TRILL] TRILL Summary for TRILL Test Suite

The following item list is about the key points for each test case in the TRILL Interoperability Test Suite Document.

IS-IS

For Neighbor Info in Hello Message

All RBridges must become adjacent with one another. TRB0 and TRB1 must list each other as neighbors in their TRILL Hellos on link 1. TRB1 and TRB2 must list each other as neighbors in their TRILL Hellos on link 3. TRB0 and TRB2 must list each other as neighbors in their TRILL Hellos on link 2.

Designated RBridge Election is based on

Priority and MAC Address to solve the tiebreak

Incremental Deployment Functionality

Nickname Collision is solved by

Priority, IS-IS System ID

Configure TRB1 and TRB2 to have an MTU of 1280 on link 3

The Campus Wide MTU Sz value must be 1280 on all RBridges. The orginatingLSPBufferSize in each RBridge’s LSP must be set to 1280.

RBridges perform IP Snooping for multicast data

TES3 sends multicast data for IPv4 multicast group 224.0.6.130 on link 3.

TES0 sends an IGMPv3 to exclude nothing for multicast group 224.0.6.130 on link 0.

The multicast data must reach TES0.

TES0 sends an IGMPv3 to include nothing for multicast group 224.0.6.130 on link 0.

The multicast data must not reach TES0.

RBridges receive and transmit BPDUs correctly

Inhibits the appointed forwarder for a period of time between zero and 30 seconds on Root Bridge Change
Sends Topology Change BPDU on change of Appointed Forwarder

When RBridge ceases to be appointed forwarder for noe or more VLANs out a particular port, it SHOULD, as long as it continues to receive spanning tree BPDUs on the port, send topology change BPDUs until it sees the topology change acknowledges in a spanning tree configuration BPDU.

Hop Count Handling

Transit RBridge must decrease the TRILL hop count of the encapsulated frames

RBridge Loss and Link Loss Handling

Unicast Pathway RBridge Loss
Unicast Pathway Link Loss
Distribution Tree Root Loss
Distribution Tree Root Link Loss

TRB1 must notify TRB0 of the link failure through transmission of an updated IS-IS LSP.

Distribution Tree RBridge Loss

TRB0 must be the appointed forwarder on link 0, 1 and 2. TRB1 must be the appointed forwarder on link 3.

Distribution Tree RBridge Link Loss

TRB2 must notify TRB0 of the link failure through transmission of an updated IS-IS LSP.

Shortest Path First Calculation

TRILL distribution trees are calculated with the shortest path first algorithm

Root Choice

RB will be the root of distribution tree with high priority
If equal priority occurs, choose higher IS-IS System Id
Distribution Tree Root Priority Max Bound is 0xFFFF

Number of Trees to calculate

The number of computed distribution tree is not greater than the number of maximum computed distribution tree
Load balancing will use the multiple distribution trees

Set of Trees to calculate

to advertise a set of root can generate multiple distribution tree root

Tie Breaking

Nickname could be refer to distribution tree root

No Receivers Pruning

Distribution tree will be pruned when there is no receivers on the link.

VLAN