Re: Questions about HWVTEP


Sumit Garg <sumit@...>
 

Please see inline below:
-- 

Sumit Garg

Extreme Networks

sumit@...

+1 (919) 595-4971



From: daya kamath <daya_k@...>
Reply-To: daya kamath <daya_k@...>
Date: Monday, August 24, 2015 at 10:06 AM
To: Sumit Garg <sumit@...>, "Ravi_Sabapathy@..." <Ravi_Sabapathy@...>, "dayavanti.gopal.kamath@..." <dayavanti.gopal.kamath@...>, "shague@..." <shague@...>, "vishal.thapar@..." <vishal.thapar@...>
Cc: "ovsdb-dev@..." <ovsdb-dev@...>, "ravindra.kenchappa@..." <ravindra.kenchappa@...>
Subject: Re: [ovsdb-dev] Questions about HWVTEP

my understanding is partly similar to that of sumit.

creation of the VTEPs themselves is mgmt action, and should be out of scope for this activity. 

[Sumit]:

I broadly agree.

Though I'm a bit thrown off by "VTEP". In this context does VTEP == the local IP address on a switch (physical or virtual) that is a the source OR destination IP of the outer VXLAN tunnel.

In theory, each underlay switch in the network can have multiple of these IP address. However,  IMO, in most cases:
(a) each underlay (physical or virtual) switch in the network would have only one of these IP addresses
(b) the IP address would belong to a "loopback" interface

this value can be published in tunnel_ips field in the physical switch table by the ovsdb-server, and the controller will use these tunnel_ips to build tunnels.

[Sumit]:
Agree.

An underlay switch must publish at least one IP address in the tunnel_ips field.

If a switch publishes multiple IP addresses in the tunnel_ips field, the controller can use any of these IP address (typically tunnel_ips[0]) to create VXLAN tunnels. The switch software should not expect any kind of load-balaning, redundancy, health monitoring etc between the multiple IP address it publishes.

also agree that creation of the tunnels is not dependent on mac's, the controller can choose to create all tunnels when the l2-gw api is invoked, and the resolution of the underlying MACs can be implementation on the switch.

[Sumit]: Agree

i think the contention comes in when we have a choice of multiple VTEPs on a device and the controller needs to choose which VTEP to use for reaching a remote endpoint. there may not be l3 connectivity between a given pair of VTEPs, in which case some other mechanism would be useful in determining the correct source VTEP for a given dest VTEP. (i am not sure if this is a use case we need to support for now, but good to have a discussion on this anyways)-

[Sumit]: I'm in favor of keeping this scenario super simple (KISS) - just pick tunnel_ips[0]. The controller should not try to do anything smart. See my note above.

i think we have the following options -

1. if the vtep's are in the same subnet, the controller can pick up the correct tunnel_ip to build a specific tunnel, based on subnet match,(no use of ARP)
however, if the VTEPs are in different subnets, we need some L3 mechanism to figure out the connectivity.

[Sumit]: I'm lost. Subnet implies some kind of a netmask. How would this (netmask) be known by the controller? How would the controller handle the same IP address range differentially sub-netted (192.168/16, 192.168.10/24). Again, I prefer no unnecessary code logic in the controller – just pick tunnel_ips[0]

2. enumerate all possible tunnels in the tunnel table, then monitor the BFD on it, and prune the non-working tunnels based on BFD output.

for e.g on the local switch, i have VTEP A and B. and i need to build a tunnel to VTEP C. i can create tunnels AC as well as BC, and enable BFD on them. then i prune either AC or BC based on BFD notification.

in this case, the tunnel table and its scope becomes important. from the last meeting, my understanding was it is only used for BFD today. can one of the vendors comment on when is the tunnel table expected to be populated by the controller?  


3. rely on MAC learning in the device. so, if the ucast_macs_local table is populated with a physical locator, 
 we use a physical locator VTEP for a given logical switch, for setting up a tunnel on the remote end points related to the same LN. however, this would not be very useful if mac learning has not happened yet for a logical switch on the hardware switch.

any other ideas?


thanks,
daya





From: Sumit Garg <sumit@...>
To: "Ravi_Sabapathy@..." <Ravi_Sabapathy@...>; "dayavanti.gopal.kamath@..." <dayavanti.gopal.kamath@...>; "shague@..." <shague@...>; "vishal.thapar@..." <vishal.thapar@...>
Cc: "ovsdb-dev@..." <ovsdb-dev@...>; "ravindra.kenchappa@..." <ravindra.kenchappa@...>
Sent: Thursday, August 20, 2015 8:47 PM
Subject: Re: [ovsdb-dev] Questions about HWVTEP

I'm confused.

Why is the MAC of the TEP (tunnel end point) IP needed for creating the VXLAN tunnel?

TEP-IP is published in the hardware_vtep schema (Physical_Switch table). It is also published in the LOCAL_MAC tables.

TEP-IP (and it's MAC) is part of the underlay network. IMO, overlay orchestration layer (openstack, ODL etc) doesn't configure & manage the underlay. That happens using some other means – e.g. CLI for hardware switches, GUI/Config files for hypervisors (ESX, linux hosts etc).
-- 
Sumit Garg
Extreme Networks
+1 (919) 595-4971




From: "Ravi_Sabapathy@..." <Ravi_Sabapathy@...>
Date: Thursday, August 20, 2015 at 10:12 AM
To: "dayavanti.gopal.kamath@..." <dayavanti.gopal.kamath@...>, "shague@..." <shague@...>, "vishal.thapar@..." <vishal.thapar@...>
Cc: "ovsdb-dev@..." <ovsdb-dev@...>, "ravindra.kenchappa@..." <ravindra.kenchappa@...>
Subject: Re: [ovsdb-dev] Questions about HWVTEP

Hi Daya,
 
  I have a general query in Hardware VTEP use case in hardware switch,
 
Case 1 :
           The VxLAN tunnel should be created in the hardware switch only after the tunnel end point IP’s ARP is resolved. The modules that interacts with OVSDB server and program the hardware should take care of resolving the ARP.
 
Case 2:
          The hardware switch can use L3 protocol to advertise the tunnel end point IP to other end of the tunnel end point and vice versa.
 
For example,
The tunnel IP can be a loopback IP and this IP can be advertised to other end of the tunnel end point (by using BGP/OSPF protocols) and vice versa. After this the MAC will be resolved and the VxLAN tunnel will be created.
 
 
                             Correct me if my wrong for the above use case.
Regards,
Ravi
 
From: Dayavanti Gopal Kamath [mailto:dayavanti.gopal.kamath@...]
Sent: Thursday, August 20, 2015 7:21 PM
To: Sabapathy, Ravi <Ravi_Sabapathy@...>; shague@...; Vishal Thapar <vishal.thapar@...>
Cc: ovsdb-dev@...; ravindra.kenchappa@...
Subject: RE: [ovsdb-dev] Questions about HWVTEP
 
We discussed this in a community meeting just after the summit. the schema I think basically assumes macs will be unique, (since it is keyed by mac addr). andre also suggested it would be ok to assume uniqueness for now, since one openstack instance typically does not re-use macs across the vm’s. long term, we all agreed we need to change the table such that the key is mac+logical switch, but for now, we are going ahead with this assumption to get something working.
 
Thanks,
daya
 
From:Ravi_Sabapathy@... [mailto:Ravi_Sabapathy@...]
Sent: Thursday, August 20, 2015 6:51 PM
To: shague@...; Vishal Thapar
Cc: Dayavanti Gopal Kamath; ovsdb-dev@...; ravindra.kenchappa@...
Subject: RE: [ovsdb-dev] Questions about HWVTEP
 
Added in line comments.
 
Regards,
Ravi Shankar
 
From:ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: Thursday, August 20, 2015 6:09 PM
To: Vishal Thapar <vishal.thapar@...>
Cc: Dayavanti Gopal Kamath <dayavanti.gopal.kamath@...>; ovsdb-dev@...; Kenchappa, Ravindra <ravindra.kenchappa@...>
Subject: Re: [ovsdb-dev] Questions about HWVTEP
 
 
 
On Thu, Aug 20, 2015 at 4:22 AM, Vishal Thapar <vishal.thapar@...> wrote:
Hi Ravi,
 
How’re you doing? Had couple of questions that came up in pervious few HWVTEP meetings and was wondering if you could help answer them:
 
1.      If we have two hwvtep devices [OVS or ToR], how do they learn each other’s local macs? Is there some sort of mac learning built into them or do we need to explicitly add one node’s local macs as remote on other one?
I thought some of this would happen via ovsdb. The switch would learn it's local macs as normal and populate it's local ovsdb macs. Then these entries would get pushed as remote macs in the other ovsdb tables - via the hwvtep netvirt.
 
2.      Are the macs unique? Openstack supports same mac in different tenant networks. Does hwvtep support non-unique macs and if yes how?
      Same MAC can be present in different tenant network, since each tenant network is a logical network. I am not sure about the implementation details of hardware VTEP in ODL. Having said that MAC’s learned from any hardware gateway will be associated to a logical switch/ vxlan ID in the Ucast_Macs_Local and Ucast_Macs_Remote tables.

Sample entry of ucast mac local/remote table:
Total Mac Count:    1
VXLAN ID(Logical_Switch)            MAC                                     TUNNEL IP (locator: Physical_Locator)
4656                                             00:00:01:00:00:01              36.1.1.1
                          So, now the MAC 00:00:01:00:00:01 can be associated with a different logical network.
 
Hope you get well soon.
 
Thanks and Regards,
Vishal.

_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
 



DISCLAIMER:
This e-mail and any attachments to it may contain confidential and proprietary material and is solely for the use of the intended recipient. Any review, use, disclosure, distribution or copying of this transmittal is prohibited except by or on behalf of the intended recipient. If you have received this transmittal in error, please notify the sender and destroy this e-mail and any attachments and all copies, whether electronic or printed.


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev





DISCLAIMER:
This e-mail and any attachments to it may contain confidential and proprietary material and is solely for the use of the intended recipient. Any review, use, disclosure, distribution or copying of this transmittal is prohibited except by or on behalf of the intended recipient. If you have received this transmittal in error, please notify the sender and destroy this e-mail and any attachments and all copies, whether electronic or printed.

Join {z.archive.ovsdb-dev@lists.opendaylight.org to automatically receive all group messages.