Unable to bring up BGP session between ODL and a Router - Need for Help


peter.lucansky@...
 
Edited

Hello all,

 

I`m Peter, new to here, so please apologize if not all information is being published properly. I currently work on a PoC with OpenDaylight (ODL) as an SDN controller for IP/DC transport network. I`d like to make ODL as a BGP route-reflector (BGP RR) for a couple of DC leaf switches (or routers, if you wish) and to establish BGPv4 and BGP-EVPN sessions between ODL and DC switches (routers).

I built up my PoC environment on a single VM running Linux Ubuntu and Docker containers, where I run my DC switches (or routers). ODL is the latest regular release – Phosphorus-SR3 – running directly in VM (in the host), as DC switches I use Arista cEOS switches. I program ODL through REST API from Postman, running in a host directly.

 

Before I introduced ODL in my PoC, I validated the concept and networking with the Arista cEOS only. I have created 5 containers – all with the same Arista cEOS version – where two (2) of them play the role of DC leaf switch (I call them PE1 and PE2), one (1) a role of DC spine switch (I call it PE3) and the next two (a) play a role of CPEs (CPE1 and CPE2). I establish IPv4 with OSPF in-between PE1, PE2 and PE3, on top of that iBGPv4 peering where PE1 and PE2 are route-reflector-clients to PE3. CPE1 and CPE2 are directly connected to PE1 and PE2, respectively. I enabled iBGP peering for address-family IPv4 and EVPN (control-plane), and VXLAN between PE1, PE and PE3 as data-plane. All work fine, I can make both Layer2 EVPN E-Line service between CPE1 and CPE2, as well as Layer 3 IP VPN service. All is good.

 

The actual PoC topology (a PNG file) is attached in to this email.

 

Once I introduced ODL in the PoC (replacing the existing function of BGP-RR on PE3 to ODL), I start building just classic BGP peering between ODL and PE1 (here I have decided to start building classic BGP peering by commenting on BGP-RR-client configuration in ODL configuration). I`ve created a model of the service (here iBGP peering) in XML format. I split it into 3 parts:

  • Part 1 – BGP binding address and TCP port – here I`m not pretty sure whether this is necessary, but according to ODL documentation, I understand that this is necessary.
  • Part 2 – BGP process definition
  • Part 3 – BGP neighbour definition

 

All three (3) parts of the ODL configuration are shown as a TXT file attached to this email.

 

Once I applied the configurations to ODL (using Postman REST API and PUT call), I got a successful response from ODL, i.e. all done. However, a BGP peering between ODL and PE1 remains in ACTIVE status. And here is the problem I can`t solve and I`d like to kindly ask you for help.

 

Doing some research on the PE1 side (like a tcpdump), I see that ODL refuses TCP connection with the PE1 router. I see a TCP packet with the [R.] flag. No other message was received from ODL.

At the ODL side, I see in logs, that ODL is trying to establish BGP peering with the PE router, but there`s an error related to BGP-cluster. Not sure whether this makes some sense, maybe not, because I don`t run more ODL instances in my POC, therefore I think – I do not need to make a cluster.

 

A log file from ODL (as a TXT file) is attached to this email.

 

I played a lot with the binding address and TCP port on ODL. I changed the address to 127.0.0.1 (host address reachable from all containers, verified by ICMP Ping), to address 0.0.0.0 The same for the TCP port – as recommended in ODL documentation I started with the TCP port 1790, then I changed it to TCP 179, and back again. No successful change in peering. Of course, every change in ODL means a configuration change on the PE1 router. I can ping an IP address of ODL (127.0.0.1) from all PE routers in containers, and I can ping all IP addresses used by PE routers from ODL. Because I see BGP neighbour state ACTIVE, I assume there is no IP connectivity problem, but something else.

 

I let a friend of mine check my service model definitions (in XML format). All look good. When I issue a PUT call to ODL, I get a successful response. If I issue a GET call for what has been configured in ODl, I get all that I configured, without an error.

 

Features installed in ODL:

 

  • features-rest
  • features-bgp
  • odl-bgpcep-bgp
  • odl-bgpcep-bgp-evpn

 

I really don`t know what`s wrong. Why I can`t bring up BGP peering with ODL. I experimented a lot with different settings on the ODL side, as well as on the PE router side. I always get the same result – BGP neighbour state ACTIVE.

UPDATE on 26th JULY, 2022:
I`ve made some other investigations on my case and I found that the problem could be in BGP peer registration by ODL. When experimenting and playing with BGP neighbour specification and I got the following warning in ODL:

12:44:28.591 WARN [epollEventLoopGroup-5-1] Channel [id: 0xff27dc1e, L:/172.16.12.1:1790 - R:/172.16.12.2:40611] negotiation failed: BGP peer with ip: IpAddressNoZone{_ipv4AddressNoZone=Ipv4Address{_value=172.16.12.2}} not configured, check configured peers in : StrictBGPPeerRegistry{peers=[IpAddressNoZone{_ipv4AddressNoZone=Ipv4Address{_value=192.0.2.1}}, IpAddressNoZone{_ipv4AddressNoZone=Ipv4Address{_value=192.0.2.5}}]}

opendaylight-user@root>

I tried to find this register in ODL and anything about that in ODL documentation. Unfortunately, I haven`t found any.
I understand that there`s no specific register I have to fill in this information, rather the BGP peer registration should be done by neighbour definition (through RESTCONF from POSTMAN).
If so, then I would really need some help here, because I don`t know what I`m doing wrong and why ODL doesn`t register BGP peers.


I would very much appreciate some help and advice. If anything else (log, dump, etc.) is needed, I can provide it. I can make a remote session with anyone who can see the PoC and actual situation, as well as who wants to debug issues with me online. If I missed something important, please, let me know and I will update my request.

 

Thank you very much for reading my message and for any response on the issue.
Peter