rgirada [Thu, 13 Jun 2019 17:21:37 +0000 (10:21 -0700)]
pimd: Added cli to generate igmp query.
Fix details :
Added a utility cli to generate a igmp query on an interface.
This won't impact the existing query generation based on the
general query interval.
pimd: remove pim and igmp OIFs when ifchannel_delete happens
In a pim-evpn setup (say TORC11<=>TORC12) an mroute can have a mix of
PIM and IGMP joins. The vxlan termination device ipmr-lo is IGMP
joined on termination mroutes and the peerlink-rif can be pim joined
on the same mroute if the MLAG peer (TORC11) loses all its uplinks to
underlay -
root@TORC12:~# net show pim state 239.1.1.101|grep pimreg
1 * 239.1.1.101 uplink-1
pimreg(I ), ipmr-lo( J ), peerlink-3.4094( J )
root@TORC12:~#
When the uplinks come back up on TORC11 it will prune the peerlink-rif
and join the RP (say spine) via the uplinks.
TORC12 is rxing the prune and removing the if_channel
(pim_ifchannel_delete). However it is not removing the OIF from
mfcc_ttl basically leaving behind a leaked OIF in the forwarding
entry. And this is because it is deriving the owner flag from the
parent upstream entry and incorrectly concluding that all OIFs are
IGMP joined.
Thix fix flushes out both PIM and IGMP ownership when the ifchannel is
deleted.
There is a second fix in the commit and that is to set the proto mask
correctly (to STAR) for inherited OIFs. (S,G) entries can inherit the
OIF from the (*, G) entry and this decision can change when the pim/igmp
ifchannel is removed. The earlier code was setting the proto-mask
incorrectly to PIM or IGMP.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit d4d1d968dbbe61347393f7dace8b675496ff1024)
pimd: ensure that the oif is removed from all the mroutes pre-vifi deletion
When a link goes down the vifi was being deleted but the OIF stayed
in the OIL with a stale vifi -
oroot@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s1 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo pimreg( J ), ipmr-lo( *), swp1s1( J )
root@act-7726-03:~# ip link set swp1s1 down
root@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s0 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo ipmr-lo( *), swp1s0( J ), <oif?>( J ) >>>>>>>>
root@act-7726-03:~#
The problem was as a part ifchannel_delete the join state of the channel
was checked to avoid incorrect OIF deletion this was preventing the OIF
from being flushed. Fix is to flip the channel join-state to NOINFO before
deleting it.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Philippe Guibert [Mon, 17 Jun 2019 13:17:20 +0000 (15:17 +0200)]
bfdd: avoid double socket initialisation on same netns
when working with a standard vrf backend, bfdd ignores that and tries to
create and configure bfd sockets for each vrf, which will fail for the
second vrf discovered, since the network namespace used is the same, and
it is not possible to use same socket settings twice. Handle this case,
and avoids to reinitialise sockets.
This patch however does not leverage bfd support for vrf-lite.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Thu, 20 Jun 2019 15:12:35 +0000 (11:12 -0400)]
bgpd: `neighbor X:X::X default-originate` complains about (null)
The `neighbor X:X::X default-originate command is complaining
that:
The route-map '(null)' does not exist.
Upon inspection of the code we were passing a NULL
string to the lookup. Testing for null gets us this:
donna.cumulusnetworks.com# conf t
donna.cumulusnetworks.com(config)# router bgp 99
donna.cumulusnetworks.com(config-router)# neighbor 2001:1::1:2 remote-as 99
donna.cumulusnetworks.com(config-router)# neighbor 2001:1::1:2 default-originate
donna.cumulusnetworks.com(config-router)# end
donna.cumulusnetworks.com# show run
Building configuration...
Current configuration:
!
frr version 7.2-dev
frr defaults datacenter
hostname donna.cumulusnetworks.com
log stdout
no ipv6 forwarding
!
ip route 4.5.6.7/32 10.50.11.4
!
router bgp 99
neighbor 2001:1::1:2 remote-as 99
!
address-family ipv4 unicast
neighbor 2001:1::1:2 default-originate
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
cast a bit too broad of a stroke. We should only inform
the user that we were ignoring the RTM_NEWNEIGH FAIL callback
when we believe it was one of our own 5549 entries.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 20 Jun 2019 08:55:47 +0000 (04:55 -0400)]
zebra: Put route in debug dump of rib data
When dumping rib data about a route for `debug rib detail`
modify the dump command to display the prefix as part
of every line so that we can use a grep on the log
file.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 20 Jun 2019 08:37:57 +0000 (04:37 -0400)]
pimd: Remove output of `debug igmp trace detail` from show commands
There has never been a `debug igmp trace detail` but we have
had code to display this when we had the appropriate flags
set. Since we never can accept this, let's remove this.
Visakha Erina [Wed, 19 Jun 2019 13:38:31 +0000 (06:38 -0700)]
lib: Keep proper count of prefix-list hit-count when used
When a prefix-list is applied to a BGP neighbor to deny the learning
of specific routes, the hit count is showing 0 for BGP even though
the routes are being filtered correctly due
to the configured prefix-list.
Before fix:
c1# show ip prefix-list nag seq 10
ZEBRA: seq 10 permit any (hit count: 0, refcount: 0)
BGP: seq 10 permit any (hit count: 0, refcount: 0)
c1# show ip prefix-list nag seq 5
ZEBRA: seq 5 deny 1.0.1.0/24 (hit count: 0, refcount: 0)
BGP: seq 5 deny 1.0.1.0/24 (hit count: 0, refcount: 0)
Fix: Increment the prefix-list's hit count whenever a rule match occurs.
After Fix:
c1# show ip prefix-list nag seq 10
ZEBRA: seq 10 permit any (hit count: 0, refcount: 0)
BGP: seq 10 permit any (hit count: 6, refcount: 0)
c1# show ip prefix-list nag seq 5
ZEBRA: seq 5 deny 1.0.1.0/24 (hit count: 0, refcount: 0)
BGP: seq 5 deny 1.0.1.0/24 (hit count: 1, refcount: 0)
Don Slice [Wed, 19 Jun 2019 11:22:21 +0000 (11:22 +0000)]
zebra: resolve issue with rnh not evaluating nexhops correctly
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers. Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some. Changed
from frr_each to frr_each_safe and the problem is now gone.
Ticket: CM-25301 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 19:47:10 +0000 (15:47 -0400)]
zebra: Display a bit better debugging for rnh tracking
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 13:21:49 +0000 (09:21 -0400)]
bgpd: BGP_ERR_MULTIPLE_INSTANCE_NOT_SET is an impossible condition
This code is not returned anywhere in the system as that bgp
is by default multiple-instance 'only' now. So remove
the last remaining bits of it from the code base.
Remove BGP_ERR_MULTIPLE_INSTANCE_USED too.
Make bgp_get explicitly return BGP_SUCCESS
instead of 0.
Remove the multi-instance error code too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 6 Jun 2019 00:59:02 +0000 (20:59 -0400)]
bgpd: Fix crash when rd has no data
There exists a state where we may have a rd node but no individual
evpn prefix nodes in the two level table:
(gdb) bt
at bgpd/bgp_evpn_vty.c:1190
filter=FILTER_RELAXED) at lib/command.c:1060
at lib/command.c:1119
vtysh=vtysh@entry=0) at lib/command.c:1273
(gdb) f 5
at bgpd/bgp_evpn_vty.c:1190
1190 bgpd/bgp_evpn_vty.c: No such file or directory.
(gdb) p buf
$1 = "[2]:[0]:[48]:[00:00:00:00:00:00]", '\000' <repeats 240 times>...
(gdb) p json_nroute
$2 = (json_object *) 0x0
(gdb) p rd_header
$3 = 1
(gdb) p buf
$4 = "[2]:[0]:[48]:[00:00:00:00:00:00]", '\000' <repeats 240 times>...
(gdb)
I'm not entirely sure that this is not a `different` problem in that the
rd node should have been removed. But I think preventing the crash
in a show command is probably the right thing to do here.
Fixes: #4501 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Thu, 6 Jun 2019 00:53:01 +0000 (20:53 -0400)]
bgpd: Mac rescan on interface up/down efficency improvements
On interface up/down, bgp stores the mac address of the interface
in a bgp_mac_hash table entry and then initiates a rescan
of the evpn l2vpn table. The problem with this scan is that
it is looking at every item in the table when only 1 mac
has changed. So every up/down event causes some major trauma
in the bgp_update processing.
Modify the mac scanning such that we know the mac that is changed
and as such we should reprocess those entries only.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 18 Jun 2019 00:16:30 +0000 (20:16 -0400)]
bgpd: Fix memleak of Mac Hash String upon insertion
If we get a callback for a interface change but we do not
actually have to move the mac entry in the hash then
we were accidently leaking the Mac Hash String all over
ourselves. Messy Messy!
Ticket: CM-25351 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ameya Dharkar [Thu, 16 May 2019 23:40:19 +0000 (16:40 -0700)]
Zebra: Handle FPM connection up/down events
- When the connection with the FPM socket is established, iterate through all the
L3VNIs and send all the RMACs for FPM processing zfpm_conn_up_thread_cb"
- We have already handled connection down even in previous commits. When the FPM
connection goes down, empty mac_q and FPM mac info hash table
"zfpm_conn_down_thread_cb"
Ameya Dharkar [Thu, 16 May 2019 22:53:46 +0000 (15:53 -0700)]
Zebra: FPM processing of mac_q and dest_q
- FPM write thread calls "zfpm_build_updates()" to process mac_q and dest_q and
to write update buffer over the FPM socket.
- "zfpm_build_updates()" processes all the update queues one by one in a while
loop. It will break the while loop and return if Queue processing function
returns "FPM_WRITE_STOP" OR FPM write buffer is full OR all the queues are
empty (no more update to process).
- "zfpm_build_route_updates()" dequeues and processes route nodes from "dest_q".
- "zfpm_build_mac_updates()" dequeues and processes MAC nodes from "mac_q"
- These queue processing functions return with "FPM_WRITE_STOP" if the write
buffer is full. Return value is "FPM_GOTO_NEXT_Q" if enough updates are
processed from this queue and we want to move on to the next queue.
- In each call, a queue processing function will process max
"FPM_QUEUE_PROCESS_LIMIT (10000)" updates to avoid starvation of other queues.
- While performing RMAC add/delete for an L3VNI, call "zebra_mac_update" hook.
- This hook call triggers "zfpm_trigger_rmac_update". In this function, we do a
lookup for the RMAC in fpm_mac_info_table. If already present, this node is
updated with the latest RMAC info. Else, a new fpm_mac_info_t node is created
and inserted in the queue and hash data structures.
Ameya Dharkar [Thu, 16 May 2019 20:28:25 +0000 (13:28 -0700)]
Zebra: Data structures for RMAC processing in FPM
- FPM MAC structure: This data structure will contain all the information
required for FPM message generation for an RMAC.
struct fpm_mac_info_t {
struct ethaddr macaddr;
uint32_t zebra_flags; /* Could be used to build FPM messages */
vni_t vni;
ifindex_t vxlan_if;
ifindex_t svi_if; /* L2 or L3 Bridge interface */
struct in_addr r_vtep_ip; /* Remote VTEP IP */
/* Linkage to put MAC on the FPM processing queue. */
TAILQ_ENTRY(fpm_mac_info_t) fpm_mac_q_entries;
uint8_t fpm_flags;
};
- Queue structure for FPM processing:
For FPM processing, we build a queue of "fpm_mac_info_t". When RMAC is
added or deleted from zebra, fpm_mac_info_t node is enqueued in this queue
for the corresponding operation. FPM thread will dequeue these nodes one by
one to generate a netlink message.
TAILQ_HEAD(zfpm_mac_q, fpm_mac_info_t) mac_q;
- Hash table for "fpm_mac_info_t"
When zebra tries to enqueue fpm_mac_info_t for a new RMAC add/delete
operation, it is possible that this RMAC is already present in the queue. So,
to avoid multiple messages for duplicate RMAC nodes, insert fpm_mac_info_t
into a hash table.
struct hash *fpm_mac_info_table;
- Before enqueueing any MAC, try to fetch the fpm_mac_info_t from the hash
table first.
- Entry is deleted from the hash table when the node is dequeued.
- For hash table key generation, parameters used are "mac adress" and "vni"
This will provide a fairly unique key for a MAC(fpm_mac_info_hash_keymake).
- Compare function uses "mac address", "RVTEP address" and "VNI" as the key
which is sufficient to distinguish any two RMACs. This compare function is
used for fpm_mac_info_t lookup (zfpm_mac_info_cmp).