]> git.puffer.fish Git - matthieu/frr.git/log
matthieu/frr.git
4 years agoisisd: Fix usage of uninited memory
Donald Sharp [Tue, 27 Oct 2020 13:59:10 +0000 (09:59 -0400)]
isisd: Fix usage of uninited memory

valgrind is showing a usage of uninited memory:

==935465== Conditional jump or move depends on uninitialised value(s)
==935465==    at 0x159E17: tlvs_area_addresses_to_adj (isis_tlvs.c:4430)
==935465==    by 0x15A4BD: isis_tlvs_to_adj (isis_tlvs.c:4568)
==935465==    by 0x1377F0: process_p2p_hello (isis_pdu.c:203)
==935465==    by 0x1391FD: process_hello (isis_pdu.c:781)
==935465==    by 0x13BDBE: isis_handle_pdu (isis_pdu.c:1700)
==935465==    by 0x13BECD: isis_receive (isis_pdu.c:1744)
==935465==    by 0x49210FF: thread_call (thread.c:1585)
==935465==    by 0x48CFACB: frr_run (libfrr.c:1099)
==935465==    by 0x1218C9: main (isis_main.c:272)
==935465==
==935465== Conditional jump or move depends on uninitialised value(s)
==935465==    at 0x483EEC5: bcmp (vg_replace_strmem.c:1111)
==935465==    by 0x15A290: tlvs_ipv4_addresses_to_adj (isis_tlvs.c:4512)
==935465==    by 0x15A4EB: isis_tlvs_to_adj (isis_tlvs.c:4570)
==935465==    by 0x1377F0: process_p2p_hello (isis_pdu.c:203)
==935465==    by 0x1391FD: process_hello (isis_pdu.c:781)
==935465==    by 0x13BDBE: isis_handle_pdu (isis_pdu.c:1700)
==935465==    by 0x13BECD: isis_receive (isis_pdu.c:1744)
==935465==    by 0x49210FF: thread_call (thread.c:1585)
==935465==    by 0x48CFACB: frr_run (libfrr.c:1099)
==935465==    by 0x1218C9: main (isis_main.c:272)

Effectively we are reallocing memory to hold data.  realloc does not
set the new memory to anything.  So whatever happens to be in the memory
is what is there.  after the realloc happens we are iterating over the
memory just realloced and doing memcmp's to values in it causing these
use of uninitialized memory.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agobgpd: Prevent ecomm memory leak
Donald Sharp [Tue, 27 Oct 2020 19:16:32 +0000 (15:16 -0400)]
bgpd: Prevent ecomm memory leak

There are some situations where we create a ecommunity for
comparing to internal state when we are deleting, but in the
failure cases we would not free up the created memory.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agoisisd: Fix memory leak in copy_tlv_router_cap
Donald Sharp [Tue, 27 Oct 2020 16:40:46 +0000 (12:40 -0400)]
isisd: Fix memory leak in copy_tlv_router_cap

There exists a code path where we would allocate memory
then test a variable and then immediately return NULL.
Prevent memory from leaking in this situation.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agotools: add comment on staticd in daemon config file
Emanuele Bovisio [Thu, 22 Oct 2020 12:47:35 +0000 (14:47 +0200)]
tools: add comment on staticd in daemon config file

staticd is always started, so no need to specify it explicitly

Signed-off-by: Emanuele Bovisio <emanuele.bovisio@eolo.it>
4 years agostaticd: remove redundant checks from vty
Igor Ryzhov [Thu, 22 Oct 2020 14:22:23 +0000 (17:22 +0300)]
staticd: remove redundant checks from vty

These checks are moved to NB layer.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoospfd: fix lsa type-7 continuously refreshed
ckishimo [Thu, 24 Sep 2020 15:36:26 +0000 (08:36 -0700)]
ospfd: fix lsa type-7 continuously refreshed

Having an NSSA ABR redistributing statics, the type-7 LSA are being
continuously refreshed (every ~14 secs). The LSA Seq number keeps
incrementing and the LSA age is going back to 0 when reaching ~14s.

This PR fixes the issue by not forcing the LSA update

However I ignore if the "force" parameter was used in purpose. With this
PR updates are sent in case the metric or metric type are changed

Sep 24 08:54:48 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:02 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:16 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:30 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:44 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:55:58 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:12 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:26 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.
Sep 24 08:56:40 r2 ospfd[7137]: ospf_flood_through: LOCAL NSSA FLOOD of Type-7.

ip route 2.2.2.2/32 blackhole
router ospf
 network 10.0.23.0/24 area 1
 area 1 nssa
!

r2# conf t
r2(config)# router ospf
r2(config-router)# redistribute static

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
2.2.2.2         10.0.25.2         13 0x8000000f 0x3f17 E2 2.2.2.2/32 [0x0]   <<< Seq: f, age 13

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
2.2.2.2         10.0.25.2          0 0x80000010 0x3d18 E2 2.2.2.2/32 [0x0]   <<< Seq: 10, age 0

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
2.2.2.2         10.0.25.2          3 0x8000001b 0x2723 E2 2.2.2.2/32 [0x0]   <<< Seq: 1b, age 3

Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
4 years agoospfd: External LSA not flushed when area is configured as nssa or stub
Soman K S [Sun, 18 Oct 2020 11:49:32 +0000 (17:19 +0530)]
ospfd: External LSA not flushed when area is configured as nssa or stub

Issue:
When the ospf area is changed from default to nssa or stub, the previously
advertised external LSAs are not removed from the neighbor.
The LSAs remain in database till maxage timeout.

Fix:
Advertise the external LSAs with age set to maxage and flood to the
nssa or stub area.

Signed-off-by: kssoman <somanks@gmail.com>
4 years agobgpd: fix mem leak in router bgp import vrf check
Chirag Shah [Tue, 27 Oct 2020 05:18:46 +0000 (22:18 -0700)]
bgpd: fix mem leak in router bgp import vrf check

==916511== 18 bytes in 2 blocks are definitely lost in loss record 7 of 147
==916511==    at 0x483877F: malloc (vg_replace_malloc.c:307)
==916511==    by 0x4BE0F0A: strdup (strdup.c:42)
==916511==    by 0x48D66CE: qstrdup (memory.c:122)
==916511==    by 0x1E6E31: bgp_vpn_leak_export (bgp_mplsvpn.c:2690)
==916511==    by 0x28E892: bgp_router_create (bgp_nb_config.c:124)
==916511==    by 0x48E05AB: nb_callback_create (northbound.c:869)
==916511==    by 0x48E0FA2: nb_callback_configuration (northbound.c:1183)
==916511==    by 0x48E13D0: nb_transaction_process (northbound.c:1308)
==916511==    by 0x48E0137: nb_candidate_commit_apply (northbound.c:741)
==916511==    by 0x48E024B: nb_candidate_commit (northbound.c:773)
==916511==    by 0x48E6B21: nb_cli_classic_commit (northbound_cli.c:64)
==916511==    by 0x48E757E: nb_cli_apply_changes (northbound_cli.c:281)

Signed-off-by: Chirag Shah <chirag@nvidia.com>
4 years agobgpd: Fix profiles compile issue when not using bfdd
Donald Sharp [Mon, 26 Oct 2020 15:25:28 +0000 (11:25 -0400)]
bgpd: Fix profiles compile issue when not using bfdd

When compiling w/ --enable-bfdd=no we get warnings
about functions not being used.

Add a #if check to include it as needed.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agobgpd: delay local routes until update-delay is over
Don Slice [Wed, 21 Oct 2020 14:46:49 +0000 (07:46 -0700)]
bgpd: delay local routes until update-delay is over

Problem found that turning an update-delay would only delay prefixes
learned from peers by delaying bestpath, but would allow local routes
(network statements or redistributed) to be immediately advertised,
followed by an End of Rib indicator. This fix delays sending local
routes until the update-delay process is completed, which matches
what testing shows other vendors do..

Ticket: CM-31743
Signed-off-by: Don Slice <dslice@nvidia.com>
4 years agozebra: fix double clearing of zif->es_info.es
Anuradha Karuppiah [Wed, 30 Sep 2020 19:03:06 +0000 (12:03 -0700)]
zebra: fix double clearing of zif->es_info.es

This problem was accidentally introduced as a part of another fixup -
[
commit e378f5020d1af1aee587659e5602ee8c07eb05f4 (anuradhak/mh-misc-fixes, mh-misc-fixes)
Author: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Date:   Tue Sep 15 16:50:14 2020 -0700

    zebra: fix use of freed es during zebra shutdown
]

zif->es_info.es is cleared as a part of zebra_evpn_es_local_info_clear so it
cannot be passed around as a pointer from zebra_evpn_local_es_update/del.

Because of this bug removing ES from an interface resulted in
a zebra crash.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
4 years agobgpd: Bgp static routes memory leak
Donald Sharp [Fri, 23 Oct 2020 15:09:51 +0000 (11:09 -0400)]
bgpd: Bgp static routes memory leak

When using MPLS_VPN/EVPN ( or really any two level table/route data structure setup )
FRR is leaking memory on shutdown:

eva# conf
eva(config)# router bgp 329
eva(config-router)# address-family ipv4 vpn
eva(config-router-af)# network 5.6.7.8/32 rd 44:55 label 3293
eva(config-router-af)# end
eva# exit
sharpd@eva ~/frr_coverity (master)> ps -ef | grep frr
root     1186423   10793  0 07:51 pts/1    00:00:00 sudo /usr/lib/frr/zebra --log stdout --log-level debug
frr      1186425 1186423  0 07:51 pts/1    00:00:00 /usr/lib/frr/zebra --log stdout --log-level debug
root     1263168  491694  0 11:10 pts/20   00:00:00 sudo valgrind --leak-check=full /usr/lib/frr/bgpd --log stdout --log-level debug
frr      1263169 1263168 22 11:10 pts/20   00:00:04 /usr/bin/valgrind.bin --leak-check=full /usr/lib/frr/bgpd --log stdout --log-level debug
sharpd   1263214  845829  0 11:10 pts/9    00:00:00 grep --color=auto frr
sharpd@eva ~/frr_coverity (master)> sudo kill -SIGTERM 1263169
sharpd@eva ~/frr_coverity (master)>

gives us this:

==1263169== 304 (40 direct, 264 indirect) bytes in 1 blocks are definitely lost in loss record 61 of 78
==1263169==    at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1263169==    by 0x48DD878: qcalloc (memory.c:110)
==1263169==    by 0x5116D5: bgp_table_init (bgp_table.c:110)
==1263169==    by 0x4EB5C4: bgp_static_set_safi (bgp_route.c:5927)
==1263169==    by 0x4C3382: vpnv4_network (bgp_mplsvpn.c:1911)
==1263169==    by 0x489FBEC: cmd_execute_command_real (command.c:916)
==1263169==    by 0x489F7CB: cmd_execute_command (command.c:976)
==1263169==    by 0x489FD04: cmd_execute (command.c:1138)
==1263169==    by 0x493AF73: vty_command (vty.c:517)
==1263169==    by 0x493AA07: vty_execute (vty.c:1282)
==1263169==    by 0x4939B54: vtysh_read (vty.c:2115)
==1263169==    by 0x492E63C: thread_call (thread.c:1585)

The bgp_static_delete function was not unlocking the right bgp_dest.  This
problem goes away after fixing this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agoisisd: Fix memory leak on shutdown
Donald Sharp [Fri, 23 Oct 2020 00:51:24 +0000 (20:51 -0400)]
isisd: Fix memory leak on shutdown

==935465== 40 bytes in 1 blocks are definitely lost in loss record 71 of 546
==935465==    at 0x483AB65: calloc (vg_replace_malloc.c:760)
==935465==    by 0x48D6611: qcalloc (memory.c:110)
==935465==    by 0x48CFE02: list_new (linklist.c:32)
==935465==    by 0x15DBF0: isis_new (isisd.c:213)
==935465==    by 0x15DAC4: isis_global_instance_create (isisd.c:179)
==935465==    by 0x121892: main (isis_main.c:264)
==935465== 64 (40 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 101 of 546
==935465==    at 0x483AB65: calloc (vg_replace_malloc.c:760)
==935465==    by 0x48D6611: qcalloc (memory.c:110)
==935465==    by 0x48CFE02: list_new (linklist.c:32)
==935465==    by 0x15DBE3: isis_new (isisd.c:212)
==935465==    by 0x15DAC4: isis_global_instance_create (isisd.c:179)
==935465==    by 0x121892: main (isis_main.c:264)

On isis shutdown we are seeing the above memory leaks.  Modify
the code to start cleaning this up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: fix unitialized msg header reading at startup
Stephen Worley [Fri, 23 Oct 2020 18:57:29 +0000 (14:57 -0400)]
zebra: fix unitialized msg header reading at startup

Fixes the valgrind error we were seeing on startup due to
initializing the msg header struct:

```
==2534283== Thread 3 zebra_dplane:
==2534283== Syscall param recvmsg(msg) points to uninitialised byte(s)
==2534283==    at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283==    by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283==    by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283==    by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283==    by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283==    by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283==    by 0x493F5CC: thread_call (thread.c:1585)
==2534283==    by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283==    by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283==    by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283==  Address 0x85cd850 is on thread 3's stack
==2534283==  in frame #2, created by nl_batch_read_resp (kernel_netlink.c:1051)
==2534283==
==2534283== Syscall param recvmsg(msg.msg_control) points to unaddressable byte(s)
==2534283==    at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283==    by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283==    by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283==    by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283==    by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283==    by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283==    by 0x493F5CC: thread_call (thread.c:1585)
==2534283==    by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283==    by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283==    by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283==    by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283==  Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==2534283==
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
4 years agozebra: Do not delete nhg's when retain_mode is engaged
Donald Sharp [Thu, 22 Oct 2020 12:02:33 +0000 (08:02 -0400)]
zebra: Do not delete nhg's when retain_mode is engaged

When `-r` is specified to zebra, on shutdown we should
not remove any routes from the fib.  This was a problem
with nhg's on shutdown due to their ref-count behavior.

Introduce a methodology where on shutdown we don't mess
with the nexthop groups in the kernel.  That way on
next startup things will be ok.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agobgpd: fix information strings for vtysh
Emanuele Bovisio [Thu, 22 Oct 2020 14:46:47 +0000 (16:46 +0200)]
bgpd: fix information strings for vtysh

set correct information strings for vtysh.

Signed-off-by: Emanuele Bovisio <emanuele.bovisio@eolo.it>
4 years agobgpd: fix crash in the MH cleanup handling
Anuradha Karuppiah [Tue, 20 Oct 2020 16:26:51 +0000 (09:26 -0700)]
bgpd: fix crash in the MH cleanup handling

The MH datastructures were being released before the paths that were
referencing them. Fix is to do the MH cleanup last.

The MH finish function has also been stripped down to only do a
datastructure cleanup i.e. avoid sending route updates etc.

Ticket: 31376

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
4 years agoospfd: flush type 5 when type 7 is removed
ckishimo [Mon, 28 Sep 2020 21:26:18 +0000 (14:26 -0700)]
ospfd: flush type 5 when type 7 is removed

When the ASBR stops announcing a prefix into the NSSA area, the LSA
type 7 is removed from the area. However the ABR is refreshing the
type 5 in its LSDB while removing the Type 7 LSA. Routers outside
the area do not get an update.

With the following topology: r1---r2---r3, with r3 being the ASBR
announcing type 7 LSA:

r3 configuration
router ospf
 redistribute static
 network 10.0.23.0/24 area 1
 area 1 nssa
!

We stop announcing prefix 3.3.3.3 in the ASBR
r3# conf
r3(config)# router ospf
r3(config-router)# no redistribute static
r3(config-router)#

r2 (ABR)
r2# sh ip os database

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33     3600 0x8000002f 0x13be E2 3.3.3.3/32 [0x0]  <-- flushed

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2          7 0x8000002f 0x73c7 E2 3.3.3.3/32 [0x0]  <-- refreshed(?)

With PR#7086 the LSA type 5 is flushed from the LSDB in r2 and the change is
announced to routers outside the area (r1)

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33     3600 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0]  <-- flushed

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2       3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]  <-- flushed

r1# sh ip os da

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2       3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]  <-- flushed

Unfortunately I just realized that with PR#7086 I'm introducing a new bug, as Type-5 LSA
are not being refreshed when reaching MaxAge

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33       35 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0]  <--- refreshed

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2       3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]  <--- not refreshed!

So this PR should fix the original issue and the bug introduced later, so when stopping
redistribution in the ASBR, both type 5 and type 7 are flushed:

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33     3600 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0]

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2       3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]

Routers outside the area are also notified

r1# sh ip os da

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2       3600 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0]

Re-enabling redistribution, both LSA will be advertised again

r3# conf
r3(config)# router ospf
r3(config-router)# no redistribute static
r3(config-router)# redistribute static
r3(config-router)#

r2# sh ip os da

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33       19 0x80000001 0x6f90 E2 3.3.3.3/32 [0x0]

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2         11 0x80000001 0xcf99 E2 3.3.3.3/32 [0x0]

and they are refreshed when reaching MaxAge

                NSSA-external Link States (Area 0.0.0.1 [NSSA])

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         33.33.33.33       10 0x80000002 0x6d91 E2 3.3.3.3/32 [0x0] <-- Seq 2

                AS External Link States

Link ID         ADV Router      Age  Seq#       CkSum  Route
3.3.3.3         10.0.25.2          2 0x80000002 0xcd9a E2 3.3.3.3/32 [0x0] <-- Seq 2

Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
4 years agozebra: add alias for "show ip/ipv6 ro"
Stephen Worley [Mon, 19 Oct 2020 18:08:18 +0000 (14:08 -0400)]
zebra: add alias for "show ip/ipv6 ro"

Add an alias so people can still type `show ip ro`.

It became ambigious in a recent release.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
4 years agolib: Relax usage of `ip prefix-list A.B.C.D/M ge Y`
Donald Sharp [Fri, 9 Oct 2020 00:36:22 +0000 (20:36 -0400)]
lib: Relax usage of `ip prefix-list A.B.C.D/M ge Y`

Currently the prefix length M must be less than Y.
Relax this restriction to allow M to be less than or equal
to Y.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agolib: fix cisco access list wildcard usage
Rafael Zalamena [Fri, 2 Oct 2020 15:47:23 +0000 (12:47 -0300)]
lib: fix cisco access list wildcard usage

Don't attempt to compress the wildcard information to fit a `/M`, but
use its own full 4 byte field.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
4 years agoyang: fix cisco access list network information
Rafael Zalamena [Fri, 2 Oct 2020 15:44:03 +0000 (12:44 -0300)]
yang: fix cisco access list network information

Don't attempt to put the wildcard information into a 1 byte field
otherwise we'll lose information.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
4 years agoisisd: fix check for area-tag modification
Igor Ryzhov [Wed, 14 Oct 2020 20:01:49 +0000 (23:01 +0300)]
isisd: fix check for area-tag modification

Interface area-tag is not supposed to be modified once defined, but the
necessary check is currently broken, because the circuit is never in
init_circ_list if the area-tag is already configured for the interface.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agobgpd: print error when as-path filter doesn't exist
Igor Ryzhov [Wed, 14 Oct 2020 16:56:18 +0000 (19:56 +0300)]
bgpd: print error when as-path filter doesn't exist

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoospfd: Prevent crash if transferring config amongst instances
Donald Sharp [Tue, 13 Oct 2020 12:16:15 +0000 (08:16 -0400)]
ospfd: Prevent crash if transferring config amongst instances

If we enter:

int eth0
  ip ospf area 0
  ip ospf 10 area 0
!

This will crash ospf.  Prevent this from happening.

OSPF instances:

a) Cannot be mixed with non-instance
b) Are their own process.

Since in multi-instance world ospf instances are their own process,
when an ospf processes receives an instance command we must remove
our config( if present ) and allow the new config to be active
in the new process.  The problem here is that if you have not
done a `router ospf` above the lookup of the ospf pointer will
fail and we will just crash.  Put some code in to prevent a crash
in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agoospfd: fix "no ip ospf area"
Igor Ryzhov [Tue, 13 Oct 2020 11:03:42 +0000 (14:03 +0300)]
ospfd: fix "no ip ospf area"

This commit fixes the following behavior:
```
nfware(config)# interface enp2s0
nfware(config-if)# ip ospf area 0
nfware(config-if)# no ip ospf area 0
% [ospfd]: command ignored as it targets an instance that is not running
```

We should be able to use the command without configuring the instance.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoMerge pull request #7383 from mjstapp/fix_router_id_lists_7_5
Donald Sharp [Fri, 23 Oct 2020 23:37:46 +0000 (19:37 -0400)]
Merge pull request #7383 from mjstapp/fix_router_id_lists_7_5

zebra: clean up all router id lists [7.5]

4 years agozebra: clean up all router id lists [7.5]
Mark Stapp [Fri, 23 Oct 2020 20:21:47 +0000 (16:21 -0400)]
zebra: clean up all router id lists [7.5]

Clean up the ipv6 router-id lists associated with a zvrf - these
were being leaked (7.5 version).

Signed-off-by: Mark Stapp <mjs@voltanet.io>
4 years agoMerge pull request #7354 from idryzhov/fix-ospf6-crash
Quentin Young [Fri, 23 Oct 2020 20:13:19 +0000 (16:13 -0400)]
Merge pull request #7354 from idryzhov/fix-ospf6-crash

[7.5] ospf6d: fix crash on message receive

4 years agoospf6d: fix crash on message receive
Igor Ryzhov [Tue, 20 Oct 2020 19:43:31 +0000 (22:43 +0300)]
ospf6d: fix crash on message receive

OSPF6 daemon starts listening on its socket and reading messages right
after the initialization before the ospf6 router is created. If any
message is received, ospf6d crashes because ospf6_receive doesn't
NULL-check ospf6 pointer.

Fix this by opening the socket and reading messages only after the
creation of ospf6 router.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoMerge pull request #7331 from donaldsharp/75_zebra_use_after_free
Donatas Abraitis [Sun, 18 Oct 2020 12:32:11 +0000 (15:32 +0300)]
Merge pull request #7331 from donaldsharp/75_zebra_use_after_free

[7.5]zebra: Fix use after free in debug path

4 years agoMerge pull request #7334 from mjstapp/fix_multi_connected_7_5
Donald Sharp [Sun, 18 Oct 2020 12:29:33 +0000 (08:29 -0400)]
Merge pull request #7334 from mjstapp/fix_multi_connected_7_5

[7.5] zebra: support multiple connected subnets on an interface

4 years agozebra: support multiple connected subnets on an interface
Mark Stapp [Fri, 16 Oct 2020 21:37:09 +0000 (17:37 -0400)]
zebra: support multiple connected subnets on an interface

[7.5 version]
We support configuration of multiple addresses in the same
subnet on a single interface: make sure that zebra supports
multiple instances of the corresponding connected route.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
4 years agozebra: Fix use after free in debug path
Donald Sharp [Fri, 16 Oct 2020 17:51:52 +0000 (13:51 -0400)]
zebra: Fix use after free in debug path

When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:

2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
    #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
    #1 0x555555705b54 in re_list_first zebra/rib.h:222
    #2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
    #3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #6 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #8 0x55555561a578 in main zebra/main.c:455
    #9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
    #10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
    #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
    #1 0x7ffff739ccad in qfree lib/memory.c:129
    #2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
    #3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
    #4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
    #5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #8 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #10 0x55555561a578 in main zebra/main.c:455
    #11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
    #1 0x7ffff739cb98 in qcalloc lib/memory.c:110
    #2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
    #3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
    #4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
    #5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
    #6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
    #7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
    #8 0x5555555e584a in connected_up zebra/connected.c:254
    #9 0x5555555e42ff in connected_announce zebra/connected.c:94
    #10 0x5555555e4fd3 in connected_update zebra/connected.c:195
    #11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
    #12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
    #13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
    #14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
    #15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
    #16 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #18 0x55555561a578 in main zebra/main.c:455
    #19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first

This is happening because we are using the dest pointer after a call into
rib_gc_dest.  In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage.  We must reload the
dest pointer in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agoMerge pull request #7291 from ton31337/feature/bgpd_backports_7.5
Donald Sharp [Tue, 13 Oct 2020 17:37:41 +0000 (13:37 -0400)]
Merge pull request #7291 from ton31337/feature/bgpd_backports_7.5

bgpd: [7.5] Backport maximum-prefix and show bgp neighbor routes for LU fixes

4 years agobgpd: fix show bgp neighbor routes for labeled-unicast
Trey Aspelund [Mon, 12 Oct 2020 19:39:11 +0000 (15:39 -0400)]
bgpd: fix show bgp neighbor routes for labeled-unicast

bgp_show_neighbor_route() was rewriting safi from LU to uni
before checking if the peer was enabled for LU.  This resulted
in the peer's address-family check looking for unicast, which
would always fail for LU peers since unicast + LU are
mutually-exclusive AFIs.
This moves this safi reassignment after the peer AFI check,
ensuring that the peer's address-family check looks for LU
while the call to bgp_show() still uses uni.

-- highlights from manual testing

config:

router bgp 2
 neighbor 1.1.1.1 remote-as external
 neighbor 1.1.1.1 disable-connected-check
 neighbor 1.1.1.1 update-source 2.2.2.2
 !
 address-family ipv4 unicast
  no neighbor 1.1.1.1 activate
 exit-address-family
 !
 address-family ipv4 labeled-unicast
  neighbor 1.1.1.1 activate
 exit-address-family

before:

spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 labeled-unicast neighbors 1.1.1.1 routes
% No such neighbor or address family

after:

spine01# show bgp ipv4 unicast neighbors 1.1.1.1 routes
% No such neighbor or address family
spine01# show bgp ipv4 label neighbors 1.1.1.1 routes
BGP table version is 1, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
              i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          Next Hop            Metric LocPrf Weight Path
*> 11.11.11.11/32   1.1.1.1                  0             0 1 i
Displayed  1 routes and 1 total paths

Signed-off-by: Trey Aspelund <taspelund@cumulusnetworks.com>
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
4 years agobgpd: Correctly calculate threshold being reached
Donald Sharp [Mon, 12 Oct 2020 14:36:37 +0000 (10:36 -0400)]
bgpd: Correctly calculate threshold being reached

if (pcout > (pcount * peer->max_threshold[afi][safi] / 100 ))
is always true.  So the very first route received will always
trigger the warning.  We actually want the warning to happen
when we hit the threshold.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agoMerge pull request #7275 from idryzhov/7.5-more-backports
Donald Sharp [Sat, 10 Oct 2020 13:46:16 +0000 (09:46 -0400)]
Merge pull request #7275 from idryzhov/7.5-more-backports

[7.5] more fixes backports

4 years ago*: move "show debugging ..." commands to enable node
Igor Ryzhov [Thu, 1 Oct 2020 14:57:23 +0000 (17:57 +0300)]
*: move "show debugging ..." commands to enable node

Use the same node for "show debugging" commands in all daemons.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years ago*: move "debug ..." commands to enable node
Igor Ryzhov [Thu, 1 Oct 2020 14:49:47 +0000 (17:49 +0300)]
*: move "debug ..." commands to enable node

Use the same node for "debug" commands in all daemons.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agorip(ng)d: fix interfaces cleaning
Igor Ryzhov [Fri, 9 Oct 2020 12:14:58 +0000 (15:14 +0300)]
rip(ng)d: fix interfaces cleaning

rip(ng)d_instance_disable unlinks the vrf from the instance which means
that rip(ng)_interfaces_clean never works, because rip(ng)->vrf is
always NULL there. This leads to the crash #6477.

Clean interfaces before disabling the instance to fix the issue.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agostaticd: To set the default value of blackhole type correctly
vdhingra [Fri, 9 Oct 2020 16:23:14 +0000 (09:23 -0700)]
staticd: To set the default value of blackhole type correctly

When nexthop is allocated, default value of blockhole type
was not getting set, this leads to below problem. The default
value should be in-sync with the deafult value in yang model.

c t
ip route 131.1.1.0/24 Null0

do show running-config
...
!
ip route 131.1.1.0/24 blackhole
!
end

Signed-off-by: vishaldhingra <vdhingra@vmware.com>
4 years agoisisd: move debug variables under ifdef
Igor Ryzhov [Thu, 8 Oct 2020 17:06:27 +0000 (20:06 +0300)]
isisd: move debug variables under ifdef

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: check for circuit existence on interface addr change
Igor Ryzhov [Thu, 8 Oct 2020 17:05:08 +0000 (20:05 +0300)]
isisd: check for circuit existence on interface addr change

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: fix incorrect vrf lookups
Igor Ryzhov [Thu, 8 Oct 2020 16:23:08 +0000 (19:23 +0300)]
isisd: fix incorrect vrf lookups

Lookup in C_STATE_NA must be made before the new circuit creation, or it
will be leaked if the isis instance is not found. All other lookups are
unnecessary - we just need to remember the previously used instance.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: add missing rollback if config is invalid
Igor Ryzhov [Thu, 8 Oct 2020 15:42:01 +0000 (18:42 +0300)]
isisd: add missing rollback if config is invalid

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agobgpd: hide test commands
Igor Ryzhov [Thu, 8 Oct 2020 08:03:25 +0000 (11:03 +0300)]
bgpd: hide test commands

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agovtysh: remove unnecessary include
Igor Ryzhov [Wed, 7 Oct 2020 12:27:12 +0000 (15:27 +0300)]
vtysh: remove unnecessary include

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoMerge pull request #7243 from idryzhov/7.5-backports
Donald Sharp [Fri, 9 Oct 2020 23:38:38 +0000 (19:38 -0400)]
Merge pull request #7243 from idryzhov/7.5-backports

[7.5] backport all recent fixes

4 years agoospf6d: Fix flooding of old copies of self-originated LSAs
Martin Buck [Tue, 29 Sep 2020 21:07:40 +0000 (23:07 +0200)]
ospf6d: Fix flooding of old copies of self-originated LSAs

When receiving old copies (e.g. originated before the local ospf6d was
restarted) of supposedly self-originated LSAs which we previously tried to
flush from the network (by setting them to MaxAge), neither flood them nor
add them to our LSDB. Instead, keep the MaxAge version until we actually
(re-)originate them.

Possible fix for #7030. Testcase in #7168
(tests/topotests/ospf6-dr-no-netlsa-bug7030).

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
4 years agozebra: Make connected routes their own entry on the meta_q
Donald Sharp [Thu, 1 Oct 2020 18:58:37 +0000 (14:58 -0400)]
zebra: Make connected routes their own entry on the meta_q

During quick ifdown / ifup events from the linux kernel there
exists a situation where a prefix that has both a kernel route
and a static route can queued up on the meta-q.  If the static
route happens to point at a connected route for nexthop resolution
and we receive a series of quick up/down events *after* the
static route and kernel route are queued up for rib reprocessing.
Since the static route and kernel route are queued on meta-q 1
and the connected route is also on meta-q 1 there exists a situation
where the connected route will be resolved after the static route
fails to resolve, leaving the static route in a unresolved state.

Add a new queue level and put connected routes on their own level,
since they are the fundamental building blocks of pretty much
all the other routes.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: When processing route_entries ignore unusable routes
Donald Sharp [Wed, 30 Sep 2020 21:55:44 +0000 (17:55 -0400)]
zebra: When processing route_entries ignore unusable routes

When zebra is processing routes to determine what to send
to the rib, suppose we have two routes (a) a route processed
earlier that none of it's nexthops were active and (b)
a route that has good nexthops but has a worse admin distance.

rib_process, would not relook at (a)'s nexthops because
the ROUTE_ENTRY_CHANGED flag was not true and it would
win when compared to (b) because it's admin distance
was better, leaving us with a state where we would
attempt and fail to install route (a) because it
was not valid.

Modify the code to consider the number of nexthops
we have as a determiner if we can use the route.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: Prevent uninstall attempts when new entry is not happy
Donald Sharp [Wed, 30 Sep 2020 21:26:02 +0000 (17:26 -0400)]
zebra: Prevent uninstall attempts when new entry is not happy

In rib_process_update_fib, the function is sent two route entries
the old ( previously installed ) and new ( the one to install )
When the function detects that the new is unusable because
the number of nexthops that are usable for that route is 0,
then we uninstall the old route.  The problem here is that
we should not attempt to uninstall any route that is
not owned by FRR.  Modify the code to not attempt
this behavior

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
4 years agovtysh: fix multiple "domainname" commands in running config
Igor Ryzhov [Thu, 1 Oct 2020 19:19:31 +0000 (22:19 +0300)]
vtysh: fix multiple "domainname" commands in running config

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: fix missing docstring
Igor Ryzhov [Fri, 2 Oct 2020 15:53:51 +0000 (18:53 +0300)]
isisd: fix missing docstring

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: fix node for clear commands
Igor Ryzhov [Thu, 1 Oct 2020 14:11:35 +0000 (17:11 +0300)]
isisd: fix node for clear commands

These are only clear commands in FRR available from view node.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years ago*: make failure to decode nht update an error
Quentin Young [Wed, 30 Sep 2020 22:37:15 +0000 (18:37 -0400)]
*: make failure to decode nht update an error

This should never happen; no need to debug guard it and it's not a
warning, if this isn't working then NHT is not working at all.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
4 years agolib: fix zapi_nexthop_update_decode error rc
Quentin Young [Wed, 30 Sep 2020 22:22:33 +0000 (18:22 -0400)]
lib: fix zapi_nexthop_update_decode error rc

This function returns true on success and false otherwise. Returning -1
on error is equivalent to returning true.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
4 years agozebra: don't touch mlag read event pointer
Mark Stapp [Wed, 30 Sep 2020 17:24:54 +0000 (13:24 -0400)]
zebra: don't touch mlag read event pointer

Don't touch the mlag read event pointer, it's not safe.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
4 years agobfdd: Make new multihop peer if local-address is unique
Tashana Mehta-Wilson [Tue, 29 Sep 2020 00:47:53 +0000 (13:47 +1300)]
bfdd: Make new multihop peer if local-address is unique

Previously if there were two multihop peers created that had the same
peer address but different local addresses then the second peer to be
created would be merged with the first one and niether would be able to
be deleted. This was due to an issue in the function bfd_key_lookup().
When the second peer was created its key would be sent into the lookup
function and would reach the last section, even though it shouldn't
have. A check has been placed around the section so that it will not be
entered if a peer is multihop.

Signed-off-by: Tashana Mehta-Wilson <tashana.mehta-wilson@alliedtelesis.co.nz>
4 years agopbrd: use bool for pbr_send_pbr_map() return val
Stephen Worley [Wed, 23 Sep 2020 18:17:15 +0000 (14:17 -0400)]
pbrd: use bool for pbr_send_pbr_map() return val

Use a bool as the return val for pbr_send_pbr_map() to make
the code a bit more readable. Dont expect there to be need
for values other than true or false anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
4 years agopbrd: cleanup pbr ifp info if not sent to zebra
Stephen Worley [Thu, 17 Sep 2020 19:34:36 +0000 (15:34 -0400)]
pbrd: cleanup pbr ifp info if not sent to zebra

Properly cleanup the pbr interface data if nothing actually
gets sent to zebra, since we will never get the callback
notification from zapi to issue final deletion.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
4 years agopbrd: add return val for pbr_send_pbr_map()
Stephen Worley [Thu, 17 Sep 2020 19:32:01 +0000 (15:32 -0400)]
pbrd: add return val for pbr_send_pbr_map()

Add a return val so caller can know if something was actually sent to
zebra here. Some things need to be cleanued up by the caller
if we arent getting a callback from zapi.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
4 years agozebra: avoid duplication node in l3vni l2vni-list
Chirag Shah [Sun, 27 Sep 2020 21:09:43 +0000 (14:09 -0700)]
zebra: avoid duplication node in l3vni l2vni-list

With l2vni flap leading to duplicate entry creation
in l3vni's l2vni-list.
Use list sorted add with no duplicates.

root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
  Type: L3
  Tenant VRF: vrf1
  State: Up
  ...
  L2 VNIs: 1000 1000 1000 0 0 1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# show evpn vni 4001
VNI: 4001
  Type: L3
  Tenant VRF: vrf1
  State: Up
  ...
  L2 VNIs: 1000 1000 1000 0 0 1002 1002

Ticket:CM-31545
Reviewed By:
Testing Done:

With Fix:
Multiple time flaps vni counts remained the same.

root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# ip link set down vx-1002
root@TORC11:mgmt:~# ip link set up vx-1002
root@TORC11:mgmt:~# net show evpn vni 4001
VNI: 4001
  Type: L3
  Tenant VRF: vrf1
  State: Up
  ...
  L2 VNIs: 1000 1002

Signed-off-by: Chirag Shah <chirag@nvidia.com>
4 years agozebra: Make nexthop_active check use the same debug
Donald Sharp [Tue, 29 Sep 2020 11:54:35 +0000 (07:54 -0400)]
zebra: Make nexthop_active check use the same debug

When debugging why a route was not successfully installed into the
rib, it would be preferable that the end user only have to turn
on `debug zebra rib detail` as that is what we have been telling
people to do for the last couple of years.  Consolidate *back*
to this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: Add missing reason we could not make an active_nexthop check
Donald Sharp [Tue, 29 Sep 2020 11:45:19 +0000 (07:45 -0400)]
zebra: Add missing reason we could not make an active_nexthop check

Add a missing reason as to why we are unable to make an active nexthop
check be successful.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agovtysh: fix exit from keychain node
Igor Ryzhov [Mon, 28 Sep 2020 14:17:05 +0000 (17:17 +0300)]
vtysh: fix exit from keychain node

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agovtysh: fix exit from babeld node
Igor Ryzhov [Mon, 28 Sep 2020 14:13:40 +0000 (17:13 +0300)]
vtysh: fix exit from babeld node

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agobuild: remove redundant commas
Igor Ryzhov [Sun, 27 Sep 2020 16:22:02 +0000 (19:22 +0300)]
build: remove redundant commas

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoisisd: guard against adj timer display overflow
Emanuele Di Pascale [Wed, 23 Sep 2020 14:46:44 +0000 (16:46 +0200)]
isisd: guard against adj timer display overflow

An adjacency should be removed when the holdtimer expires, but if the
system is overloaded we may end up doing it late. In the meanwhile vtysh
will display an incorrect value in the show isis neighbor output, due to
an overflow of the unsigned variable used to display the Holdtime, e.g.:

pe1# show isis neighbor
Area test:
 System Id     Interface   L   state   Holdtime  SNPA
 Spirent-1     2.201       1   Down    26        2020.2020.2020
 Spirent-1     2.203       1   Up      21        2020.2020.2020
 Spirent-1     2.204       1   Up      18446744073709551615  2020.2020.2020
 Spirent-1     2.207       1   Up      18446744073709551615  2020.2020.2020
 Spirent-1     2.208       1   Up      18446744073709551615  2020.2020.2020
 Spirent-1     2.209       1   Up      0         2020.2020.2020
 Spirent-1     2.210       1   Up      18446744073709551615  2020.2020.2020
 pe2           12.200      1   Up      30        2020.2020.2020

Guard against that by printing an "Expiring" message instead.

Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
4 years agoisisd: simplify adj_change hook call
Emanuele Di Pascale [Wed, 23 Sep 2020 14:37:21 +0000 (16:37 +0200)]
isisd: simplify adj_change hook call

There is no need to call isis_adj_state_change_hook once per level
in isis_adj_state_change, we can just do it once at the end.

Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
4 years ago*: move all userdata when changing node xpath
Igor Ryzhov [Thu, 24 Sep 2020 18:05:32 +0000 (21:05 +0300)]
*: move all userdata when changing node xpath

The same thing was done for interfaces in commit f7c20aa1f.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agobgpd: Remove dest variable from route_out_vty_flowspec
Donald Sharp [Thu, 24 Sep 2020 12:20:24 +0000 (08:20 -0400)]
bgpd: Remove dest variable from route_out_vty_flowspec

The dest variable was never really used.  Just remove
from the code base.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agobgpd: pbra is already derefed in all paths to this spot
Donald Sharp [Thu, 24 Sep 2020 12:16:57 +0000 (08:16 -0400)]
bgpd: pbra is already derefed in all paths to this spot

The pbra variable is already derefed in all paths to this spot
and as such we cannot be NULL at this point.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agopimd: When bind fails we could leave an open socket
Donald Sharp [Thu, 24 Sep 2020 12:12:49 +0000 (08:12 -0400)]
pimd: When bind fails we could leave an open socket

Clean up the rare situation when bind fails to not
close the fd that was just opened and have the socket
leaked.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agopimd: NULL not 0
Donald Sharp [Thu, 24 Sep 2020 12:10:26 +0000 (08:10 -0400)]
pimd: NULL not 0

When handling data pointers explicity use NULL not
0.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agobgpd: Ensure we do integer size promotions
Donald Sharp [Thu, 24 Sep 2020 12:07:12 +0000 (08:07 -0400)]
bgpd: Ensure we do integer size promotions

When doing multiplication of (int) * (uint_8t) we can
have overflow and end up in a weird state.  Intentionally
upgrade the type then do the math.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: Don't ignore setsockopt return
Donald Sharp [Thu, 24 Sep 2020 11:42:51 +0000 (07:42 -0400)]
zebra: Don't ignore setsockopt return

When attempting to limit the amount of data sent from the kernel
to FRR, some kernels we can run against may not have this ability
in which case the setsockopt will fail.  Notice that in the log.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: fix use of freed es during zebra shutdown
Anuradha Karuppiah [Tue, 15 Sep 2020 23:50:14 +0000 (16:50 -0700)]
zebra: fix use of freed es during zebra shutdown

This problem was reported by the sanitizer -
=================================================================
==24764==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000115c8 at pc 0x55cb9cfad312 bp 0x7fffa0552140 sp 0x7fffa0552138
READ of size 8 at 0x60d0000115c8 thread T0
    #0 0x55cb9cfad311 in zebra_evpn_remote_es_flush zebra/zebra_evpn_mh.c:2041
    #1 0x55cb9cfad311 in zebra_evpn_es_cleanup zebra/zebra_evpn_mh.c:2234
    #2 0x55cb9cf6ae78 in zebra_vrf_disable zebra/zebra_vrf.c:205
    #3 0x7fc8d478f114 in vrf_delete lib/vrf.c:229
    #4 0x7fc8d478f99a in vrf_terminate lib/vrf.c:541
    #5 0x55cb9ceba0af in sigint zebra/main.c:176
    #6 0x55cb9ceba0af in sigint zebra/main.c:130
    #7 0x7fc8d4765d20 in quagga_sigevent_process lib/sigevent.c:103
    #8 0x7fc8d4787e8c in thread_fetch lib/thread.c:1396
    #9 0x7fc8d4708782 in frr_run lib/libfrr.c:1092
    #10 0x55cb9ce931d8 in main zebra/main.c:488
    #11 0x7fc8d43ee09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x55cb9ce94c09 in _start (/usr/lib/frr/zebra+0x8ac09)
=================================================================

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
4 years agozebra: evpn-mh: add error logs on ES processing failures
Anuradha Karuppiah [Wed, 20 May 2020 21:56:36 +0000 (14:56 -0700)]
zebra: evpn-mh: add error logs on ES processing failures

Cleanup some of the XXX added during development of MH.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
4 years agozebra: Increase the read/write mlag buffer sizes
Donald Sharp [Wed, 23 Sep 2020 17:06:08 +0000 (13:06 -0400)]
zebra: Increase the read/write mlag buffer sizes

The read/write mlag buffer sizes of 2k were sufficient
for ~100 S,G notifications at one go.  Increase to 32k
to give us 16 times the space.

Ticket: CM-31576
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: Ensure that message received from mlag will fit
Donald Sharp [Wed, 23 Sep 2020 17:04:20 +0000 (13:04 -0400)]
zebra: Ensure that message received from mlag will fit

If we receive a message that is greater than our buffer
size we are in a situation where both the read and write
buffers are fubar'ed beyond the end.  Assert when we notice
this fact.

Ticket: CM-31576
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: modify mlag code to only need 1 stream when generating data
Donald Sharp [Wed, 23 Sep 2020 16:26:13 +0000 (12:26 -0400)]
zebra: modify mlag code to only need 1 stream when generating data

The normal pattern of writing the type/length at the beginning
of the packet was not being quite followed.  Modify the mlag
code to respect the proper way of doing things and get rid
of a stream_new and copy.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agozebra: stop neigh hold timer when the neigh is deleted
Anuradha Karuppiah [Tue, 26 May 2020 13:24:17 +0000 (06:24 -0700)]
zebra: stop neigh hold timer when the neigh is deleted

The neigh hold timer was firing after the neigh was deleted resulting
in the following crash -
[
    at ./zebra/zebra_evpn_neigh.h:155
    at zebra/zebra_evpn_neigh.c:447
    at lib/thread.c:1578
    at zebra/main.c:488
]

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
4 years agozebra: fix deletion of evpn mh neigh-holdtime
Don Slice [Thu, 4 Jun 2020 15:23:09 +0000 (15:23 +0000)]
zebra: fix deletion of evpn mh neigh-holdtime

Found that the command "evpn mh neigh-holdtime" can be set but
not deleted.  This fix solves the delete process

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
4 years agozebra: Move debug information gathering to inside guard
Donald Sharp [Wed, 23 Sep 2020 00:47:33 +0000 (20:47 -0400)]
zebra: Move debug information gathering to inside guard

Let's not make the entire `depend_finds` function pay
for the data gathering needed for the debug.  There
are numerous other places in the code that check
the NEXTHOP_FLAG_RECURSIVE and do the same output.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agovrf: VRF_DEFAULT must be 0, remove useless code
Christophe Gouault [Mon, 24 Aug 2020 16:01:15 +0000 (18:01 +0200)]
vrf: VRF_DEFAULT must be 0, remove useless code

Code was added in the past to support a value of VRF_DEFAULT different
from 0. This option was abandoned, the default vrf id is always 0.

Remove this code, this will simplify the code and improve performance
(use a constant value instead of a function that performs tests).

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
4 years agozebra: always display vrf in show ip route json
Christophe Gouault [Thu, 20 Aug 2020 09:15:33 +0000 (11:15 +0200)]
zebra: always display vrf in show ip route json

In route json outputs, always display the vrf even if it is the
default vrf.

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
4 years agozebra: simplify and optimize vrf display in show ip route
Christophe Gouault [Thu, 20 Aug 2020 09:15:33 +0000 (11:15 +0200)]
zebra: simplify and optimize vrf display in show ip route

In all outputs (text and json): simplify and optimize the vrf name
display, use the vrf_id_to_name() handler.

Note: vrf_id_to_name() has a safeguard system that prevents from
crashing when the vrf cannot be found because it changed in some
(unexpected) manner, it returns "n/a".

Note: "vrf n/a" will now be displayed instead of "vrf UNKNOWN" in this
case, like in most other frr components.

This safeguard was missing for show ip route json, so this
optimization also fixes a potential crash.

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
4 years agolib: optimize vrf_id_to_name(VRF_DEFAULT) case
Christophe Gouault [Wed, 26 Aug 2020 14:26:49 +0000 (16:26 +0200)]
lib: optimize vrf_id_to_name(VRF_DEFAULT) case

vrf_id_to_name() looks up in a RB_TREE to find the VRF entry, then
reads the name.

Avoid it for VRF_DEFAULT, which always exists and for which the
translation is straightforward.

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
4 years agozebra: fix show ip route output
Christophe Gouault [Thu, 20 Aug 2020 09:15:33 +0000 (11:15 +0200)]
zebra: fix show ip route output

Variable "show ip route" commands invoke the same helper
(do_show_ip_route), potentially several times.

When asking to dump a non-default vrf, all vrfs or all tables, the
output is messy, the header summarizing abbreviations is repeated
several times, excess line feeds appear, the default table of default
VRF is concatenated to the previous table output...

Normalize the output:

- whatever the case, display the common header at most once, if there
  is at least an entry to dump.

- when using a "vrf all" or "table all" command, prepend a line with
  the VRF and table (even for the default vrf or table).

- when dumping a specific vrf or table, prepend a line with the VRF
  and table.

Example (vrf all)
=================

router# show ip route vrf all
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF main:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:24:09
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:24:09
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:00:26

VRF private:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:00:29
C>* 10.125.0.0/24 is directly connected, loop0, 00:00:42

Example (main vrf)
==================

router# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

C>* 10.0.2.0/24 is directly connected, mgmt0, 00:24:41
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:24:41
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:00:58

Example (specific vrf)
======================

router# show ip route vrf private
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF private:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:01:23
C>* 10.125.0.0/24 is directly connected, loop0, 00:01:36

Example (all tables)
====================

router# show ip route table all
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:01:51

VRF main table 254:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:25:34
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:25:34
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:01:51

Example (all vrf, all table)
============================

router# show ip route table all vrf all
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:02:15

VRF main table 254:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:25:58
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:25:58
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:02:15

VRF private table 200:
S>* 2.2.2.0/24 [1/0] via 10.125.0.2, loop0, 00:02:18

VRF private table 254:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:02:18
C>* 10.125.0.0/24 is directly connected, loop0, 00:02:31

Example (specific table)
========================

router# show ip route table 200
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:05:26

Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
4 years agoospfd: do not generate type 4 LSA from NSSA ABR
ckishimo [Thu, 17 Sep 2020 09:51:26 +0000 (02:51 -0700)]
ospfd: do not generate type 4 LSA from NSSA ABR

In a topology like R1 -- R2 -- R5, with R2 being NSSA ABR and R5 being
ASBR redistributing external routes, the ABR R2 will translate type-7
LSA into type-5 and advertise to the backbone. In the current implementation
R2 is also advertising a type-4 LSA when there is no need.

RFC 3101: "...NSSA's border routers never originate Type-4 summary-LSAs
for the NSSA's AS boundary routers, since Type-7 AS-external-LSAs are
never flooded beyond the NSSA's border..."

With this PR a type-4 LSA will not be advertised

Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
4 years agoripd, ripngd: info -> debug
Donald Sharp [Mon, 21 Sep 2020 11:55:36 +0000 (07:55 -0400)]
ripd, ripngd: info -> debug

There are a couple info messages in rip/ripng that really should
be debugs.  Modify code to be so.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 years agolib: don't execute command if pre-processing hook has failed
Igor Ryzhov [Mon, 21 Sep 2020 13:00:33 +0000 (16:00 +0300)]
lib: don't execute command if pre-processing hook has failed

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agolib: fix regcomp error processing
Igor Ryzhov [Mon, 21 Sep 2020 12:35:56 +0000 (15:35 +0300)]
lib: fix regcomp error processing

 * use actual error code instead of "false"
 * add missing new line

Before:
```
nfware# show interface | include (a]
% Regex compilation error: Success% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```

After:
```
nfware# show interface | include (a]
% Regex compilation error: Unmatched ( or \(
% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
4 years agoospfd : Fix for ospf dead interval and hello due.
Kaushik [Sat, 19 Sep 2020 07:29:25 +0000 (00:29 -0700)]
ospfd : Fix for ospf dead interval and hello due.

1. Ospf dead-interval will be set as 4 times of hello-interval, incase
if it is not set by using "ip ospf dead-interval <dead-val>".
2. On resetting hello-interval using "no ip ospf hello-interval" the
dead interval and hello due will be changed accordingly.

Signed-off-by: Kaushik <kaushik@niralnetworks.com>
4 years agodoc: clarify python and pip2 for ubuntu 20
Mark Stapp [Mon, 14 Sep 2020 13:52:16 +0000 (09:52 -0400)]
doc: clarify python and pip2 for ubuntu 20

Must run the pip2 install script with python2 on ubuntu 20.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
4 years agotests: Add topotest for BGP metric change
Martin Winter [Mon, 14 Sep 2020 19:58:57 +0000 (21:58 +0200)]
tests: Add topotest for BGP metric change

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
4 years agodoc: updated user doc for routemap set metric cmd
David Schweizer [Mon, 14 Sep 2020 15:08:50 +0000 (17:08 +0200)]
doc: updated user doc for routemap set metric cmd

Updated the user documentation to reflect changes made to routemaps "set
metric" VTY shell command.

Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
4 years agolib: fix negating set metric route-map command
David Schweizer [Thu, 10 Sep 2020 07:14:58 +0000 (09:14 +0200)]
lib: fix negating set metric route-map command

Changed negating set metric route-map command to be usable in
conjunction with the affirming command.

Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>