Donald Sharp [Wed, 12 Oct 2022 20:05:23 +0000 (16:05 -0400)]
ospfd: Allow unnumbered and numbered addresses to co-exist better
When forming a neighbor relationship on an interface, ospf is
currently evaluating unnumbered as highest priority, without
any consideration for if you have /32's and non /32's on the
interface. Effectively if I have something like this:
int foo0
ip address 192.168.119.1/24
!
router ospf
network 0.0.0.0/0 area 0
!
ospf will form a neighbor on foo0 if it exists. Now
suppose someone does this:
int foo0
ip address 192.168.120.1/32
This will create the unnumbered interface on foo0 and
the peering will come down immediately.
The problem here is that the original designers of the unnumbered
code for ospf didn't envision end operators mixing and matching
addresses on an interface like this ( for perfectly legitimate
reasons I might add ).
So if ospf has both numbered and unnumbered let's match against
the numbered first and then unnumbered. This solves the problem
Fixes: #6823 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Tue, 11 Oct 2022 17:21:03 +0000 (13:21 -0400)]
lib: Free some memory in scripting subsystem at shutdown
Pre:
staticd: showing active allocations in memory group libfrr
staticd: memstats: Scripting : 16 * (variably sized)
staticd: memstats: Hash : 2 * (variably sized)
staticd: memstats: Hash Bucket : 8 * 32
staticd: memstats: Hash Index : 1 * (variably sized)
staticd: memstats: Link List : 1 * 40
staticd: memstats: Link Node : 1 * 24
staticd: showing active allocations in memory group logging subsystem
staticd: memstats: log file target : 1 * 88
staticd: showing active allocations in memory group staticd
Post:
staticd: showing active allocations in memory group libfrr
staticd: showing active allocations in memory group logging subsystem
staticd: memstats: log file target : 1 * 88
staticd: showing active allocations in memory group staticd
Donatas Abraitis [Tue, 11 Oct 2022 19:36:26 +0000 (22:36 +0300)]
tools: Handle sequence numbers for BGP community (large/ext) in frr-reload.py
If we add/modify community/large/ext lists without sequence numbers, and
doing frr-reload.py, then rules with sequence numbers (show running-config
always adds sequence numbers) will be deleted and new ones will be re-added.
bgpd: Don't check for NULL when removing SRv6 SIDs
When an SRv6 locator is unset, all the SRv6 SIDs allocated from the
locator are removed. Before freeing the memory allocated for an SRv6
SID, we check if the pointer to the SID is `NULL`.
However, checking for `NULL` before freeing memory is useless.
This PR aims to improve the code's readability by removing the
useless `NULL` checks.
Currently, if `bgp max-med on-startup` is configured, after BGP session
is established for the first time, a timer for the specified time is
started. When the timer is expired, an UPDATE message should be sent to
reflect changes in the routes' MED value. The problem is that the routes
are being suppressed because based on the attributes they look like they
have not changed. However, in the case of max-med, the value is copied
to the packet directly from `bgp->maxmed_value`, not from the
attributes. Thus, changes in this case cannot be detected by comparing
attributes.
With this fix, avoid route suppressing when the `max-med on-startup`
timer expires and initiates an UPDATE.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
No mistake, just to unify style for the parameter of function address - remove
ampersand. In current code, only this one place of `hook_register()`s needs
to be made.
Donald Sharp [Fri, 7 Oct 2022 11:51:17 +0000 (07:51 -0400)]
*: Create and use infrastructure to show debugs in lib
There are lib debugs being set but never show up in
`show debug` commands because there was no way to show
that they were being used. Add a bit of infrastructure
to allow this and then use it for `debug route-map`
In order to send correct SRv6 L3VPN advertisement, we need to save
srv6_locator_chunk in vpn_policy. With this information, we can
construct correct SRv6 L3VPN advertisement packets.
Donald Sharp [Fri, 7 Oct 2022 00:17:26 +0000 (20:17 -0400)]
bgpd: Remove unnecessary check for pi and setting type and sub-type
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.bgpd: Remove unnecessary check for pi and setting type and sub-type
There is code that sets the pi based upon matching it against
the same peer. In this code the type and sub-type are also
compared to the passed in type and sub-type. Let's just use
type and sub-type as that if we have a pi we know type and sub-type
are already correct. This should also make the first iteration
work correctly when the pi has not been created yet when we call
the martian_update function.
The typesafe hash data structure enforces items to be unique, but their
hash values may still collide. To this extent, when two items have the
same hash value, the compare function is called to see if it returns 0
(aka "equal").
While the _find() function handles this correctly, the _add() function
mistakenly only checked the first item with a colliding hash value for
equality, and if it was inequal proceeded to add the new item. There
may however be additional items with the same hash value collision, one
of which could still compare as equal. In that case, _add() would
mistakenly add the new element, failing to notice the already added
item. Breakage ensues.
Fix by looking for an equal element among *all* existing items with the
same hash value, not just the first.
Philippe Guibert [Wed, 28 Sep 2022 15:23:51 +0000 (17:23 +0200)]
bgpd: improve 'show bgp nexthop' command
- for a given IP nexthop, dump all NH entries, including
colored entries, or entries with an ifindex.
- when a given IP nexthop is requested, the path is displayed.
For better readibility, remove the carriage return between
'Last update' and 'Paths', because ctime() function already
performs carriage return.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Donald Sharp [Tue, 4 Oct 2022 12:51:38 +0000 (08:51 -0400)]
zebra: Allow tunneldump data to work properly on non-supported kernels
When zebra requests tunnel data it is sending a RTM_GETTUNNEL per
interface that is a VXLAN tunnel. If the kernel that is being
used does not support the particular request type then zebra
will get a error message per tunnel request back. Unfortunately
netlink_parse_info *stops* reading on the first error message.
Therefor one kernels that are returning an error message
let's gather all of those errors. This will allow things
like route reads to actually work properly
Fixes: #12056 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra: ignore unspec RetransTimer in RA validation
Section 6.2.7 of RFC 4861 states that a router SHOULD log
inconsistencies in RA information detected on a given link:
```
- Cur Hop Limit values (except for the unspecified value of zero
other inconsistencies SHOULD be logged to system network
management).
- Values of the M or O flags.
- Reachable Time values (except for the unspecified value of zero).
- Retrans Timer values (except for the unspecified value of zero).
- Values in the MTU options.
- Preferred and Valid Lifetimes for the same prefix. If
AdvPreferredLifetime and/or AdvValidLifetime decrement in real
time as specified in Section 6.2.1 then the comparison of the
lifetimes cannot compare the content of the fields in the Router
Advertisement, but must instead compare the time at which the
prefix will become deprecated and invalidated, respectively. Due
to link propagation delays and potentially poorly synchronized
clocks between the routers such comparison SHOULD allow some time
skew.
```
We were not logging inconsistencies if "the unspecified value of zero"
was used for Reachable Time but were logging them for Retrans Timer.
This updates the validation check to also skip the logging of Retrans
Timer inconsistencies if either local/rx value is 0.
When we process a received Router Advertisement we have some logic in
place to detect and log mismatches in a handful of flags/values.
However, these logs do not include what the actual values are, which
means it's up to the operator to grab a packet capture and compare that
against the local configuration...
So let's make life a little easier by including those in the log itself.
Before:
```
2022/09/30 20:37:16 ZEBRA: [KV2V1-7GM7G][EC 4043309149] enp1s0(2): Rx RA - our AdvCurHopLimit doesn't agree with fe80::5054:ff:feca:b085
2022/09/30 20:37:16 ZEBRA: [KS0BP-4GR8K][EC 4043309149] enp1s0(2): Rx RA - our AdvManagedFlag doesn't agree with fe80::5054:ff:feca:b085
2022/09/30 20:37:16 ZEBRA: [RE4EC-VYEJ2][EC 4043309149] enp1s0(2): Rx RA - our AdvOtherConfigFlag doesn't agree with fe80::5054:ff:feca:b085
2022/09/30 20:37:16 ZEBRA: [X6794-9MW18][EC 4043309149] enp1s0(2): Rx RA - our AdvReachableTime doesn't agree with fe80::5054:ff:feca:b085
2022/09/30 20:37:16 ZEBRA: [S1KXC-H8F4W][EC 4043309149] enp1s0(2): Rx RA - our AdvRetransTimer doesn't agree with fe80::5054:ff:feca:b085
```
After:
```
Sep 30 20:45:18 ub20-2 zebra[47487]: [GSW5Z-V7DZN][EC 4043309149] enp1s0(2): Rx RA - our AdvCurHopLimit (14) doesn't agree with fe80::5054:ff:fe9a:e2ca (64)
Sep 30 20:45:18 ub20-2 zebra[47487]: [RHHTS-F96DR][EC 4043309149] enp1s0(2): Rx RA - our AdvManagedFlag (0) doesn't agree with fe80::5054:ff:fe9a:e2ca (1)
Sep 30 20:45:18 ub20-2 zebra[47487]: [MNBY3-FTN6W][EC 4043309149] enp1s0(2): Rx RA - our AdvOtherConfigFlag (0) doesn't agree with fe80::5054:ff:fe9a:e2ca (1)
Sep 30 20:45:18 ub20-2 zebra[47487]: [GG62B-XXWR0][EC 4043309149] enp1s0(2): Rx RA - our AdvReachableTime (20) doesn't agree with fe80::5054:ff:fe9a:e2ca (777)
Sep 30 20:45:18 ub20-2 zebra[47487]: [YG220-D6B4H][EC 4043309149] enp1s0(2): Rx RA - our AdvRetransTimer (13) doesn't agree with fe80::5054:ff:fe9a:e2ca (0)
```
Donald Sharp [Tue, 4 Oct 2022 11:28:51 +0000 (07:28 -0400)]
fabricd: Turn off excessive logging when peering will not come up
When fabricd is configured to use an interface and there will be
no peers out that interface, the log file is filling up with:
Oct 04 10:50:03 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Initializing to Up
Oct 04 10:50:03 host2 fabricd[1444769]: [R18GA-MS9R7] OpenFabric: Started initial synchronization with 1111.1111.1111 on enp1s0f1np1
Oct 04 10:50:06 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Up to Initializing
Oct 04 10:50:07 host2 fabricd[1444769]: [NT6J7-1RYRF] OpenFabric: Initial synchronization on enp1s0f1np1 timed out!
Oct 04 10:50:07 host2 fabricd[1444769]: [R18GA-MS9R7] OpenFabric: Started initial synchronization with 3333.3333.3333 on enp1s0f0np0
Oct 04 10:50:08 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Up to Initializing
Oct 04 10:50:11 host2 fabricd[1444769]: [NT6J7-1RYRF] OpenFabric: Initial synchronization on enp1s0f0np0 timed out!
Oct 04 10:50:11 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Initializing to Up
Oct 04 10:50:11 host2 fabricd[1444769]: [R18GA-MS9R7] OpenFabric: Started initial synchronization with 1111.1111.1111 on enp1s0f1np1
Oct 04 10:50:14 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Up to Initializing
Oct 04 10:50:15 host2 fabricd[1444769]: [NT6J7-1RYRF] OpenFabric: Initial synchronization on enp1s0f1np1 timed out!
Oct 04 10:50:16 host2 fabricd[1444769]: [R18GA-MS9R7] OpenFabric: Started initial synchronization with 1111.1111.1111 on enp1s0f1np1
Oct 04 10:50:18 host2 fabricd[1444769]: [HHXDJ-1DA93] ISIS-Adj (1): Threeway state change Initializing to Up
The `Threeway state change..` message is guarded by a debug, but the other 2 are not.
Let's guard those with debugs since the log will be filled up rather quickly
with any sort of aggressive timers.
bgpd: Show why the prefix is inaccessible in show commands
```
donatas-pc# show ip bgp 100.100.100.0/24 longer-prefixes
BGP table version is 13, local router ID is 10.10.10.10, vrf id 0
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
100.100.100.0/24 0.0.0.0 0 32768 i
Displayed 1 routes and 15 total paths
donatas-pc# show ip bgp 100.100.100.0/24
BGP routing table entry for 100.100.100.0/24, version 0
Paths: (1 available, no best path)
Not advertised to any peer
Local
0.0.0.0 (inaccessible, import-check enabled) from 0.0.0.0 (10.10.10.10)
Origin IGP, metric 0, weight 32768, invalid, sourced, local
Last update: Tue Oct 4 11:31:44 2022
donatas-pc# show ip bgp 100.100.100.0/24 json
{
"prefix":"100.100.100.0\/24",
"version":0,
"paths":[
{
"aspath":{
"string":"Local",
"segments":[
],
"length":0
},
"origin":"IGP",
"metric":0,
"weight":32768,
"valid":false,
"version":0,
"sourced":true,
"local":true,
"lastUpdate":{
"epoch":1664872304,
"string":"Tue Oct 4 11:31:44 2022\n"
},
"nexthops":[
{
"ip":"0.0.0.0",
"hostname":"donatas-pc",
"afi":"ipv4",
"accessible":false,
"importCheckEnabled":true,
"used":true
}
],
"peer":{
"peerId":"0.0.0.0",
"routerId":"10.10.10.10"
}
}
]
}
donatas-pc#
```