David Lamparter [Tue, 28 Jan 2025 01:58:09 +0000 (02:58 +0100)]
lib: don't shadow `_once` in `frr_with_mutex`
The `_once` loop variable will result in a `-Wshadow` warning when that
is turned on. Use `__COUNTER__` to give these variables distinct names,
like is already done with `_mtx_`.
(and because I touched it, clang-format wants it reformatted... ohwell.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Loïc Sang [Tue, 8 Apr 2025 14:57:36 +0000 (16:57 +0200)]
pimd: add YANG attr to YANG cmd
Those commands are using northbound api, add YANG attr to them. This
will allow them to use with pending commit, else the validation will
failed as they are detected as non YANG cmd.
Loïc Sang [Tue, 8 Apr 2025 14:55:33 +0000 (16:55 +0200)]
pathd: add YANG attr to YANG cmd
Those commands are using northbound api, add YANG attr to them. This
will allow them to use with pending commit, else the validation will
failed as they are detected as non YANG cmd.
Loïc Sang [Tue, 8 Apr 2025 13:23:01 +0000 (15:23 +0200)]
isisd: add YANG attr to YANG cmd
Those commands are using northbound api, add YANG attr to them. This
will allow them to use with pending commit, else the validation will
failed as they are detected as non YANG cmd.
Christian Hopps [Tue, 8 Apr 2025 05:15:53 +0000 (05:15 +0000)]
mgmtd: remove bogus "hedge" code which corrupted active candidate DS
Say you have 2 mgmtd frontend sessions (2 vtysh's) the first one is long
running and is actively changing the global candidate datastore (DS),
the second one starts and exits, this code would then copy running
back over the candidate, blowing away any changes made by the first
session.
(the long running session could technically be any user)
Instead we need to trust the various cleanup code that already exits.
For example in the commit_cfg_reply on success candidate is copied to
running, and on failure *for implicit commit* running is copied back to
candidate clearing the change. This leaves the non-implicit
configuration changes in this case we actually want candidate to keep
it's changes in transactional cases, in the other case of pending commit
during a file read the code restores candidate (if needed) on exit from
"config terminal", with this call stack:
nhrpd: Add Hop Count Validation Before Forwarding in nhrp_peer_recv()
According to [RFC 2332, Section 5.1], if an NHS receives a packet that it would normally forward and the hop count is zero, it must send an error indication back to the source and drop the packet.
Rajasekar Raja [Fri, 4 Apr 2025 20:27:03 +0000 (13:27 -0700)]
staticd: Avoid requesting SRv6 sid from zebra when loc and sid block dont match
Currently, when the locator block and sid block differs, staticd would
still go ahead and request zebra to allocate the SID which it does if
there is atleast one match (from any locators).
Only when staticd tries to install the route, it sees that the locator
block and sid block are different and avoids installing the route.
Fix:
Check if the locator block and sid block match before even requesting
Zebra to allocate one.
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
bfdd: Fix demultiplexing to rely solely on Your Discriminator as per RFC 5880.
According to RFC 5880 Section 6.3, once the remote peer reflects back the local discriminator, the receiver MUST demultiplex subsequent BFD packets based solely on the Your Discriminator field. The source IP or interface MUST NOT be used in demultiplexing once the session is established.
bgpd: flowspec: remove sizelimit check applied to the wrong length field (issue 18557)
Section 4.1 of RFC8955 defines how the length field of flowspec NLRIs is encoded.
The method use implies a maximum length of 4095 for a single flowspec NLRI.
However, in bgp_flowspec.c, we check the length attribute of the bgp_nlri structure against this maximum value, which actually is the *total* length of all NLRI included in the considered MP_REACH_NLRI path attribute.
Due to this confusion, frr would reject valid announces that contain many flowspec NLRIs, when their cummulative length exceeds 4095, and close the session.
The proposed change removes that check entirely. Indeed, there is no need to check the length field of each invidual NLRI because the method employed make it impossible to encode a length greater than 4095.
`"` was accidentally added, and random tests failures happening.
Fixes: a4f61b78dd382c438ff4fec2fda7450ecc890edf ("tests: Check if routes are marked as stale and retained with N-bit for GR") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
G. Paul Ziemba [Sun, 30 Mar 2025 22:43:04 +0000 (15:43 -0700)]
bgpd: rfapi: track outstanding rib and import timers, free mem at exit
While here, also make "VPN SAFI clear" test wait for clear result
(tests/topotests/bgp_rfapi_basic_sanity{,_config2})
Original RFAPI code relied on the frr timer system to remember
various allocations that were supposed to be freed at future times
rather than manage a parallel database. However, if bgpd is terminated
before the times expire, those pending allocations are marked as
memory leaks, even though they wouldn't be leaks under normal operation.
This change adds some hash tables to track these outstanding
allocations that are associated with pending timers, and uses
those tables to free the allocations when bgpd exits.
zmw12306 [Mon, 31 Mar 2025 04:01:30 +0000 (00:01 -0400)]
babeld: Add input validation for update TLV.
1. If the metric is infinite and AE is 0, Plen and Omitted MUST both be 0
2. Use INFINITY to replace 0xFFFF
3. Ignore unkown ae
4. If the metric field if 0xFFFF, a retraction happens So it is acceptable for no router_id when metric is 0xFFFF while ae is not 0.
Donald Sharp [Sat, 29 Mar 2025 16:02:11 +0000 (12:02 -0400)]
bgpd: Free memory associated with aspath_dup
Fix this:
==3890443== 92 (48 direct, 44 indirect) bytes in 1 blocks are definitely lost in loss record 68 of 98
==3890443== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==3890443== by 0x49737B3: qcalloc (memory.c:106)
==3890443== by 0x3EA63B: aspath_dup (bgp_aspath.c:703)
==3890443== by 0x2F5438: route_set_aspath_exclude (bgp_routemap.c:2604)
==3890443== by 0x49BC52A: route_map_apply_ext (routemap.c:2708)
==3890443== by 0x2C1069: bgp_input_modifier (bgp_route.c:1925)
==3890443== by 0x2C9F12: bgp_update (bgp_route.c:5205)
==3890443== by 0x2CF281: bgp_nlri_parse_ip (bgp_route.c:7271)
==3890443== by 0x2A28C7: bgp_nlri_parse (bgp_packet.c:338)
==3890443== by 0x2A7F5C: bgp_update_receive (bgp_packet.c:2448)
==3890443== by 0x2ACCA6: bgp_process_packet (bgp_packet.c:4046)
==3890443== by 0x49EB77C: event_call (event.c:2019)
==3890443== by 0x495FAD1: frr_run (libfrr.c:1247)
==3890443== by 0x208D6D: main (bgp_main.c:557)
Donald Sharp [Fri, 28 Mar 2025 18:58:01 +0000 (14:58 -0400)]
bgpd: On shutdown, unlock table when clearing the bgp metaQ
There are some tables not being freed upon shutdown. This
is happening because the table is being locked as dests
are being put on the metaQ. When in shutdown it was clearing
the MetaQ it was not unlocking the table
Modified the fsm state machine to attempt to not
clear routes on a peer that was not established.
The peer should be not a peer self. We do not want
to ever clear the peer self.
Donald Sharp [Thu, 27 Mar 2025 14:20:07 +0000 (10:20 -0400)]
pimd: Fix memory leak on shutdown
The gm_join_list has a setup where it attempts to only
create the list upon need and deletes it when the list
is empty. On interface shutdown it was calling the
function to empty the list but it was not empty so
the list was being left at the end. Just add a bit
of code to really clean up the list in the shutdown
case.
Direct leak of 40 byte(s) in 1 object(s) allocated from:
0 0x7f84850b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
1 0x7f8484c391c4 in qcalloc lib/memory.c:106
2 0x7f8484c1ad36 in list_new lib/linklist.c:49
3 0x55d982827252 in pim_if_gm_join_add pimd/pim_iface.c:1354
4 0x55d982852b59 in lib_interface_gmp_address_family_join_group_create pimd/pim_nb_config.c:4499
5 0x7f8484c6a5d3 in nb_callback_create lib/northbound.c:1512
6 0x7f8484c6a5d3 in nb_callback_configuration lib/northbound.c:1910
7 0x7f8484c6bb51 in nb_transaction_process lib/northbound.c:2042
8 0x7f8484c6c164 in nb_candidate_commit_apply lib/northbound.c:1381
9 0x7f8484c6c39f in nb_candidate_commit lib/northbound.c:1414
10 0x7f8484c6cf1c in nb_cli_classic_commit lib/northbound_cli.c:57
11 0x7f8484c72f67 in nb_cli_apply_changes_internal lib/northbound_cli.c:195
12 0x7f8484c73a2e in nb_cli_apply_changes lib/northbound_cli.c:251
13 0x55d9828bd30f in interface_ip_igmp_join_magic pimd/pim_cmd.c:5436
14 0x55d9828bd30f in interface_ip_igmp_join pimd/pim_cmd_clippy.c:6366
15 0x7f8484bb5cbd in cmd_execute_command_real lib/command.c:1003
16 0x7f8484bb5fdc in cmd_execute_command lib/command.c:1062
17 0x7f8484bb6508 in cmd_execute lib/command.c:1228
18 0x7f8484cfb6ec in vty_command lib/vty.c:626
19 0x7f8484cfbc3f in vty_execute lib/vty.c:1389
20 0x7f8484cff9f0 in vtysh_read lib/vty.c:2408
21 0x7f8484cec846 in event_call lib/event.c:1984
22 0x7f8484c1a10a in frr_run lib/libfrr.c:1246
23 0x55d9828fc765 in main pimd/pim_main.c:166
24 0x7f848470c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:
Nathan Bahr [Fri, 28 Mar 2025 16:02:19 +0000 (16:02 +0000)]
pimd: Only create and bind the autorp socket when really needed
Previously, the autorp socket would get created and bind if needed
by autorp configuration.
This update limits it further to also require pim enabled interfaces
in the vrf before the socket is created and bind.
So now the socket will automatically close if there are no pim
enabled interfaces left, or if autorp is turned off. It will
automatically turn on if autorp is turned on and there are pim
enabled interfaces in the vrf.
Donald Sharp [Thu, 27 Mar 2025 12:51:05 +0000 (08:51 -0400)]
yang: Limit eigrp to just 1 instance per vrf
Currently EIGRP has built in yang code that expects only
1 ASN used per vrf. Let's just limit the operator from
putting themselves in a bad position by allowing something like
this:
router eigrp 33
....
!
router eigrp 99
...
!
no router eigrp 99 would crash because of assumptions
made in the yang processing.
Let's just hard code that assumption into the EIGRP yang
at the moment such that it will not allow you to enter
a `router eigrp 99` instance at all.
This is purely a software limitation to prevent the code
from violating it's current assumptions. I do not see
much need to support this at this point in time so I
fixed the problem this way instead of having to possibly
touch a bunch of code.
Donald Sharp [Wed, 26 Mar 2025 15:14:57 +0000 (11:14 -0400)]
tests: Modify simple_snmp_test to use frr.conf
The simple_snmp_test was not properly testing
the rip snmp code because of weirdness w/ mgmtd
and non-integrated configs. Modify the whole
test to use a integrated config and voila
ripd is talking snmp again in the test.
Donatas Abraitis [Wed, 26 Mar 2025 12:47:44 +0000 (14:47 +0200)]
tests: Use label 0x800000 instead of 0x000000 for BMP tests
Related-to: 94e2aadf7187d7d695babce21033b5bc8e454f25 ("bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Martin Buck [Tue, 25 Mar 2025 15:53:12 +0000 (16:53 +0100)]
tests: Fix wait times in test_ospf6_gr_topo1 topotest
Increase wait times to at least the minimum wait time accepted by
topotest.run_and_expect(). Also change poll interval to 1s, no point in
doings this more frequently.
Finally, slightly improve the topology diagram to also include area numbers.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Donatas Abraitis [Wed, 26 Mar 2025 08:30:52 +0000 (10:30 +0200)]
bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000
RFC8277 says:
The procedures in [RFC3107] for withdrawing the binding of a label
or sequence of labels to a prefix are not specified clearly and correctly.
=> How to Explicitly Withdraw the Binding of a Label to a Prefix
Suppose a BGP speaker has announced, on a given BGP session, the
binding of a given label or sequence of labels to a given prefix.
Suppose it now wishes to withdraw that binding. To do so, it may
send a BGP UPDATE message with an MP_UNREACH_NLRI attribute. The
NLRI field of this attribute is encoded as follows:
Upon transmission, the Compatibility field SHOULD be set to 0x800000.
Upon reception, the value of the Compatibility field MUST be ignored.
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
In FRR case where a single label is used per-prefix, we should send 0x800000,
and not 0x000000.