]> git.puffer.fish Git - matthieu/frr.git/log
matthieu/frr.git
4 weeks agoMerge pull request #18470 from zmw12306/NH_Init
Russ White [Tue, 1 Apr 2025 14:13:11 +0000 (10:13 -0400)]
Merge pull request #18470 from zmw12306/NH_Init

babeld: Add next hop initialization

4 weeks agoMerge pull request #18450 from donaldsharp/bgp_packet_reads
Russ White [Tue, 1 Apr 2025 14:12:37 +0000 (10:12 -0400)]
Merge pull request #18450 from donaldsharp/bgp_packet_reads

Bgp packet reads conversion to a FIFO

4 weeks agoMerge pull request #18544 from donaldsharp/memory_leaks_all_over
Donatas Abraitis [Mon, 31 Mar 2025 11:50:59 +0000 (14:50 +0300)]
Merge pull request #18544 from donaldsharp/memory_leaks_all_over

Memory leaks all over

4 weeks agobgpd: Free memory associated with aspath_dup
Donald Sharp [Sat, 29 Mar 2025 16:02:11 +0000 (12:02 -0400)]
bgpd: Free memory associated with aspath_dup

Fix this:

==3890443== 92 (48 direct, 44 indirect) bytes in 1 blocks are definitely lost in loss record 68 of 98
==3890443==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==3890443==    by 0x49737B3: qcalloc (memory.c:106)
==3890443==    by 0x3EA63B: aspath_dup (bgp_aspath.c:703)
==3890443==    by 0x2F5438: route_set_aspath_exclude (bgp_routemap.c:2604)
==3890443==    by 0x49BC52A: route_map_apply_ext (routemap.c:2708)
==3890443==    by 0x2C1069: bgp_input_modifier (bgp_route.c:1925)
==3890443==    by 0x2C9F12: bgp_update (bgp_route.c:5205)
==3890443==    by 0x2CF281: bgp_nlri_parse_ip (bgp_route.c:7271)
==3890443==    by 0x2A28C7: bgp_nlri_parse (bgp_packet.c:338)
==3890443==    by 0x2A7F5C: bgp_update_receive (bgp_packet.c:2448)
==3890443==    by 0x2ACCA6: bgp_process_packet (bgp_packet.c:4046)
==3890443==    by 0x49EB77C: event_call (event.c:2019)
==3890443==    by 0x495FAD1: frr_run (libfrr.c:1247)
==3890443==    by 0x208D6D: main (bgp_main.c:557)

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agozebra: Clean up memory associated with affinity maps
Donald Sharp [Sat, 29 Mar 2025 00:08:35 +0000 (20:08 -0400)]
zebra: Clean up memory associated with affinity maps

Zebra is using affinity maps but not cleaning up memory on shutdown.
BAD!

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agoisisd: Tie isis into cleaning up affinity maps
Donald Sharp [Sat, 29 Mar 2025 00:04:22 +0000 (20:04 -0400)]
isisd: Tie isis into cleaning up affinity maps

Affinity maps are abeing leaked.  STOP

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agolib: Add a affinity_map_terminate() function
Donald Sharp [Sat, 29 Mar 2025 00:03:50 +0000 (20:03 -0400)]
lib: Add a affinity_map_terminate() function

This function will clean up memory associated with affinity maps
on shutdown

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks ago*: Ensure prefix lists are freed on shutdown.
Donald Sharp [Fri, 28 Mar 2025 21:17:37 +0000 (17:17 -0400)]
*: Ensure prefix lists are freed on shutdown.

Several daemons were not calling prefix_list_reset
to clean up memory on shutdown.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agobgpd: On shutdown, unlock table when clearing the bgp metaQ
Donald Sharp [Fri, 28 Mar 2025 18:58:01 +0000 (14:58 -0400)]
bgpd: On shutdown, unlock table when clearing the bgp metaQ

There are some tables not being freed upon shutdown.  This
is happening because the table is being locked as dests
are being put on the metaQ.  When in shutdown it was clearing
the MetaQ it was not unlocking the table

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agobgpd: When shutting down do not clear self peers
Donald Sharp [Fri, 28 Mar 2025 18:54:37 +0000 (14:54 -0400)]
bgpd: When shutting down do not clear self peers

Commit: e0ae285eb8beeef7b43bdadc073d8ae346eaeb6c

Modified the fsm state machine to attempt to not
clear routes on a peer that was not established.
The peer should be not a peer self.  We do not want
to ever clear the peer self.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agoMerge pull request #15471 from opensourcerouting/frrreload_logfile
Christian Hopps [Sun, 30 Mar 2025 09:52:43 +0000 (05:52 -0400)]
Merge pull request #15471 from opensourcerouting/frrreload_logfile

tools: Add option to frr-reload to specify alternate logfile

4 weeks agoMerge pull request #18532 from y-bharath14/srib-tests-v8
Donatas Abraitis [Fri, 28 Mar 2025 10:38:07 +0000 (12:38 +0200)]
Merge pull request #18532 from y-bharath14/srib-tests-v8

tests: Irrelevant code in lutil.py

4 weeks agotests: Irrelevant code in lutil.py
Y Bharath [Fri, 28 Mar 2025 05:22:36 +0000 (10:52 +0530)]
tests: Irrelevant code in lutil.py

Irrelevant code in lutil.py

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
4 weeks agoMerge pull request #18520 from y-bharath14/srib-tests-v7
Donatas Abraitis [Thu, 27 Mar 2025 13:07:45 +0000 (15:07 +0200)]
Merge pull request #18520 from y-bharath14/srib-tests-v7

tests: Fix potential issues at send_bsr_packet.py

4 weeks agoMerge pull request #18515 from donaldsharp/route_map_show_fix
Donatas Abraitis [Thu, 27 Mar 2025 07:13:51 +0000 (09:13 +0200)]
Merge pull request #18515 from donaldsharp/route_map_show_fix

lib: `show route-map` should not print (null)

4 weeks agotests: Fix potential issues at send_bsr_packet.py
Y Bharath [Thu, 27 Mar 2025 04:00:40 +0000 (09:30 +0530)]
tests: Fix potential issues at send_bsr_packet.py

Fix potential issues at send_bsr_packet.py

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
4 weeks agolib: `show route-map` should not print (null)
Donald Sharp [Wed, 26 Mar 2025 18:35:13 +0000 (14:35 -0400)]
lib: `show route-map` should not print (null)

This command:
route-map FOOBAR permit 10
 set ipv6 next-hop prefer-global
 set community 5060:12345 additive
!

When you issue a `show route-map ...` command displays this:

route-map: FOOBAR Invoked: 0 (0 milliseconds total) Optimization: enabled Processed Change: false
 permit, sequence 5 Invoked 0 (0 milliseconds total)
  Match clauses:
  Set clauses:
    ipv6 next-hop prefer-global (null)
    community 5060:12345 additive
  Call clause:
  Action:
    Exit routemap

Modify the code so that it no longer displays the NULL when there
is nothing to display.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agoMerge pull request #18498 from opensourcerouting/fix/keep_stale_routes_on_clear
Russ White [Wed, 26 Mar 2025 18:02:52 +0000 (14:02 -0400)]
Merge pull request #18498 from opensourcerouting/fix/keep_stale_routes_on_clear

bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart

4 weeks agoMerge pull request #18508 from donaldsharp/rip_snmp_test_fixup
Jafar Al-Gharaibeh [Wed, 26 Mar 2025 17:38:05 +0000 (12:38 -0500)]
Merge pull request #18508 from donaldsharp/rip_snmp_test_fixup

tests: Modify simple_snmp_test to use frr.conf

4 weeks agoMerge pull request #18502 from opensourcerouting/fix/mpls_withdraw_label
Donald Sharp [Wed, 26 Mar 2025 15:26:19 +0000 (11:26 -0400)]
Merge pull request #18502 from opensourcerouting/fix/mpls_withdraw_label

bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000

4 weeks agotests: Modify simple_snmp_test to use frr.conf
Donald Sharp [Wed, 26 Mar 2025 15:14:57 +0000 (11:14 -0400)]
tests: Modify simple_snmp_test to use frr.conf

The simple_snmp_test was not properly testing
the rip snmp code because of weirdness w/ mgmtd
and non-integrated configs.  Modify the whole
test to use a integrated config and voila
ripd is talking snmp again in the test.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
4 weeks agoMerge pull request #18500 from y-bharath14/srib-yang-v7
Donald Sharp [Wed, 26 Mar 2025 14:51:10 +0000 (10:51 -0400)]
Merge pull request #18500 from y-bharath14/srib-yang-v7

yang: Fixed pyang errors at frr-isisd.yang

4 weeks agoMerge pull request #18506 from donaldsharp/ripng_test_aggregate_address
Russ White [Wed, 26 Mar 2025 14:31:59 +0000 (10:31 -0400)]
Merge pull request #18506 from donaldsharp/ripng_test_aggregate_address

tests: Add ripng aggregate address testing

4 weeks agoMerge pull request #18503 from gromit1811/bugfix/ospf6_gr_leak
Donald Sharp [Wed, 26 Mar 2025 14:30:30 +0000 (10:30 -0400)]
Merge pull request #18503 from gromit1811/bugfix/ospf6_gr_leak

ospf6d: Fix LSA memory leaks related to graceful restart

4 weeks agotests: Use label 0x800000 instead of 0x000000 for BMP tests
Donatas Abraitis [Wed, 26 Mar 2025 12:47:44 +0000 (14:47 +0200)]
tests: Use label 0x800000 instead of 0x000000 for BMP tests

Related-to: 94e2aadf7187d7d695babce21033b5bc8e454f25 ("bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
4 weeks agoMerge pull request #18482 from donaldsharp/eigrp_typesafe
Mark Stapp [Wed, 26 Mar 2025 11:54:23 +0000 (07:54 -0400)]
Merge pull request #18482 from donaldsharp/eigrp_typesafe

Eigrp typesafe

4 weeks agotests: Fix wait times in test_ospf6_gr_topo1 topotest
Martin Buck [Tue, 25 Mar 2025 15:53:12 +0000 (16:53 +0100)]
tests: Fix wait times in test_ospf6_gr_topo1 topotest

Increase wait times to at least the minimum wait time accepted by
topotest.run_and_expect(). Also change poll interval to 1s, no point in
doings this more frequently.

Finally, slightly improve the topology diagram to also include area numbers.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
4 weeks agoospf6d: Fix LSA memory leaks related to graceful restart
Martin Buck [Tue, 25 Mar 2025 15:32:47 +0000 (16:32 +0100)]
ospf6d: Fix LSA memory leaks related to graceful restart

Fixes leaks reported by ospf6_gr_topo1 topotest.

Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
4 weeks agoMerge pull request #18448 from Shbinging/fix_babel_hello_interval
Donatas Abraitis [Wed, 26 Mar 2025 08:37:58 +0000 (10:37 +0200)]
Merge pull request #18448 from Shbinging/fix_babel_hello_interval

babeld: fix hello packets not sent with configured hello timer

4 weeks agoMerge pull request #18476 from y-bharath14/srib-tests-v6
Donatas Abraitis [Wed, 26 Mar 2025 08:34:23 +0000 (10:34 +0200)]
Merge pull request #18476 from y-bharath14/srib-tests-v6

tests: Handling potential errors gracefully

4 weeks agobgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000
Donatas Abraitis [Wed, 26 Mar 2025 08:30:52 +0000 (10:30 +0200)]
bgpd: Set the label for MP_UNREACH_NLRI 0x800000 instead of 0x000000

RFC8277 says:

The procedures in [RFC3107] for withdrawing the binding of a label
or sequence of labels to a prefix are not specified clearly and correctly.

=> How to Explicitly Withdraw the Binding of a Label to a Prefix

Suppose a BGP speaker has announced, on a given BGP session, the
   binding of a given label or sequence of labels to a given prefix.
   Suppose it now wishes to withdraw that binding.  To do so, it may
   send a BGP UPDATE message with an MP_UNREACH_NLRI attribute.  The
   NLRI field of this attribute is encoded as follows:

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Length     |        Compatibility                          |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                          Prefix                               ~
     ~                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                       Figure 4: NLRI for Withdrawal

   Upon transmission, the Compatibility field SHOULD be set to 0x800000.
   Upon reception, the value of the Compatibility field MUST be ignored.

[RFC3107] also made it possible to withdraw a binding without
   specifying the label explicitly, by setting the Compatibility field
   to 0x800000.  However, some implementations set it to 0x000000.  In
   order to ensure backwards compatibility, it is RECOMMENDED by this
   document that the Compatibility field be set to 0x800000, but it is
   REQUIRED that it be ignored upon reception.

In FRR case where a single label is used per-prefix, we should send 0x800000,
and not 0x000000.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
4 weeks agoyang: Fixed pyang errors at frr-isisd.yang
Y Bharath [Wed, 26 Mar 2025 07:16:08 +0000 (12:46 +0530)]
yang: Fixed pyang errors at frr-isisd.yang

Fixed pyang errors at frr-isisd.yang

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
4 weeks agobgpd: Remove unused defines from bgp_label.h
Donatas Abraitis [Wed, 26 Mar 2025 06:50:06 +0000 (08:50 +0200)]
bgpd: Remove unused defines from bgp_label.h

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
4 weeks agoMerge pull request #18496 from mjstapp/fix_bgp_clearing_sa
Donald Sharp [Tue, 25 Mar 2025 22:00:02 +0000 (18:00 -0400)]
Merge pull request #18496 from mjstapp/fix_bgp_clearing_sa

bgpd: fix SA warning in bgp clearing code

4 weeks agotests: Add ripng aggregate address testing
Donald Sharp [Tue, 25 Mar 2025 21:35:47 +0000 (17:35 -0400)]
tests: Add ripng aggregate address testing

Looking at gcov and noticed that ripngd does not
test any aggregate address addition/deletion
to ensure that it works.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agotests: Check if routes are marked as stale and retained with N-bit for GR
Donatas Abraitis [Tue, 25 Mar 2025 15:35:41 +0000 (17:35 +0200)]
tests: Check if routes are marked as stale and retained with N-bit for GR

Related-to: b7c657d4e065f310fcf6454abae1a963c208c3b8 ("bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
5 weeks agobgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart
Donatas Abraitis [Tue, 25 Mar 2025 15:20:56 +0000 (17:20 +0200)]
bgpd: Retain the routes if we do a clear with N-bit set for Graceful-Restart

On receiving side we already did the job correctly, but the peer which initiates
the clear does not retain the other's routes. This commit fixes that.

Fixes: 20170775da3a3c5d41aba714d0c1d5a29b0da61c ("bgpd: Activate Graceful-Restart when receiving CEASE/HOLDTIME notifications")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
5 weeks agobgpd: Delay processing MetaQ in some events
Donald Sharp [Tue, 25 Mar 2025 14:43:14 +0000 (10:43 -0400)]
bgpd: Delay processing MetaQ in some events

If the number of peers that are being handled on
the peer connection fifo is greater than 10, that
means we have some network event going on.  Let's
allow the packet processing to continue instead
of running the metaQ.  This has advantages because
everything else in BGP is only run after the metaQ
is run.  This includes best path processing,
installation of the route into zebra as well as
telling our peers about this change.  Why does
this matter?  It matters because if we are receiving
the same route multiple times we limit best path processing
to much fewer times and consequently we also limit
the number of times we send the route update out and
we install the route much fewer times as well.

Prior to this patch, with 512 peers and 5k routes.
CPU time for bgpd was 3:10, zebra was 3:28.  After
the patch CPU time for bgpd was 0:55 and zebra was
0:25.

Here are the prior `show event cpu`:
Event statistics for bgpd:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0         20.749     33144        0       395        1       396         0         0          0    T    (bgp_generate_updgrp_packets)
    0       9936.199      1818     5465     43588     5466     43589         0         0          0     E   bgp_handle_route_announcements_to_zebra
    0          0.220        84        2        20        3        20         0         0          0    T    update_subgroup_merge_check_thread_cb
    0          0.058         2       29        43       29        43         0         0          0     E   zclient_connect
    0      17297.733       466    37119     67428    37124     67429         0         0          0   W     zclient_flush_data
    1          0.134         7       19        40       20        42         0         0          0  R      vtysh_accept
    0        151.396      1067      141      1181      142      1189         0         0          0  R      vtysh_read
    0          0.297      1030        0        14        0        14         0         0          0    T    (bgp_routeadv_timer)
    0          0.001         1        1         1        2         2         0         0          0    T    bgp_sync_label_manager
    2          9.374       544       17       261       17       262         0         0          0  R      bgp_accept
    0          0.001         1        1         1        2         2         0         0          0    T    bgp_startup_timer_expire
    0          0.012         1       12        12       13        13         0         0          0     E   frr_config_read_in
    0          0.308         1      308       308      309       309         0         0          0    T    subgroup_coalesce_timer
    0          4.027       105       38        77       39        78         0         0          0    T    (bgp_start_timer)
    0     112206.442      1818    61719     84726    61727     84736         0         0          0    TE   work_queue_run
    0          0.345         1      345       345      346       346         0         0          0    T    bgp_config_finish
    0          0.710       620        1         6        1         9         0         0          0   W     bgp_connect_check
    2         39.420      8283        4       110        5       111         0         0          0  R      zclient_read
    0          0.052         1       52        52      578       578         0         0          0    T    bgp_start_label_manager
    0          0.452        87        5        90        5        90         0         0          0    T    bgp_announce_route_timer_expired
    0        185.837      3088       60       537       92     21705         0         0          0     E   bgp_event
    0      48719.671      4346    11210     78292    11215     78317         0         0          0     E   bgp_process_packet

Showing statistics for pthread BGP I/O thread
---------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0        321.915     28597       11        86       11       265         0         0          0   W     bgp_process_writes
  515        115.586     26954        4       121        4       128         0         0          0  R      bgp_process_reads

Event statistics for zebra:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0          0.109         2       54        62       55        63         0         0          0    T    timer_walk_start
    1          0.550        11       50       100       50       100         0         0          0  R      vtysh_accept
    0     112848.163      4441    25410    405489    25413    410127         0         0          0     E   zserv_process_messages
    0          0.007         1        7         7        7         7         0         0          0     E   frr_config_read_in
    0          0.005         1        5         5        5         5         0         0          0    T    rib_sweep_route
    1        573.589      4789      119      1567      120      1568         0         0          0    T    wheel_timer_thread
  347         30.848        97      318      1367      318      1366         0         0          0    T    zebra_nhg_timer
    0          0.005         1        5         5        6         6         0         0          0    T    zebra_evpn_mh_startup_delay_exp_cb
    0          5.404       521       10        38       10        70         0         0          0    T    timer_walk_continue
    1          1.669         9      185       219      186       219         0         0          0  R      zserv_accept
    1          0.174        18        9        53       10        53         0         0          0  R      msg_conn_read
    0          3.028       520        5        47        6        47         0         0          0    T    if_zebra_speed_update
    0          0.324       274        1         5        1         6         0         0          0   W     msg_conn_write
    1         24.661      2124       11       359       12       359         0         0          0  R      kernel_read
    0      73683.333      2964    24859    143223    24861    143239         0         0          0    TE   work_queue_run
    1         46.649      6789        6       424        7       424         0         0          0  R      rtadv_read
    0         52.661        85      619      2087      620      2088         0         0          0  R      vtysh_read
    0         42.660        18     2370     21694     2373     21695         0         0          0     E   msg_conn_proc_msgs
    0          0.034         1       34        34       35        35         0         0          0     E   msg_client_connect_timer
    0       2786.938      2300     1211     29456     1219     29555         0         0          0     E   rib_process_dplane_results

Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0       4875.670    200371       24       770       24       776         0         0          0     E   dplane_thread_loop
    0          0.059         1       59        59       76        76         0         0          0     E   dplane_incoming_request
    1          9.640       722       13      4510       15      5343         0         0          0  R      dplane_incoming_read

Here are the post `show event cpu` results:

Event statistics for bgpd:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0      21297.497      3565     5974     57912     5981     57913         0         0          0     E   bgp_process_packet
    0        149.742      1068      140      1109      140      1110         0         0          0  R      vtysh_read
    0          0.013         1       13        13       14        14         0         0          0     E   frr_config_read_in
    0          0.459        86        5       104        5       105         0         0          0    T    bgp_announce_route_timer_expired
    0          0.139        81        1        20        2        21         0         0          0    T    update_subgroup_merge_check_thread_cb
    0        405.889    291687        1       179        1       450         0         0          0    T    (bgp_generate_updgrp_packets)
    0          0.682       618        1         6        1         9         0         0          0   W     bgp_connect_check
    0          3.888       103       37        81       38        82         0         0          0    T    (bgp_start_timer)
    0          0.074         1       74        74      458       458         0         0          0    T    bgp_start_label_manager
    0          0.000         1        0         0        1         1         0         0          0    T    bgp_sync_label_manager
    0          0.121         3       40        54      100       141         0         0          0     E   bgp_process_conn_error
    0          0.060         2       30        49       30        50         0         0          0     E   zclient_connect
    0          0.354         1      354       354      355       355         0         0          0    T    bgp_config_finish
    0          0.283         1      283       283      284       284         0         0          0    T    subgroup_coalesce_timer
    0      29365.962      1805    16269     99445    16273     99454         0         0          0    TE   work_queue_run
    0        185.532      3097       59       497       94     26107         0         0          0     E   bgp_event
    1          0.290         8       36       151       37       158         0         0          0  R      vtysh_accept
    2          9.462       548       17       320       17       322         0         0          0  R      bgp_accept
    2         40.219      8283        4       128        5       128         0         0          0  R      zclient_read
    0          0.322      1031        0         4        0         5         0         0          0    T    (bgp_routeadv_timer)
    0        356.812       637      560      3007      560      3007         0         0          0     E   bgp_handle_route_announcements_to_zebra

Showing statistics for pthread BGP I/O thread
---------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
  515         62.965     14335        4       103        5       181         0         0          0  R      bgp_process_reads
    0       1986.041    219813        9       213        9       315         0         0          0   W     bgp_process_writes

Event statistics for zebra:

Showing statistics for pthread default
--------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0          0.006         1        6         6        7         7         0         0          0     E   frr_config_read_in
    0       3673.365      2044     1797    259281     1800    261342         0         0          0     E   zserv_process_messages
    1        651.846      8041       81      1090       82      1233         0         0          0    T    wheel_timer_thread
    0         38.184        18     2121     21345     2122     21346         0         0          0     E   msg_conn_proc_msgs
    1          0.651        12       54       112       55       112         0         0          0  R      vtysh_accept
    0          0.102         2       51        55       51        56         0         0          0    T    timer_walk_start
    0        202.721      1577      128     29172      141     29226         0         0          0     E   rib_process_dplane_results
    1         41.650      6645        6       140        6       140         0         0          0  R      rtadv_read
    1         22.518      1969       11       106       12       154         0         0          0  R      kernel_read
    0          4.265        48       88      1465       89      1466         0         0          0  R      vtysh_read
    0       6099.851       650     9384     28313     9390     28314         0         0          0    TE   work_queue_run
    0          5.104       521        9        30       10        31         0         0          0    T    timer_walk_continue
    0          3.078       520        5        53        6        55         0         0          0    T    if_zebra_speed_update
    0          0.005         1        5         5        5         5         0         0          0    T    rib_sweep_route
    0          0.034         1       34        34       35        35         0         0          0     E   msg_client_connect_timer
    1          1.641         9      182       214      183       215         0         0          0  R      zserv_accept
    0          0.358       274        1         6        2         6         0         0          0   W     msg_conn_write
    1          0.159        18        8        54        9        54         0         0          0  R      msg_conn_read

Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
                               CPU (user+system): Real (wall-clock):
Active   Runtime(ms)   Invoked Avg uSec Max uSecs Avg uSec Max uSecs  CPU_Warn Wall_Warn Starv_Warn   Type  Event
    0        301.404      7280       41      1878       41      1878         0         0          0     E   dplane_thread_loop
    0          0.048         1       48        48       49        49         0         0          0     E   dplane_incoming_request
    1          9.558       727       13      4659       14      5420         0         0          0  R      dplane_incoming_read

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoMerge pull request #18494 from opensourcerouting/fix/duplicate_prefix_list
Mark Stapp [Tue, 25 Mar 2025 14:43:24 +0000 (10:43 -0400)]
Merge pull request #18494 from opensourcerouting/fix/duplicate_prefix_list

lib: Return duplicate prefix-list entry test

5 weeks agoMerge pull request #18474 from zmw12306/Hop-Count
Russ White [Tue, 25 Mar 2025 14:38:23 +0000 (10:38 -0400)]
Merge pull request #18474 from zmw12306/Hop-Count

babeld: Hop Count must not be 0.

5 weeks agoMerge pull request #18369 from huchaogithup/master-dev-pr1
Russ White [Tue, 25 Mar 2025 14:18:13 +0000 (10:18 -0400)]
Merge pull request #18369 from huchaogithup/master-dev-pr1

isisd: Fix the issue where redistributed routes do not change when th…

5 weeks agoMerge pull request #18311 from Z-Yivon/fix-isis-hello-timer-bug
Russ White [Tue, 25 Mar 2025 14:15:42 +0000 (10:15 -0400)]
Merge pull request #18311 from Z-Yivon/fix-isis-hello-timer-bug

isisd:IS-IS hello packets not sent with configured hello timer

5 weeks agobgpd: fix SA warnings in bgp clearing code
Mark Stapp [Tue, 25 Mar 2025 13:32:14 +0000 (09:32 -0400)]
bgpd: fix SA warnings in bgp clearing code

Fix a possible use-after-free in the recent bgp batch
clearing code, CID 1639091.

Signed-off-by: Mark Stapp <mjs@cisco.com>
5 weeks agozebra: Limit reading packets when MetaQ is full
Donald Sharp [Mon, 24 Mar 2025 18:11:35 +0000 (14:11 -0400)]
zebra: Limit reading packets when MetaQ is full

Currently Zebra is just reading packets off the zapi
wire and stacking them up for processing in zebra
in the future.  When there is significant churn
in the network the size of zebra can grow without
bounds due to the MetaQ sizing constraints.  This
ends up showing by the number of nexthops in the
system.  Reducing the number of packets serviced
to limit the metaQ size to the packets to process
allieviates this problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agobgpd: Modify bgp to handle packet events in a FIFO
Donald Sharp [Fri, 21 Mar 2025 11:48:50 +0000 (07:48 -0400)]
bgpd: Modify bgp to handle packet events in a FIFO

Current behavor of BGP is to have a event per connection.  Given
that on startup of BGP with a high number of neighbors you end
up with 2 * # of peers events that are being processed.  Additionally
once BGP has selected the connection this still only comes down
to 512 events.  This number of events is swamping the event system
and in addition delaying any other work from being done in BGP at
all because the the 512 events are always going to take precedence
over everything else.  The other main events are the handling
of the metaQ(1 event), update group events( 1 per update group )
and the zebra batching event.  These are being swamped.

Modify the BGP code to have a FIFO of connections.  As new data
comes in to read, place the connection on the end of the FIFO.
Have the bgp_process_packet handle up to 100 packets spread
across the individual peers where each peer/connection is limited
to the original quanta.  During testing I noticed that withdrawal
events at very very large scale are taking up to 40 seconds to process
so I added a check for yielding to further limit the number of packets
being processed.

This change also allow for BGP to be interactive again on scale
setups on initial convergence.  Prior to this change any vtysh
command entered would be delayed by 10's of seconds in my setup
while BGP was doing other work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agotests: Expand hold timer to 60 seconds for high_ecmp
Donald Sharp [Fri, 21 Mar 2025 14:47:14 +0000 (10:47 -0400)]
tests: Expand hold timer to 60 seconds for high_ecmp

The hold timer is 5/20.  At load with a very very
large number of routes, the tests are experiencing
some issues with this.  Let's just give ourselves
some headroom associated with the receiving
of packets

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoMerge pull request #18471 from zmw12306/NH-TLV
Donald Sharp [Tue, 25 Mar 2025 13:03:16 +0000 (09:03 -0400)]
Merge pull request #18471 from zmw12306/NH-TLV

babeld: add check incorrect AE value for NH TLV.

5 weeks agobabeld: fix hello packets not sent with configured hello timer
Shbinging [Fri, 21 Mar 2025 02:57:28 +0000 (02:57 +0000)]
babeld: fix hello packets not sent with configured hello timer

Same issue occurring as previously addressed in https://github.com/FRRouting/frr/pull/9092. The root cause is: "Sending a Hello message before restarting the hello timer to avoid session flaps in case of larger hello interval configurations."

Signed-off-by: Shbinging <bingshui@smail.nju.edu.cn>
5 weeks agolib: Return duplicate prefix-list entry test
Donatas Abraitis [Tue, 25 Mar 2025 11:54:24 +0000 (13:54 +0200)]
lib: Return duplicate prefix-list entry test

If we do e.g.:

ip prefix-list PL_LoopbackV4 permit 10.1.0.32/32
ip prefix-list PL_LoopbackV4 permit 10.1.0.32/32
ip prefix-list PL_LoopbackV4 permit 10.1.0.32/32

We end up, having duplicate records with a different sequence number only.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
5 weeks agoMerge pull request #18483 from donaldsharp/holdtime_mistake
Donatas Abraitis [Tue, 25 Mar 2025 07:38:09 +0000 (09:38 +0200)]
Merge pull request #18483 from donaldsharp/holdtime_mistake

bgpd: Fix holdtime not working properly when busy

5 weeks agoMerge pull request #18484 from mjstapp/fix_evpn_rt_cli
Donatas Abraitis [Tue, 25 Mar 2025 07:37:08 +0000 (09:37 +0200)]
Merge pull request #18484 from mjstapp/fix_evpn_rt_cli

bgpd: fix handling of configured route-targets for l2vni, l3vni

5 weeks agobgpd: fix handling of configured RTs for l2vni, l3vni
Mark Stapp [Mon, 24 Mar 2025 20:53:32 +0000 (16:53 -0400)]
bgpd: fix handling of configured RTs for l2vni, l3vni

Test for existing explicit config as part of validation of
route-target configuration: allow explicit config of generic/
default AS+VNI, for example, instead of rejecting it.

Signed-off-by: Mark Stapp <mjs@cisco.com>
5 weeks agoMerge pull request #18447 from donaldsharp/bgp_clear_batch
Russ White [Mon, 24 Mar 2025 20:13:49 +0000 (16:13 -0400)]
Merge pull request #18447 from donaldsharp/bgp_clear_batch

Bgp clear batch

5 weeks agobabeld: add check incorrect AE value for NH TLV.
zmw12306 [Mon, 24 Mar 2025 19:55:08 +0000 (15:55 -0400)]
babeld: add check incorrect AE value for NH TLV.

According to RFC 8966, for NH TLV, AE SHOULD be 1 (IPv4) or 3 (link-local IPv6), and MUST NOT be 0.
Signed-off-by: zmw12306 <zmw12306@gmail.com>
5 weeks agobgpd: Fix holdtime not working properly when busy
Donald Sharp [Mon, 24 Mar 2025 18:28:38 +0000 (14:28 -0400)]
bgpd: Fix holdtime not working properly when busy

Commit:  cc9f21da2218d95567eff1501482ce58e6600f54

Modified the bgp_fsm code to dissallow the extension
of the hold time when the system is under extremely
heavy load.  This was a attempt to remove the return
code but it was too aggressive and messed up this bit
of code.

Put the behavior back that was introduced in:
d0874d195d0127009a7d9c06920c52c95319eff9

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agobabeld: Hop Count must not be 0.
zmw12306 [Mon, 24 Mar 2025 19:32:18 +0000 (15:32 -0400)]
babeld: Hop Count must not be 0.

According to RFC 8966:
Hop Count The maximum number of times that this TLV may be forwarded, plus 1. This MUST NOT be 0.
Signed-off-by: zmw12306 <zmw12306@gmail.com>
5 weeks agoeigrpd: Remove unneeded function declaration
Donald Sharp [Mon, 24 Mar 2025 14:27:20 +0000 (10:27 -0400)]
eigrpd: Remove unneeded function declaration

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agozebra: On shutdown call appropriate finish functions
Donald Sharp [Mon, 24 Mar 2025 13:37:25 +0000 (09:37 -0400)]
zebra: On shutdown call appropriate finish functions

The vrf_terminate and route_map_finish functions are not being called and as such
memory was being dropped on shutdown.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoeigrpd: Cleanup memory issues on shutdown
Donald Sharp [Mon, 24 Mar 2025 12:07:02 +0000 (08:07 -0400)]
eigrpd: Cleanup memory issues on shutdown

a) EIGRP was having issues with the prefix created as part
of the topology destination.  Make this just a part of the
topology data structure instead of allocating it.

b) EIGRP was not freeing up any memory associated with
the network table.  Free it.

c) EIGRP was confusing zebra shutdown as part of the deletion
of the last eigrp data structure.  This was inappropriate it
should be part of the `I'm just shutting down`.

d) The QOBJ was not being properly freed, free it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoeigrpd: Convert eigrp list to a typesafe hash
Donald Sharp [Mon, 24 Mar 2025 11:35:22 +0000 (07:35 -0400)]
eigrpd: Convert eigrp list to a typesafe hash

Convert the eigrp_om->eigrp list to a typesafe hash.
Allow for quicker lookup and all that jazz.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoeigrpd: Convert the eiflist to a typesafe hash
Donald Sharp [Mon, 24 Mar 2025 01:16:56 +0000 (21:16 -0400)]
eigrpd: Convert the eiflist to a typesafe hash

The eigrp->eiflist is a linked list and should just
be a hash instead.  The full conversion to a hash
like functionality is goingto wait until the connected
eigrp data structure is created.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoeigrpd: Convert the nbrs list to a typesafe hash
Donald Sharp [Mon, 24 Mar 2025 00:28:59 +0000 (20:28 -0400)]
eigrpd: Convert the nbrs list to a typesafe hash

Convert the ei->nbrs list to a typesafe hash to
facilitate quick lookups.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agolib: expose comparision function to allow a typesafe conversion
Donald Sharp [Mon, 24 Mar 2025 01:17:53 +0000 (21:17 -0400)]
lib: expose comparision function to allow a typesafe conversion

The interface hash comparison function is needed in eigrpd.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoMerge pull request #18473 from zmw12306/Request-TLV
Donald Sharp [Mon, 24 Mar 2025 14:29:36 +0000 (10:29 -0400)]
Merge pull request #18473 from zmw12306/Request-TLV

babeld: Missing Validation for AE=0 and Plen!=0

5 weeks agotests: Handling potential errors gracefully
Y Bharath [Mon, 24 Mar 2025 07:43:22 +0000 (13:13 +0530)]
tests: Handling potential errors gracefully

Handling potential errors gracefully at exa-receive.py

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
5 weeks agoMerge pull request #18467 from cscarpitta/fix/fix_srv6_static_sids_crash_2
Donatas Abraitis [Mon, 24 Mar 2025 12:16:37 +0000 (14:16 +0200)]
Merge pull request #18467 from cscarpitta/fix/fix_srv6_static_sids_crash_2

staticd: Fix a crash that occurs when modifying an SRv6 SID

5 weeks agoMerge pull request #18469 from donaldsharp/fix_update_groups
Donatas Abraitis [Mon, 24 Mar 2025 12:08:19 +0000 (14:08 +0200)]
Merge pull request #18469 from donaldsharp/fix_update_groups

tests: high_ecmp creates 2 update groups

5 weeks agoMerge pull request #18475 from LabNConsulting/chopps/pylint
Donatas Abraitis [Mon, 24 Mar 2025 12:03:17 +0000 (14:03 +0200)]
Merge pull request #18475 from LabNConsulting/chopps/pylint

tests: add another directory to search path for pylint

5 weeks agotests: add another directory to search path for pylint
Christian Hopps [Mon, 24 Mar 2025 05:07:28 +0000 (05:07 +0000)]
tests: add another directory to search path for pylint

Some IDEs (e.g., emacs+lsp) run pylint from the root directory and so
we need to add `tests/topotests` so that `lib` and `munet` are found
by pylint when used in imports

Signed-off-by: Christian Hopps <chopps@labn.net>
5 weeks agobabeld: Missing Validation for AE=0 and Plen!=0
zmw12306 [Mon, 24 Mar 2025 02:37:59 +0000 (22:37 -0400)]
babeld: Missing Validation for AE=0 and Plen!=0

A Request TLV with AE set to 0 and Plen not set to 0 MUST be ignored.
Signed-off-by: zmw12306 <zmw12306@gmail.com>
5 weeks agobabeld: Add next hop initialization
zmw12306 [Sun, 23 Mar 2025 23:02:14 +0000 (19:02 -0400)]
babeld: Add next hop initialization

Initialize v4_nh/v6_nh from source address at the beginning of packet parsing
Signed-off-by: zmw12306 <zmw12306@gmail.com>
5 weeks agotests: high_ecmp creates 2 update groups
Donald Sharp [Sun, 23 Mar 2025 21:48:02 +0000 (17:48 -0400)]
tests: high_ecmp creates 2 update groups

The high_ecmp test was creating 2 update groups, where
513 of the neighbors are in 1 and the remaining is in
another.  They should just all be in 1 update group.
Modify the test creation such that interfaces r1-eth514
and r2-eth514 have v4 and v6 addresses.

Signed-off-by: Donald Sharp <donaldsharp72@gmail.com>
5 weeks agotests: Add test case to verify SRv6 SID modify
Carmine Scarpitta [Sun, 23 Mar 2025 15:57:15 +0000 (16:57 +0100)]
tests: Add test case to verify SRv6 SID modify

This commit adds a test case that modifies a SID and verifies that the
RIB is as expected.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
5 weeks agostaticd: Fix crash that occurs when modifying an SRv6 SID
Carmine Scarpitta [Sun, 23 Mar 2025 15:56:52 +0000 (16:56 +0100)]
staticd: Fix crash that occurs when modifying an SRv6 SID

When the user modifies an SRv6 SID and then removes all SIDs, staticd
crashes:

```
2025/03/23 08:37:22.691860 STATIC: lib/memory.c:74: mt_count_free(): assertion (mt->n_alloc) failed
STATIC: Received signal 6 at 1742715442 (si_addr 0x8200007cf0); aborting...
STATIC: zlog_signal+0x390                  fcc704a844b8     ffffd7450390 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: core_handler+0x1f8                 fcc704b79990     ffffd7450590 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC:     ---- signal ----
STATIC: ?                                  fcc705c008f8     ffffd74507a0 linux-vdso.so.1 (mapped at 0xfcc705c00000)
STATIC: pthread_key_delete+0x1a0           fcc70458f1f0     ffffd7451a00 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: raise+0x1c                         fcc70454a67c     ffffd7451ad0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: abort+0xe4                         fcc704537130     ffffd7451af0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: _zlog_assert_failed+0x3c4          fcc704c407c8     ffffd7451c40 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: mt_count_free+0x12c                fcc704a93c74     ffffd7451dc0 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: qfree+0x28                         fcc704a93fa0     ffffd7451e70 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: static_srv6_sid_free+0x1c          adc1df8fa544     ffffd7451e90 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: delete_static_srv6_sid+0x14        adc1df8faafc     ffffd7451eb0 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: list_delete_all_node+0x104         fcc704a60eec     ffffd7451ed0 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: list_delete+0x8c                   fcc704a61054     ffffd7451f00 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: static_srv6_cleanup+0x20           adc1df8fabdc     ffffd7451f20 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: sigint+0x40                        adc1df8be544     ffffd7451f30 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: frr_sigevent_process+0x148         fcc704b79460     ffffd7451f40 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: event_fetch+0x1c4                  fcc704bc0834     ffffd7451f60 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: frr_run+0x650                      fcc704a5d230     ffffd7452080 /usr/lib/frr/libfrr.so.0 (mapped at 0xfcc704800000)
STATIC: main+0x1d0                         adc1df8be75c     ffffd7452270 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
STATIC: __libc_init_first+0x7c             fcc7045373fc     ffffd74522b0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: __libc_start_main+0x98             fcc7045374cc     ffffd74523c0 /lib/aarch64-linux-gnu/libc.so.6 (mapped at 0xfcc704510000)
STATIC: _start+0x30                        adc1df8be0f0     ffffd7452420 /usr/lib/frr/staticd (mapped at 0xadc1df8a0000)
```

Tracking this down, the crash occurs because every time we modify a
SID, staticd executes some callbacks to modify the SID and finally it
calls `apply_finish`, which re-adds the SID to the list `srv6_sids`.

This leads to having the same SID multiple times in the `srv6_sids`
list. When we delete all SIDs, staticd attempts to deallocate the same
SID multiple times, which leads to the crash.

This commit fixes the issue by moving the code that adds the SID to the
list from the `apply_finish` callback to the `create` callback.
This ensures that the SID is inserted into the list only once, when it
is created. For all subsequent modifications, the SID is modified but
not added to the list.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
5 weeks agoMerge pull request #18378 from Tuetuopay/fix-route-map-gateway-ip
Donatas Abraitis [Sun, 23 Mar 2025 10:38:38 +0000 (12:38 +0200)]
Merge pull request #18378 from Tuetuopay/fix-route-map-gateway-ip

bgpd: fix `set evpn gateway-ip ipv[46]` route-map

5 weeks agoMerge pull request #18339 from y-bharath14/srib-tests-v3
Donald Sharp [Sat, 22 Mar 2025 23:53:53 +0000 (19:53 -0400)]
Merge pull request #18339 from y-bharath14/srib-tests-v3

tests: Corrected typo at path_attributes.py

5 weeks agoMerge pull request #18446 from louis-6wind/test_bfd_static_vrf
Donatas Abraitis [Sat, 22 Mar 2025 10:21:30 +0000 (12:21 +0200)]
Merge pull request #18446 from louis-6wind/test_bfd_static_vrf

tests: add bfd_static_vrf

5 weeks agoMerge pull request #18452 from donaldsharp/bmp_changes
Donatas Abraitis [Sat, 22 Mar 2025 10:20:18 +0000 (12:20 +0200)]
Merge pull request #18452 from donaldsharp/bmp_changes

tests: Change up start order of bmp tests

5 weeks agotests: Change up start order of bmp tests
Donald Sharp [Fri, 21 Mar 2025 22:08:25 +0000 (18:08 -0400)]
tests: Change up start order of bmp tests

Currently the tests appear to do this:
a) Start the neighbors
b) Start the bmp server connection
c) Look for the neighbors up
d) Look for the neighbor up messages in the bmp log

This is not great from a testing perspective in that
even though we started a) first it may not happen
until after b) happens.  Or even worse if it is
partially up ( 1 of the 2 peers ) then the dump
will have the neighbor connecting after parts
of the table.  This doesn't work too well because
the SEQ number is something that is kept and compared
to to make sure only new data is being looked at.

Let's modify the startup configuration to start
the bmp server first and then have a delayopen
on the bgp neighbor statements so that the bmp
peering can come up first.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agoMerge pull request #18442 from y-bharath14/srib-yang-v6
Donald Sharp [Fri, 21 Mar 2025 19:08:45 +0000 (15:08 -0400)]
Merge pull request #18442 from y-bharath14/srib-yang-v6

yang: Code inline with RFC 8407 rules

5 weeks agoMerge pull request #18359 from soumyar-roy/soumya/streamsize
Mark Stapp [Fri, 21 Mar 2025 15:30:16 +0000 (11:30 -0400)]
Merge pull request #18359 from soumyar-roy/soumya/streamsize

zebra: zebra crash for zapi stream

5 weeks agoMerge pull request #17986 from dmytroshytyi-6WIND/fix-static-30-01-2025
Donatas Abraitis [Fri, 21 Mar 2025 10:19:50 +0000 (12:19 +0200)]
Merge pull request #17986 from dmytroshytyi-6WIND/fix-static-30-01-2025

lib: fix static analysis error

5 weeks agoMerge pull request #18277 from y-bharath14/srib-tests-v2
Donatas Abraitis [Fri, 21 Mar 2025 10:13:13 +0000 (12:13 +0200)]
Merge pull request #18277 from y-bharath14/srib-tests-v2

tests: Catch specific exceptions

5 weeks agotests: add bfd_static_vrf
Louis Scalbert [Wed, 19 Mar 2025 14:51:30 +0000 (15:51 +0100)]
tests: add bfd_static_vrf

Add bfd_static_vrf to test BFD tracking of static routes in VRF.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
5 weeks agoMerge pull request #18330 from usrivastava-nvidia/master
Jafar Al-Gharaibeh [Thu, 20 Mar 2025 21:28:05 +0000 (16:28 -0500)]
Merge pull request #18330 from usrivastava-nvidia/master

pimd: Skip RPF check for SA message from mesh group peer

5 weeks agoMerge pull request #18409 from donaldsharp/typesafe_zclient
Russ White [Thu, 20 Mar 2025 16:48:47 +0000 (12:48 -0400)]
Merge pull request #18409 from donaldsharp/typesafe_zclient

Typesafe zclient

5 weeks agopimd:Skip RPF check for SA message received from the MSDP mesh group peers
usrivastava-nvidia [Fri, 7 Mar 2025 06:05:52 +0000 (06:05 +0000)]
pimd:Skip RPF check for SA message received from the MSDP mesh group peers

Signed-off-by: Utkarsh Srivastava <usrivastava@nvidia.com>
5 weeks agopimd:Setting the flag PIM_MSDP_PEERF_IN_GROUP for MSDP mesh group peers
usrivastava-nvidia [Fri, 7 Mar 2025 06:05:06 +0000 (06:05 +0000)]
pimd:Setting the flag PIM_MSDP_PEERF_IN_GROUP for MSDP mesh group peers

Signed-off-by: Utkarsh Srivastava <usrivastava@nvidia.com>
5 weeks agozebra: reduce memory usage by streams when redistributing routes
Soumya Roy [Fri, 14 Mar 2025 22:01:51 +0000 (22:01 +0000)]
zebra: reduce memory usage by streams when redistributing routes

This commit undo 8c9b007a0c7efb2e9afc2eac936ba9dd971c6707
stream lib has been modified to expand the stream if needed
Now for zapi route encode, we use expandable stream

Signed-off-by: Soumya Roy <souroy@nvidia.com>
5 weeks agozebra: zebra crash for zapi stream
Soumya Roy [Fri, 14 Mar 2025 21:56:48 +0000 (21:56 +0000)]
zebra: zebra crash for zapi stream

Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.

Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage

Testing:
No crash happens anymore with the fix
zebra: zebra crash for zapi stream

Issue:
If static route is created with a BGP route as nexthop, which
recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode
fails, as there is not enough memory allocated for stream. This causes
assert/core dump in zebra. Right now we allocate fixed memory
of ZEBRA_MAX_PACKET_SIZ size.

Fix:
1)Dynamically calculate required memory size for the stream
2)try to optimize memory usage

Testing:
No crash happens anymore with the fix
r1#
r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000
r1#

r2# conf
r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1
r2(config)# exit
r2#

Signed-off-by: Soumya Roy <souroy@nvidia.com>
5 weeks agotests: Add staticd/ospfd/ospf6d/pimd for high ecmp
Soumya Roy [Fri, 14 Mar 2025 21:48:20 +0000 (21:48 +0000)]
tests: Add staticd/ospfd/ospf6d/pimd for high ecmp

Signed-off-by: Soumya Roy <souroy@nvidia.com>
5 weeks agolib: Add support for stream buffer to expand
Soumya Roy [Fri, 14 Mar 2025 21:44:39 +0000 (21:44 +0000)]
lib: Add support for stream buffer to expand

Issue:
 Currently, during encode time, if required memory is
 more than available space in stream buffer, stream buffer
 can't be expanded. This fix introduces new apis to support
 stream buffer expansion.

 Testing:
 Tested with zebra nexthop encoding with 512 nexthops, which triggers
 this new code changes, it works fine. Without fix, for same trigger
 it asserts.

Signed-off-by: Soumya Roy <souroy@nvidia.com>
5 weeks agobgpd: Tie in more clear events to clear code
Donald Sharp [Wed, 12 Mar 2025 16:37:21 +0000 (12:37 -0400)]
bgpd: Tie in more clear events to clear code

The `clear bgp *` and the interface down events
cause a global clearing of data from the bgp rib.
Let's tie those into the clear peer code such
that we can take advantage of the reduced load
in these cases too.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
5 weeks agobgpd: Allow batch clear to do partial work and continue later
Mark Stapp [Wed, 12 Mar 2025 13:55:53 +0000 (09:55 -0400)]
bgpd: Allow batch clear to do partial work and continue later

Modify the batch clear code to be able to stop after processing
some of the work and to pick back up again.  This will allow
the very expensive nature of the batch clearing to be spread out
and allow bgp to continue to be responsive.

Signed-off-by: Mark Stapp <mjs@cisco.com>
5 weeks agobgpd: fix evpn attributes being dropped on input
Tuetuopay [Mon, 17 Mar 2025 14:08:15 +0000 (15:08 +0100)]
bgpd: fix evpn attributes being dropped on input

All assignments of the EVPN attributes (ESI and Gateway IP) are gated
behind the peer being set up for inbound soft-reconfiguration.

There are no actual reasons for this limitation, so let's perform the
EVPN attribute assignment no matter what when soft reconfiguration is
not enabled.

Fixes: 6e076ba5231 ("bgpd: Fix for ain->attr corruption during path update")
Signed-off-by: Tuetuopay <tuetuopay@me.com>
5 weeks agoyang: Code inline with RFC 8407 rules
Y Bharath [Thu, 20 Mar 2025 06:41:46 +0000 (12:11 +0530)]
yang: Code inline with RFC 8407 rules

Code inline with RFC 8407 rules

Signed-off-by: y-bharath14 <y.bharath@samsung.com>
5 weeks agoMerge pull request #18325 from chdxD1/topotests/evpn-multipath-flap
Jafar Al-Gharaibeh [Thu, 20 Mar 2025 04:14:38 +0000 (23:14 -0500)]
Merge pull request #18325 from chdxD1/topotests/evpn-multipath-flap

topotests: Add EVPN RT5 multipath flap test

5 weeks agoMerge pull request #18431 from donaldsharp/fpm_listener_reject
Jafar Al-Gharaibeh [Thu, 20 Mar 2025 04:00:47 +0000 (23:00 -0500)]
Merge pull request #18431 from donaldsharp/fpm_listener_reject

Fpm listener reject

5 weeks agoMerge pull request #18435 from donaldsharp/fix_valgrind_found_memory_leak_in_bgp
Jafar Al-Gharaibeh [Thu, 20 Mar 2025 03:58:37 +0000 (22:58 -0500)]
Merge pull request #18435 from donaldsharp/fix_valgrind_found_memory_leak_in_bgp

bgpd: Fix leaked memory when showing some bgp routes

5 weeks agoMerge pull request #18432 from donaldsharp/fix_topotest_to_wait_for_zebra_connection
Jafar Al-Gharaibeh [Thu, 20 Mar 2025 03:55:31 +0000 (22:55 -0500)]
Merge pull request #18432 from donaldsharp/fix_topotest_to_wait_for_zebra_connection

Fix topotest to wait for zebra connection