summaryrefslogtreecommitdiff
path: root/lib/vrf.c
AgeCommit message (Collapse)Author
2024-04-16lib, zebra: fix exit commandsIgor Ryzhov
If a command is not marked as `YANG`-converted, the current command batching buffer is flushed before executing the command. We shouldn't flush the buffer when executing an `exit` command. It should only be flushed if the next command is not `YANG`-converted, which is checked by the command itself, not the previous `exit`. Fixes #15706. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2024-03-15zebra: fix route deletion during zebra shutdownAlexander Skorichenko
Split zebra's vrf_terminate() into disable() and delete() stages. The former enqueues all events for the dplane thread. Memory freeing is performed in the second stage. Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
2024-02-04lib, mgmtd: don't register NB config callbacks in mgmtdIgor Ryzhov
mgmtd is supposed to only register CLI callbacks. If configuration callbacks are registered, they are getting called on startup when mgmtd reads config files, and they can use infrastructure that is not initialized on mgmtd, or allocate some memory that is never freed. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2024-02-02lib: fix "no vrf" commandIgor Ryzhov
Remove operational data check from CLI command. It never works in mgmtd and it is not needed in backend daemons because it's done in `lib_vrf_destroy` callback. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2024-01-28zebra: convert vrf configuration output to NBIgor Ryzhov
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2024-01-04*: Remove sys/ioctl.h from zebra.hDonald Sharp
Practically no-one uses this and ioctls are pretty much wrappered. Further wrappering could make this even better. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-12-28zebra: support yielding between oper state routes queryChristian Hopps
Signed-off-by: Christian Hopps <chopps@labn.net>
2023-11-29lib: all: remove './' from xpath 22% speedupChristian Hopps
fixes #8299 Signed-off-by: Christian Hopps <chopps@labn.net>
2023-06-26*: Rearrange vrf_bitmap_X api to reduce memory footprintDonald Sharp
When running all daemons with config for most of them, FRR has sharpd@janelle:~/frr$ vtysh -c "show debug hashtable" | grep "VRF BIT HASH" | wc -l 3570 3570 hashes for bitmaps associated with the vrf. This is a very large number of hashes. Let's do two things: a) Reduce the created size of the actually created hashes to 2 instead of 32. b) Delay generation of the hash *until* a set operation happens. As that no hash directly implies a unset value if/when checked. This reduces the number of hashes to 61 in my setup for normal operation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-21*: Add a hash_clean_and_free() functionDonald Sharp
Add a hash_clean_and_free() function as well as convert the code to use it. This function also takes a double pointer to the hash to set it NULL. Also it cleanly does nothing if the pointer is NULL( as a bunch of code tested for ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-02-09*: auto-convert to SPDX License IDsDavid Lamparter
Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-01-31lib: Add missing enum's to switch statementDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-27*: fix non-const northbound XPath format stringsDavid Lamparter
Passing a pre-formatted buffer in these places needs a `"%s"` in front so it doesn't get formatted twice. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-01-19lib: remove dead logic codeRafael Zalamena
If we got inside the condition of `vrfp->status == VRF_ACTIVE` then don't make the same check again. Found by Coverity Scan (CID 1519760) Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-11-22lib: disable vrf before terminating interfacesStephen Worley
We must disable the vrf before we start terminating interfaces. On termination, we free the 'zebra_if' struct from the interface ->info pointer. We rely on that for subsystems like vxlan for cleanup when shutting down. ''' ==497406== Invalid read of size 8 ==497406== at 0x47E70A: zebra_evpn_del (zebra_evpn.c:1103) ==497406== by 0x47F004: zebra_evpn_cleanup_all (zebra_evpn.c:1363) ==497406== by 0x4F2404: zebra_evpn_vxlan_cleanup_all (zebra_vxlan.c:1158) ==497406== by 0x4917041: hash_iterate (hash.c:267) ==497406== by 0x4F25E2: zebra_vxlan_cleanup_tables (zebra_vxlan.c:5676) ==497406== by 0x4D52EC: zebra_vrf_disable (zebra_vrf.c:209) ==497406== by 0x49A247F: vrf_disable (vrf.c:340) ==497406== by 0x49A2521: vrf_delete (vrf.c:245) ==497406== by 0x49A2E2B: vrf_terminate_single (vrf.c:533) ==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561) ==497406== by 0x441240: sigint (main.c:192) ==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130) ==497406== Address 0x6d68c68 is 200 bytes inside a block of size 272 free'd ==497406== at 0x48470E4: free (vg_replace_malloc.c:872) ==497406== by 0x4942CF0: qfree (memory.c:141) ==497406== by 0x49196A9: if_delete (if.c:293) ==497406== by 0x491C54C: if_terminate (if.c:1031) ==497406== by 0x49A2E22: vrf_terminate_single (vrf.c:532) ==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561) ==497406== by 0x441240: sigint (main.c:192) ==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130) ==497406== by 0x499A5F0: thread_fetch (thread.c:1775) ==497406== by 0x492850E: frr_run (libfrr.c:1197) ==497406== by 0x441746: main (main.c:476) ==497406== Block was alloc'd at ==497406== at 0x4849464: calloc (vg_replace_malloc.c:1328) ==497406== by 0x49429A5: qcalloc (memory.c:116) ==497406== by 0x491D971: if_new (if.c:174) ==497406== by 0x491ACC8: if_create_name (if.c:228) ==497406== by 0x491ABEB: if_get_by_name (if.c:613) ==497406== by 0x427052: netlink_interface (if_netlink.c:1178) ==497406== by 0x43BC18: netlink_parse_info (kernel_netlink.c:1188) ==497406== by 0x4266D7: interface_lookup_netlink (if_netlink.c:1288) ==497406== by 0x42B634: interface_list (if_netlink.c:2368) ==497406== by 0x4ABF83: zebra_ns_enable (zebra_ns.c:127) ==497406== by 0x4AC17E: zebra_ns_init (zebra_ns.c:216) ==497406== by 0x44166C: main (main.c:408) ''' Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-01-17Merge pull request #10183 from idryzhov/rework-vrf-renameRafael Zalamena
*: rework renaming the default VRF
2021-12-21*: rework renaming the default VRFIgor Ryzhov
Currently, it is possible to rename the default VRF either by passing `-o` option to zebra or by creating a file in `/var/run/netns` and binding it to `/proc/self/ns/net`. In both cases, only zebra knows about the rename and other daemons learn about it only after they connect to zebra. This is a problem, because daemons may read their config before they connect to zebra. To handle this rename after the config is read, we have some special code in every single daemon, which is not very bad but not desirable in my opinion. But things are getting worse when we need to handle this in northbound layer as we have to manually rewrite the config nodes. This approach is already hacky, but still works as every daemon handles its own NB structures. But it is completely incompatible with the central management daemon architecture we are aiming for, as mgmtd doesn't even have a connection with zebra to learn from it. And it shouldn't have it, because operational state changes should never affect configuration. To solve the problem and simplify the code, I propose to expand the `-o` option to all daemons. By using the startup option, we let daemons know about the rename before they read their configs so we don't need any special code to deal with it. There's an easy way to pass the option to all daemons by using `frr_global_options` variable. Unfortunately, the second way of renaming by creating a file in `/var/run/netns` is incompatible with the new mgmtd architecture. Theoretically, we could force daemons to read their configs only after they connect to zebra, but it means adding even more code to handle a very specific use-case. And anyway this won't work for mgmtd as it doesn't have a connection with zebra. So I had to remove this option. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-12-14lib: default VRF may not exist on early exitDavid Lamparter
If we're exiting before we finished initializing, we can end up trying to shut down a NULL vrf here. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2021-11-23lib, yang: remove vrf from the interface list keyIgor Ryzhov
This is needed for the following two reasons: 1. To be able to remove the northbound HACK in if_update_to_new_vrf. It is totally wrong to rewrite the configuration datastore when some operational state changes. It is a hard blocker for storing a configuration data in a management daemon which knows nothing about the operational state. 2. To allow changing the VRF of the interface using FRR CLI or any other frontend in the future. If the VRF is a part of the key, it can't be changed. If the VRF is a simple leaf, it becomes possible to change it and thus move the interface between VRFs. For now I mark the leaf as a "config false" as it's not yet possible to control it from FRR. But we can't simply remove the VRF from the key, because it is needed to distinguish interfaces when using netns based VRFs, as it is possible to have multiple interfaces with the same name in different namespaces. To handle this, I came up with an idea to store both VRF and an interface name in the "name" leaf using the pattern "vrfname:ifname". For example, if there's an interface "eth0" in VRF "red" then its "name" leaf will be "red:eth0". Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-22*: cleanup ifp->vrf_idIgor Ryzhov
Since f60a1188 we store a pointer to the VRF in the interface structure. There's no need anymore to store a separate vrf_id field. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-11lib: fix vrf deletion when the last interface is deletedIgor Ryzhov
Currently, we automatically delete an inactive VRF when its last interface is deleted. This code introduces a couple of crashes because of the following problems: - vrf_delete is called before calling if_del hook, so daemons may try to dereference an ifp->vrf pointer which is freed - in if_terminate, we continue to use the VRF in the loop condition after the last interface is deleted This check is needed only when the interface is deleted by the user, because if the interface is deleted by the system, VRF must still exist in the system. Move the check to appropriate places to fix crashes. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-03lib: fix crash when terminating inactive VRFsIgor Ryzhov
If the VRF is not enabled, if_terminate deletes the VRF after the last interface is removed from it. Therefore daemons crash on the subsequent call to vrf_delete. We should call vrf_delete only for enabled VRFs. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-03zebra: fix stale pointer when netns is deletedIgor Ryzhov
When the netns is deleted, we should always clear the vrf->ns_ctxt pointer. Currently, it is not cleared when there are interfaces in the netns at the time of deletion. If the netns is re-created, zebra crashes because it tries to use the stale pointer. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-10-26Merge pull request #9846 from idryzhov/lib-zebra-netnsMark Stapp
lib: move zebra-only netns stuff to zebra
2021-10-19lib: allow to create interfaces in non-existing VRFsIgor Ryzhov
It allows FRR to read the interface config even when the necessary VRFs are not yet created and interfaces are in "wrong" VRFs. Currently, such config is rejected. For VRF-lite backend, we don't care at all about the VRF of the inactive interface. When the interface is created in the OS and becomes active, we always use its actual VRF instead of the configured one. So there's no need to reject the config. For netns backend, we may have multiple interfaces with the same name in different VRFs. So we care about the VRF of inactive interfaces. And we must allow to preconfigure the interface in a VRF even before it is moved to the corresponding netns. From now on, we allow to create multiple configs for the same interface name in different VRFs and the necessary config is applied once the OS interface is moved to the corresponding netns. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-10-19lib: move zebra-only netns stuff to zebraIgor Ryzhov
When something is used only from zebra and part of its description is "should be called from zebra only" then it belongs to zebra, not lib. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-09-07vrf_name_to_id(): removeG. Paul Ziemba
vrf_name_to_id() returned VRF_DEFAULT when the vrf name was unknown, hiding errors. Per community recommendation, vrf_name_to_id() is now removed and the few callers now use vrf_lookup_by_name() directly. Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2021-09-02lib: Remove unused function vrf_generate_idDonald Sharp
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-08-26lib: remove unused argument from vrf_cmd_initIgor Ryzhov
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-08-23lib, zebra: move vrf netns commands from lib to zebraIgor Ryzhov
"[no] netns NAME" commands are part of the lib, but they are actually zebra-only: - they are using vrf_netns_handler_create and its description clearly says that it "should be called from zebra only" - vtysh sends these commands only to zebra - only zebra outputs the netns related config - zebra notifies other daemons about netns attachment Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-07-30lib, zebra: Preserve user-configured VRF on netns deletionXiao Liang
Don't clear VRF's user-configured flag when netns is deleted. Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
2021-06-21lib: remove vrf-interface config when removing the VRFIgor Ryzhov
If we have the following configuration: ``` vrf red smth exit-vrf ! interface red vrf red smth ``` And we delete the VRF using "no vrf red" command, we end up with: ``` interface red smth ``` Interface config is preserved but moved to the default VRF. This is not an expected behavior. We should remove the interface config when the VRF is deleted. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-06-11lib: terminate default vrf lastStephen Worley
Always terminate default VRF last during FRR shutdown. On shutdown we were simply looping over the RB tree and terminating VRFs from the ROOT. This is not guaranteed to be the default last ever. Instead switch to RB_SAFE and skip the default VRF till the very end. Signed-off-by: Stephen Worley <sworley@nvidia.com>
2021-06-04zebra: fix config after exit from vrfIgor Ryzhov
When the VRF node is exited using "exit" or "quit", there's still a VRF pointer stored in the vty context. If you try to configure some router related command, it will be applied to the previous VRF instead of the default VRF. For example: ``` (config)# vrf test (config-vrf)# ip router-id 1.1.1.1 (config-vrf)# do show run ... ! vrf test ip router-id 1.1.1.1 exit-vrf ! ... (config-vrf)# exit (config)# ip router-id 2.2.2.2 (config)# do show run ... ! vrf test ip router-id 2.2.2.2 exit-vrf ! ... ``` `vrf-exit` works correctly, because it stores a pointer to the default VRF into the vty context (but weirdly keeping the VRF_NODE instead of changing it to CONFIG_NODE). Instead of relying on the behavior of exit function, always use the default VRF when in CONFIG_NODE. Another problem is missing `VTY_CHECK_CONTEXT`. If someone deletes the VRF in which node the user enters the command, then zebra applies the command to the default VRF instead of throwing an error. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-06-02Merge pull request #8210 from LabNConsulting/chopps/always-batchDonald Sharp
northbound: KISS always batch yang config, it's faster.
2021-06-02northbound: KISS always batch yang config (file read), it's fasterChristian Hopps
The backoff code assumed that yang operations always completed quickly. It checked for > 100 YANG modeled commands happening in under 1 second to enable batching. If 100 yang modeled commands always take longer than 1 second batching is never enabled. This is the exact opposite of what we want to happen since batching speeds the operations up. Here are the results for libyang2 code without and with batching. | action | 1K rts | 2K rts | 1K rts | 2K rts | 20k rts | | | nobatch | nobatch | batch | batch | batch | | Add IPv4 | .881 | 1.28 | .703 | 1.04 | 8.16 | | Add Same IPv4 | 28.7 | 113 | .590 | .860 | 6.09 | | Rem 1/2 IPv4 | .376 | .442 | .379 | .435 | 1.44 | | Add Same IPv4 | 28.7 | 113 | .576 | .841 | 6.02 | | Rem All IPv4 | 17.4 | 71.8 | .559 | .813 | 5.57 | (IPv6 numbers are basically the same as iPv4, a couple percent slower) Clearly we need this. Please note the growth (1K to 2K) w/o batching is non-linear and 100 times slower than batched. Notes on code: The use of the new `nb_cli_apply_changes_clear_pending` is to commit any pending changes (including the current one). This is done when the code would not correctly handle a single diff that included the current changes with possible following changes. For example, a "no" command followed by a new value to replace it would be merged into a change, and the code would not deal well with that. A good example of this is BGP neighbor peer-group changing. The other use is after entering a router level (e.g., "router bgp") where the follow-on command handlers expect that router object to now exists. The code eventually needs to be cleaned up to not fail in these cases, but that is for future NB cleanup. Signed-off-by: Christian Hopps <chopps@labn.net>
2021-05-31lib: fix binding to a vrfIgor Ryzhov
There are two possible use-cases for the `vrf_bind` function: - bind socket to an interface in a vrf - bind socket to a vrf device For the former case, there's one problem - success is returned when the interface is not found. In that case, the socket is left unbound without throwing an error. For the latter case, there are multiple possible problems: - If the name is not set, then the socket is left unbound (zebra, vrrp). - If the name is "default" and there's an interface with that name in the default VRF, then the socket is bound to that interface. - In most daemons, if the router is configured before the VRF is actually created, we're trying to open and bind the socket right after the daemon receives a VRF registration from zebra. We may not receive the VRF-interface registration from zebra yet at that point. Therefore, `if_lookup_by_name` fails, and the socket is left unbound. This commit fixes all the issues and updates the function description. Suggested-by: Pat Ruddy <pat@voltanet.io> Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-05-24lib: fix missing newlineIgor Ryzhov
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-05-13lib: adapt to version 2 of libyangChristian Hopps
Compile with v2.0.0 tag of `libyang2` branch of: https://github.com/CESNET/libyang staticd init load time of 10k routes now 6s vs ly1 time of 150s Signed-off-by: Christian Hopps <chopps@labn.net>
2021-03-29*: modify VRF_CONFIGURED flag only in VRF NB layerIgor Ryzhov
This is to fix the crash reproduced by the following steps: * ip link add red type vrf table 1 Creates VRF. * vtysh -c "conf" -c "vrf red" Creates VRF NB node and marks VRF as configured. * ip route 1.1.1.0/24 2.2.2.2 vrf red * no ip route 1.1.1.0/24 2.2.2.2 vrf red (or similar l3vni set/unset in zebra) Marks VRF as NOT configured. * ip link del red VRF is deleted, because it is marked as not configured, but NB node stays. Subsequent attempt to configure something in the VRF leads to a crash because of the stale pointer in NB layer. Fixes #8357. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-03-17*: require semicolon after DEFINE_QOBJ & co.David Lamparter
Again, see previous commits. Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-17*: require semicolon after DEFINE_MTYPE & coDavid Lamparter
Back when I put this together in 2015, ISO C11 was still reasonably new and we couldn't require it just yet. Without ISO C11, there is no "good" way (only bad hacks) to require a semicolon after a macro that ends with a function definition. And if you added one anyway, you'd get "spurious semicolon" warnings on some compilers... With C11, `_Static_assert()` at the end of a macro will make it so that the semicolon is properly required, consumed, and not warned about. Consistently requiring semicolons after "file-level" macros matches Linux kernel coding style and helps some editors against mis-syntax'ing these macros. Signed-off-by: David Lamparter <equinox@diac24.net>
2021-02-22lib: add definitions for vrf xpathsIgor Ryzhov
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-02-14*: remove tabs & newlines from log messagesDavid Lamparter
Neither tabs nor newlines are acceptable in syslog messages. They also break line-based parsing of file logs. Signed-off-by: David Lamparter <equinox@diac24.net>
2021-02-10Merge pull request #7508 from sudhanshukumar22/zebra-vrf-deleteStephen Worley
zebra: treat vrf add for existing vrf as update
2021-02-09vrf: use wrappers to change VRF_CONFIGURED flagIgor Ryzhov
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-02-09vrf: mark vrf as configured when entering vrf nodeIgor Ryzhov
The VRF must be marked as configured when user enters "vrf NAME" command. Otherwise, the following problem occurs: `ip link add red type vrf table 1` VRF structure is allocated. `vtysh -c "conf t" -c "vrf red"` `lib_vrf_create` is called, and pointer to the VRF structure is stored to the nb_config_entry. `ip link del red` VRF structure is freed (because it is not marked as configured), but the pointer is still stored in the nb_config_entry. `vtysh -c "conf t" -c "no vrf red"` Nothing happens, because VRF structure doesn't exist. It means that `lib_vrf_destroy` is not called, and nb_config_entry still exists in the running config with incorrect pointer. `ip link add red type vrf table 1` New VRF structure is allocated. `vtysh -c "conf t" -c "vrf red"` `lib_vrf_create` is NOT called, because the nb_config_entry for that VRF name still exists in the running config. After that all NB commands for this VRF will use incorrect pointer to the freed VRF structure. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-02-01zebra: treat vrf add for existing vrf as updatesudhanshukumar22
Description: When we get a new vrf add and vrf with same name, but different vrf-id already exists in the database, we should treat vrf add as update. This happens mostly when there are lots of vrf and other configuration being replayed. There may be a stale vrf delete followed by new vrf add. This can cause timing race condition where vrf delete could be missed and further same vrf add would get rejected instead of treating last arrived vrf add as update. Treat vrf add for existing vrf as update. Implicitly disable this VRF to cleanup routes and other functions as part of vrf disable. Update vrf_id for the vrf and update vrf_id tree. Re-enable VRF so that all routes are freshly installed. Above 3 steps are mandatory since it can happen that with config reload stale routes which are installed in vrf-1 table might contain routes from older vrf-0 table which might have got deleted due to missing vrf-0 in new configuration. Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
2020-09-21vrf: VRF_DEFAULT must be 0, remove useless codeChristophe Gouault
Code was added in the past to support a value of VRF_DEFAULT different from 0. This option was abandoned, the default vrf id is always 0. Remove this code, this will simplify the code and improve performance (use a constant value instead of a function that performs tests). Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
2020-09-21lib: optimize vrf_id_to_name(VRF_DEFAULT) caseChristophe Gouault
vrf_id_to_name() looks up in a RB_TREE to find the VRF entry, then reads the name. Avoid it for VRF_DEFAULT, which always exists and for which the translation is straightforward. Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>