| Internet-Draft | ROVBench | February 2026 |
| Liu & Geng | Expires 1 September 2026 | [Page] |
This document defines a benchmarking methodology for routers that implement ROV. The methodology focuses on device-level behavior, including processing of validated ROA payload (VRP) updates, the interaction between ROV and BGP, control-plane resource utilization, and the scalability of ROV under varying operational conditions. The procedures described here follow the principles and constraints of the Benchmarking Methodology Working Group (BMWG) and are intended to produce repeatable and comparable results across implementations.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 1 September 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Route Origin Validation (ROV), as specified in [RFC6811], allows routers to use validated Route Origin Authorization (ROA) information, which is distributed via the RPKI-to-Router (RTR) protocol defined in [RFC8210], to classify BGP routes as Valid, Invalid, or NotFound. Deployments of ROV continue to increase across networks, and router vendors have implemented ROV processing as part of their control-plane functions.¶
While operational experience is growing, there is currently no standardized methodology for measuring the performance impact and behavioral characteristics of ROV on routing devices. As with other protocol features evaluated by the Benchmarking Methodology Working Group (BMWG), a consistent and repeatable test framework is essential for:¶
Comparing router implementations,¶
Evaluating scalability under controlled conditions,¶
Characterizing the control-plane costs of ROV processing, and¶
Understanding how ROV influences BGP convergence and routing stability.¶
This document defines a benchmarking methodology for routers that implement ROV, which builds upon the foundational benchmarking principles defined in [RFC1242], [RFC2285], [RFC2544], [RFC2889], and [RFC3918]. The methodology focuses on the Device Under Test (DUT) and uses controlled, reproducible inputs to isolate the effects of ROV from external dependencies. In particular, the benchmarking framework assumes the presence of an RPKI-to-Router (RTR) update source, which may be an RPKI Cache Server or an RTR traffic generator capable of delivering synthetic Validated ROA Payloads (VRPs).¶
The objective of this document is to define a set of metrics and procedures to quantify:¶
The latency of ROV state updates within the router,¶
The impact of ROV on BGP control-plane performance,¶
The scalability of ROV processing under varying VRP and BGP table sizes, and¶
The control-plane resource utilization associated with enabling ROV.¶
By providing a consistent framework, this document enables vendors, operators, and researchers to evaluate ROV functionality under controlled and repeatable conditions, improving understanding of implementation performance and supporting informed deployment decisions.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document specifies a laboratory-based benchmarking methodology for evaluating the performance of router implementations of ROV as defined in [RFC6811]. The scope of this benchmarking methodology includes:¶
ROV processing performance: Measurement of the time and resources required for a router to process VRP updates received via the RTR protocol.¶
Impact on BGP control-plane performance: Quantification of how enabling ROV affects BGP convergence times and routing table stability.¶
Scalability under controlled conditions: Evaluation of the router's ability to handle large VRP sets, rapid VRP churn, and BGP updates influenced by ROV.¶
Resource utilization: Measurement of CPU, memory, and internal control-plane load associated with ROV processing.¶
The goals of this document are:¶
The terminology used in this document follows the conventions of [RFC1242], [RFC2285], and subsequent BMWG publications. The following terms are used with specific meanings in the context of ROV benchmarking.¶
Route Origin Validation (ROV): A procedure defined in [RFC6811] that compares the origin AS of a BGP announcement with the set of authorized origins derived from validated ROA objects. ROV results in one of three states: Valid, Invalid, or NotFound.¶
Validated ROA Payload (VRP): The processed output from a relying party containing prefix-origin pairs that routers use for ROV decisions. VRPs are transported via the RPKI-to-Router (RTR) protocol.¶
RPKI-to-Router (RTR) Session: A protocol session between a router and an RPKI Cache Server. In benchmarking, RTR sessions may be emulated or generated using traffic/test tools to deliver synthetic VRP updates.¶
ROV Update Processing Latency: The time from when a router receives new VRP data (via RTR) until the updated ROV state is reflected in the router's local Routing Information Base (RIB) or implemented in routing decisions.¶
VRP-Triggered Revalidation Latency: The time interval between completion of VRP installation and the moment all affected prefixes have updated validation states.¶
BGP-Triggered ROV Validation Latency: The time interval between receipt of a BGP UPDATE message and completion of the ROV validation procedure for that route.¶
BGP Convergence Time: The time required for the router's control plane to process BGP updates and reach a stable routing state, while ROV validation is active.¶
Resource Utilization: CPU utilization and memory consumption of the router when performing ROV-related tasks, including processing of VRP updates and applying ROV policy.¶
ROV Churn: A burst of VRP changes (e.g., many ROA additions or withdrawals) that may trigger significant re-validation and BGP recalculation, which is used in stress tests.¶
ROV Scalability Limit: The maximum number of VRPs, RTR sessions, or ROV-triggered BGP changes that the router can process while maintaining normal operational performance.¶
This section describes the required test topology, equipment, DUT configuration, RPKI data emulation, and traffic generation conditions. The goal of the test environment is to isolate the DUT and subject it to clearly defined RPKI-RTR and BGP test, while providing accurate timing and state measurements.¶
+-------------------+ RTR +----------------------+
| RTR Emulator |---------->| DUT |
|(RTR Update Source)| | (ROV Enabled) |
+-------------------+ +----------------------+
/\ /\
| | Data-plane Traffic
BGP | +-----------------+
+---------------------+ | | Tester |
|BGP Traffic Generator|-----------+ |(Data-plane Load)|
+---------------------+ +-----------------+
The test topology consists of four primary components: the DUT, an RPKI-RTR update source, a BGP traffic generator, and a tester for generating data-plane load. The DUT is a router equipped with ROV capabilities, supporting the RPKI-RTR protocol and applying ROV policies to received BGP routes. The RPKI-RTR update source may be either a real RPKI cache implementation running in isolated mode or a dedicated emulator capable of producing arbitrary VRP sets and update patterns. This RTR source connects directly to the DUT using the RPKI-RTR protocol and provides precisely controlled VRP updates, including serial increments, cache resets, and bursty or delayed update sequences.¶
The BGP traffic generator establishes one or more BGP peering sessions with the DUT and is responsible for delivering a full global routing table, on the order of 800,000 to 1,000,000 prefixes, along with controlled withdrawal or re-announcement events. The generator should be capable of presenting both stable baseline routing conditions and timed ROV-affected prefixes whose validation status will change in response to VRP updates. A tester is connected to the DUT to introduce controlled data-plane load during benchmarking. When present, the tester SHOULD generate stable and deterministic traffic loads so that the impact of forwarding load on ROV processing can be evaluated. When data-plane load is applied, its rate, frame size, and traffic profile MUST be documented in the test report.¶
The DUT must be configured with ROV enabled on all BGP sessions receiving test routes. The router must establish a stable and fully functional RPKI-RTR session with the RTR emulator. To ensure that performance results are attributable solely to ROV behavior, all non-essential features on the DUT, such as additional routing protocols, unnecessary telemetry mechanisms, and unused services, should be disabled. Logging related to ROV may remain enabled for debugging purposes but must be rate-limited to avoid skewing CPU measurements or affecting test repeatability. All system parameters relevant to routing performance, such as multipath behavior or maximum-prefix limits, must be documented prior to testing.¶
The RTR emulator must be capable of generating synthetic VRP data sets with user-defined characteristics. This includes the ability to create arbitrary combinations of prefixes and ASNs, overlapping VRPs, conflicting VRPs, and other edge cases relevant to validation logic. The VRP datasets should mimic realistic global distributions where appropriate, but must also support scaling tests where VRP volumes are substantially higher than today's norm. The data source must further support generating controlled bursts of VRP updates, ranging from 100 to 10,000 VRP changes per second, and must allow for both additive updates and withdrawals. These capabilities are essential for evaluating the DUT's scalability and robustness under high churn.¶
The BGP traffic generator must present the DUT with a stable baseline routing table prior to initiating any benchmark. This ensures that the DUT begins each test run in a known, converged state with predictable CPU and memory utilization. The generator must also provide a set of ROV-affected prefixes whose origin AS can be manipulated in concert with VRP updates from the RTR emulator. These prefixes should span a range of prefix lengths and originate from diverse ASes to reflect realistic routing conditions. The traffic generator must support deterministic convergence triggers, such as the precise injection of BGP updates following a VRP change or the simultaneous application of both BGP and VRP events.¶
When data-plane traffic is used, the following parameters SHOULD be specified:¶
Frame size(s) used (e.g., 64, 512, 1518 bytes).¶
Traffic rate (percentage of line rate or packets per second).¶
Traffic pattern (constant rate, burst, IMIX).¶
Source and destination IP address ranges.¶
Whether traffic matches ROV-affected prefixes.¶
Each frame size and traffic rate combination SHOULD be reported separately.¶
This section describes the general methodology for benchmarking ROV behavior on a DUT. The goal is to ensure that all tests are repeatable, comparable across different environments, and representative of realistic deployment conditions. The methodology defines how to establish a controlled and stable test environment, how to specify and vary input conditions, and how to measure key performance metrics associated with ROV processing.¶
Before any measurements are taken, the DUT must reach a well-defined steady state in which the RPKI-RTR session is fully established, the VRP set has been completely synchronized, and the BGP control plane has converged. A warm-up period is recommended to eliminate any cold-start effects that could bias measurement results.¶
All sources of measurement noise should be avoided. Features such as logging, real-time telemetry export, or periodic background tasks can interfere with timing-sensitive measurements; therefore, such features should be disabled or rate-limited during benchmarking. CPU clock scaling, thermal throttling, or other variable-performance modes should be minimized if the test setup allows it.¶
Accurate benchmarking depends on precise control of the input conditions applied to the DUT. All tests should begin from a consistent baseline consisting of:¶
A predefined VRP set size (e.g., tens of thousands to millions of entries).¶
A stable and realistic baseline BGP RIB-in (e.g., ~1M global routes).¶
From this baseline, input variables may be modified to stress different aspects of ROV behavior. These variables include the VRP churn rate, ranging from steady incremental updates to high-intensity bursts, and the type of RPKI-RTR updates provided to the DUT, such as incremental updates versus full-table refreshes. Each of these conditions may trigger different processing strategies within the DUT, and therefore must be explicitly controlled and documented.¶
Benchmarking ROV behavior requires collecting quantitative performance metrics that reflect how the DUT processes validation information and incorporates it into the BGP decision process. Therefore, this document proposes key performance metrics including ROV update processing latency, ROV validation latency, BGP convergence time, VRP storage size, CPU and memory utilization, and ROV state rebuild time.¶
ROV update processing latency measures the time from receipt of an RTR update (incremental or full) until the DUT has fully updated its internal validation state. This metric captures the efficiency of ROV data structures and algorithms.¶
ROV validation latency measures the time interval between a router's receipt of a BGP UPDATE message that contains a new or changed route, and the completion of the ROV procedure for that route, producing a validation state of Valid, Invalid, or NotFound. This metric isolates the internal validation step, excluding the larger BGP convergence process, and provides insight into the responsiveness of the DUT's validation engine.¶
BGP convergence time with ROV enabled measures how long the DUT takes to converge on BGP prefixes whose validation states change due to VRP updates. This reflects the real operational behavior of ROV as it interacts with the control plane.¶
The VRP storage size inside the DUT should also be recorded to evaluate the scalability of the implementation when operating with large VRP datasets. Alongside this, CPU and memory utilization should be monitored to identify performance limits or resource-intensive operations triggered by ROV.¶
A recovery-related measurement, ROV state rebuild time after RTR session reset, quantifies the time needed for the DUT to re-establish a complete and correct ROV validation state after an RTR session reset or cache outage. This metric reflects robustness and recovery behavior under fault or restart scenarios.¶
Finally, the DUT should be evaluated under high-pressure scenarios by measuring its behavior when processing VRP bursts, such as surges of 100-10,000 VRPs per second. This measurement reveals whether the implementation can sustain abrupt workload increases without dropping updates, stalling, or entering unstable states.¶
This section defines the individual benchmark tests used to evaluate the performance and behavior of a DUT implementing ROV. Each test focuses on a specific aspect of the ROV processing pipeline, including VRP ingestion, validation, interaction with BGP, scalability limits, and robustness under stress and failure conditions. All tests assume the laboratory setup and input conditions described previously.¶
Objective: Measure the latency from the arrival of an RTR PDU until the new VRP information is installed in the DUT's internal ROV tables.¶
The test procedures for ROV update processing latency are listed below:¶
Prepare baseline state¶
Inject controlled RTR update¶
Timestamp PDU transmission¶
Record the exact moment the first update PDU is sent.¶
Monitor DUT internal state¶
Calculate latency¶
Latency = (VRP applied timestamp) − (RTR PDU sent timestamp).¶
Repeat for multiple VRP table sizes¶
E.g., 50k, 100k, 500k, and 1M VRPs.¶
Repeat at least 10 times per condition¶
Compute mean and standard deviation.¶
Objective: Measure how long the DUT takes to apply updated VRPs to the validation states of affected BGP prefixes.¶
The test procedures for ROV validation latency are listed below:¶
Establish baseline¶
Select a controlled prefix set¶
Pick a set of prefixes (e.g., 1,000) whose origin AS is tied to specific VRPs.¶
Trigger validation update¶
Modify VRPs so that these prefixes change validation state (Valid->Invalid or Invalid->Valid).¶
Timestamp VRP installation completion¶
As measured in Section 6.1.¶
Monitor DUT validation table¶
Compute latency¶
Validation Latency = (all validation updated) − (VRP installed).¶
Repeat with varying set sizes¶
E.g., 10 prefixes, 100 prefixes, 1,000 prefixes.¶
Objective: Measure BGP convergence time for routes impacted by ROV state changes, and compare to BGP-only convergence without ROV.¶
The test procedures for BGP convregence with ROV enabled are listed below:¶
Prepare baseline¶
Select test prefixes¶
Choose prefixes that will transition from Valid to Invalid once VRP updates are applied.¶
Trigger VRP state change¶
Send VRP modifications via RTR.¶
Monitor BGP behavior¶
Measure convergence¶
Repeat test with ROV disabled¶
Use identical routing changes for baseline comparison.¶
Record:¶
Objective: Evaluate DUT performance with varying VRP table sizes.¶
The test procedures for VRP scalability tests are listed below:¶
Generate VRP datasets at sizes:¶
E.g., 50k, 100k, 500k, 1M.¶
Load each dataset into the RTR emulator.¶
For each dataset, measure:¶
Full-table synchronization time.¶
VRP update processing latency (from Section 6.1).¶
ROV validation latency (from Section 6.2).¶
Memory consumption¶
CPU utilization during sync and steady state.¶
Record failures¶
Repeat 10 times per size for statistical stability.¶
Objective: Stress-test the DUT under rapid VRP changes to measure stability, performance, and correctness.¶
The test procedures for VRP churn and stress tests are listed below:¶
Objective: Measure resource consumption under various ROV workloads.¶
The test procedures for resource utilization are listed below:¶
An ROV benchmarking report MUST provide enough detail to allow reproducibility and meaningful comparison across different DUTs. Each report MUST include the following elements:¶
Test environment description: The report MUST specify the DUT hardware and software versions, the testbed topology, and all ROV-related configuration parameters required to replicate the setup.¶
Input conditions: The report MUST document the VRP set size, RIB-in size, the presence and rate of VRP churn, and whether RTR updates were incremental or full.¶
Metrics and results: Each measured metric MUST include its definition, a brief description of the measurement procedure, and results presented in tabular numerical form (including minimum, average, maximum, and at least P95 values). Graphs MAY be included for clarification.¶
Deviations and anomalies: Any deviation from the expected behavior MUST be described, including the conditions under which it occurred and whether the test was repeated.¶
Summary of observations: The report MUST include a concise summary of overall DUT performance, scalability limits observed, and any significant effects of enabling ROV on BGP behavior.¶
In addition, the report MUST include, at minimum, the following parameters:¶
DUT hardware model, CPU architecture, memory size, and software version.¶
Complete DUT configuration relevant to ROV and BGP.¶
Testbed topology description.¶
VRP table size.¶
VRP churn rate.¶
RIB-in size.¶
Number of RTR sessions.¶
RTR timer configuration.¶
Presence and parameters of data-plane traffic (if used).¶
ROV policy mode (e.g., reject Invalid).¶
CPU sampling interval.¶
Measurement repetition count.¶
For each metric, the report MUST provide:¶
This document defines a benchmarking methodology for evaluating ROV on routing devices. As such, it does not introduce new protocols, modify existing security mechanisms, or create new vulnerabilities within the RPKI system or BGP itself. All benchmarking activities are intended to take place in isolated laboratory environments. Nevertheless, a number of security considerations apply to the execution and interpretation of the tests described in this document.¶
Benchmarking ROV necessarily involves the generation, manipulation, and replay of RPKI objects. These test artifacts MUST NOT be injected into production RPKI repositories, production RPKI caches, or live BGP routing systems. Test-generated RPKI data sets SHOULD be clearly separated from real-world trust anchors, and laboratory RPKI caches SHOULD use isolated test Trust Anchors to prevent accidental propagation.¶
Similarly, BGP routing information used in the tests including simulated full tables, invalid prefixes, or artificially crafted origin-AS combinations, MUST NOT leak into production routing domains. All BGP sessions used for testing MUST be confined to a closed environment without external connectivity.¶
Tests involving stress conditions, such as high churn rates or large-scale VRP updates, may cause elevated CPU or memory consumption on the DUT. Operators performing such tests SHOULD ensure that the DUT is not simultaneously connected to any production network to avoid unintended service degradation.¶
This document has no actions for IANA.¶