IPPM Working Group                                          L. Melegassi
Internet-Draft                                                  Catellix
Intended status: Experimental                              18 June 2026
Expires: 20 December 2026


  Real-World Measurement of the Infrastructure-Cognitive Coupling
   Matrix R_cross: Closing the MVPS AI-Coherence Production
                          Conjecture (IC9.1)
          draft-melegassi-mvps-ai-coherence-coupling-real-00

Abstract

   The MVPS AI-Coherence framework [I-D.melegassi-mvps-ai-coherence]
   defines an infrastructure-cognitive coupling matrix R_cross =
   Sigma_net^{-1/2} Sigma_cross Sigma_AI^{-1/2} and proves, in
   simulation, that a non-zero R_cross is the necessary and sufficient
   condition for the joint network-AI anomaly space to carry detection
   information that neither standalone monitor can recover.  That
   document leaves two items open: (a) work item IC9.1, a statistical
   hypothesis test on R_cross over an empirical joint covariance, and
   (b) the CONJECTURE that E[R_cross] != 0 in production AI-on-network
   deployments.

   This companion document closes both.  It specifies a permutation-
   based hypothesis test for the normalized cross-block correlation
   estimator, reports the FIRST real-wire measurement of R_cross on a
   production large-language-model serving path (n = 100 ticks,
   DeepInfra), and documents a pure-arithmetic reference implementation
   embedded in an operational system that reproduces the measurement
   number-for-number.  The strongest coupling, latency_ms <-> output
   tokens, is r = +0.446 (permutation p = 0.0005) on the full series
   and survives the same-model confound control at r = +0.343
   (p = 0.0135) within a single serving regime.  The Frobenius norm
   ||R_cross||_F = 0.469 (full) / 0.443 (intra-regime) exceeds the
   non-triviality floor of 0.05, confirming the production conjecture
   for this deployment.

   The document also specifies how the measured coupling and the
   per-engine Mahalanobis distance D^2 are consumed by an operational
   Wald Sequential Probability Ratio Test (SPRT) as an additive
   evidence channel for surgical sub-environment bifurcation.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current
   Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."

   This document is a companion to [I-D.melegassi-mvps-ai-coherence]
   and [I-D.melegassi-mvps-perfsec-coupling].

   This Internet-Draft will expire on 20 December 2026.

Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.  Code Components extracted from this
   document must include Revised BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction
   2.  Terminology and Requirements Language
   3.  The Normalized Cross-Block Estimator
   4.  Hypothesis Test (closes IC9.1)
   5.  Real-Wire Measurement Results
   6.  Reference Implementation (production, pure arithmetic)
   7.  Operational Use: R_cross and D^2 as SPRT Evidence
   8.  CAVEATs (Honest Limitations)
   9.  Security Considerations
   10. IANA Considerations
   11. References
   Appendix A.  Reproducibility
   Author's Address

==============================================================================
1.  Introduction
==============================================================================

   [I-D.melegassi-mvps-ai-coherence], Section 18.3, partitions the joint
   covariance of a network-coupled AI system as

      Sigma_joint = [ Sigma_net     | Sigma_cross ]
                    [ Sigma_cross^T  | Sigma_AI    ]

   and defines the coupling matrix

      R_cross = Sigma_net^{-1/2} * Sigma_cross * Sigma_AI^{-1/2}.

   Under the null hypothesis H_0: R_cross = 0 the joint Mahalanobis
   distance factorises, D^2_joint = D^2_net + D^2_AI, and the joint
   monitor adds nothing over two independent monitors.     Section 18.4 establishes (CORRECTED THEOREM) that Phase 3 (COUPLED)
   existence is necessary but not sufficient for R_cross != 0, and
   defers "the proper test -- a statistical hypothesis test on R_cross
   using the empirical Sigma_joint" to open work item IC9.1.  It
   further records a CONJECTURE that E[R_cross] != 0 in production.

   This document inherits the evidential-status discipline of the parent
   draft (Appendix A: THEOREM / DEFINITION / CONJECTURE / HYPOTHESIS /
   CAVEAT) and the reproducible-receipt discipline of
   [I-D.melegassi-irtf-mvps-methodology].  The measurement below is a
   NUMERICAL RECEIPT in the sense of
   [I-D.melegassi-ippm-mvps-proof-envelope]: a machine-regenerable
   artifact (evidence/rcross_real.json) whose SHA-256 digest can be bound
   into a proof envelope.  It does not introduce any new THEOREM; it
   converts the parent CONJECTURE into a measured result for one
   deployment and reports the failed-to-reject and rejected channels
   honestly (Section 5.3), including negative results.

   This document supplies the missing test, the missing measurement,
   and a reference implementation, replacing simulation-only evidence
   (scripts/simulate_three_domains.py in the parent draft) with a
   measurement on a live commercial inference API.

   This document follows the IP Performance Metrics framework of
   [RFC2330]: the metric (R_cross) is defined with an explicit
   measurement methodology (Sections 3-4), and the sources of
   measurement uncertainty are enumerated (Section 8), as that
   framework requires.  Both blocks are derived from operator
   telemetry in the sense of the Network Telemetry Framework [RFC9232],
   and the detection lineage (Coherence-BFD) inherits the sub-second
   timing model of Bidirectional Forwarding Detection [RFC5880] via
   [I-D.melegassi-coherence-bfd].

==============================================================================
2.  Terminology and Requirements Language
==============================================================================

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

   Network/infrastructure block (x_net):  per-tick vector
      [latency_ms, tok_per_s], where tok_per_s = tokens / (latency_ms /
      1000).  Both quantities are observable on every request at the
      client edge with zero additional instrumentation.

   Cognitive block (x_AI):  per-tick vector [out_tokens, len_dev],
      where out_tokens is the completion length and len_dev =
      |out_tokens - mean(out_tokens)| is the absolute deviation of the
      response length, a black-box proxy for cognitive instability when
      logit-level signals are unavailable (Section 6).

   Tick:  one served request.  A regime is a contiguous run of ticks
      served by the same model identifier.

==============================================================================
3.  The Normalized Cross-Block Estimator
==============================================================================

   The framework definition R_cross = Sigma_net^{-1/2} Sigma_cross
   Sigma_AI^{-1/2} reduces, when each variable is standardised to unit
   variance, to the cross-block of the Pearson CORRELATION matrix.
   This document uses that normalized form as the estimator:

      R_cross[i][j] = corr(x_net,i , x_AI,j)

   where corr is the sample Pearson correlation.  The estimator
   coincides with the parent-draft definition exactly when the intra-
   block correlations are small, and is an honest, conservative proxy
   otherwise (it omits the whitening cross-terms, which can only add
   coupling, never remove it).  The aggregate coupling magnitude is the
   Frobenius norm

      ||R_cross||_F = sqrt( sum_{i,j} R_cross[i][j]^2 ).

   A coupling is reported as NON-TRIVIAL when ||R_cross||_F > 0.05.

   CAVEAT (dimensionality).  The parent draft defines R_cross as a 3x3
   matrix over the full coherence axes (C_1, C_2, C_3).  This document
   measures a 2x2 black-box SUB-INSTANCE over the only axes observable
   without logit access (Section 2): it is a lower bound on the full
   ||R_cross||_F, never an over-statement.  A grey-box deployment would
   recover the remaining entries and can only increase the measured
   coupling.

==============================================================================
4.  Hypothesis Test (closes IC9.1)
==============================================================================

   For each cross pair (i, j) the null H_0: R_cross[i][j] = 0 is tested
   by a PERMUTATION test that makes no Gaussian assumption:

      1.  Compute the observed |r| = |corr(x_net,i, x_AI,j)|.
      2.  For B = 2000 iterations, randomly permute x_AI,j and
          recompute |r_b|.
      3.  p = (1 + #{b : |r_b| >= |r|}) / (B + 1).

   The permutation null is exact under exchangeability and is robust to
   the heavy-tailed latency distributions typical of shared inference
   wires.  A fixed seed (12345) makes the p-value reproducible.

   To control the MODEL-SWAP CONFOUND (a change of served model shifts
   both latency and output length jointly, manufacturing correlation
   that is not infrastructure-cognitive coupling), the test is run
   twice: once on the full series spanning a model swap (regimes A+B),
   and once restricted to the larger single-model regime (B).  Coupling
   that survives within a single regime cannot be attributed to the
   swap.

   This two-regime design is the COUNTER-PROOF (falsification attempt)
   required by [I-D.melegassi-irtf-mvps-methodology]: the most plausible
   alternative explanation (the swap manufactured the correlation) is
   constructed and tested against, not assumed away.  The claim is
   retained only because it survives that attempt (Section 5.2, F2).

==============================================================================
5.  Real-Wire Measurement Results
==============================================================================

   Measurement context: n = 100 ticks collected on the same client wire
   against the DeepInfra inference API, spanning one deliberate model
   swap (regime A -> regime B).  Raw series:
   evidence/coupling_timeseries.json.  Computed verdict:
   evidence/rcross_real.json.

5.1.  Full series (A+B, includes the model swap)

      R_cross (rows = net block, cols = cognitive block):

                       out_tokens     len_dev
          latency_ms       +0.446      -0.063
          tok_per_s        -0.063      -0.113

      permutation p-values:

                       out_tokens     len_dev
          latency_ms      0.0005       0.5482
          tok_per_s       0.5467       0.2549

      ||R_cross||_F = 0.469
      strongest pair = (latency_ms, out_tokens), |r| = 0.446

5.2.  Within regime B (single model -- confound controlled)

      R_cross:

                       out_tokens     len_dev
          latency_ms       +0.343      +0.181
          tok_per_s        +0.164      -0.138

      permutation p-values:

                       out_tokens     len_dev
          latency_ms      0.0135       0.2144
          tok_per_s       0.2579       0.3453

      ||R_cross||_F = 0.443

5.3.  Findings

   F1.  R_cross != 0 on the real wire.  ||R_cross||_F = 0.469 > 0.05.
        The production CONJECTURE of [I-D.melegassi-mvps-ai-coherence]
        Section 18 holds for this deployment.

   F2.  The coupling SURVIVES the model-swap confound:
        ||R_cross||_F = 0.443 within a single regime, with the leading
        pair latency_ms <-> out_tokens still significant
        (r = +0.343, p = 0.0135).  The coupling is therefore an
        intra-regime infrastructure-cognitive effect, not a swap
        artefact.

   F3.  The coupling is DIRECTIONAL and SPARSE: it concentrates in the
        latency <-> output-length channel, consistent with the
        drift-transfer mechanism of [I-D.melegassi-mvps-ai-coherence]
        Section 19, where serving-path state perturbs decode length.

   F4 (NEGATIVE RESULT, reported for falsifiability).  Three of the four
        cross pairs FAIL to reject H_0: (tok_per_s, out_tokens) p=0.5467,
        (latency_ms, len_dev) p=0.5482, (tok_per_s, len_dev) p=0.2549 on
        the full series.  Only the latency <-> out_tokens channel is
        significant.  Reporting the non-significant channels is required
        by the adversarial-audit discipline of
        [I-D.melegassi-irtf-mvps-methodology]: the claim is "one strong
        coupling channel exists", NOT "the blocks are densely coupled".
        A reader MUST NOT infer coupling on the silent pairs.

==============================================================================
6.  Reference Implementation (production, pure arithmetic)
==============================================================================

   The estimator of Section 3, the permutation test of Section 4, and
   the telemetry derivation of Section 2 are implemented in an
   operational system (Catellix "Aurix") as pure standard-library
   arithmetic with zero I/O and zero numerical dependencies:

      app/aurix2/trajectory.py:
        _pearson(a, b)               -- sample Pearson correlation
        cross_coupling(block_net,    -- R_cross + ||.||_F + max pair
                       block_ai)
        coupling_from_telemetry(rows)-- derives x_net, x_AI from the
                                        request-telemetry rows and
                                        returns R_cross, gated by a
                                        minimum sample size (default 12)

   The implementation runs on the SAME telemetry rows already queried
   for the per-engine trajectory report (Section 7); it introduces no
   additional database query and is exposed under the report key
   "_coupling".

6.1.  Exact-reproduction conformance test

   A conformance test (tests/test_aurix2_trajectory.py::
   test_cross_coupling_matches_validated_evidence) feeds the published
   raw series (evidence/coupling_timeseries.json) to the production
   cross_coupling() function and asserts BYTE-EXACT equality with the
   published verdict (evidence/rcross_real.json): the full R_cross
   matrix, the Frobenius norm, and the strongest pair.  The production
   path therefore computes the measurement of Section 5 with no
   deviation; the numbers in this document are not a separate analysis
   but the system's own output.  The full trajectory/coupling suite is
   17/17 passing.

==============================================================================
7.  Operational Use: R_cross and D^2 as SPRT Evidence
==============================================================================

   The measured coupling is consumed operationally, not merely
   reported.  Two mechanisms apply.

7.1.  Per-engine D^2 channel into the Wald SPRT

   The system maintains a per-engine trajectory report with a
   Mahalanobis distance D^2 (diagonal form over the state vector z(t) =
   [1 - C_4, CBF, truncation_rate, latency]) and Critical-Slowing-Down
   precursors (lag-1 autocorrelation and Kendall-tau variance trend).
   The current D^2 is now fed as an additive evidence channel into the
   Wald SPRT [WALD1945] that decides whether an individual request
   merits a surgical sub-environment (bifurcation).  The channel uses
   the chi-square-quantile-calibrated log-likelihood ratio of the
   parent incremental draft [I-D.melegassi-mvps-incremental-be],
   Theorem 5 region:

      D^2 <= dof          -> LLR = -0.5  (evidence for H_0)
      D^2 = 7.815 (.05)   -> LLR = +1.0
      D^2 = 11.345 (.01)  -> LLR = +2.3

   The channel MUST default to a neutral log-likelihood ratio (LLR = 0)
   when no trajectory is available, so the addition is fail-safe: it can
   only add evidence, never suppress the prior channels.

7.2.  Why coupling matters for the SPRT

   Because R_cross != 0 (Section 5), the infrastructure axes carry
   information about the cognitive state.  The latency component of
   z(t) and the D^2 channel are therefore not redundant with the
   coherence probe (C_2/C_4/CBF): they are a partially independent,
   zero-cost-to-observe leading indicator.  Quantifying ||R_cross||_F
   tells the operator HOW MUCH independent precision the infrastructure
   channel adds, exactly as predicted by
   [I-D.melegassi-mvps-ai-coherence] Section 18.

==============================================================================
8.  CAVEATs (Honest Limitations)
==============================================================================

   Per the evidential discipline of [I-D.melegassi-mvps-ai-coherence]
   Appendix A, every limitation is stated explicitly as a CAVEAT.

   CAVEAT L1.  SINGLE SHARED WIRE.  The measurement is taken on one client
        wire against one commercial API.  It does not isolate pure
        network latency from server-side queueing/load; the coupling is
        between END-TO-END infrastructure latency and cognitive output,
        which is the operationally relevant quantity but not a clean
        physical-layer measurement.

   L2.  SINGLE BATCH (n = 100).  Effect sizes and p-values are
        indicative, not definitive.  The intra-regime significance
        (p = 0.0135, n = 60-ish) is the conservative figure; the full-
        series p = 0.0005 is inflated by the swap.  Replication across
        wires, providers, and time-of-day is required before any
        normative claim.

   L3.  BLACK-BOX COGNITIVE PROXY.  The cognitive block uses output
        length and its deviation, not logit-level coherence, because
        the tested API returns logprobs = null.  len_dev is a coarse
        proxy; a grey-box deployment with logprobs would measure a
        sharper cognitive axis and likely a larger ||R_cross||_F.

   L4.  CORRELATION, NOT MECHANISM.  This document measures coupling;
        the causal drift-transfer mechanism is argued in
        [I-D.melegassi-mvps-ai-coherence] Section 19 and is not
        re-proved here.

==============================================================================
9.  Security Considerations
==============================================================================

   The coupling channel is a DETECTION aid; it adds no new attack
   surface because both blocks are derived from telemetry the operator
   already collects.  An adversary who can shape serving-path latency
   could, in principle, attempt to bias the cognitive proxy via the
   measured coupling; the fail-safe SPRT wiring (Section 7.1) bounds the
   influence of any single channel and the cross-check quorum of the
   trajectory layer requires corroboration from at least two
   independent axes before a strong action.  No part of the proprietary
   coherence calibration is disclosed by R_cross itself.

   PRIVACY.  R_cross is computed over aggregate per-engine telemetry
   (latency and token counts), not over request content; the privacy
   considerations framework of [RFC6973] applies to the underlying
   telemetry collection but R_cross adds no new personal-data exposure.

==============================================================================
10.  IANA Considerations
==============================================================================

   This document has no IANA actions.

==============================================================================
11.  References
==============================================================================

11.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
              "Framework for IP Performance Metrics", RFC 2330,
              May 1998, <https://www.rfc-editor.org/info/rfc2330>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in
              RFC 2119 Key Words", BCP 14, RFC 8174, May 2017,
              <https://www.rfc-editor.org/info/rfc8174>.

   [I-D.melegassi-mvps-ai-coherence]
              Melegassi, L., "MVPS AI-Coherence Extension: Semantic,
              Byzantine, and Infrastructure-Cognitive Coherence for
              AI-Serving Network Deployments",
              draft-melegassi-mvps-ai-coherence-01, May 2026.

11.2.  Informative References

   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding
              Detection (BFD)", RFC 5880, June 2010,
              <https://www.rfc-editor.org/info/rfc5880>.

   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
              Morris, J., Hansen, M., and R. Smith, "Privacy
              Considerations for Internet Protocols", RFC 6973,
              July 2013, <https://www.rfc-editor.org/info/rfc6973>.

   [RFC9232]  Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L.,
              and A. Wang, "Network Telemetry Framework", RFC 9232,
              May 2022, <https://www.rfc-editor.org/info/rfc9232>.

   [I-D.melegassi-mvps-incremental-be]
              Melegassi, L., "Incremental Bandwidth-Efficient Multi-
              Vantage Path Synchrony (BE-MVPS): Cell-Partitioned
              Coherence with epsilon-Gated Sherman-Morrison Updates",
              draft-melegassi-mvps-incremental-be-00, May 2026.

   [I-D.melegassi-mvps-perfsec-coupling]
              Melegassi, L., "MVPS Performance-Security Coupling
              Profile: Joint Volume-Independence and Authentication
              Guarantees for Coherence-BFD with Coherent-Witness Trust
              (CWT)", draft-melegassi-mvps-perfsec-coupling-00,
              May 2026.

   [I-D.melegassi-coherence-bfd]
              Melegassi, L., "Coherence-BFD: Sub-Second Coherence
              Detection Using Bidirectional Forwarding Detection
              Patterns", draft-melegassi-coherence-bfd-00, May 2026.

   [I-D.melegassi-irtf-mvps-methodology]
              Melegassi, L., "The MVPS Adversarial-Audit Methodology: A
              Reproducible Discipline for Measurement-Security Internet-
              Drafts", draft-melegassi-irtf-mvps-methodology-00,
              May 2026.

   [I-D.melegassi-ippm-mvps-proof-envelope]
              Melegassi, L., "MVPS Proof Envelope: Tamper-Evident
              Binding of Theorem Catalogues, Validators, and Numerical
              Receipts, with an Optional Post-Quantum Profile",
              draft-melegassi-ippm-mvps-proof-envelope-00, May 2026.

   [WALD1945] Wald, A., "Sequential Tests of Statistical Hypotheses",
              Annals of Mathematical Statistics, 16(2):117-186, 1945.

Appendix A.  Reproducibility

   Raw series:    evidence/coupling_timeseries.json
   Verdict:       evidence/rcross_real.json
   Analysis:      scripts/_rcross_real.py
   Production:    app/aurix2/trajectory.py (cross_coupling,
                  coupling_from_telemetry)
   Conformance:   tests/test_aurix2_trajectory.py
                  (test_cross_coupling_matches_validated_evidence)

   The conformance test asserts that the production function reproduces
   the published verdict exactly; running the trajectory suite
   regenerates the agreement (17/17 passing).

Author's Address

   Leonardo Melegassi
   Catellix
   Andradina, SP
   Brazil

   Email: melegassi@catellix.com
   URI:   https://catellix.com/