Measuring Architectural Impact: A Load Testing Study

Daniel King · January 16, 2026 · 8 min read

Landscape – Multiple Left Align Person 02 + Masked Image with Half X (2)

Part 2 of a series on scaling data-intensive applications

Coming in 2026: The distributed connector architecture described in this series is planned for an upcoming XMPro release. This series shares our engineering findings and the architectural approach we're taking.

Architecture decisions are hypotheses. Load testing is the experiment.

In the previous post, we introduced the distributed connector pattern—moving data connector execution from the application process to a stream host collection. The architectural reasoning was sound, but reasoning alone doesn’t validate a design. Measurement does.

This post examines our load testing methodology and what the numbers reveal about each architecture’s behaviour under pressure.

Why Full Page Loads Matter

Most load testing focuses on individual API endpoints. A single endpoint might respond quickly, leading teams to conclude the system can handle high request rates. But this arithmetic misses something important: users don't experience endpoints—they experience pages.

A single page load in XMPro Application Designer triggers a sequence of API calls:

Authentication validation — verify the session is valid
Application metadata — retrieve page structure and layout
Page configuration — load block definitions and bindings
Data source connections — establish connector contexts
Data retrieval — fetch actual data for each grid or chart (potentially multiple calls)
Asset loading — retrieve images, styles, and scripts

In our test pages, this translated to multiple API calls per full page load. Testing individual endpoints in isolation would have masked the cumulative impact of connector loading overhead—each data call triggers the connector instantiation cycle described in Part 1.

The key insight: architectural bottlenecks compound across a request sequence. Overhead per data call becomes multiplied on a page with multiple data sources, plus the cascading impact on connection pool exhaustion and memory pressure.

Test Infrastructure

Important Context: We deliberately used constrained infrastructure—below our recommended production sizing—to stress-test architectural limits. This approach surfaces architectural differences more clearly. The improvement ratios we report are consistent across infrastructure sizes, but absolute capacity numbers depend on your specific deployment configuration.

To isolate architectural impact from infrastructure variation, we held the environment constant:

Component	Configuration
Application Server	Undersized cloud instance (2 vCPUs, limited RAM)
Database	Basic tier
Stream Host Collection	Single host (distributed tests only)
Message Broker	MQTT broker with persistent connections
Load Generator	k6 running on separate infrastructure

We chose constrained resources to understand behaviour where architectural inefficiencies surface first.

Test Methodology

Staged Ramp-Up

Rather than immediately flooding the system with maximum load, we used a staged ramp-up pattern—gradually increasing concurrent users over time.

This approach reveals how each architecture degrades. A system that handles moderate load gracefully but collapses at a threshold behaves differently from one that degrades linearly. The staged ramp-up captures these transitions.

Automatic Threshold Detection

Tests included automatic termination conditions:

p95 response time exceeding acceptable limits — the point where user experience becomes unacceptable
Error rate exceeding threshold — indicating system instability

When either threshold was breached, the test ended and recorded the concurrent user count at failure. This gave us a consistent measure of practical capacity—not theoretical throughput, but the load level where the system remains usable.

Full Page Simulation

Each virtual user executed a complete page load sequence:

Authenticate (if session expired)
Request page metadata
Request page data (connector calls)
Wait for response completion
Think time (randomised pause)
Repeat

This cycle continued until the test ended or thresholds were breached. The metrics captured both individual call performance and aggregate page load duration.

The Numbers

Concurrent Users at Threshold

The first clear difference: how many concurrent users could each architecture support before degrading below acceptable thresholds?

Architecture	Relative Capacity
In-Process	1× (baseline)
Distributed	~5×

The in-process architecture reached its threshold at a certain concurrency level. The distributed architecture handled approximately 5× more concurrent users while maintaining acceptable response times. comparison-concurrent-users (1)

The ~5× difference is striking, but the shape of the degradation matters as much as the final number.

Response Time vs Concurrent Users

The critical question isn't just "how much capacity" but "how does performance change as load increases?" degradation-curve

The degradation patterns diverge dramatically:

In-Process Architecture: Response times climb steeply as users increase. At a certain threshold, the p95 response time breaches acceptable limits. The test automatically terminates—the architecture can’t sustain higher load.

Distributed Architecture: Response times remain relatively flat. Even at 5× the in-process failure point, median page load times remain responsive. The architecture has headroom to spare.

The contrast illustrates a fundamental difference: the in-process model hits a wall where adding users causes exponential degradation. The distributed model degrades linearly—each additional user adds a small, predictable increment to response time rather than compounding existing pressure.

Throughput: Page Loads Completed

The ultimate measure of capacity is work completed. How many full page loads did each architecture serve during the test?

Architecture	Relative Throughput
In-Process	1× (baseline)
Distributed	~50×

A ~50× difference in throughput. The distributed architecture served approximately 50 times more page loads while maintaining higher reliability.

This disparity exceeds what the ~5× user capacity might suggest. The explanation lies in test duration: the in-process test ended quickly (threshold breach), while the distributed test ran much longer. Users in the distributed system not only loaded pages faster—they had time to load more pages before the test concluded.

Connector Read Performance

Isolating the connector operations specifically:

Architecture	Relative Speed
In-Process	1× (baseline)
Distributed	~6× faster

Connector read time improved approximately 6×. This confirms that the connector loading overhead in the in-process model dominates actual query execution time.

What the Data Reveals

The Threshold Effect

In-process connector loading exhibits a threshold behaviour. Below a certain concurrency level, the system performs adequately. Above that threshold, performance degrades rapidly as:

Database connections queue for connector binary retrieval
Memory pressure increases from concurrent assembly loading
CPU cycles divide between connector instantiation and page rendering
Thread pool exhaustion causes cascading delays

The distributed model avoids this threshold by eliminating the per-request connector loading entirely. The application server's work is constant regardless of connector complexity—publish a message, await a response.

Network Overhead is Not the Bottleneck

A common concern with distributed architectures: doesn't adding network hops slow things down?

In theory, yes. In practice, the numbers show the opposite. The distributed model responds ~3× faster, despite routing through a message broker.

The explanation: network round-trip time to the broker (typically milliseconds) is negligible compared to the connector loading overhead (hundreds of milliseconds to seconds). Trading a small fixed cost for a large variable cost is a good trade.

Success Rates Tell a Stability Story

The distributed system showed higher reliability—fewer failures per request despite serving dramatically more traffic.

In relative terms—failures per successful page load—it was significantly more reliable. And those failures occurred while serving ~50× more traffic.

Interpreting Results Carefully

Several caveats apply to these measurements:

Test environment was deliberately constrained. We chose undersized resources specifically to surface architectural differences. Larger infrastructure would raise both architectures’ thresholds—but the ratio between them would likely persist.

Single page type tested. Our test page had one data grid with a single data source. Pages with multiple data sources show even larger differences (explored in Part 3).

MQTT broker was not the bottleneck. We verified broker capacity exceeded test requirements. In production, broker sizing becomes a consideration.

Stream host was not the bottleneck. In the distributed tests, the stream host remained well within optimal resource limits throughout. The application server was still the limiting factor—the distributed model simply uses that resource more efficiently by eliminating connector loading overhead.

Database was not the bottleneck. Query execution time was consistent across both tests. The difference was in how often the application could dispatch queries, not how fast the database answered.

What This Means for Capacity Planning

The ~5× concurrent user improvement translates directly to infrastructure efficiency. Consider two deployment scenarios for high-concurrency requirements:

In-Process approach:

Requires scaling to multiple times baseline infrastructure (larger instance tier or multiple instances)
Each scale unit adds connector loading overhead proportionally
Memory requirements grow with connector binary caching

Distributed approach:

Smaller infrastructure supports the load directly
Stream host collection scales independently
Application server memory remains stable regardless of connector count

The distributed model changes the scaling equation. Instead of scaling application servers to handle connector load, you scale stream hosts to handle query volume—a more predictable and efficient allocation of resources.

Next in This Series

The single-data-source results are compelling, but they're the easy case. In Part 3, we examine what happens when pages have multiple data sources—three data grids, each with its own connector. The compounding effect of connector loading overhead creates differences measured in orders of magnitude.

This is part 2 of a series on distributed architecture patterns. The series draws on load testing conducted on XMPro Application Designer, comparing in-process SQL connectors with distributed stream-based connectors. The distributed connector capability is planned for an upcoming 2026 release.