In this article, we will explore how the Stingray platform scaling is implemented, which components participate in the cluster scheme, and how an operator can increase performance from a single node to terabit configurations.
The Limit of a Traffic Processing Node
Let’s imagine a regional operator. They have several thousand subscribers, evening peak traffic in the tens of gigabits, IPTV, torrents, online games, video calls, corporate clients, and legislative requirements for filtering prohibited resources.
At the start, a single Stingray node can operate as a unified point of subscriber traffic processing. Based on Deep Packet Inspection technology, the Stingray platform performs the following functions:
- BNG/BRAS (Broadband Network Gateway / Broadband Remote Access Server);
- Carrier Grade NAT (IPv4 network address translation);
- Channel bandwidth control (Quality of Service);
- Traffic filtering;
- Statistics collection and analytics (Network Visibility and Quality of Experience).
Then the operator starts to grow. New districts appear, video consumption increases, the volume of mobile and OTT traffic grows, corporate clients connect, and the tariff lineup expands.
When traffic on the network increases, a single node may no longer be able to handle its processing. The operator then needs to expand computing resources or change the traffic processing scheme.
At the first stage, the operator typically uses vertical scaling. The Stingray software is moved to a more powerful server with greater headroom in CPU, memory, and network interfaces.
| Approach | How it works | When it fits | Where the limit appears |
| Vertical scaling | The Stingray license is moved to a more powerful server | When a single server provides sufficient performance | CPU, memory, interfaces, cost, and physical server limitations |
| Horizontal scaling | Traffic is distributed across multiple servers | When a single server no longer provides the needed headroom or expansion is required | Traffic must be properly distributed among servers without additional load balancers |
| Cluster with multiple NPBs (Balancers) | Additional NPBs and servers are added to the scheme | When growth to terabit values is needed | Scaling depends on NPB performance, number of nodes, and N+X redundancy |
Scaling approaches
But vertical growth is not endless. A server has a physical ceiling in terms of CPU, memory, PCIe lanes, network cards, cooling, and configuration cost.
At the same time, the architecture does not change. All processing is still concentrated at a single point through which a critical volume of subscriber traffic passes.
When a single node no longer provides the needed headroom, the system is moved to a cluster scheme.
How the Cluster Scheme with a Network Packet Broker (NPB) Works
The Stingray platform scales through the linear addition of NPB (Network Packet Broker) traffic balancers and processing servers. This scheme allows performance to be increased gradually, without replacing central components and without rebuilding the network logic — it is sufficient to add new elements to the cluster.
For the operator’s network, the platform remains a transparent L2 device and continues to operate in inline mode.
The main elements involved in scaling:
- NPB — a traffic balancing device that distributes flows between traffic processing nodes and maintains symmetric session traversal within each Stingray server. In combination with bypass switches, it participates in a fault-tolerant traffic flow scheme.
- Processing servers — general-purpose x86_64 servers with Mellanox/Intel network cards running Stingray software for deep traffic analysis.
- Bypass switches — a device that ensures continuity of communication by automatically redirecting traffic directly through the network in the event of a processing system failure.
- Management is performed via NMS. Through the graphical interface, profiles, policies, filtering rules, lists, custom protocols, and monitoring parameters are configured.
Each platform node can operate independently or be part of a cluster. In a cluster scheme, the NPB distributes flows between servers based on the balancing algorithm and the current state of the nodes.
The typical processing logic is as follows:
- Operator traffic arrives at the optical bypass.
- Through the bypass, it is forwarded to the NPB.
- The NPB distributes traffic among processing nodes.
- Outgoing and incoming traffic of a single session/subscriber is directed to the same Stingray processing node.
- At the Stingray node, traffic is analyzed, QoS policies are applied, web filtering is performed, CG-NAT address translation or full BNG/BRAS functions are executed.
- After processing, the traffic is returned to the line through the NPB and BYPASS.
Network traffic processing logic using NPB and DPI
Depending on the scenario, balancing can be structured differently. For DPI scenarios, balancing at the level of individual sessions is acceptable. For BNG and CG-NAT, a subscriber-aware scheme is used, where all traffic from a single subscriber is processed by a single node.
If route asymmetry exists in the network — for example, incoming and outgoing traffic passes through different sites — mirroring of OUTBOUND TRAFFIC ONLY between NPBs at different sites is used. In this case, a copy of the flow is transmitted between sites and used for correct session analysis. After processing, the mirrored traffic is discarded and does not participate in statistics. This does not entail significant additional costs, as outbound traffic is no more than 10% of inbound traffic.
This scenario is especially important for distributed networks where a subscriber’s traffic may pass through different points of presence.
How Performance Is Scaled Up
The cluster then expands in stages. Within the current scheme, servers for Stingray can be added. When the capacity of the balancing layer becomes insufficient, new NPBs are added.
One NPB with 64x100G ports is designed for 1.2 Tbps of total traffic, of which 1 Tbps is download and 200 Gbps is upload.
A maximum of up to 8 NPBs can be included in a cluster. In this configuration, the total throughput reaches 9.6 Tbps.
Cluster scaling example with 4 NPBs: 4xNPB = 4.8 Tbps total traffic (4 Tbps download + 800 Gbps upload)
For DPI servers, the calculation is done separately. Depending on the configuration, the working calculated load can range from 120 to 360 Gbps of total traffic per node. However, the final number of nodes always depends on the traffic profile, PPS, number of sessions, NAT translations, the set of enabled functions, and the redundancy coefficient the operator wishes to include.
| Configuration | Total traffic | Download | Upload |
| 1 NPB | 1.2 Tbps | 1 Tbps | 200 Gbps |
| 2 NPB | 2.4 Tbps | 2 Tbps | 400 Gbps |
| 3 NPB | 3.6 Tbps | 3 Tbps | 600 Gbps |
| 4 NPB | 4.8 Tbps | 4 Tbps | 800 Gbps |
| 8 NPB | 9.6 Tbps | 8 Tbps | 1.6 Tbps |
For example, if an operator builds a cluster that is sized exactly to the current load, any evening peak or node failure will immediately cause overload. Therefore, industrial schemes include an N+X reserve. Some processing nodes are designated as cluster reserves. If one server fails, the NPB redistributes traffic flows to the remaining nodes, and the cluster continues to process traffic.
When scaling a cluster, it is recommended to use servers of equal capacity.
Redundancy, Heartbeat, and Bypass
Scaling should not turn the network into a structure where the failure of a single server breaks the entire segment. Therefore, in the Stingray cluster scheme, redundancy is built on several levels.
External Optical BYPASS and DPI Node Health Control
For DPI and PCEF scenarios, an external optical BYPASS is used. The operator’s links are connected in break-in mode through a bypass switch. It monitors the signal state on the line and the operability of all components of the entire complex. If a critical failure of the processing server or balancer occurs, the bypass passes traffic directly, eliminating the impact. Network connectivity is maintained, but DPI functions are temporarily suspended.
Traffic passing through optical bypass in enabled and disabled modes
Cluster health is monitored through a heartbeat mechanism. The NPB can see which servers are available and uses only active nodes. If one of the nodes stops responding, the balancer removes it from the distribution scheme and directs new traffic flows to the remaining servers.
N+X Redundancy
For all functions of the Stingray platform, N+X redundancy is applied. Additional computing resources are built into the cluster, allowing it to survive the failure of individual nodes without immediately overloading the system.
If after a failure the remaining servers receive more traffic than they can process, the quality of processing may begin to degrade even while the service remains operational. It is therefore recommended to plan for a load of 80% of the maximum performance on each Stingray node.
Conclusion
Stingray scaling follows a linear scheme. An operator can start with a single node and then move to a cluster scheme with multiple NPBs and a combined capacity of 9.6 Tbps.
This architecture provides a clear expansion path. Throughput grows by adding components, policy management remains centralized, and the platform itself continues to operate as a transparent L2 device for the operator’s network.
If your business requires a system that grows with you — the Stingray architecture provides a ready-made, proven expansion strategy without rebuilding the logical network scheme.