Converged BNG: Four Ways to Ensure Infrastructure Resilience

November 7, 2025
BNG/BRAS
Converged BNG: Four Ways to Ensure Infrastructure Resilience
Today's telecom operators face unprecedented challenges: explosive traffic growth, increasingly complex network architectures, frequent cyberattacks, and ever-growing subscriber expectations for service quality and continuity. According to data for 2025, the number of DDoS attacks on the telecom industry increased by 55% compared to the previous year, with the industry accounting for 31% of all attacks. In such conditions, infrastructure reliability is no longer an option — it has become a necessity for business survival.

Converged Broadband Service Gateway solutions combine BNG, CG-NAT, DPI, router, and DDoS protection functions on a standard x86 server, simplifying administration and reducing total cost of ownership. However, the technological excellence of the platform is only part of the equation. True fault tolerance is achieved through proper architecture, intelligent redundancy, built-in protection, and process automation.

There are four complementary ways to ensure maximum network infrastructure reliability, from scaling to configuration automation. Let’s take a look at them in order.

Scaling BNG solutions: from a single server to a cluster

The performance of a single server with converged BRAS/BNG can reach 100 Gbps in full duplex mode, which is sufficient to serve up to 50,000 subscribers at an average speed of 2 Mbps per user. However, for large operators and with active growth of the subscriber base, there is a need for further scaling.

Vertical scaling is achieved by increasing the power of the equipment: increasing the number of processor cores, adding network adapters, and switching to faster interfaces. Modern platforms support interfaces from 1GbE to 100GbE, and higher configurations are capable of processing up to 400 Gbps and serving up to 150,000 subscribers on a single physical server. The systems use DPDK technology for direct access to network cards and load balancing across processor cores, achieving a latency of around 30 microseconds.

When the performance of a single server is exhausted, horizontal scaling is applied. Cluster architecture using traffic balancers (Network Packet Broker) allows you to combine up to several dozen servers into a single system. The balancer distributes traffic between cluster nodes using an IP address source and destination hashing algorithm, ensuring critical symmetry — all packets within a single session go to the same server.

The cluster’s performance scales almost linearly and can reach 4.6 Tbps based on standard x86 servers. A practical example of this architecture is one of the operators in Central Asia, which built a network based on two physical arms, each with four BNG deployed. This scheme not only provides the necessary performance, but also lays the foundation for the next level of reliability — redundancy.

Fault tolerance through redundancy: Active-Active and Active-Standby

Scaling solves the performance problem, but creates a new challenge: what happens if one of the nodes fails? The failure of even one BNG in a cluster can leave thousands of subscribers without connectivity if no redundancy mechanisms are in place. That is why the next critical step is a well-designed fault tolerance scheme.

Modern converged BNG support two main redundancy modes: Active-Active and Active-Standby:

  • In Active-Active mode, the load is evenly distributed among all operating nodes — for example, four BNG can simultaneously process traffic, each taking on 25% of subscribers. If one of the servers fails, its load is automatically redistributed among the remaining three, with minimal downtime for end users. This scheme not only ensures fault tolerance, but also maximizes resource utilization — no server is idle waiting for a failure.
  • Active-Standby mode, on the other hand, assumes the presence of a “hot” reserve — the backup BNG is in standby mode and only comes into operation if the main one fails. The VRRP protocol is often used for L2 IPoE BNG, ensuring fast switching.
Although this scheme means that some of the equipment is not constantly involved in traffic processing, it guarantees instant service recovery in the event of critical failures.

Server-level redundancy is complemented by hardware redundancy within each platform. Modern solutions are equipped with redundant power supplies and an N+1 fan system, which protects against most hardware failures without the need to switch to a backup server. Connection via aggregated LAG (Link Aggregation) channels provides fault tolerance at the network level — if one physical port fails, traffic is automatically transferred to the remaining ports.

The practical implementation of such schemes has been demonstrated by one of the operators in Central Asia. The network is built on the principle of two physical arms, each of which has four BNG operating in Active-Active mode. Two logical segments are implemented within each arm, and switching between them is performed manually by engineers as needed. This architecture has proven its effectiveness: during the move to our own data center, a hardware problem was discovered with one of the BNg, but thanks to redundancy, the downtime was minimal.

Protection against DDoS attacks: built-in security mechanisms

Even the most scalable and fault-tolerant infrastructure will be useless if attackers can paralyze it. Telecommunications infrastructure has become a priority target for cybercriminals, making built-in DDoS protection not an optional extra, but an essential element of network reliability.

Attacks on operators take various forms, each of which requires specific protection methods. The most common scenario is overflowing input channels through amplification attacks (DNS, NTP, UDP flood) or using botnets. The second popular vector is a high PPS (packets per second) attack, usually via SYN flood or UDP flood with spoofing of the source IP address, the purpose of which is to exhaust the resources of network equipment. Finally, attackers may attempt to hack into the operator’s network elements themselves, gaining control over critical infrastructure.

Modern converged BNG offer two levels of protection depending on the operator’s needs. The basic level includes built-in automatic protection against the most common attacks: SYN Flood, UDP Flood, and HTTP Flood. These mechanisms are activated automatically when abnormal traffic is detected and do not require additional configuration, ensuring an immediate response to the threat. For operators who need deeper protection, there are comprehensive solutions with integrated quality of service analytics modules.

The advanced protection architecture is built on the principle of “detection – analysis – mitigation.”

Analytical modules continuously collect and analyze traffic statistics via IPFIX, identifying anomalies in real time. The detector uses neural network algorithms combined with deep packet inspection (DPI) to accurately determine the type of attack and separate legitimate traffic from malicious traffic. After detection, the system can act in two ways: completely block incoming traffic to the attacked resource (blackhole) or selectively clean the traffic, allowing only legitimate connections to pass.

The advantage of modern protection systems lies in their distributed architecture — protection can be deployed on several network nodes simultaneously, ensuring high fault tolerance of the security system itself. Adaptive algorithms automatically update filtering rules as the attack develops, without requiring manual intervention by engineers. Deep analytics not only allow you to repel the current attack, but also accumulate knowledge about the methods used by attackers, gradually increasing the effectiveness of protection. At the same time, the solutions remain flexible — the operator can choose different blocking scenarios depending on the type of resource under attack and business priorities.

Automation through Ansible: From Manual Configuration to DevOps

Scaling, redundancy, and protection against attacks create a powerful technical foundation for a reliable infrastructure, but they also give rise to a new problem: increasing management complexity. When dozens of BNG servers serving hundreds of thousands of subscribers are operating on a network, manually configuring each change becomes a bottleneck. Adding new rate plans, mass updating security settings, or migrating subscribers between nodes all require repeating the same operations on multiple devices, where every mistake can lead to service downtime.

The transition to automation via Ansible radically changes the operating model. Instead of manually connecting to each BNG and entering commands, the engineer describes the desired state of the infrastructure in the form of a playbook — a text file in YAML language, which is then applied to all nodes simultaneously. Ansible ensures idempotency of operations — re-running the same script will not lead to conflicts or duplicate settings; the system will simply check that the desired state has already been achieved.

The time savings become apparent from the very first use of automation. What used to require hours of work by an engineer — for example, changing the policing parameters for a specific category of subscribers on eight BNG servers — can now be done in minutes with a single command. But even more important is the minimization of human error: once the configuration is described in code and verified, it is applied identically to all nodes, eliminating typos, missed steps, or differences in settings between servers.

Scaling the infrastructure also becomes a simple task. Adding a new BNG server to the cluster no longer requires hours of manual configuration — just add a new host to the Ansible inventory and run the standard deployment playbook. Connecting new subscribers is similarly simplified: instead of manually creating configuration entries, you can import data from billing and automatically generate all the necessary settings. Ansible’s flexibility allows you to manage multiple devices simultaneously, making it the ideal tool for dynamically evolving networks.

It is particularly important that Ansible is a free, open-source tool — operators do not need to pay for configuration management system licenses. At the same time, working with Ansible does not require in-depth programming skills: the YAML syntax is intuitive, and the library of ready-made modules covers most typical tasks. Even an engineer with no development experience can create a working playbook, making automation accessible to operators of any scale.

Automation completes the circle of infrastructure reliability: technologically advanced equipment is complemented by operational excellence, where the human factor is minimized and the speed of response to changes is maximized.

Conclusion

The reliability of modern telecommunications infrastructure is not the result of a single “silver bullet” solution, but rather the result of a comprehensive approach in which each of the four methods described above reinforces the others. Scaling provides the necessary performance for growth, redundancy protects against technical failures, built-in DDoS protection deflects external threats, and automation minimizes the human factor and speeds up response to changes. Together, they create a synergistic effect where the reliability of the system becomes greater than the sum of its parts.

Converged BNG/BRAS solutions are not just a technology platform, but the foundation for building a truly resilient network that can withstand both technical failures and targeted attacks, while remaining flexible and manageable.