BRAS Scaling and Standby

June 3, 2021
BNG/BRAS
BRAS Scaling and Standby
The key hardware requirements for scaling are component and driver compatibility of the core software. The main components are CPU, RAM, HDD, Network cards.

It is important to note that when building highly loaded systems only one CPU is used in order to avoid high delays when accessing memory due to NUMA architecture.

Software BRAS has certain advantages compared to hardware-software solutions:

  • Independence from the hardware vendor
  • Ability to use available hardware
  • Fast deployment
  • Ability to create a demo zone based on the Virtual Machine for configuration validation
  • Flexible scalability through performance increase.

VAS Experts has extensive experience in implementing BRAS expansion solutions. Often our clients face the challenge of network equipment scale, and here’s what they say about their development and migration experience:

“Stingray SG-20 to Stingray SG-40 migration was absolutely easy. Stingray software worked without errors and the migration itself passed without any problems.”
Roman Trepalin, Post Ltd.

In this article, we will talk about several expansion options in detail with examples of scaling.

Scale-up (Vertical)

Vertical expansion occurs within a single device to achieve the desired performance. The current performance limit per server is up to 120G common traffic. When new processors will appear in the future, the limit is going to be raised. We recommend using a standby server, so the load can be switched to in case the primary server fails.

Example of scaling up based on Intel servers

Memory is ramped up according to requirements: a platform with 8 cores can be used to start, then it is possible to switch to 16 and 28 cores. Network cards may be used with 10G/25G/40G ports, a combination of different port types is not supported in this case.

Here is an example of expanding Stingray SG-6 to Stingray SG-60 on the same platform.

Platform: Supermicro SYS-5019P-WTR

Platform characteristics:

  1. Single Socket P (LGA 3647) supports 2nd Gen Intel® Xeon® Scalable processor (Cascade Lake/Skylake)
  2. 6xDIMMs; up to 1.5TB 3DS ECC DDR4-2933MHz RDIMM/LRDIMM
  3. 2xPCI-E 3.0 x16 (FHFL) slots, 1 PCI-E 3.0 x8 (LP) slot
  4. 4xHot-swap 3.5″ SATA3 drive bays
  5. 2x10GBase-T ports with Intel X722 + X557
  6. 1xVGA, 2 COM, 2 USB 3.0, 2 USB 2.0
  7. 2xSuperDOM (Disk on Module) ports
  8. 500W Redundant power supplies Platinum Level Certifie

 

table 1 — intel processor scale up

 

AMD Servers based scaling example

AMD multi-core processors are great for systems that need 50G full-duplex performance. Performance can also be scaled up by the processor replacement and memory and network adapters addition.

Here is an example of Stingray SG-60 to Stingray SG-120 expansion on the same platform.

Platform: Supermicro AS-1014S-WTRT or Supermicro AS-2113S-WTRT.

Important: To use 100G interfaces you must install a motherboard with PCI 4.0 support.

Platform characteristics:

  1. Single AMD EPYC™ 7002 Series Processor
  2. 8xDIMMs; up to 2TB 3DS ECC DDR4-3200MHz RDIMM/LRDIMM
  3. 2xPCI-E 4.0 x16 (FHFL) slots; 1xPCI-E 4.0 x16 (LP) slot
  4. Integrated IPMI 2.0 + KVM with dedicated LAN
  5. 4xHot-swap 3.5″ SATA3 drive bays, Optional 4 U.2 NVMe (PCI-E 3.0) drive support via additional kit for NVMe devices
  6. 2x10GBase-T LAN ports via Broadcom BCM57416 Controller
  7. 1xVGA, 5xUSB 3.0 (4 rear, 1 Type A)
  8. 2xSuperDOM (Disk on Module) ports
  9. 500W Redundant Power Supplies Platinum Level High-efficiency

 

table 2 — amd scale up

 

Example of Supermicro Platform Specification

When using Stingray SG as L2 BRAS (authorization DHCP/ARP/PPPoE) it is necessary to consider additional load that is associated with the analysis of each packet according to additional parameters. This results in increased CPU power consumption. In these scenarios, it is recommended to increase the number of CPU cores by 30%. For the Stingray SG-40 license better take
the Stingray SG-60 platform.

table 3 — supermicro scale up

 

Scale-out (Horizontal)

Horizontal scalability can be achieved by using multiple Stingray SG servers for load balancing and splitting. By increasing the number of servers, you can resolve a situation when it is not possible to handle all traffic on a single device. In this case, the fault tolerance of the solution as a whole is increased.

We recommend taking into account the performance of each device if you intend to transfer traffic from the retired to another server.

Let’s consider scaling and reserving options separately for L2 and L3 BRAS schemes.

Terms:

  • Same BRAS performance and licenses
  • PCRF Server synchronizes internal UDR bases (authorization status, policing, services).

L3-Connected BRAS

L3 IPoE BRAS communicates with subscribers through intermediate routers, so it does not see the original MAC addresses, and subscribers are assigned IP addresses already. The IP address assignment in this scheme is done either statically in the network settings or on the access switches via DHCP Relay.

 

L3 BRAS Hot Standby via LAG (Active-Active)

 

l3 (active-active)

Two devices are placed in the same LAG with IP src / IP dst traffic balancing configured. This allows load balancing between the two devices and if one fails, then all traffic is directed to one server. You will need an active license for each BRAS server to implement.

 

L3 BRAS Cold Standby in LAG (Active — Standby)

 

l3(active-standby)

Two devices are placed in the same LAG with some inactive links. If the main links fail, the standby links are activated. This allows all traffic to be directed to one server if the primary server fails. Implementation will require a backup license for the BRAS server backup.

 

Cold Standby L3 BRAS via Routing (Active-Standby)

 

l3 routing (active-standby)

Two devices are placed in different LAGs. The main traffic route goes through the active server. Routing is configured on the edge routers. If the primary server fails, the standby route is switched. In case if the primary server goes down, it allows all traffic to be routed to one standby server via an alternate route. Implementation will require a standby license for the standby BRAS server.

L2-Connected BRAS

L2-Connected BRAS and the subscriber have a direct L2 connection, so he sees the original MAC addresses, VLAN or Q-in-Q tag, DHCP requests, which form the basis for Radius requests. The IP addresses are issued in the Radius Accept attribute.

Options of BRAS L2:

  • DHCP — The subscriber receives an IP address through the Stingray SG DHCP Proxy and passes the AAA in Billing. Then it is terminated by Stingray SG and goes to Border.
  • Static IP — The subscriber has a fixed IP address, passes through ARP authorization AAA in Billing, terminated by Stingray SG, and goes to the border.
  • PPPoE — The subscriber picks up a PPP tunnel with Stingray SG, the login/password authorization passes AAA in Billing, terminated by Stingray SG, and goes on the Border.

 

Cold Standby L2 BRAS (DHCP, PPPoE) (Active-Standby)

 

l2 (active-standby)

Two devices are placed in the same L2 domain. The active server responds to user requests and performs authorization. After successful authorization, routes are announced via OSPF/BGP to the border router. Routing is dynamic. If the primary server fails, the backup server begins to answer queries and announce subscribers. The switching process can be done in different ways. For example, by raising ports on the standby server or by rewriting the VLAN on the aggregation switch. The implementation will require a standby license for the standby BRAS server.

 

L2 BRAS Hot Standby (DHCP, PPPoE) (Active-Active)

 

l2 (active-active)

Two devices are placed in the same L2 domain. Both servers are active and respond to user requests and authenticate. For PPPoE technology, this mode is supported based on the design of the technology itself, where the subscriber establishes a connection to the server that responded first. For DHCP this mode is in development. After successful authorization, routes are announced via OSPF/BGP to the edge router. Routing is dynamic. If one of the servers fails, the second server begins to handle all traffic. To implement this, you will need an active license for each BRAS server.

We use cookies to optimize site functionality and give you the best possible experience. To learn more about the cookies we use, please visit our Cookies Policy. By clicking ‘Okay’, you agree to our use of cookies. Learn more.