Yes, to implement more than 100 Gb/s on a single server. It is possible to use 2.4 sockets and achieve 400 Gbps. It is preferable to use uniprocessor systems with a sufficient number of cores.
You need to disable Hyper-Threading for the correct operation of the Stingray. SSH works only with physical processor cores.
Yes, you can. Both the BRAS/BNG and DPI functions can be controlled per subnet or an autonomous system, it could be configured.
The Stingray SG database contains correspondence of all addresses to their autonomous systems, so you can understand which ASN the traffic belongs to and correlate this with the geographic location of the owner of the AS. At the moment, there is no direct integration with IP location services by geographic location.
Approximately 4 terabytes in 6 months with an average daily load of 1 gigabit/s. More accurate data can be found experimentally.
MPLS — yes.
GRE — yes, but only if the traffic in the tunnel is not encrypted.
QoE data is stored in the ClickHouse database.
Yes, it is possible. Both in the “In-Line” and in the “Mirror” schemes.
- DHCP L3 — The subscriber receives an IP address from the DHCP server (it is not connected in any way with the SCAT), and through the routed network gets to the SCAT. Where, by Source IP, it passes authorization in Billing (policing, services) and gets to the next router after the SSH (IP can be broadcast using CG-NAT).
- DHCP L2 — The subscriber receives an IP address through the SSH DHCP Proxy or DHCP Relay and is authorized in Billing. Then it is terminated by SCAT and gets to the router following it.
- Static L2 — The subscriber has a fixed IP address, according to the data received by the SCAT from the ARP request to the gateway, it passes authorization in Billing, is terminated by the SCAT, and gets to the border.
- PPPoE — The subscriber raises a PPP tunnel from the SCAT, through the login/password passes authorization in Billing, is terminated by the Stingray SG, and gets to the border.
It is suggested to use the SNMP agent, which is installed on the server to send information to the operator’s existing system.
Many telecom operators use the Zabbix monitoring system. In this case, zabbix-agent is installed on the Platform servers, its standard configuration and connection to the zabbix server are performed. As a template, you must use the standard template “Template OS Linux” – to monitor the main parameters. And also connect a custom template “Template DPI”, which allows you to monitor specific fastDPI parameters. The availability of zabbix-agent for the Platform allows you to use the full power of this monitoring system.
File rotation provides a daily backup of the daily log. By default, this process is performed during the hours with the least load on the system. The depth of log storage is defined in the configuration /etc/logrotate.d/fastdpi parameter maxage, the value is specified in days.
The DPI server acts as a sensor. It logs information about the stream in partitions and sends them to the QoE Stor collector (default interval is 1 min.). After the QoE Stor receives the data, it translates it into tabular values (Raw log), then aggregates (Aggregated log).
There are 4 types of raw logs in QoE Stor:
- raw full NetFlow, generated from netflow_full flow
- raw Clickstream, formed from the ipfix_tcp stream
- raw GTP Flow, for mobile networks
- raw NAT Flow, information about NAT translations
4 types of aggregated logs:
- GTP Flow
- NAT flow
Raw logs consist exclusively of tables with all collected information on streams: session start/end/ID, IPv4/v6 source/destination addresses before/after NAT, source/destination ports, destination ACs, etc. The table template is described in NetFlow Formats.
Aggregated logs are generated based on raw data by combining several records for one IP address, which reduces the amount of storage and increases the speed of reporting, but the detail for each session is lost.
For a subsystem you can use hardware or virtual machines with the following characteristics:
- Processor (CPU) 2.5 GHz — 1 pc. (requires SSE 4.2 instruction set support. Clock speed is less important. For example, 16 cores at 2600 MHz is better than 8 cores at 3600 MHz)
- Random-access memory (RAM) — from 16 GB (memory must be at least as large as the amount of requested data. The more memory, the better the performance when building reports. The more memory, the less disk load. The minimum requirement is 16 GB. Always turn off the file swap)
- Hard drive (SSD highly desirable) — from 500 GB (required disk space from 16 GB for each day of storage depending on traffic. It is estimated that 10 Gb / s of average daily traffic generates approximately 25 GB of data per hour in Stor QoE). The calculation calculator is presented on the Wiki.
- Operating system — CentOS 8+
- Network card (NIC) — from 1 Gbps.
*The module for receiving and processing statistics can’t be installed on a server with a DPI platform: preparing reports is demanding on CPU resources, which may adversely affect the performance of the DPI platform.
Four components are used to work with IPFIX (NetFlow v10):
- Stream exporter: a device responsible for collecting information about streams and exporting it to the stream collector – SSH DPI (DPI module).
- Collector: The application that receives the exported flow information is ipfixreceiver2 as part of the QoE Stor.
- Flow Analyzer: An application that analyzes the flow information collected by the collector. Information about the flow is recorded for storage and analysis in the ClickHouse database.
- Management interface: an application that allows you to effectively visualize statistics about the flow and manage the entire traffic control and analysis system (SCAT DPI) – a graphical user interface DPIUI2.
The Quality of Experience (QoE) module is a software product for collecting statistics and evaluating the quality of perception of services.
The statistics collected by the module are superimposed on specific metrics to determine user experience and answer the question of how high-quality communication and Internet access services the end-user receives.
The data obtained allow the operator to take the necessary actions to improve the quality of services and, as a result, to increase subscriber loyalty.
No, if the SSH is restarted it will continue to work in the same mode.
There are the following options for receiving and processing statistics:
- Free product NFSEN in conjunction with NFDUMP – only suitable for receiving NetFlow v5 to analyze statistics by protocol and direction.
- Ipfixreceiver – used to receive IPFIX, store it in a local file.
- QoE Stor module in conjunction with DPIUI2 GUI.
We recommend using QoE Stor in order to get the maximum effect from the statistics uploaded from the DPI module.
NetFlow is sent to the collector from a dedicated DPI interface, typically the mgmt interface which is used for SSH access, RADIUS integration, etc.
*There can only be one stream of the same type. If it is necessary to send to several sources, re-export from the QoE module or other collectors is used.
The bandwidth required to export NetFlow data is typically less than 0.5% of the total bandwidth consumption. For example, if you are monitoring a link using 100 Gbps, DPI will use less than 0.5 Gbps to export NetFlow data.
DPI accumulates statistics and transmits it to the collector at a specified frequency. Based on the principles of DPI operation, statistics will always be uploaded with some delay depending on the time intervals set in the configuration file.
NetFlow v5 is limited to IPv4 flows. The selection of fields that can be exported using this version is also limited.
IPFIX — sometimes referred to as NetFlow v10. Two new concepts for stream monitoring are introduced here: firstly, the use of templates is allowed, and secondly, the user has the opportunity to “look” deep enough into the packages and select the fields of interest to him for monitoring. IPFIX allows almost all fields, all types of TCP flags, and other information to be selected from the IP header, including VLAN tags and URLs.
Benefits of IPFIX over traditional NetFlow v5:
- flexibility, scalability, and flow data aggregation beyond the capabilities of traditional NetFlow
- the ability to monitor a wide range of information about IP packets, from layer 2 to layer 7
- user-configurable flow information allowing customization of traffic identification (an opportunity to single out and monitor specific network behavior).
- Inserts advertising banners
- Ads blocking
- Allowed and not allowed lists websites lists
- Notification of subscribers (for requests via HTTP)
- DDoS protection
- Collection of NetFlow statistics for billing, accounting
- Recording subscriber traffic in PCAP
- ACL (mini firewall)
- Diverting traffic to external platforms
- Providing marketing activities
- By IP address in a packet received when using BRAS in L3 mode.
- By PPPoE — a packet for authorization received from the subscriber’s PPPoE port (PADI). It allows you to identify a subscriber by login, mac-address, qinq tags.
- By DHCP packet — from the broadcast DHCP Discovery packet that is received from the subscriber port in L2 mode, which allows getting information to identify the subscriber by mac-address or qinq-tag.
If it is necessary to handle more traffic than the capacity of one DPI platform (100G full-duplex), additional DPI modules are installed. For correct operation, it is necessary to fulfill the condition for organizing traffic symmetry (all packets within one session must pass through one DPI device). Symmetry can be achieved by balancing over SRC / DST IP on switches and routers or by installing special balancers — Packet Broker.
The platform is completely transparent to the application of the LACP link aggregation protocol, subject to the following conditions:
- traffic processing ports belong to the same cluster
- ports are configured symmetrically and properly connected
- traffic from/to the Subscriber (source IP addresses) passes through the same traffic processing ports.
As a control protocol for LAG, both standard LACP (IEEE 802.3ad) and proprietary ones, such as PAgP, can be used.
*LAG protocols have a rather long “convergence” time up to 30 seconds (without using special settings), which may affect the correct passage of traffic through the platform.
Bypass support is implemented for Silicom 40 GbE, 10 GbE, and 1 GbE cards. Multiport network cards have a special “Bypass” mode. When this mode is enabled on the NIC, the traffic for the NIC is physically looped between the two ports and thus bypasses processing directly on the NIC and, accordingly, on the platform.
This mode is used when it is important to keep traffic going even if the system fails. Of course, when “Bypass” is enabled, all platform functions are unavailable.
*Enabling this mode on the network card should be considered as an emergency.
The module is demanding on the number of processor cores, RAM, and the speed of the disk subsystem. Requirements: at least 16 GB of RAM, 4 CPU cores from 2.5 GHz, a disk subsystem for a database from 1000 MB on an SSD drive, for an application from 128 MB. Storage calculation: On average, 10 Gbps of DPI traffic requires about 25 GB of data storage in the database.
For management – a separate network port (but not less than 1 Gb / s) on any chipset supported by the operating system. Operating system: CentOS 8+, disabled paging file (swap).
If only one server with DPI is used, the PCRF module can be placed on the same server. In cases where multiple DPIs are used (for load balancing/redundancy), it makes sense to move the PCRF to a separate server.
Requirements: at least 2 GB of RAM, 2 CPU cores from 2.5 GHz, disk space from 128Gb.
For management – a separate network port (but not less than 100 Mbps) on any chipset supported by the operating system.
Operating system: CentOS 8+ (x64).
It is quite a resource-intensive module: it imposes increased requirements in terms of CPU, speed, and RAM capacity, optionally a fast HDD, but NVMe-enabled SSD is preferable.
- One processor with SSE 4.2 support, starting with Intel Nehalem and AMD EPYC Zen2 with 4 cores or more, a base clock speed of 2.6 GHz or higher.
- Also, HyperThreading must be disabled on the server.
- Minimum required 3 ports: one for SSH control (any chipset), two for traffic processing – network cards based on chipsets with DPDK technology support.
It is recommended to use only tested cards based on Intel chipsets (with 2, 4, and 62 ports):
e1000 (82540, 82545, 82546)
e1000e (82571, 82572, 82573, 82574, 82583, ICH8, ICH9, ICH10, PCH, PCH2, I217, I218, I219)
igb (82573, 82576, 82580, I210, I211, I350, I354, DH89xx)
ixgbe (82598, 82599, X520, X540, X550)
10GbE and 40GbE interfaces
i40e (X710, XL710, X722, XXV710)
mlx5 (Mellanox ConnectX-5 Ex)
ice (Intel E810) – not recommended, there are problems in the intel firmware on the card: it does not allow GREtunnels
- Bypass support implemented for Silicom 40 GbE, 10 GbE, and 1 GbE cards
- Operating system: CentOS 8+ (x64)
It is possible to transfer traffic from/to Subscribers to the platform through a “mirror” using an optical splitter or SPAN mode on the router. In this mode, the platform can provide filtering by blocklists (similar to the “asymmetric” mode), collection of statistics on traffic in QoE Stor, notification of subscribers via HTTP Redirect.
Radius monitor, that uploads IP-Login bundles to UDR DPI, is used to obtain information about IP belonging to logins.
IP packets are sent towards clients and hosts from a response port which is used to implement filtering and notifications.
*It is needed to know that the network interface of the platform will only work to “receive” traffic. So, for example, if in this mode a port of 10 Gbps is applied, “mirrored” traffic in the amount of 8 Gbps of incoming (to subscribers) and 5 Gbps of outgoing (from subscribers), then the total traffic will be 13 Gbps, which is clearly higher than the capabilities of the physical receiving port.
In some schemes for using the DPI platform, it is possible to use only outgoing traffic from subscribers. These options include the blocked sites filtering mode.
A feature of this mode is that the DPI module receives only traffic from subscribers, and the output from the platform is either the formation of a “fictitious” response from a prohibited resource for the router or simply blocking a packet destined for a prohibited resource.
This mode is used when the operator does not have the ability to use the platform in symmetrical mode or there is no need for other platform functions. This mode is not recommended.
If authorization functions are already implemented in the network, then DPI is enabled after BRAS / BNG before NAT. Thus, real private IPs of subscribers will be visible in the analyzed traffic.
If the platform performs the functions of BRAS/NAT/DPI, the enabling is carried out in the network core between the aggregation level and the border router.
The DPI platform always has destination ports: to the Subscriber – the port is connected to the data network and to the Internet – the port is connected to the border router.
Thus, traffic from/to the Subscriber must pass through the platform, this is called symmetric enabling. This mode provides full functional use of the platform. But it is possible for the platform to work in “asymmetric” mode or “mirroring”.
The component is responsible for the interaction of the platform with the AAA server of the telecom operator via the RADIUS protocol. (AAA – Authentication, Authorization, Accounting). Used when BRAS/BNG features are enabled on the DPI Platform.
The DPI and PCRF components communicate with each other using an internal communication protocol via the TCP/IP stack. PCRF can be hosted either on a separate physical or virtual server or run on the same server along with DPI.
If multiple DPIs are used, the 1xPCRF: NxDPI scheme is used.
DPI (fastDPI) is the main component that the platform as a whole cannot function without. DPI provides analysis and processing of traffic passing through the platform, application of various services to traffic, and its management.
The main functional tasks of DPI:
- Application of rate policies to all traffic or individually per subscriber
- Application of platform services (CG-NAT, Allow list, Blocked Sites Filter, etc.)
- Formation of traffic information export in various formats (NetFlow, IPFIX-clickstream, IPFIX-nat, IPFIX-flow, etc.)
- BRAS functions (IPoE, PPPoE, DHCP L2)
* Optional – this option can be added to the main License with additional payment.
- The router announces a route (BGP, OSPF) to the NAT Pool at the time it is created.
- For subscribers with public addresses, the announcement is made after the subscriber has been authorized.
- The user makes the request via PPP (PAP, CHAP, MS-CHAPv2).
- SSG processes the request and forms an Access-Request to the Radius server via PCRF.
Acct-Session-Id = “001122334455667788”
User-Name = “login04”
User-Password = “password”
Calling-Station-Id = “84:16:f9:05:8b:12”
NAS-Port-Type = 5 (Virtual)
NAS-Port = 0
NAS-Port-Id = “321/654”
NAS-IP-Address = “220.127.116.11”
VasExperts-Service-Type = 3
- Radius generates an Access-Accept response specifying the name of the DHCP server pool from which the private IP address will be assigned.
Session-Timeout = 86400
User-Name = “login04”
Framed-Pool = default
VasExperts-Policing-Profile = “rate_50mbits”
VasExperts-Service-Profile = “11:CG-NAT_pool_001”
- The PCRF server makes a DHCP Request to the DHCP server and receives the parameters for a specific subscriber.
- The login that was received from Radius and the IP address that was received from the DHCP server are bound together. Policing and service data from Radius are applied to login.
- A response is generated from SSG to the user that the PPPoE connection has been successfully established.
SSG provides analysis and processing of traffic that passes through the platform, applying traffic policing and management. Through a combination of DPI technology, software BRAS/BNG, and CG-NAT it resolves most of the ISP’s challenges.
The DHCP server is responsible for issuing private addresses to subscribers. Any CentOS-compatible implementation can be used. The current implementation uses the Kea DHCP server.
A software router is a component that announces and receives routes via OSPF, BGP, and other dynamic routing protocols supported by the router daemon. The current implementation uses BIRD.
PCRF provides the platform interaction with the operator’s AAA-server via the RADIUS protocol (AAA — Authentication, Authorization, Accounting).
QoE Stor is a set of applications for collecting and storing traffic information from SSG.
There are three options for implementation:
- purchase a standard x86 server from any manufacturer
- use the equipment you already have
- deploy BNG as a virtual system (VNF component on eSXI, KVM, Hyper-V server).