I'm pleased to announce the opensource release of anti-DDoS solution for high traffic DNS servers - Flow management. It is based on well-known Solaris Crossbow flow feature since OpenSolaris 2009.06 as well as new ORACLE Solaris feature - CR 18734919 : Support for DSCP marking on flows - which had appeared in Solaris 11.2 on March, 18 2015.
Until recently Solaris IP QoS (complex feature) has been the only Solaris feature that handles DSCP policies, but it lacks the API. As opposite Solaris Crossbow Flow (lightweight feature) has API (although unpublished) and more simplified configuration. Each flow has one or more attributes (traffic classified endpoint) and could have one or more properties (which are subject to be applied on traffic that matches the endpoint). Shared bandwidth flow property maxbw serves for limiting bandwidth and traffic control. However it makes little sense if we have multiple sources of attack (DDoS case) and the design requires that each of them is reflected by dynamically created flow.
Newly introduced flow property dscp adds the ability to create cascade flows by classifying the traffic originated from different sources into one flow (via flow attribute dsfield for indication ToS bits and finally applying maxbw limit), thus creating the premises for DoS mitigation software.
It should be mentioned we should have at least two zones for organizing cascade flows due to Solaris flow design constraints. However those zones could coabit in the same physical server, providing out-of-box cost-effective solution where both DoS protection and target service coexist.
It should be mentioned we should have at least two zones for organizing cascade flows due to Solaris flow design constraints. However those zones could coabit in the same physical server, providing out-of-box cost-effective solution where both DoS protection and target service coexist.
Central part of the discussed software is flow-mgmtd binary which could be run both as data producer (that sends what should be blocked) and daemon (data consumer that handles flows dynamically). All interaction are via IP multicast and you shouldn't bother about configuring connection endpoints in a complex HA scheme. Current production scheme includes 4 redundant Quagga routing zones and 6 standalone DNS server zones (each of them could be served by any available routing zone). Flow client operates within DNS server zone, and flow daemon operates on top of Quagga zone.
DSCP-marked traffic arrives to DNS server zone, here are two statically configured flows that aggregates the traffic by DS field. It is possible to create additional flow with priority set to high in order to insure the traffic originated from some clients will be prioritized according to SLA.
The project is running in production mode since 15.04.2015 and helps to mitigate various DNS DoS attacks (slow-drip, resource exhaustion etc) on caching DNS platform that serves 100Mbps DNS traffic.
What is not covered by current article is detection of attacks and the possible appliance of that project to other fields, for example it is possible to protect WEB services by using the existing software without major changes due to the fact that analytical data about attacks is fed via JSON. You could imagine simple script that generates JSON file periodically replicated by Flow Management Agent.
You can download the sources.
Project goals:
- protecting against DDoS attack on DNS server & improve quality of service,
- traffic prioritization according to SLA levels.
- based on Solaris Crossbow flow feature and flow DSCP marking (CR 18734919),
- tested on ORACLE Solaris 11.2 SRU8.4,
- running the software Solaris non-global zones is supported,
- compatible with Differentiated Services Field (protocol standard RFC 2474),
- analytical source could be file-based (currently implemented) or interactive,
- the autoprovisioning of new attacking source due to IP multicast transport use,
- Sqlite3 backend is used for providing fast recovery with minimal impact on the system. When flow-mgmtd is restarted it uses data from hardcoded file /var/run/flow-mgmt.flow.db which contains previously recorded information about expiring events and flows created. Please note flows are generated as temporary and without the reflection into standard flow database path in order to support custom data model and for compatibility with Solaris immutable zones feature.
- receives traffic analytics from outside (DNS Cache), takes decision whether to apply traffic policies,
- transforms analytical results into JSON protocol,
- floods JSON payload up to IP multicast group subscribed members (located on redundant OSPF routers),
- flow agent (OSPF router side) receives JSON commands and creates DSCP policies on forwarded traffic (via creation of dynamic flows)
- flow agent makes the policies to expire after configurable timeout (thus removing expired flows from the system),
- customer traffic arrives to consumer (DNS server) and is redistributed among different flows (statically configured) according to DSCP policies,
- high spikes of traffic are supressed prior the arriving to consumer (DNS server); thus saving CPU cycles on DNS server,
- customer traffic activities classified in different sinks couldn’t influence each other.
- small CPU footprint on high load
- traffic is managed in an intelligent way, not simply dropped; it starts to drop when flow bandwidth allocation maxbw is exceeded.
- if it isn’t running, all things continue to work as before, but unprotected.
- ORACLE doesn’t publish libdlflow API (part of Datalink management library), and it is still undocumented.