RESEARCH AND DEVELOPMENT

Shared Memory Enabled Service Plane Optimization

Networking devices that allow for "bump in the wire" (e.g. all Cisco routers and switches) will benefit from the identified service plane optimizations. We focus on Snort, an Intrusion Prevention System (IPS), that runs on such devices, since it benefits from the optimizations involving the use of shared memory and machine learning logic, to significantly improve CPU cores' utilization and per packet latency.

First, we do away with copying packets (making duplicate copies) meant for the punt path, by using our shared memory interface "memif" - copying packets is time consuming in general, and with small packets the copying action is quite frequent. The new approach saves one memory cpy in each direction (tx/rx), and all kernel calls are saved too.

Second, the packet need not be re-inserted. Instead, we will send only the verdict on the packet under inspection to TM using the memif. Traffic Manager (TM) will drop/forward the packet based on the verdict. The latency incurred in waiting for verdict through the memif interface is negligible as compared to the latency tolerated from the re-insertion of the packet in the existing approach.

A combination of packet fields is usually used to identify the flow that a packet belongs to. Let us say the first packet was identified to belong to a malicious flow. This observation can be used to avoid punting every subsequent packet belonging to the same flow (identified by the first packet in that flow that reached Snort). It precludes even the need to wait on a verdict from Snort.

But, it is likely that a bad actor sends out a legitimate packet at the start and then malware packets thereafter. This malicious strategy usually works as the flow is marked as legitimate with the first packet, and subsequent packets are forwarded without inspection since those belong to a "legitimate" flow.

Third, we use machine learning logic to detect and thwart such a malicious strategy. The learning logic is designed to use a similarity metric, that assigns to each packet a soft value like probability, thresholding on which the TM will decide whether or not, to further inspect it by punting it to Snort. By using the verdict (drop/forward the packet) from the Snort as supervision, the learning logic can be made aggressive/conservative in the punting decision.

MmWave Radio Resource Allocation Scheme for V2X Communication

Autonomous self-driving vehicles of the future will demand 3D HD maps to plan and navigate routes, frequent software upgrades. Moreover, such vehicles will have numerous on-board sensors (including cameras) – the large volume of sensory data generated can be preprocessed, then pushed to the Cloud.

Cisco Systems is building a Gigabit Ethernet Network housed in the vehicle’s body to aggregate all the sensory data traffic and then push all that data to the Cloud. Cisco however, isn’t interested in developing applications that consumes that data, what to do with all that data is left to the automakers.

LTE/4G max out at ~100Mbps which is not sufficient to stream all the sensor data (currently 10-100 sensors on-board) that's generated. mmWave mobile communication can in principle enable downloading/uploading such high volume data in under, say, a minute or two, as there is contiguous GHz wide mmWave bands. A radio resource allocation scheme that handles data needs of multiple vehicles concurrently is developed, enabling high data rate communication from moving vehicle(s) to an infrastructure tower (Base station).

Dynamic Spectrum Switching between mmWave and THz Small Cells

Reliable and continuous high bandwidth connectivity within the next generation of vehicles will enable driver-less cars, data backhauling and ultra-high-definition infotainment services. The ability to achieve data transfer rates in the order of several gigabits-per-second is key to enable such applications, so far unattainable through state of the art dedicated short-range communication (DSRC) and 4G cellular communication. In that regard, small cells that can utilize the available massive spectrum bandwidth in the millimeter-wave (mmWave) and Terahertz (THz) frequencies promise a paradigm shift, leading up to several Tbps of effective data transfer rates.

We propose a new software-defined network (SDN) framework for vehicles equipped with transceivers capable of dynamically switching between THz and mmWave bands. We present a novel SDN controlled admission policy that preferentially handoffs between the mmWave and THz small cells, accommodates asymmetric uplink/downlink traffic, performs error recovery and handles distinct link states that arise due to motion along practical vehicular paths.

We then formulate the optimal procedure for scheduling multiple vehicles at a given infrastructure tower, with regards to practical road congestion scenarios. To that end, we design a computationally-feasible polynomial-time scheduling algorithm and compare its performance against the optimal procedure and random access. Additionally, we present a simulation-based case study for the use case of data center backhauling in Boston city to showcase the benefits of our approach.

Prototyping IEEE 802.11b Link Layer for MATLAB-based SDR

Software defined radio (SDR) allows unprecedented levels of flexibility by transitioning the radio communication system from a rigid hardware platform to a more user-controlled software paradigm, allowing unprecedented levels of flexibility in parameter settings. However, programming and operating such SDRs have typically required deep knowledge of the operating environment and intricate tuning of existing code, which adds delay and overhead to the network design.

We make a systems contribution - We developed the very first MATLAB implementation of the PHY and MAC layer that is entirely compliant with the IEEE 802.11b standard. The code base extends support for software defined radios (SDR) and has been extensively tested with USRP N210s (using WBX daughterboards). We modeled the system using a finite state machine (FSM) that transitions only on the clock cycles derived from the USRP. In addition to the physical carrier sensing, the software implements the virtual carrier sensing in CSMA/CA with the optional IEEE 802.11 RTS/CTS exchange. By parameterizing the system parameters, we adopted a software-only approach, thereby enabling the user to fully reconfigure system parameters as desired during run-time. The DATA/ACK packet structure and the link-layer protocols are modeled to be fully compliant with IEEE 802.11b specifications.

The link layer's capability of mitigating packet collisions and enforcing fairness among nodes in accessing the channel was established through extensive experiments on USRP N210 based SDR platform. The developed code is modular making it relatively easy to manage, allowing extensibility by the research community. The technical merit of this work is the full system parameter flexibility, eliminating the dependence of an external clock, the significant reduction in programming complexity. The software has been released publicly for research purposes under the GNU Public License (GPL), available for download directly from GitHub and MATLAB Central. Further, this work provides a testbed to experiment with new MAC protocols.

Detecting Adversaries among Heterogeneous Annotators

Supervised/semi-supervised learning settings face an increasingly common scenario: the ground truth exists, but it is not available or expensive. However, multiple sources of annotation are available. The question is now of evaluating the annotators – are they adversarial, spammers, helpful? The use case will be to identify as early as possible the helpful/unhelpful annotators and can help evaluate data collection/annotation process/mechanisms.

Example scenarios: Product quality from user reviews, conference paper rating from multiple reviews, quality of on-line contributions from user history. In healthcare, diagnosis of the Coronary Artery Disease (CAD) is carried out by measuring and scoring regional heart-wall motion in echocardiography. There is much difference found in cardiologists’ expert diagnosis, and the quality of diagnosis depends heavily on the skill and training. This degree of variability raises practical questions for healthcare – how to diagnose if the doctors don’t agree, how to tell which doctors are skilled enough.

The problem is challenging as not all annotators label all data points, and we have to account for the variability in annotators, the relative reliability between annotators, the internal reliability and degree of maliciousness by the individual.

We developed novel scoring metrics to evaluate labelers. The scoring metric involves information-theoretic measures and are robust enough to identify adversarial labelers. The strength of this approach was demonstrated in response to the number of adversaries, their degree of malicious behavior, and their effect on the classification accuracy.

Human Intrusion Detection Algorithm in a Wireless Sensor Network

Unattended surveillance in terrains such as the ravines, dense forests is often necessary to identify and notify the authority of harmful activities such as tree felling, poaching and, forest burning. The surveillance cameras like the ones we find in malls are a natural choice. But, such cameras do not work in the night. In that case, infrared cameras that operate at night is the next option. However, such surveillance cameras are expensive and power hungry. They are power hungry as they are required to do a lot of image processing and have to power up a lot of electronics. Being expensive is not a serious limitation but being power hungry is. Power outlets are not going to be available in the terrains of interest. Further, such terrains are obviously prone to infiltration, and it is impossible to position security personnel. So we require a system that is power efficient, reliable and is not tied down by wires. A Wireless Sensor Network (WSN) using Passive Infra-Red (PIR) sensors is best suited for this application.

Given a placement of sensing nodes, the problem is to detect an intruder, who is assumed to be moving during the period of detection, in the presence of clutter with low false alarm rate. The intruder is a human traveling in the vicinity of the sensor. The term clutter is used to describe the waveform generated at the output of the sensor as a result of the alternate source of disturbances like the animal or vegetative movement or natural phenomenon such as the wind, rain, etc.

The problem is challenging because intrusion is a random, rare and ephemeral event while clutter is always present. And, frequent false alarms would effectively render the system useless. Moreover, the nodes are likely to be inaccessible for battery replacements because of their risky sensing environments. Hence, energy consumption in the nodes is another equally important issue.

We designed a low-complexity algorithm for intrusion detection in the presence of clutter arising from wind-blown vegetation. The algorithm is based on a combination of Haar Transform (HT) and Support Vector Machine (SVM) based training. The amplitude and frequency of the intruder signature are used to differentiate it from the clutter signal. Intruder data collected in a laboratory and clutter data gathered from various types of vegetation are fed into SVM for training. The optimal decision rule returned by SVM is then used to separate intruder from clutter in the outdoor settings.