CURRENT RESEARCH DIRECTIONS

My past work has been on making routers fast by developing high-speed algorithms for packet lookups, classification and Quality of Service that are widely deployed in products. In recent years, I have continued this focus (see later) but have also helped start two new areas: algorithmic measurement and algorithmic security. I have also worked on new problems at the interface of networking and computer architecture.

ALGORITHMIC MEASUREMENT

While the initial focus of network research was on speed and scale, the size, complexity, and importance of the Internet has led to a need to provide service and security guarantees. A foundation of both these guarantees is accurate measurement tools. While measurement of network phenomena is a well studied field, there are two major problems: first, collection of even basic data at high speeds is challenging because of high speeds and memory limits; second, the amount of data collected on major networks now exceeds a Petabyte a day, and is difficult to assimilate or understand.

In this context, my research introduces (and makes fair progress in) a new field that one can call *algorithmic network measurement*, which in turn opens up a new field of *algorithmic network security*. Briefly, we address the challenge of data collection by inventing new online algorithms that collect data that network managers need (flow counts, heaviest flows) using very little memory and with algorithms that can be implemented in hardware at high speeds. We addresses the challenge of understanding the data deluge by new algorithms (and a complete system implementation) for data mining traffic patterns. There are 6 major papers:

A) solution to the so-called heavy-hitter problem introduces new techniques called sample-and-hold and multistage filters that reduce the error in standard sampling solutions from 1/sqrt(S) to 1 /S, where S is the number of samples. For a paper written 2 years ago, it has already been cited 22 times. It was ranked in the top 5 papers in SIGCOMM 2002 (the top networking conference) and was fast-tracked for ACM Transactions on Networking, and finally appeared in the top systems journal, ACM Transactions on Computer Systems.

B) A solution to the flow counting problem was to develop simple bitmap counting algorithms. To reduce the size of the bitmap, he uses a technique to sample part of the hash space, and scale up. The resulting algorithm has been appreciated by a number of pioneers in the field of streaming algorithms including Pierre Flajolet of INRIA and Piotr Indyk of MIT. It was one of the top 5 papers in the Internet Measurement Conference 2002 where it was fast-tracked for submission to ACM Transactions on Networking where it is undergoing review.

C) A solution (with S. Ramabadran, SIGMETRICS 04) to the problem of maintaining a large number of counters at high-speed show how to do this by maintaining the high-order bits of the counters in slow memory and only the low-order bits in fast memory. The main insight is how this is managed without overflowing counters and without requiring complex sorting of counters.

D) A solution to the data-mining problem (with C. Estan and S. Savage) to automatically determine high-volume clusters of traffic, where clusters are defined as prefixes across protocol fields, at the "right" level of granularity. The prefix requirement distinguishes it from Association Rules in Data Mining. This paper was one of the top 10 ranked papers in SIGCOMM 2003.

E) While the mechanisms in A and B) require router changes, a more easily deployable proposal is proposed in a SIGCOMM 2004 paper called "Building a Better NetFlow".

These papers have had impact. A company is currently negotiating with UCSD to license Cristi's data mining system. The SIGCOMM 2004 paper has resulted in Tom Zingale the head of the NetFlow *the standard measurement mechanism in routers today) group at Cisco talking to us about incorporating these ideas in Cisco NetFlow.

ALGORITHMIC SECURITY

The field of network security has grown in importance with high-profile Denial of Service and Worms attacking networks almost on a monthly basis. The current mechanisms are defined by a lack of agility (a new worm runs amok till a signature is obtained several days later with human intervention), high false-positives, and poor performance (most intrusion detection systems run at low speeds in software). My research into algorithmic security looks for new algorithms for network security that can be implemented at wire speed and yet address deficiencies of existing security solutions.

I have worked on 4 major papers in this area:

A) Automated Worm Detection: Using the work in algorithmic measurement as a foundation, with S. Singh, C. Estan and S. Savage, I have helped pioneer the building of a high-speed implementation of an automated signature extraction tool called EarlyBird (that was a top-4 poster in SOSP 2003 and one of the top 5 papers selected for OSDI 2004) More importantly, EarlyBird has detected the signatures of most major worms and viruses in 2003 in a matter of seconds compared to the hours or days required by anti-virus vendors. UCSD has already filed a patent for this technology (which already has considerable interest from the 3 major router vendors) to a company called NetSift which will pay several hundred thousand dollars in fixed costs (and potentially much more in royalties) over the next few years.

B) Scalable DOS Detection: Standard technologies for detecting Denial of Service Attack keep track of every conversation rendering it difficult to implement at high speeds. With R. Kompella and S. Singh, invented a high speed scalable algorithm for DoS Detection using a filtering technique called Partial Completion Filters that requires very little memory to detect imbalances between control messages (e.g., conversation start message followed by no conversation end message).

C) Fast String Matching: Most IDS systems must search packets at high speeds for suspicious strings that are signatures of common attacks such as Code Red. George has worked on two papers in this period, With Mike Fisk, he has extended the Boyer-Moore algorithm to sets of strings. While this paper has not been published yet, it was initially ported to the publically available and widely used tool Snort. Following that, (with Brad Calder and students in Infocom 2004) worked on making a compact version of such set string matching algorithms that could work on a small amount of high speed memory.

D) Impossibility of Scalable Detection: With Kirill Levchenko and Mohan Paturi, we were able to reduce a number of security problems to the Set Disjointness problem in communication complexity and show that many problems cannot be solved scalably (however, scalable solutions are possible by changing the statement of the problem). This paper will appear in ACM CCS 2004, one of the top security conferences.

ROUTER ALGORITHMICS

I have continued to work on router algorithmics. Every router in the world today does 3 major functions on every message sent through the Internet: they do packet lookup (to see how to forward a packet), packet classification (special checks for security and QoS), and packet scheduling for QoS. In the 2001-2004 period, we have improved on previous algorithms for each of these functions, leading to the best-performing algorithms known in the literature and possibly in industry.

A) IP Lookups: Previous algorithms developed by George and his colleagues required more memory than could be afforded by very compact implementations. With co-author (and former student) Will Eatherton, we have developed the Tree Bitmap algorithm for lookups. This algorithm is being used in Cisco Routers, and is currently the fastest and most compact IP lookup algorithm known in the literature, allowing both software and hardware implementations. Cisco has confirmed that the latest Cisco HFR Router (officially known as CRS-1), the fastest and largest router in the market, is using Tree Bitmap. TreeBitmap is also being proposed for placement in Cisco IOS; if so it will be widely deployed across several Cisco router platforms.

B) Packet Classification: We have developed the fastest packet classification algorithm currently known in the literature, that they call HyperCuts that beats the Stanford algorithm HiCuts that was the best so far. This paper was published in SIGCOMM 2004 and was also a top 10 ranked paper. More importantly, they have been asked permission for use of the algorithm by Nevis Networks, Cisco Systems, and NewBridge Networks. While all these three companies are currently evaluating their algorithm, there is a some chance that the algorithm will be practically deployed. Florin Baboescu has been hired by Cisco, in no small part to implement HyperCuts at Cisco.

B) Quality of Service: The scheduling algorithm DRR is widely used in routers and was invented by M. Shreedhar and I in 1993. In this period, R. Kompella and I invented a version of DRR, called randomized DRR, that reduces state needs considerably, allowing DRR to scale to much larger numbers of queues, which may be required by core routers.

COUPLING ARCHITECTURE AND NETWORKS

I have also been exploring interdisciplinary opportunities at the interface of networking and architecture in collaboration (especially with Brad Calder and his architecture group at UCSD).

A) APPLYING NETWORKING MEASUREMENT TECHNIQUES TO ARCHITECTURE: Motivated by the observation that computer CPUs also require high speed measurement, with B. Calder and students, we found a way of applying some of the algorithms our group developed for high-speed measurement (e.g., multistage filters) for architecture. The resulting profiler can do in hardware what the best previous profiler (from Jim Smith at Wisconsin) does using a mixture of hardware and software. The paper appeared in a major architecture conference, HPCA

B) NEW NETWORK PROCESSOR DESIGNS: Network processors are being widely used in routers as programmable processor that can be used to implement router functions. Unfortunately, they do not scale to very high speeds. With B. Calder and T. Sherwood, we proposed a unique network processor architecture based on wide words, that is dramatically different from current network processor designs. The paper introduces a new technique called internal memory pipelining. The paper appeared in the top computer architecture conference, ISCA 2003.

C) INTERACTION OF MEMORY ALLOCATION AND PIPELINES: With Ron and Fan Graham, we have shown that while pipeline memories cannot be shared without causing a great deal of contention, a series of 2-port memories in a pipeline architecture can come very close the memory sharing capability of pure shared memory while reducing contention slowdown to at most a factor of 2: SPAA 2004.