Large-scale Distributed Computation


Shonan Meeting Seminar 011


Foto Afrati “Transitive Closure and Recursive Datalog Implemented on Clusters”

Implementing recursive algorithms using a distributed computational environment that is an extension of map-reduce presents new challenges. A central problem is that recursive tasks cannot deliver their output only at the end, which makes recovery from failures much more complicated than in map-reduce and its nonrecursive extensions. Another is the endgame problem: later rounds of a recursion often transfer only small amounts of data, causing high overhead for interprocessor communication.
I will focus in the endgame problem. One way to deal with it is to use an algorithm that reduces the number of rounds of the recursion. I will present a number of algorithms for transitive closure that are efficient also as regards the amount of data transmitted among the compute nodes because they have the “unique decomposition” property. Then I will present results about reducing the number of rounds for more general Datalog programs and show also that there are cases where you can do so efficiently and cases where you can reduce the number of rounds but only at the expense of increasing the arity of the intermediately derived predicates.

(This is joint work with Jeff Ullman.)

Graham Cormode, “Continuous Distributed Monitoring: A short survey

In the model of continuous distributed monitoring, a number of observers each see a stream of observations. Their goal is to work together to compute a function of the union of their observations. This can be as simple as counting the total number of observations, or more complex non-linear functions such as tracking the entropy of the induced distribution. Assuming that it is too costly to simply centralize all the observations, it becomes quite challenging to design solutions which provide a good approximation to the current answer, while bounding the communication cost of the observers, and their other resources such as their space usage. This survey introduces this model, and describe a selection results in this setting, from the simple counting problem to a variety of other functions that have been studied.

Amr El Abbadi “From the Trenches of Continuous Monitoring of Multitenant DBMSs for Consolidation and Load-balancing.”

Cloud platforms handle large numbers of applications with small data footprints, different types of workloads, and unpredictable load characteristics. A multitenant database management system (DBMS) storing and serving the data for these applications (tenants) is therefore an important component of the platform’s software stack. Continuous monitoring of tenant and node performance is critical for resource provisioning and elastic load balancing to ensure effective resource utilization and minimize operational cost. A self-managing system controller faces multiple algorithmic and system level challenges; including, characterizing a tenant given the variety of workloads a tenant might issue, reducing the impact of co-locating multiple tenants, and adapting to changes in tenant behavior. In this presentation, I will discuss some of the challenges we have faced in the trenches of designing a self-managing coordinating component that monitors system performance, controls tenant placement, and manages elasticity. Our approach attempts to “”learn”" tenant behavior through observation and analysis of the tenants. We use a collection of system level and DBMS agnostic database level performance measures and machine learning techniques to characterize tenant and node behavior. We are also developing a combination of reactive mechanisms and pro-active actions (when possible) to detect and mitigate performance crises arising from unpredictable workloads and behavior.

This work was done jointly with Divy Agrawal, Sudipto Das and Aaron Elmore.

Donatella Firmani “Minimum Spanning Tree and Connectivity of Large Scale Graphs in MapReduce

Minos Garofalakis “Geometric Query Tracking Using Sketches and Models”

The geometric method offers a generic methodology for non-linear distributed threshold monitoring tasks. Still, the approach relies on maintaining complete state at each site and assumes static estimates of the data, thus raising efficiency concerns for massive, dynamic data streams. In this talk, I will discuss novel query tracking and threshold monitoring schemes that rely on extending the geometric method with compact sketch summaries and dynamic prediction models for the local data streams. Time permitting, I will also discuss some of our current research directions and open problems in this space

Ashish Goel “Algorithms for Distributed Stream Processing”

An Active Distributed Hash Table (Active DHT) is a distributed system that stores key-value pairs in distributed main memory and allows arbitrary User Defined Functions (UDFs) to be executed on any key-value pair. An Active DHT can be viewed as a streaming version of the popular Map-Reduce paradigm. We will outline algorithms for Locality Sensitive Hashing, Social Search, PageRank computation, and Graph Sparsification in this model. In each case, we will show a significant improvement over the straightforward distributed implementations.

Daniel Keren “Safe-Zones for Distributed Monitoring”

Many monitoring tasks over distributed data streams can be formulated as a continuous threshold query on a function defined over the global average of the stream values. A fundamental problem in efficient monitoring is to correctly identify all threshold crossings while minimizing communication. We present a novel scheme for communication reduction in distributed monitoring using local constraints. Communication is required only in the event that the local constraints are violated by the incoming data. As opposed to previous work on geometric monitoring, we present here local constraints which are tailored to fit the local data distribution at each stream. Also, we attempt to define local constraints which are as simple as possible. The result is a substantial decrease in the required volume of communication compared to previous state of the art, up to two orders of magnitude in experiments with real-life data. Further theoretical analysis reveals that the reduction can sometimes be unbounded, and that it typically improves with the dimensionality of the data, which is borne out in the experiments.

Akihiro Kishimoto “Large-scale Parallel Best-First Search for Optimal Planning”

Large-scale, parallel clusters composed of commodity processors are increasingly available, enabling the use of vast processing capabilities and distributed RAM to solve hard search problems. I investigate a parallel algorithm for optimal sequential planning, with an emphasis on exploiting distributed memory computing clusters. The scaling behavior of the algorithm is evaluated experimentally on clusters using up to 1024 processors. I show that this approach scales well, allowing us to effectively utilize the large amount of distributed memory to optimally solve problems which require a terabyte of RAM to solve.

Silvio Lattanzi “Filtering: A Method for Solving Graph Problems in MapReduce”

The MapReduce framework is currently the de facto standard used throughout both industry and academia for petabyte scale data analysis. As the input to a typical MapReduce computation is large, one of the key requirements of the framework is that the input cannot be stored on a single machine and must be processed in parallel. In this talk we describe a general algorithmic design technique in the MapReduce framework called filtering. The main idea behind filtering is to reduce the size of the input in a distributed fashion so that the resulting, much smaller, problem instance can be solved on a single machine. Using this approach we give new algorithms in the MapReduce framework for a variety of fundamental graph problems for sufficiently dense graphs. Specifically, we present algorithms for minimum spanning trees, maximal matchings, approximate weighted matchings, approximate vertex and edge covers and minimum cuts. In all of these cases, we parameterize our algorithms by the amount of memory available on the machines allowing us to show tradeoffs between the memory available and the number of MapReduce rounds. For each setting we will show that even if the machines are only given substantially sublinear memory, our algorithms run in a constant number of MapReduce rounds. To demonstrate the practical viability of our algorithms we implement the maximal matching algorithm that lies at the core of our analysis and show that it achieves a significant speedup over the sequential version.

Luigi Laura “Biconnected components in MapReduce, using Graph (Navigational) Sketches”

Andrew McGregor “Graph Sketches”

Vahab Mirrokni “Overlapping Clustering and Distributed Computation”

I will discuss overlapping clustering and distributed computation for large-scale social networks. The talk has two parts. One part discusses the use of overlapping clustering techniques in distributed computation and in particular the computation of PageRank vectors in a distributed infrastructure. The other part describes implementation and results of a large-scale overlapping clustering algorithm. For example we discuss an application of such community discovery over Youtube videos using the co-watched graph in which we manage to cluster a graph of 150 million nodes in 8 hours.

Benjamin Moseley “Fast Clustering Using MapReduce”

Clustering problems have numerous applications and are becoming more challenging as the size of the data increases. In this paper, we consider designing clustering algorithms that can be used in MapReduce, the most popular programming environment for processing large datasets. We focus on the practical and popular clustering problems, $k$-center and $k$-median. We develop fast clustering algorithms with constant factor approximation guarantees. From a theoretical perspective, we give the first analysis that shows several clustering algorithms are in $MRC^0$, a theoretical MapReduce class introduced by Karloff et al. Our algorithms use sampling to decrease the data size and they run a time consuming clustering algorithm such as local search or Lloyd’s algorithm on the resulting data set. Our algorithms have sufficient flexibility to be used in practice since they run in a constant number of MapReduce rounds. We complement these results by performing experiments using our algorithms. We compare the empirical performance of our algorithms to several sequential and parallel algorithms for the $k$-median problem. The experiments show that our algorithms’ solutions are similar to or better than the other algorithms’ solutions. Furthermore, on data sets that are sufficiently large, our algorithms are faster than the other parallel algorithms that we tested.

Appeared in KDD 2011. Joint work with Alina Ene and Sungjin Im

Jeff M. Phillips “Mergeable Summaries”

We study the *mergeability* of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modifications. Specifically, we show that for eps-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/eps); for eps-approximate quantiles, there is a deterministic summary of size O(1/eps) log(eps n)) that has a restricted form of mergeability, and a randomized one of size O((1/eps) log^{3/2} (1/eps)) with full mergeability. We also extend our results to geometric summaries such as eps-approximations and eps-kernels.

We also achieve two results of independent interest: (1) we provide the best known randomized streaming bound for eps-approximate quantiles that depends only on eps, of size O((1/eps) log^{3/2}(1/eps)), and (2) we demonstrate that the MG and the SpaceSaving summaries for heavy hitters are isomorphic.

Joint work with: Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Zhewei Wei, and Ke Yi.

Assaf Schuster “Geometric Monitoring of Distributed Streams”

Monitoring data streams in a distributed system has been the focus of much recent research. Most of the proposed schemes, however, deal with monitoring simple aggregated values, such as the frequency of appearance of items in the streams. More involved challenges, such as feature selection (e.g., by monitoring the information gain of various features), still require very high communication overhead using naive, centralized algorithms. I will present a geometric approach which reduces monitoring the value of a function to a set of constraints applied locally on each of the streams. The constraints are used to locally filter out data increments that do not affect the monitoring outcome, thus avoiding unnecessary communication. I will discuss both the basic approach we introduced in 2007 and some recent extensions.

Izchak Sharfman “Monitoring the Median and Related Functions”

The median function has numerous applications in computer science and robust estimation. In the context of stream monitoring, it is applied to the output of sketches in order to obtain tight and reliable approximations to the underlying stream data. I will present a computationally efficient algorithm for monitoring the median of linear and quadratic functions defined over streams, which correspond to the problems of tracking range queries, self-joins, and joins in streaming data.

Cliff Stein, “Scheduling a Google data center”

Jun Tarui, “Complexity of Finding a Duplicate in a Stream”

Srikanta Tirthapura, “Distributed Random Sampling”

I will talk about an algorithm for continuous maintenance of a random sample in a distributed setting. This simple algorithm is proved to be optimal in communication cost. I will also talk about models for distributed streams, and some ideas to unify previously studied “one-pass” and “continuous” models of streams.

(joint work with David Woodruff)

Jeffrey Ullman “One-Pass Map-Reduce Algorithms”

I shall review a number of algorithms that use a single round of map-reduce to solve an interesting problem. Such algorithms can be evaluated both in terms of their processing cost at the mappers and reducers and their communication cost (the total number of key-value pairs that must be passed from mappers to reducers). Depending on the computing environment, either can be the dominant cost. For the problem of fuzzy joins (finding pairs of elements that are within some minimum distance of each other), we offer a spectrum of algorithms that offer different communication/computation trade-offs. For the problem of enumerating triangles or more complex structures (“”sample graphs”") in a large data graph, we use a one-round map-reduce implementation of conjunctive queries from Afrati and U. (EDBT, 2010) to minimize communication cost. To minimize computation cost for the same class of problems, we offer “”convertible algorithms”" (serial algorithms that can be implemented at the reducers without an order-of-magnitude increase in total computation cost). For general data graphs, there are convertible algorithms that meet the lower bounds established by Noga Alon (IJM, 1981) and for all connected sample graphs there is a convertible algorithm that runs in time O(m^{p/2})$, where p is the number of nodes in the sample graph and m is the number of edges in the data graph.

[Work variously coauthored with Foto Aftrati, Anish das Sarma, Dimitris Fotakis, David Menestrina, and Aditya Parameswaran]

Takeaki Uno “Fast Algorithms for Big-data”

One of recent business/research topic is big data. The big data helps us analysis of the data, for example, increase of accuracy, clarifying minorities. One of the main difficulties of big data analysis is the heavy computation. There would be many approaches, including distributed computation. In this talk, I would like to talk about algorithmic approaches to big data analysis in practically short time. The algorithms are for pattern mining, similarity analysis, compression, etc, and I also show the efficiency of the algorithms.

Sergei Vassilivitskii “MapReduce Algorithmics”

Suresh Venkatasubramanian “Protocols for Distributed Learning

We consider the problem of learning classifiers for labeled data that has been distributed across several nodes. Our goal is to find a single classifier, with small approximation error, across all datasets while minimizing the communication between nodes. This setting models real-world communication bottlenecks in the processing of massive distributed datasets. We present several very general sampling-based solutions as well as some two-way protocols which have a provable exponential speed-up over any one-way protocol. We focus on core problems for noiseless data distributed across two or more nodes. The techniques we introduce are reminiscent of active learning, but rather than actively probing labels, nodes actively communicate with each other, each node simultaneously learning the important data from another node.

Milan Vojnovic, “Continuous Distributed Counting for Non-Monotonic Streams”

I will discuss the problem of tracking a sum of values in distributed environments where input values arrive from k distributed streams. It is required that an estimate of the sum is maintained by a coordinator node that is within a prescribed relative error tolerance, at any time with high probability. It is desired that this is achieved with small total communication with the coordinator node, in order to minimize the use of network bandwidth that may be a scarce resource. This is a basic distributed computation problem that is of interest for various applications including data aggregation and iterative solving of various computational problems. I will show that a sub-linear total communication with respect to the number of updates can be achieved for input values that are selected by an adversary but arrive in a random order, and show matching lower bounds. This result stands in between the previously known results that for monotonic partial sums (all value updates either positive or negative), the expected total communication is logarithmic in the number of updates, and that for general worst-case inputs, the total required communication is linear in the number of updates. I will also discuss applications of the sum tracking module for tracking the second frequency moment and for solving a Bayesian linear regression.

Joint work with Zhenming Liu and Bozidar Radunovic.

Ke Yi “Computing Statistical Summaries over Massive Distributed Data”

Facing today’s big data challenge, it is often impossible and unnecessary to examine the entire data set for many analytical tasks. What is more useful is various concise statistical summaries extracted from the underlying data that capture certain data characteristics. In this talk, I will present simple and efficient algorithms for computing some classical statistical summaries over distributed data, in particular the frequency estimation problem and quantiles. These algorithms can be easily implemented in distributed systems like MapReduce, Dremel, and sensor networks.

Qin Zhang, “Tight bounds for Distributed Continuous Monitoring”

In this talk we will discuss recent developments in distributed continuous monitoring. We will introduce several new techniques to get tight bounds for central problems in this model, including frequency moments, heavy hitters and empirical entropy. In particular we will talk about how to get (almost) tight lower bounds for F0 (number of distinct elements) and F2 (size of self-join), based on multiparty number-in-hand communication complexity. These results resolve several major theoretical open problems in the area of distributed continuous monitoring.

Category: Schedule


One Response

  1. [...] Jeffrey Ullman, “One-Pass Map-Reduce Algorithms” 9.50 Foto Afrati, “Transitive Closure and Recursive Datalog Implemented on Clusters” 10.30 Break 10.55 Assaf Schuster, “Geometric Monitoring of Distributed Streams” 11.30 Milan [...]