dkolovos.8: Hybrid Distributed Processing Architectures

Description

State-of-the-art stream processing frameworks such as Apache Spark [1] and Flink [2] provide facilities for distributing the processing of streams of incoming data (e.g. end-user events, incidents) over a large number of computing nodes and are extensively used in organisations such as Amazon, TripAdvisor, Netflix, Alibaba etc. One of their main current limitations is the assumption that all computing nodes are equally capable of carrying out all types of processing involved in the execution graph. The aim of this project is to investigate novel extensions of such big-data stream processing frameworks that can facilitate hybrid architectures involving "opinionated" computing nodes, which can declare their capabilities and preferences at runtime.

Prerequisites

Suitable applicants should have strong object-oriented skills and good knowledge of Java (or a similar OO language).

Eligibility

This project meets the project specifications of the following courses:
BSc

Resources

  1. Apache Spark
  2. Apache Flink

 Printer-friendly version