Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities.

  • A streaming-first runtime that supports both batch processing and data streaming programs
  • Elegant and fluent APIs in Java and Scala
  • A runtime that supports very high throughput and low event latency at the same time
  • Support for event time and out-of-order processing in the DataStream API, based on the Dataflow Model
  • Flexible windowing (time, count, sessions, custom triggers) across different time semantics (event time, processing time)
  • Fault-tolerance with exactly-once processing guarantees
  • Natural back-pressure in streaming programs
  • Libraries for Graph processing (batch), Machine Learning (batch), and Complex Event Processing (streaming)
  • Built-in support for iterative programs (BSP) in the DataSet (batch) API
  • Custom memory management for efficient and robust switching between in-memory and out-of-core data processing algorithms
  • Compatibility layers for Apache Hadoop MapReduce
  • Integration with YARN, HDFS, HBase, and other components of the Apache Hadoop ecosystem

源代码: https://github.com/apache/kafka
文档: https://kafka.apache.org/

其他钉钉沟通群

钉钉
钉钉