Apache Spark Features
In-memory computation Distributed processing using parallelize Can be used with many cluster managers (Spark, Yarn, Mesos e.t.c) Fault-tolerant Immutable Lazy evaluation Cache & persistence Inbuild-optimization when using DataFrames Supports ANSI SQL