Uncategorized

Introduction To Apache Spark

I have just started a new job, where we will be using the following technology stack

  • Apache Spark
  • Apache Zookeeper
  • Cassandra
  • Scala

As I get to grips with these I will be writing introductory articles on these things, that will hopefully help those that wish to take their 1st steps with these cool bits of tech.

The 1st one is done as is on Apache Spark : http://www.codeproject.com/Articles/1023037/Introduction-to-Apache-Spark

 

This is what the creators of Apache Spark have to say about their own work.

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

So if this sounds of interest to you, I hope you enjoy the article.