akka : ‘hello world’

So last time I outlined the road map for this series of articles on Akka, and posted up some information from the Akka creators on what they had to say about Akka.

This time we will look at a simple example of Akka.

But just before we do that, let’s talk a bit more about WHY I think Akka is great stuff.

I have been a .NET programmer for many years and have seen asynchronous programming come in many different flavors now

  • Asynchronous delegates
  • BackgroundWorker
  • Task (TPL)
  • Async Await
  • RX
  • Concurrent collections
  • Critical sections (synchronized sections of code)

All of these pretty much have their JVM equivalent, and what you would have likely seen if you have used the .NET or JVM equivalents is the used of locks from time to time. For example under the covers the the concurrent collections would still use some locking (monitor enter/exit) to achieve the critical sections

This is all good stuff, and has got better to work with over the years, but there could be better more elegant lock free way of working with concurrent/parallel programming.

For me this is what Akka brings to the table. Instead of working with shared state that MUST be protected when writing multithreaded code, we simply don’t use any shared state and create dedicated micro sized bits of code that deal with one thing and one thing only. These are called “Actors”.

Actors do NOT share state instead they work independently of each other and rely on message passing. Where the message payload should give the actor either everything it needs to do it’s job (or at the very least enough information to perhaps look things up, say an Id such that Actor can look up the entity required by its Id field).

At no point will we be using locks within actors.

Ok so that is my mini-rant/intro over. Lets now carry on and have a look at what it takes to write some Akka code in Scala.

What Libs Do We Need?

As I mentioned in the introduction post I will be using Scala exclusively to do this series of posts. As such I shall also be using SBT to do the build side of things.

So for this post we are only showing how to use simple actors so we don’t have to pull in that many dependencies, we can keep things simple and use the following “build.sbt” file, where I am pulling in the following 2 dependencies

  • Basic Akka stuff
  • Joda time
name := "HelloWorld"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq(
  "com.typesafe.akka" % "akka-actor_2.11" % "2.4.8",
  "joda-time" % "joda-time" % "2.9.4")

NOTE : This SBT set of dependencies may grow in subsequent articles, but where it does require pulling in more Akka JARs I will show that when the time comes.

 

How Do We Create An Actor System?

To be to use the Akka actor system, we must first create an Akka system that is the fabric that all actors run under. This is easily achieved as follows:

object Demo extends App {

  //create the actor system
  val system = ActorSystem("HelloSystem")

  //---------------------------
  //   EXTRA STUFF WILL GO HERE
  //---------------------------


  //shutdown the actor system
  system.terminate()

  StdIn.readLine()
}

Note that you should ensure that the Akka system is also shutdown correctly.

The example I show here is a simple console type application, so my startup/shutdown logic is all in the same file, but in a real production app, things may be more complex (well they should be I would hope).

 

How Do We Create An Actor?

Now that we have an actor system, the next thing we need to do is create some actors that will live within the actor system. I used the word live as a ownership type of arrangement.

So how do we create an actor exactly?

Well luckily this to is dead simple we just need to inherit from Actor and provide an implementation for the receive method to handle the different messages that may be sent to the actor.

Recall that an actor works by receiving messages and acting upon them.

Here is what the skeleton code may look like:

import akka.actor.Actor

class HelloActor extends Actor {
  def receive = {
    //DO THE MESSAGE HANDLING HERE
  }
}

We will talk more about the receive method in a while, for now just know that you must implement this method for an Akka actor to work correctly

Difference Between Tell And Ask

Just before we get on to seeing the examples, lets just take a brief diversion where we talk about the difference between ask and tell.

When we ask (? method in scala) an actor we expect a response by way of a Future[T], this is an asynchronous operation.

When we tell (! method in scala) an actor something this is analogous to fire and forget (of course the receiving actor could send a response back to the initiating actor via a different message, but that’s a different story), this is an asynchronous operation that returns immediately.

Sender

We have not covered this ground yet but we will in one of the subsequent posts, but for now all you need to know is that when you are sending messages to an actor, you do so by a construct called actorRef which is a kind of like a handle to an actor.

There are special types of actorRef one such case being “sender” who is the initiator actorRef of the message being received (if we are talking from the context of the receiving actor). You will see “sender” used in the examples below.

How Do We Send A Message To An Actor?

Here is how we would send a message to an actor. This is a tell (!) so is fire and forget.

import akka.actor._
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.{Await, ExecutionContext, Future}
import scala.concurrent.duration._
import scala.language.postfixOps
import scala.io.StdIn
import scala.util.{Success, Failure}
import ExecutionContext.Implicits.global

object Demo extends App {

  //create the actor system
  val system = ActorSystem("HelloSystem")

  // default Actor constructor
  val helloActor = system.actorOf(Props[HelloActor], name = "helloactor")

  //send some messages to the HelloActor (fire and forget)
  helloActor ! "hello"
  helloActor ! "tis a fine day for Akka"

  //shutdown the actor system
  system.terminate()

  StdIn.readLine()
}

Where we have the following HelloActor implementation

import akka.actor.Actor

class HelloActor extends Actor {
  def receive = {
    case "hello" => println("world")
    case _       => println("unknown message")
  }
}

See how for this simple actor we only ever deal with 2 things in the receive method

  • “Hello”
  • Anything else

It is consired good practice to ensure that you handle the correct messages and deal with unknown messages too

How Do We Wait For A Response From An Actor?

So we just saw a tell (fire and forget) but how about an ask. This is slightly harder but not much, we simply have to deal with the fact that a Future[T] will be the result of an ask. There are numerous ways of dealing with that, assuming we have the following actor

import akka.actor.Actor

class AskActor extends Actor {
  def receive = {
    case GetDateMessage => sender ! new org.joda.time.DateTime().toDate().toString()
    case _       => println("unknown message")
  }
}

Here are some examples of how to deal with the result of the ask

import akka.actor._
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.{Await, ExecutionContext, Future}
import scala.concurrent.duration._
import scala.language.postfixOps
import scala.io.StdIn
import scala.util.{Success, Failure}
import ExecutionContext.Implicits.global

object Demo extends App {

  //create the actor system
  val system = ActorSystem("HelloSystem")

  // default Actor constructor
  val askActor = system.actorOf(Props[AskActor], name = "askactor")

  //send some messages to the AskActor, we want a response from it

  // (1) this is one way to "ask" another actor for information
  implicit val timeout = Timeout(5 seconds)
  val future1 = askActor ? GetDateMessage
  val result1 = Await.result(future1, timeout.duration).asInstanceOf[String]
  println(s"result1=$result1")

  // (2) a slightly different way to ask another actor for information
  val future2: Future[String] = ask(askActor, GetDateMessage).mapTo[String]
  val result2 = Await.result(future2, 5 second)
  println(s"result2=$result2")

  // (3) don't use blocking call at all, just use future callbacks
  val future3: Future[String] = ask(askActor, GetDateMessage).mapTo[String]
  future3 onComplete {
    case Success(result3) =>  println(s"result3=$result3")
    case Failure(t) => println("An error has occured: " + t.getMessage)
  }


  //shutdown the actor system
  system.terminate()

  StdIn.readLine()
}

Piping Futures

Another use case you may have is that you may want to use Future[T] internally within the actor code and send Future[T] around from actor to actor. Akka also supports this by the use of the pipe pattern, which you can use like this where we are piping the Future[List[Int]] back to the sender

import akka.actor._
import akka.pattern.pipe
import scala.concurrent.{ExecutionContext, Future}
import ExecutionContext.Implicits.global

class FutureResultActor extends Actor {
  def receive = {
    case GetIdsFromDatabase => {
      Future(List(1,2,3)).pipeTo(sender)
    }
    case _       => println("unknown message")
  }
}

Where the code that initiated the sending of the GetIdsFromDatabase message looks like this (the sender in the code above)

import akka.actor._
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.{Await, ExecutionContext, Future}
import scala.concurrent.duration._
import scala.language.postfixOps
import scala.io.StdIn
import scala.util.{Success, Failure}
import ExecutionContext.Implicits.global

object Demo extends App {

  //create the actor system
  val system = ActorSystem("HelloSystem")

  // default Actor constructor
  val futureResultActor = system.actorOf(Props[FutureResultActor], name = "futureresultactor")

  //send some messages to the FutureResultActor, we expect a Future back from it
  val future4: Future[List[Int]] = ask(futureResultActor, GetIdsFromDatabase).mapTo[List[Int]]
  future4 onComplete {
    case Success(result4) =>  println(s"result4=$result4")
    case Failure(t) => println("An error has occured: " + t.getMessage)
  }


  //shutdown the actor system
  system.terminate()

  StdIn.readLine()
}

 

Where Is The Code?

As previously stated all the code for this series will end up in this GitHub repo:

https://github.com/sachabarber/SachaBarber.AkkaExamples

AkkA series

I have been idle for a while, which kind of irks me. Thing is I have not really been that idle, it’s just that I have not been doing that much in the .NET space of late. That is not to say I do not like .NET anymore, it’s just that I am spreading my time between .NET and the JVM which I am actually enjoying)

I thought the time has come for me to come out with a few posts of what I have been up to. So to that end I thought I would start out with a series of posts on Akka.

There will be quite a few parts to this mini series. And I will update this page as I bring new posts on line.

This is what I am planning on covering

  • What is Akka (this post)
  • “hello world”
  • Heirachies / Lifecyles
  • Mailboxes
  • Logging
  • Dead letters and how to monitor for them
  • Persistent Actors
  • Routing
  • Finite state machines
  • Selection
  • Remoting
  • Clustering
  • Testkit
  • Http
  • Streams

What Is Akka

I can think of no better way to describe Akka then to screen scrape what the creators of Akka have to say about it. So here is what they say about it

We believe that writing correct distributed, concurrent, fault-tolerant and scalable applications is too hard. Most of the time it’s because we are using the wrong tools and the wrong level of abstraction. Akka is here to change that. Using the Actor Model we raise the abstraction level and provide a better platform to build scalable, resilient and responsive applications—see the Reactive Manifesto for more details. For fault-tolerance we adopt the “let it crash” model which the telecom industry has used with great success to build applications that self-heal and systems that never stop. Actors also provide the abstraction for transparent distribution and the basis for truly scalable and fault-tolerant applications.

Actors

Actors give you:

  • Simple and high-level abstractions for distribution, concurrency and parallelism.
  • Asynchronous, non-blocking and highly performant message-driven programming model.
  • Very lightweight event-driven processes (several million actors per GB of heap memory).

Fault Tolerance

  • Supervisor hierarchies with “let-it-crash” semantics.
  • Actor systems can span over multiple JVMs to provide truly fault-tolerant systems.
  • Excellent for writing highly fault-tolerant systems that self-heal and never stop.

Location Transparency

Everything in Akka is designed to work in a distributed environment: all interactions of actors use pure message passing and everything is asynchronous.

Persistence

State changes experience by an actor can optionally be persisted and replayed when the actor is started or restarted. This allows actors to recover their state, even after JVM crashes or when being migrated to another node.

 

So that is the 1000 mile view of what Akka is. You will get more familiar with it as we move through the series I hope.

What Form Will The Examples Take

As I am trying to improve my Scala to get it in line with what I can do in .NET, all examples will be based on

Where Can I Find The Code Examples?

I will be augmenting this GitHub repo with the example projects as I move through this series

https://github.com/sachabarber/SachaBarber.AkkaExamples

I hope you all enjoy the series

That’s all for now, Like I say I will update this page when more posts become available, which may be a while since I have a day job, 2 kids, one wife and a cat, and I like a drink too, and also enjoy my weekends.  So it happens when it happens folks

OuT WITH RAVEN embedded In with litedb

I recently starting working (I have now finished it) writing a small internal web site using the following things

  • WebApi2
  • OAuth
  • JWT support
  • OWIN
  • AutoFac
  • Raven DB Embedded
  • Aurelia.io for front end

I have to say it worked out great, It was a pleasuree to work on, all the way through.

I quite like Raven embedded, for this type of app. Its completely stand alone, and does just what I need from it.

So I got the end of the project, and I was pretty sure I checked that we had licenses for everything I was using. Turns out we didn’t have one for RavenDB.

Mmm. This app was a tool really to help us internally so we did not want to spend that much on it.

Shame as I like Raven. I started to look around for another tool that could fit the bill.

This was my shopping list

  • Had to be .NET
  • Had to support document storage
  • Had to have LINQ support
  • Had to support same set of features that I was using as Raven Embedded (CRUD + indexes essentially)
  • Had to be free
  • Had to be embedded as single Dll

It did not take me long to stumble upon LiteDB.

This ticked all my boxes and more. I decided to try it out in a little Console app to test it, and was extremely happy. I did not do any performance testing, as that is not such a concern for the app that I was building, but from an API point of view, it would prove to be very easy to replace the Raven Embedded code I had written so far.

I was happy.

Just thought I would show you all a little bit of its usage right here

 

Installation

This is done via NuGet. The package is called “LiteDB”

CRUD

Assuming we have this entity

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LiteDBDemo
{
    public class Customer
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string[] Phones { get; set; }
        public bool IsActive { get; set; }

        public override string ToString()
        {
            return string.Format("Id : {0}, Name : {1}, Phones : {2}, IsActive : {3}",
                Id,
                Name,
                Phones.Aggregate((x, y) => string.Format("{0}{1}", x, y)),
                IsActive);
        }
    }
}

Here is how you would might use LiteDB to perform CRUD operation. See how it has the concept of collections. This is kind of like MongoDB if you have used that.

using LiteDB;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LiteDBDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            // Open database (or create if not exits)
            using (var db = new LiteDatabase(@"MyData.db"))
            {
                //clean out entire collection, will drop documents, indexes everything
                db.GetCollection<Customer>("customers").Drop();


                Create(db);
                Read(db);
                Update(db);
                Delete(db);
            }
            Console.ReadLine();
        }

        private static void Create(LiteDatabase db)
        {
            Console.WriteLine("\r\nCREATE\r\n");

            // Get customer collection
            var customers = db.GetCollection<Customer>("customers");

            // Create your new customer instance
            var customer = new Customer
            {
                Name = "John Doe",
                Phones = new string[] { "8000-0000", "9000-0000" },
                IsActive = true
            };

            // Insert new customer document (Id will be auto-incremented)
            customers.Insert(customer);
            Console.WriteLine("Inserted customer");
        }

        private static void Read(LiteDatabase db)
        {
            Console.WriteLine("\r\nREAD\r\n");

            // Get customer collection
            var customers = db.GetCollection<Customer>("customers");

            // Index document using a document property
            customers.EnsureIndex(x => x.Name);

            // Use Linq to query documents
            var firstCustomer = customers.Find(x => x.Name.StartsWith("Jo")).FirstOrDefault();
            Console.WriteLine(firstCustomer);

        }

        private static void Update(LiteDatabase db)
        {
            Console.WriteLine("\r\nUPDATE\r\n");

            // Get customer collection
            var customers = db.GetCollection<Customer>("customers");
            // Use Linq to query documents
            var johnDoe = customers.Find(x => x.Name == "John Doe").First();
            Console.WriteLine("Before update");
            Console.WriteLine(johnDoe);

            johnDoe.Name = "John Doe MODIFIED";
            customers.Update(johnDoe);

            var johnDoe2 = customers.Find(x => x.Name == "John Doe MODIFIED").First();
            Console.WriteLine("Read updated");
            customers.Update(johnDoe2);
            Console.WriteLine(johnDoe2);

        }

        private static void Delete(LiteDatabase db)
        {
            Console.WriteLine("\r\nDELETE\r\n");

            // Get customer collection
            var customers = db.GetCollection<Customer>("customers");
            // Use Linq to query documents
            customers.Delete(x => x.Name == "John Doe MODIFIED");
            Console.WriteLine("Deleting Name = 'John Doe MODIFIED'");

            var johnDoe = customers.Find(x => x.Name == "John Doe MODIFIED").FirstOrDefault();
            Console.WriteLine("Looking for Name = 'John Doe MODIFIED'");
            Console.WriteLine(johnDoe == null ? "It's GONE" : johnDoe.ToString());


        }

    }
}

You can learn more about this over at the LiteDB website

http://www.litedb.org/

 

Overall I was very very happy with LiteDB and I particularly like the fact that is was free, and it did pretty much exactly the same as RavenDB Emebedded (sometimes it was easier to do as well).

I would use this library again for sure, I found it spot on to be honest.

Like a nice Gin and Tonic on a summers day.

 

The Nuances of Loading and Unloading Assemblies with AppDomain

I don’t normallly like just pointing out other peoples work, bit this time I have no hesitation at all in doing just that. If you have ever worked with AppDomain(s) in .NET you would have certainly had some fun.

CodeProject Marc Clifton has written a truly great article on AppDomain(s) which you should all read. You can find it here : http://www.codeproject.com/Articles/1091726/The-Nuances-of-Loading-and-Unloading-Assemblies-wi

Nice one Marc

WebApi POST + [ISerializable] + JSON .NET

At work I have taken on the task of building a small utility web site for admin needs. Thing is I wanted it to be very self contained so I have opted for this

  • Self hosted web API
  • JSON data exchanges
  • Aurelia.IO front end
  • Raven DB database

So I set out to create a nice web api endpoint like this

private IDocumentStore _store;

public LoginController(IDocumentStore store)
{
	_store = store;
}

[HttpPost]
public IHttpActionResult Post(LoginUser loginUser)
{
    //
}

Where I then had this datamodel that I was trying to post via the awesome AWEWSOME REST plugin for Chrome

using System;
 
namespace Model
{
    [Serializable]
    public class LoginUser
    {
        public LoginUser()
        {
 
        }
 
        public LoginUser(string userName, string password)
        {
            UserName = userName;
            Password = password;
        }
 
        public string UserName { get; set; }
        public string Password { get; set; }
 
        public override string ToString()
        {
            returnstring.Format("UserName: {0}, Password: {1}", UserName, Password);
        }
    }
}

This just would not work, I could see the endpoint being called ok, but no matter what I did the LoginUser model only the post would always have NULL properties. After a little fiddling I removed the [Serializable] attribute and it all just started to work.

Turns out this is to do with the way JSON.Net works when it see the [Serializable] attribute.

For example if you had this model

[Serializable]
public class ResortModel
{
    public int ResortKey { get; set; }
    public string ResortName { get; set; }
}

Without the [Serializable] attribute the JSON output is:

{
    "ResortKey": 1,
    "ResortName": "Resort A"
}

With the [Serializable] attribute the JSON output is:

{
    "<ResortKey>k__BackingField": 1,
    "<ResortName>k__BackingField": "Resort A"
}

I told one of my collegues about this, and he found this article : http://stackoverflow.com/questions/29962044/using-serializable-attribute-on-model-in-webapi which explains it all nicely including how to fix it

Hope that helps, sure bit me in the Ass

Entity framework 7 in memory provider test

A while ago I wrote an article that http://www.codeproject.com/Articles/875165/To-Repository-Or-NOT which talked about how to test repository classes using Entity Framework and not using Entity Framework.

It has been a while and I am just about to start a small project for one of the Admin staff at work, to aid her in her day to day activities.

As always there will be a database involved.

I will likely be using Owin and OR MVC5 with Aurelia.IO for Client side.

Not sure about DB, so I decided to try out the In Memory support in the yet to be released Entity Framework 7.

Grabbing The Nuget Package

So lets have a look. The first thing you will need to do is grab the Nuget package which for me was as easy as using the Nuget package window in Visual Studio 2015.

image

The package name is “EntityFramework.InMemory” this will bring in the other bits and pieces you need.

NOTE : This is a pre-release NuGet package so you will need to include prelease packages.

 

The Model

So now that I have the correct packages in place its just a question of crafting some model classes. I am using the following 2

Person

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace EF7_InMemoryProviderTest
{
    public class Person
    {
        public Person()
        {
            Qualifications = new List<Qualification>();
        }

        public int Id { get; set; }

        public string FirstName { get; set; }

        public string LastName { get; set; }

        public ICollection<Qualification> Qualifications { get; set; }

        public override string ToString()
        {
            string qualifications = Qualifications.Any() ?
                    Qualifications.Select(x => x.Description)
                        .Aggregate((x, y) => string.Format("{0} {1}", x, y)) :
                    string.Empty;

            return string.Format("Id : {0}, FirstName : {1}, LastName : {2}, \r\nQualifications : {3}\r\n",
                        Id, FirstName, LastName, qualifications);
        }
    }
}

Qualification

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace EF7_InMemoryProviderTest
{
    public class Qualification
    {
       
        public int Id { get; set; }

        public string Description { get; set; }

        public override string ToString()
        {
            return string.Format("Id : {0}, Description : {1}",
                        Id, Description);
        }

    }
}

 

Custom DbContext

Nothing more to it than that. So now lets look at creating a DbContext which has our stuff in it. For me this again is very simple, I just do this:

using Microsoft.Data.Entity;
using Microsoft.Data.Entity.Infrastructure;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace EF7_InMemoryProviderTest
{
    public class ClassDbContext : DbContext
    {
        public ClassDbContext(DbContextOptions options)
            : base(options)
        {
        }

        public DbSet<Person> Members { get; set; }
        public DbSet<Qualification> Qualifications { get; set; }
    }
}

Writing Some Test Code Using The InMemory Provider

So now that we have all the pieces in place, lets run some code to do a few things

  1. Seed some data
  2. Obtain a Person
  3. Add a Qualification to the Person obtained

Here is all the code to do this

using Microsoft.Data.Entity;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace EF7_InMemoryProviderTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var optionsBuilder = new DbContextOptionsBuilder<ClassDbContext>();
            optionsBuilder.UseInMemoryDatabase();

            using (var classDbContext = new ClassDbContext(optionsBuilder.Options))
            {
                SeedData(classDbContext);

                var personId1 = GetMember(classDbContext, 1);
                Console.WriteLine("> Adding a qualication\r\n");
                personId1.Qualifications.Add(classDbContext.Qualifications.First());
                classDbContext.SaveChanges();

                personId1 = GetMember(classDbContext, 1);


                Console.ReadLine();

            }
        }

        private static Person GetMember(ClassDbContext classDbContext, int id)
        {
            var person = classDbContext.Members.FirstOrDefault(x => x.Id == id);
            Console.WriteLine(person);
            return person;
        }


        private static void SeedData(ClassDbContext classDbContext)
        {
            classDbContext.Members.Add(new Person()
                {
                    Id = 1,
                    FirstName = "Sacha",
                    LastName = "Barber"
                });
            classDbContext.Members.Add(new Person()
                {
                    Id = 2,
                    FirstName = "Sarah",
                    LastName = "Barber"
                });

            classDbContext.Qualifications.Add(new Qualification()
                {
                    Id = 1,
                    Description = "Bsc Hons : Computer Science"
                });
            classDbContext.Qualifications.Add(new Qualification()
                {
                    Id = 2,
                    Description = "Msc : Computer Science"
                });
            classDbContext.Qualifications.Add(new Qualification()
                {
                    Id = 3,
                    Description = "Bsc Hons : Naturapathic medicine"
                });

            classDbContext.SaveChanges();
        }
    }
}

And this is the results

image

Closing Note

Quite happy with how easy this was, and I think I would definitely try this out for real.

If you want to play along, I have a demo project (for Visual Studio 2015) here:

https://github.com/sachabarber/EF7_InMemoryTest

Apache Kafka 0.9 Scala Producer/Consumer

For my job at the moment, I am roughly spending 50% of my time working on .NET and the other 50% of the time working with Scala. As such a lot of Scala/JVM toys have spiked my interest of late. My latest quest was to try and learn Apache Kafka, well enough that I at least understood the core concepts. I have even read a book or two on Apache Kafka, now, so feel I am at least talking partial sense in this article.

So what is Apache Kafka, exactly?

Here is what the Apache Kafka folks have to say about their own tool.

Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.
Fast
A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients.

Scalable
Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers

Durable
Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.

Distributed by Design
Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.

Taken from http://kafka.apache.org/ up on date 11/03/16

Apache Kafka was designed and built by a team of engineers at LinkedIn, where I am sure you will agree they probably had to deal with quite a bit of data.

 

I decided to learn a bit more about all this and have written an article on this over at code project :

 

http://www.codeproject.com/Articles/1085758/Apache-Kafka-Scala-Producer-Consumer-With-Some-RxS

 

In this article I will talk you through some of the core Apache Kafka concepts, and will also show how to create a Scala Apache Kafka Producer and a Scala Apache Kafka Consumer. I will also sprinkle some RxScala pixie dust on top of the Apache Kafka Consumer code such that the RX operators to be applied to the incoming Apache Kafka messages.

CASSANDRA + SPARK 2 OF 2

Last time I walked you through how to install Cassandra in the simplest manner possible. Which was as a single node installation using the DataStax community edition.

All good stuff. So I have also just written up how you might use Scala/DataStax Cassandra/Spark connector, to allow you to retrieve data from Cassandra into Spark RDDs and Cassandra tables to be hydrated into Spark RDDs.

These 2 Cassandra articles and the 1st Spark one kind of form a series of articles, which you can find using the series links at the top of the articles.

Anyway here is the latest installment

Apache Spark/Cassandra 2 of 2

CASSANDRA + SPARk 1 of 2

A while ago I wrote about using Apache Spark, which is a great tool. I have been using Cassandra for a bit at work now, so thought it might be nice to revisit that artilcle and talk through how to use Spark with Cassandra.

Here is the 1st part of that ; http://www.codeproject.com/Articles/1073158/Apache-Spark-Cassandra-of

Scala : multi project sbt setup

A while ago I wrote a post about how to use SBT (Scala Build Tool):

https://sachabarbs.wordpress.com/2015/10/13/sbt-wheres-my-nuget/

In that post I showed simple usages of SBT. Thing is that was not really that realistic, so I wanted to have a go at a more real world example of this. One where we might have multiple projects, say like this:

 

SachasSBTDemo-App which depends on 2 sub projects

  • SachasSBTDemo-Server
  • SachasSBTDemo-Common

So how do we go about doing this with SBT?

There are 5 main steps to do this. Which we look at in turn.

SBT Directory Structure

The first that we need to do is create a new project folder (if you are from Visual Studio / .NET background think of this as the solution folder) called “project”

In here we will create 2 files

build.properties which just lists the version of SBT we will use. It looks like this

sbt.version=0.13.8

SachaSBTDemo.scala is what I have called the other file, but you can call it what you like. Here is the contents of that file, this is the main SBT file that governs how it all hangs together. I will be explaining each of these parts as we go.

  import sbt._
  import Keys._

object BuildSettings {


  val buildOrganization = "sas"
  val buildVersion      = "1.0"
  val buildScalaVersion = "2.11.5"

  val buildSettings = Defaults.defaultSettings ++ Seq (
    organization := buildOrganization,
    version      := buildVersion,
    scalaVersion := buildScalaVersion
  )
}


object Dependencies {
  val jacksonjson = "org.codehaus.jackson" % "jackson-core-lgpl" % "1.7.2"
  val scalatest = "org.scalatest" % "scalatest_2.9.0" % "1.4.1" % "test"
}


object SachasSBTDemo extends Build {

  import Dependencies._
  import BuildSettings._

  // Sub-project specific dependencies
  val commonDeps = Seq (
     jacksonjson,
     scalatest
  )

  val serverDeps = Seq (
     scalatest
  )


  lazy val demoApp = Project (
    "SachasSBTDemo-App",
    file ("SachasSBTDemo-App"),
    settings = buildSettings
  )
  //build these projects when main App project gets built
  .aggregate(common, server)
  .dependsOn(common, server)

  lazy val common = Project (
    "common",
    file ("SachasSBTDemo-Common"),
    settings = buildSettings ++ Seq (libraryDependencies ++= commonDeps)
  )

  lazy val server = Project (
    "server",
    file ("SachasSBTDemo-Server"),
    settings = buildSettings ++ Seq (libraryDependencies ++= serverDeps)
  ) dependsOn (common)
  
}

 

Projects

In order to have separate project we need to use the Project item from the SBT library JARs. A minimal Project setup will tell SBT where to create the new Project. Here is an example of a Project, where the folder we expect SBT to create will be called “SachasSBTDemo-App”.

lazy val demoApp = Project (
    "SachasSBTDemo-App",
    file ("SachasSBTDemo-App"),
    settings = buildSettings
  )

Project Dependencies

We can also specify Project dependencies using “dependsOn” which takes a Seq of other projects that this Project depends on.

That means that when we apply an action to the Project that is depended on, the Project that has the dependency will also have the action applied.

lazy val demoApp = Project (
    "SachasSBTDemo-App",
    file ("SachasSBTDemo-App"),
    settings = buildSettings
  )
  //build these projects when main App project gets built
  .aggregate(common, server)
  .dependsOn(common, server)

Project Aggregation

We can also specify Project aggregates results from other projects, using “aggregate” which takes a Seq of other projects that this Project aggregates.

What “aggregate” means is that whenever we apply an action on the aggregating Project we should also see the same action applied to the aggregated Projects.

lazy val demoApp = Project (
    "SachasSBTDemo-App",
    file ("SachasSBTDemo-App"),
    settings = buildSettings
  )
  //build these projects when main App project gets built
  .aggregate(common, server)
  .dependsOn(common, server)

Library Dependencies

Just like the simple post I did before, we still need to bring in our JAR files using SBT. But this time we come up with a nicer way to manage them. We simply wrap them all up in a simple object, and then use the object to satisfy the various dependencies of the Projects. Much neater.

import sbt._
import Keys._


object Dependencies {
  val jacksonjson = "org.codehaus.jackson" % "jackson-core-lgpl" % "1.7.2"
  val scalatest = "org.scalatest" % "scalatest_2.9.0" % "1.4.1" % "test"
}

  // Sub-project specific dependencies
  val serverDeps = Seq (
     scalatest
  )

  .....
  .....
  lazy val server = Project (
    "server",
    file ("SachasSBTDemo-Server"),
    //bring in the library dependencies
    settings = buildSettings ++ Seq (libraryDependencies ++= serverDeps)
  ) dependsOn (common)

The Finished Product

The final product once run through SBT should be something like this if viewed in IntelliJ IDEA:

image

 

Or like on the file system

image

If you want to grab my source files, they are available here at GitHub : https://github.com/sachabarber/SBT_MultiProject_Demo

Follow

Get every new post delivered to your Inbox.

Join 167 other followers