Uncategorized

Elasticsearch

So most people would have probably heard of Elasticsearch by now. So what exactly is Elasticsearch?

Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now known as Elastic). Known for its simple REST APIs, distributed nature, speed, and scalability, Elasticsearch is the central component of the Elastic Stack, a set of open source tools for data ingestion, enrichment, storage, analysis, and visualization. Commonly referred to as the ELK Stack (after Elasticsearch, Logstash, and Kibana), the Elastic Stack now includes a rich collection of lightweight shipping agents known as Beats for sending data to Elasticsearch.

https://www.elastic.co/what-is/elasticsearch

Essentially it is a great tool for analysing data that is stored within indexes inside of a NoSQL type database that is clustered/sharded and fault tolerant. As the blurb above states it is built on top of Lucene. For those that are interested in that, I wrote a small article in the past on using Luecene.Net : https://www.codeproject.com/Articles/609980/Small-Lucene-NET-Demo-App

 

Anyway this post will talk you through downloading Elasticsearch for windows, and will show you how to use the high level C# client called NEST.

 

We will be learning how to do the following things:

  • Create and index new documents
  • Search for documents
  • Update documents
  • Delete documents

So let’s carry on and learn how we can download Elasticsearch.

Download

You can download it from here : https://www.elastic.co/downloads/elasticsearch. For my setup (windows) once downloaded we can simply open the bin folder from the download, and use the BAT file shown in the image below to start it on Windows.

 

image

Once you click that BAT file, and wait a while you should see something like this appear

image

Demo

For this set of demos I am using Visual Studio 2019 (Community), and have installed the following Nuget package for Elasticsearch:

<PackageReference Include="Elasticsearch.Net" Version="7.5.1" />
<PackageReference Include="NEST" Version="7.5.1" />

So with those in place lets proceed to the meat of this post, which is how do we do the things that we said we would do at the start of this post. So lets carry on to look at that. As I say this demo will use the high level Elasticsearch .NET client NEST which you can read more about here : https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/nest.html

Indexing documents

The 1st step is to get some data into Elasticsearch, so to do that we need to craft some data and also Index the data. Elastic is clever enough to infer some of the data/field types that should be used when it indexes but you can override this should you want to. Lets see an example

We will use this class (ONLY) during this demo to do all our operations with

namespace ElasticDemoApp_CSharp.Models
{
    public class Person
    {
        public string Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public bool IsManager { get; set; }
        public DateTime StartedOn { get; set; }
    }
}

It can be seen that there is an Id field in that POCO object. This field is fairly important and we will see why later.Lets see how we can get some data in.

var settings = new ConnectionSettings(new Uri("http://localhost:9200"))
.DefaultIndex("people");

var client = new ElasticClient(settings);

//CREATE
var person = new Person
{
    Id = "1",
    FirstName = "Tom",
    LastName = "Laarman",
    StartedOn = new DateTime(2016, 1, 1)
};

var people = new[]
{
    new Person
    {
        Id = "2",
        FirstName = "Tom",
        LastName = "Pand",
        StartedOn = new DateTime(2017, 1, 1)
    },
    new Person
    {
        Id = "3",
        FirstName = "Tom",
        LastName = "grand",
        StartedOn = new DateTime(2017, 5, 4)
    }
};

client.IndexDocument(person);
client.IndexMany(people);

var manager1 = new Person
{
    Id = "4",
    FirstName = "Tom",
    LastName = "Foo",
    StartedOn = new DateTime(2017, 1, 1)
};

client.Index(manager1, i => i.Index("managerpeople"));

The code above shows you how to create the initial client, and also how to insert a single document, and how to insert many documents. Elastic kind of has a few ways for doing the same thing, so its up to you which API syntax you prefer, but the examples above largely do the same thing, they get data into Elastic at certain indexes. You can read more about Indexing here :https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/indexing-documents.html

Query

So now that we have some data in we may want to Search for it. Elastic comes with a rich query API, which you can read about here : https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/search.html

So here is an example to query the data we just stored in Elastic. Note the use of the “&&” to form complex queries, you can read about that here : https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/bool-queries.html#binary-and-operator. Its worth getting to know these operators as it will make your queries more readable

//SEARCH
var searchResponse = client.Search<Person>(s => s
    .From(0)
    .Size(10)
    .AllIndices()
    .Query(q =>
            q.Match(m => m
            .Field(f => f.FirstName)
            .Query("Tom")
            ) &&
            q.DateRange(r => r
            .Field(f => f.StartedOn)
            .GreaterThanOrEquals(new DateTime(2017, 1, 1))
            .LessThan(new DateTime(2018, 1, 1))
            )
    )
);

var matches = searchResponse.Documents;

Update

So now that we have some data and we can search it, lets turn our hand to updating it. Here are a few examples where I mix in some queries to check the updated data

//UPDATE 

//update all "Tom" person in "people" index
person.FirstName = "Tim";
client.UpdateAsync(new DocumentPath<Person>(person.Id),
    u => u.Index("people")
    .DocAsUpsert(true)
    .Doc(person)
    .Refresh(Elasticsearch.Net.Refresh.True))
    .ConfigureAwait(false).GetAwaiter().GetResult();

searchResponse = client.Search<Person>(s => s
    .From(0)
    .Size(10)
    .AllIndices()
    .Query(q =>
            q.Match(m => m
            .Field(f => f.FirstName)
            .Query("Tim")
            )
    )
);

matches = searchResponse.Documents;

//update "Tim" to "Samantha" using different update method
client.UpdateAsync<Person, object>(new DocumentPath<Person>(1),
    u => u.Index("people")
        .DocAsUpsert(true)
        .RetryOnConflict(3)
        .Doc(new { FirstName = "Samantha" })
        .Refresh(Elasticsearch.Net.Refresh.True))
        .ConfigureAwait(false).GetAwaiter().GetResult();


searchResponse = client.Search<Person>(s => s
    .From(0)
    .Size(10)
    .AllIndices()
    .Query(q =>
        q.Match(m => m
            .Field(f => f.FirstName)
            .Query("Samantha")
        )
    )
);

matches = searchResponse.Documents;

There is not much more to say there apart from perhaps pay special attention to how we use the fluent DSL Doc(…) to apply partial updates, and we also use Refresh(..) which ensures the shards are updated that hold this data, which makes it visible to new searches.

Deleting data

So now we have put data in, queried it, and updated it, guess we should talk about deletes. This is done as follows:

//DELETE
client.DeleteAsync<Person>(1,
    d => d.Index("people")
        .Refresh(Elasticsearch.Net.Refresh.True))
        .ConfigureAwait(false).GetAwaiter().GetResult();

searchResponse = client.Search<Person>(s => s
    .From(0)
    .Size(10)
    .AllIndices()
    .Query(q =>
        q.Match(m => m
            .Field(f => f.Id)
            .Query("1")
        )
    )
);


matches = searchResponse.Documents;

//delete using a query
client.DeleteByQueryAsync<Person>(
    d => d.AllIndices()
        .Query(qry => qry.Term(p => p.Name("FirstName").Value("Tom")))
        .Refresh(true)
        .WaitForCompletion())
        .ConfigureAwait(false).GetAwaiter().GetResult();

var response = client.DeleteByQueryAsync<Person>(
    q => q
        .AllIndices()
        .Query(rq => rq
            .Match(m => m
            .Field(f => f.FirstName)
            .Query("Tom")))
        .Refresh(true)
        .WaitForCompletion())
        .ConfigureAwait(false).GetAwaiter().GetResult();

searchResponse = client.Search<Person>(s => s
.From(0)
.Size(10)
.AllIndices()
.Query(q =>
        q.Match(m => m
        .Field(f => f.FirstName)
        .Query("Tom")
        )
    )
);


matches = searchResponse.Documents;

As before I have included queries in here to check the deletes. Hopefully you get the idea, where below we can delete by an Id, or by using a query where we look to match N-many records.

Demo Project

Anyway that is all I wanted to show this time, hopefully it gives you a small taste of using the .NET Elastic client. You can download a demo project from here : https://github.com/sachabarber/Elasticdemo

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s