F# and Elasticsearch

Read Time: 8 minutes

Recently I was working on a project using F# and Elasticsearch. I thought it would be fun to post a light introduction. Nicely enough, Elastic provides a .NET client, two actually here. They are a combination of low level and high level interfaces. As is sometimes the case, using F# and C#-style libraries requires some clever navigation of the interface. For this post I will only focus on the high level interface (NEST).

Before getting into the code, an Elasticsearch server is needed. This isn’t a tutorial on Elasticsearch, so I won’t go into a lot of setup and configuration detail; here are the instructions. Its a quick install, and here is what I did below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Download
curl -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.4.0.deb

# Install
sudo dpkg -i elasticsearch-6.4.0.deb

# Configure: Edit /etc/elasticsearch/elasticsearch.yml
cluster.name: simple-search
node.name: node-1

# Start service
sudo systemctl start elasticsearch.service

# Quick test
curl -X GET 'http://localhost:9200'

# Results:
{
"name" : "node-1",
"cluster_name" : "simple-search",
"cluster_uuid" : "qAfHvbAgTC6O-r_jKS6qmA",
"version" : {
"number" : "6.4.0",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "595516e",
"build_date" : "2018-10-07T23:18:47.308994Z",
"build_snapshot" : false,
"lucene_version" : "7.4.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}

That was easy. Continuing with the other prerequisites, I used .NET Core version 2.1. Select SDK for your platform. After that, create a console F# project, then add the NEST package.

1
2
3
4
5
6
dotnet new console --language F# --name SimpleSearch
cd SimpleSearch
dotnet add package NEST --version 6.3.1

# Not required, but here is the other (low-level) .NET package
dotnet add package Elasticsearch.Net --version=6.3.1

There are a couple things to construct prior to the interesting things. Include necessary namespaces. The example will index files, and Elasticsearch needs a datatype. FileData is a good structure to use for indexing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
open System
open System.Globalization
open System.IO
open System.Security.Cryptography
open System.Text
open Nest

/// elasticsearch index name
[<Literal>]
let SearchIndex = "simple-search"

/// Datatype for indexing
type FileData = {
Id: string;
Directoryname: string;
Filename: string;
Filetype: string;
Contents: string;
CreateDate: DateTime;
ModifyDate: DateTime;
IndexDate: DateTime }

There are a couple supporting functions to support the process. nullable is used to support the NEST interface. To create document ids I use a hash of the filename.

1
2
3
4
5
6
7
8
9
10
11
12
/// Convert int to Nullable int
let inline nullable (a:int) :Nullable<int> = System.Nullable<int>(a)

/// Hashing algorithm
let hashAlgorithm= new SHA1Managed()

/// Hash a string
let hash (s:string) =
let bytes = Encoding.Unicode.GetBytes(s)
hashAlgorithm.ComputeHash(bytes)
|> Array.map (fun x -> String.Format("{0:x2}", x))
|> String.concat ""

Now to the interesting part. It turns out connecting to the Elasticsearch service is straight forward. Setting up a default index makes later calls more convenient. There are more defaults that could be setup here as well.

1
2
3
4
5
6
7
[<EntryPoint>]
let main argv =
// Configuration
let node = new Uri("http://127.0.0.1:9200")
let settings = new ConnectionSettings(node)
settings.DefaultIndex(SearchIndex) |> ignore
let client = new ElasticClient(settings)

Adding a document to the index can be done using an F# record. For this case I’ll take my Program.fs file and add it to the document index.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
let filename = "Program.fs"
let data = {
FileData.Id = (hash filename);
Directoryname = Path.GetDirectoryName(filename);
Filename = Path.GetFileName(filename);
Filetype = Path.GetExtension(filename);
Contents = File.ReadAllText(filename);
CreateDate = File.GetCreationTime(filename);
ModifyDate = File.GetLastWriteTime(filename);
IndexDate = DateTime.Now }

client.Index<FileData>(
new IndexRequest<FileData>(
new DocumentPath<FileData>(data)))

Once the document is indexed, it can be displayed by showing all documents in the index.

1
2
3
4
5
6
let showContents = true
let searchResults = client.Search<FileData>(new SearchRequest<FileData>())
searchResults.Documents
|> Seq.iter (fun doc ->
printfn "%s (%A)" (Path.Combine(doc.Directoryname, doc.Filename)) (doc.ModifyDate)
if showContents then printfn "%s" doc.Contents else ())

Showing all documents is fine, but not very interesting. Here is a more useful example, performing a boolean search for text in either the filename or contents attributes of the document. A couple notes here, constructing the search is a bit more involved. When building the SearchRequest additional attributes can be set, like Size (the number of records to return). It must also be upcast to ISearchRequest to be consumed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/// Search and display results
let text = "fsharp"
let result = client.Search<FileData>(fun (s:SearchDescriptor<FileData>) ->
new SearchRequest(
Size = (nullable 1000),
Query = new QueryContainer(query = BoolQuery(Should = [
new QueryContainer(query = new TermQuery(Field = new Field("filename"), Value = text));
new QueryContainer(query = new TermQuery(Field = new Field("contents"), Value = text))
]))
) :> ISearchRequest)

printfn "MaxScore: %f" result.MaxScore
result.Documents
|> Seq.iter (fun doc ->
printfn "%s (%A)" (Path.Combine(doc.Directoryname, doc.Filename)) (doc.ModifyDate)
if showContents then printfn "%s" doc.Contents else ())

The above approach is typical. But in the spirit of there-is-more-than-one-way-to-do-it, queries can be created in raw form. Below is the same query.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/// Search and display results (use raw query)
let result = client.Search<FileData>(fun (s:SearchDescriptor<FileData>) ->
let sr = new Nest.SearchRequest(Size = (nullable 1000))
let query = new QueryContainerDescriptor<FileData>()
// query.Raw((sprintf "{ \"match\": { \"contents\": \"%s\"} }" text)) |> ignore
let queryString =
sprintf "
{
\"bool\": {
\"should\": [
{ \"term\": { \"filename\": \"%s\" }},
{ \"term\": { \"contents\": \"%s\" }} ] } }"
text text

query.Raw(queryString) |> ignore
sr.Query <- query
sr :> ISearchRequest)

printfn "MaxScore: %f" result.MaxScore
result.Documents
|> Seq.iter (fun doc ->
printfn "%s (%A)" (Path.Combine(doc.Directoryname, doc.Filename)) (doc.ModifyDate)
if showContents then printfn "%s" doc.Contents else ())

Now that I’ve shown an insert and searches, here is how to delete an index. Deleting an index is easy enough, but it doesn’t take an index name as a string directly. So an Indices object needs created from the index name string.

1
2
let result = client.DeleteIndex(new DeleteIndexRequest(Indices.Parse(SearchIndex)))
printfn "%A" result

There you have it. This has been a short introduction into using F# with the Elasticsearch NEST library. There is certainly more, but most of interesting composition exists in constructing custom searches. The above patterns should be enough to guide the rest of the way. One way this process could improve is to use discriminated and records. To that end, a quick search found some projects in various states of completeness. I certainly enjoy finding these, since coding the F#-way is often more pleasant. That puts these on my list for future evaluation. That is all for today, until next time. Thanks.

Elasticsearch Client