Skip to content

Administration guide


Content

1. ElasticSearch Clustering

ElasticSearch is built to be always available, and to scale with your needs. Scale can come from buying bigger servers (vertical scale, or scaling up) or from buying more servers (horizontal scale, or scaling out). While ElasticSearch can benefit from more-powerful hardware, vertical scale has its limits. Real scalability comes from horizontal scale—the ability to add more nodes to the cluster and to spread load and reliability between them. With most databases, scaling horizontally usually requires a major overhaul of your application to take advantage of these extra boxes. In contrast, ElasticSearch is distributed by nature: it knows how to manage multiple nodes to provide scale and high availability. This also means that your application doesn’t need to care about it. A node is a running instance of ElasticSearch, while a cluster consists of one or more nodes with the same cluster.name that are working together to share their data and workload. As nodes are added to or removed from the cluster, the cluster reorganizes itself to spread the data evenly. One node in the cluster is elected to be the master node, which is in charge of managing cluster-wide changes like creating or deleting an index, or adding or removing a node from the cluster. The master node does not need to be involved in document-level changes or searches, which means that having just one master node will not become a bottleneck as traffic grows. Any node can become the master. Every node knows where each document lives and can forward our request directly to the nodes that hold the data we are interested in. Whichever node we talk to manages the process of gathering the response from the node or nodes holding the data and returning the final response to the client. It is all managed transparently by Elasticsearch. CyberQuest takes advantage of this technology so whether the underlying database is clustered or single node deployment no additional configuration of CyberQuest is required.

1.1. Checking cluster health

CyberQuest comes installed with the ElasticSearch Kopf plugin for a visual representation of the database.

It can be accessed from a web browser on port 9000 of the CyberQuest IP addressed:

http://CyberQuestIP:9000/

! By default, the 9000 port is secured by local Debian iptables utility to only be accessible from localhost. This can be checked from the putty interface by running the following command:

iptables -L

Alt text

To be able to use the Kopf plugin this rule needs to be cleared and flushed with the following commands:

iptables -X

Alt text

iptables -F

Alt text

Check the modifications with the previous command:

iptables -L

Alt text

This means that the plugin can be launched and can communicate with the database.

In a web browser type the following link in the navigation bar:

http://CyberQuestIP:9200/_plugin/kopf/

The results should be: Alt text

For a single node installation or: Alt text For two or more clustered nodes.

1.2. Adding cluster nodes

ElasticSearch is configured to use unicast discovery out of the box to prevent nodes from accidentally joining a cluster. Only nodes running on the same machine will automatically form cluster. To use unicast, you provide ElasticSearch a list of nodes that it should try to contact. When a node contacts a member of the unicast list, it receives a full cluster state that lists all of the nodes in the cluster. It then contacts the master and joins the cluster. This means your unicast list does not need to include all of the nodes in your cluster. It just needs enough nodes that a new node can find someone to talk to. If you use dedicated masters, just list your three dedicated masters and call it a day. This setting is configured in elasticsearch.yml: Alt text

discovery.zen.ping.unicast.hosts: ["OtherEasticSearchHost1","OtherEasticSearchHost2"]

Save and restart the ElasticSearch service:

systemctl restart elasticsearch.service

2. Additional ElasticSearch documentation

2.1. Zen Discovery

The zen discovery is the built-in discovery module for elasticsearch and the default. It provides unicast discovery but can be extended to support cloud environments and other forms of discovery.

The zen discovery is integrated with other modules, for example, all communication between nodes is done using the transport module.

It is separated into several sub modules, which are explained below:

Ping

This is the process where a node uses the discovery mechanisms to find other nodes.

2.2. Unicast

Unicast discovery requires a list of hosts to use that will act as gossip routers. These hosts can be specified as hostnames or IP addresses; hosts specified as hostnames are resolved to IP addresses during each round of pinging. Note that with the Java security manager in place, the JVM defaults to caching positive hostname resolutions indefinitely. This can be modified by addingnetworkaddress.cache.ttl= to your Java security policy. Any hosts that fail to resolve will be logged. Note also that with the Java security manager in place, the JVM defaults to caching negative hostname resolutions for ten seconds. This can be modified by addingnetworkaddress.cache.negative.ttl= to your Java security policy.

Setting Description
hosts Either an array setting or a comma delimited setting. Each value should be in the form of host:port or host (where port defaults to the setting transport.profiles.default.port falling back to transport.tcp.port if not set). Note that IPv6 hosts must be bracketed. Defaults to 127.0.0.1, [::1]
hosts.resolve_timeout The amount of time to wait for DNS lookups on each round of pinging. Specified as time units. Defaults to 5s.

The unicast discovery uses the transport module to perform the discovery.

2.3. Master Election

As part of the ping process a master of the cluster is either elected or joined to. This is done automatically. The discovery.zen.ping_timeout (which defaults to 3s) allows for the tweaking of election time to handle cases of slow or congested networks (higher values assure less chance of failure). Once a node joins, it will send a join request to the master (discovery.zen.join_timeout) with a timeout defaulting at 20 times the ping timeout. When the master node stops or has encountered a problem, the cluster nodes start pinging again and will elect a new master. This pinging round also serves as a protection against (partial) network failures where a node may unjustly think that the master has failed. In this case the node will simply hear from other nodes about the currently active master. If discovery.zen.master_election.ignore_non_master_pings is true, pings from nodes that are not master eligible (nodes where node.master is false) are ignored during master election; the default value is false. Nodes can be excluded from becoming a master by setting node.master to false. The discovery.zen.minimum_master_nodes sets the minimum number of master eligible nodes that need to join a newly elected master in order for an election to complete and for the elected node to accept its mastership. The same setting controls the minimum number of active master eligible nodes that should be a part of any active cluster. If this requirement is not met the active master node will step down and a new master election will begin. This setting must be set to a quorum of your master eligible nodes. It is recommended to avoid having only two master eligible nodes, since a quorum of two is two. Therefore, a loss of either master eligible node will result in an inoperable cluster.

2.4. Fault Detection

There are two fault detection processes running. The first is by the master, to ping all the other nodes in the cluster and verify that they are alive. And on the other end, each node pings to master to verify if it’s still alive or an election process needs to be initiated. The following settings control the fault detection process using the discovery.zen.fd prefix:

Setting Description
ping_interval How often a node gets pinged. Defaults to 1s.
ping_timeout How long to wait for a ping response, defaults to 30s.
ping_retries How many ping failures / timeouts cause a node to be considered failed. Defaults to 3.

2.5. Cluster state updates

The master node is the only node in a cluster that can make changes to the cluster state. The master node processes one cluster state update at a time, applies the required changes and publishes the updated cluster state to all the other nodes in the cluster. Each node receives the publish message, acknowledges it, but does not yet apply it. If the master does not receive acknowledgement from at least discovery.zen.minimum_master_nodes nodes within a certain time (controlled by the discovery.zen.commit_timeout setting and defaults to 30 seconds) the cluster state change is rejected. Once enough nodes have responded, the cluster state is committed and a message will be sent to all the nodes. The nodes then proceed to apply the new cluster state to their internal state. The master node waits for all nodes to respond, up to a timeout, before going ahead processing the next updates in the queue. The discovery.zen.publish_timeout is set by default to 30 seconds and is measured from the moment the publishing started. Both timeout settings can be changed dynamically through the cluster update settings api.

2.6. Extensive documentation

Extensive database documentation can be found here:

https://www.elastic.co/guide/en/elasticsearch/guide/index.html

3. System backup

3.1. Mysql database dump

The majority of application configurations for CyberQuest is stored in the local MySQL database.

This is backed up on a daily basis by using a MySQL dump script:

DATE=date +%Y-%m-%d

mkdir -p /data/mysqlbackups/
mysqldump -u [dbuser] -p[dbpass] --all-databases | gzip > /data/mysqlbackups/$DATE.sql.gz

Replace [dbuser] and [dbpass] with MySQL username and password respectively.

This script is executed daily on a cronjob basis.

To check if the script is added to the crontab scheduler use the following command:

root@cyberquest:~# crontab -l

Alt text

The following line needs to be present: Alt text

If it is not it can be added with:

crontab -e

paste the line on last row:

30 2 * * * /var/opt/cyberquest/dataacquisition/bin/mysql_full_backup.sh**

Save and exit.

Finally check the /data/mysqlbackups/ folder for backed up database

3.2. Event data backup

By default, every message that is collected by CyberQuest it is automatically sent to the configured data storages:

Default: /data/storage/default

Alt text The events are normalized (JSON format) compressed, encrypted and digitally signed.

These can be later imported by creating another data storage from Settings Alt text> Data Storages Alt text> New Data Storage Alt text Button, copying the backed-up files in in this new data storage and creating a new import job from Settings Alt text> Jobs Alt text > New Job Alt textButton.

Create a new job with “Job Type” - import job:

Alt text

And select the newly created datastorage from the “Data Storage” drop down menu

Alt text

3.3. Backup locations

Default backup locations for 3rd party backup solutions are:

a. Configuration database settings Default: /data/mysqlbackups/ folder

b. Event data backup Default: /data/storage/default/ folder

c. The current CyberQuest/Cyberquest installation files: Default: /var/opt/cyberquest/ folder