Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Remove CentOS

Table of Contents
outlinetrue
stylenone

Single Node Setup Instructions

1. Ensure you have the syndeia-cloud-3.5_cassandra_zookeeper_kafka_setup.zip (or latest service pack) downloaded to your home directory from the download/license instructions sent out by our team.

(info)  Note: the .ZIP will pre-create a separate folder for its contents when extracted so there is no need to pre-create a separate folder for it.  

2. Ensure you satisfy Zookeeper's pre-requisites, ie: have (Open|Oracle)JDK/JRE, memory, HD space, etc. (see https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_systemReq for more details).  

3. If using a firewall, ensure the following port is accessible (consult your local network admin if required): TCP port 2181 (this is the port to listen to for client connections).  


...

Download, Install, Configure & Run Apache Zookeeper

1. Download Zookeeper 3.6.3 from https://archive.apache.org/dist/zookeeper/zookeeper-3.6.3/apache-zookeeper-3.6.3-bin.tar.gz (ie: wget https://archive.apache.org/dist/zookeeper/zookeeper-3.6.3/apache-zookeeper-3.6.3-bin.tar.gz)

...

7.  Create the configuration file /etc/zookeeper/conf/zoo.cfg & paste the configuration below, ie:  sudo mkdir -p /etc/zookeeper/conf/ && sudo cp /opt/zookeeper-${ZK_build_ver}/conf/zoo_sample.cfg /etc/zookeeper/conf/zoo.cfg (to use as a template) and edit it, pasting in the below to replace it:    

Code Block
languagebash
themeMidnight
# http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/var/lib/zookeeper
# Place the dataLogDir to a separate physical disc for better performance
# dataLogDir=/disk2/zookeeper

# the port at which the clients will connect
clientPort=2181

# specify all zookeeper servers
# The fist port is used by followers to connect to the leader
# The second one is used for leader election
#server.1=zookeeperServer1.mydomain.tld:2888:3888
#server.2=zookeeperServer2.mydomain.tld:2888:3888
#server.3=zookeeperServer3.mydomain.tld:2888:3888

# To avoid seeks ZooKeeper allocates space in the transaction log file in
# blocks of preAllocSize kilobytes. The default block size is 64M. One reason
# for changing the size of the blocks is to reduce the block size if snapshots
# are taken more often. (Also, see snapCount).
#preAllocSize=65536

# Clients can submit requests faster than ZooKeeper can process them,
# especially if there are a lot of clients. To prevent ZooKeeper from running
# out of memory due to queued requests, ZooKeeper will throttle clients so that
# there is no more than globalOutstandingLimit outstanding requests in the
# system. The default limit is 1,000.ZooKeeper logs transactions to a
# transaction log. After snapCount transactions are written to a log file a
# snapshot is started and a new transaction log file is started. The default
# snapCount is 10,000.
#snapCount=1000

# If this option is defined, requests will be will logged to a trace file named
# traceFile.year.month.day. 
#traceFile=

# Leader accepts client connections. Default value is "yes". The leader machine
# coordinates updates. For higher update throughput at thes slight expense of
# read throughput the leader can be configured to not accept clients and focus
# on coordination.
#leaderServes=yes

(warning) Note:  In particular, please pay attention to the value of  dataDir=/var/lib/zookeeper and ensure the zookeeper user and kafka-zookeeper group have access to the directory.  

(info) Note:  For a quick start, the above settings will get you up and running, however for any multi-node deployment scenarios you will need to specify server.n, where n = server # and create a /var/lib/zookeeper/myid file on each server specifying each server's id (see steps 4-5 of https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkMulitServerSetup for more details)

8. Start the Zookeeper service, ie:  /opt/zookeeper-<release_ver>/bin/zkServer.sh start /etc/zookeeper/conf/zoo.cfg

(info) Note, Apache Zookeeper doesn't include a native systemd .service file by default.  While systemd will dynamically create one at runtime via its SysV compatibility module, you may wish to create one yourself to exercise better control over the various service parameters.  For your convenience we have created a systemd zookeeper.service file (included in the syndeia-cloud-3.5_cassandra_zookeeper_kafka_setup.zip download) .  To use this, copy it to to /etc/systemd/system, reload systemd units, enable zookeeper to start on boot and start the service, ie:  sudo cp <service_file_download_dir>/zookeeper.service /etc/systemd/system/. && sudo systemctl daemon-reload && sudo systemctl enable zookeeper && sudo systemctl start zookeeper 

...

(info)  Before making the image you may wish to first stop and optionally disable the service temporarily to prevent auto-start on boot, ie:  sudo systemctl disable zookeeper  


...

Multi-Node (Cluster) Setup Instructions (Adding nodes to an existing single-node)

11. Deploy another instance of your Zookeeper base image.
12. Make any appropriate changes for the MAC address (ex: in the VM settings and/or udev, if used).
13. Setup forward & reverse DNS records on your DNS server (consult your IT admin/sysadmin if required) and set the hostname and primary DNS suffix on the machine itself (sudo hostnamectl set-hostname <new_Zookeeper_node_FQDN> where FQDN = Fully Qualified Domain Name, ex: zookeeper2.mycompany.com )
14. SSH to the IP (or the FQDN of the new node if DNS has already propagated).

...

16. Repeat steps 11 ~ 15 for each additional cluster node.


...

Validating Zookeeper Operation for 1-node (or multiple nodes)

17. To validate Zookeeper operation, we connect to each node using the included client script.  To do this, connect via CQLSH on the server (or node 1 if testing a cluster) and perform the following steps:  

...