Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update versions + date, remove old warning that no longer apply

Table of Contents
outlinetrue
absoluteUrltrue
stylenone

Pre-requisites:

1.  Ensure you have the syndeia-cloud-3.45_cassandra_zookeeper_kafka_setup.zip (or latest service pack) downloaded to your home directory from the download/license instructions sent out by our team.  

(info)  Note: the .ZIP will pre-create a separate folder for its contents when extracted so there is no need to pre-create a separate folder for it.  

2.  Review Apache Cassandra's recommendations, ie: (Open|Oracle)JDK/JRE, memory, FS selection, params, etc. in Deployment.  

...

1. Deploy a new standard RHEL/CentOS7, headless image on a physical or virtual machine (VM) or install from a Kixstart script or install from media manually.

2. Setup forward & reverse DNS records on your DNS server (consult your IT admin/sysadmin if required) and set the hostname and primary DNS suffix on the machine itself if necessary.  

3. If using a firewall, ensure the following ports are accessible(consult your local network admin if required): TCP ports 7000, 7001, 7199, 9042, 9142, 9160 (for details on what each port is used for see http://cassandra.apache.org/doc/latest/faq/index.html#what-ports & https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/security/secFirewallPorts.html#secFirewallPorts__firewall_table).  

(info) Note: If required by your IT department, perform any other standard configuration (ie: create separate admin account, set timezone, date & time or set it to synchronize with an NTP server, disable root logins, change default SSH port, installing Fail2Ban, enabling & configuring local firewall, etc.)

...

4. Follow the "Installation from RPM packages" section at https://cassandra.apache.org/download/#installation-from-rpm-packages. This will setup https://www.apache.org/dist/cassandra/redhat/311x/ as a yum repo, install Cassandra (sudo yum install cassandra-3.11.1013) + it's dependencies (ie: openjdk), set it up to run as a boot-service & start it.

(info) Note1, the instructions on the Apache Cassandra download page currently mention using the legacy SysV command of sudo service start cassandra to start the Cassandra service.  While this works, you will notice this redirects to systemctl, which is the standard way to manage services on RHEL/CentOS7, ie: sudo systemctl start cassandra

(info) Note2, Apache Cassandra doesn't include a native systemd .service file by default.  While systemd will dynamically create one at runtime via its SysV compatibility module, you may wish to create one yourself to exercise better control over the various service parameters.  For your convenience we have created a systemd cassandra.service & a tmpfiles.d cassandra.conf file (included in the syndeia-cloud-3.45_cassandra_zookeeper_kafka_setup.zip download).  To use these, copy the tmpfiles.d cassandra.conf to /etc/tmpfiles.d/, run it, copy cassandra.service to /etc/systemd/system/, reload systemd's units, enable cassandra to start on boot and start the service, ie:  sudo cp <tmpfiles.d_conf_file_download_dir>/cassandra.conf /etc/tmpfiles.d/. ; sudo systemd-tmpfiles --create --boot /etc/tmpfiles.d/cassandra.conf ; sudo cp <service_file_download_dir>/cassandra.service /etc/systemd/system/. && sudo systemctl daemon-reload && sudo systemctl enable cassandra && sudo systemctl start cassandra 

5. Verify/configure the following settings in /etc/cassandra/conf/cassandra.yaml ( (info) 'YourClusterName' = a cluster name of your choice, ex:  'SC 3.4 5 Prod Cluster')

Code Block
languagesass
themeMidnight
cluster_name: 'YourClusterName'
num_tokens: 256
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
seed_provider:
 - class_name: org.apache.cassandra.locator.SimpleSeedProvider
   parameters:
        - seeds: "127.0.0.1"
listen_address: localhost
rpc_address: localhost
write_request_timeout_in_ms: 20000

(info) Note1:  FQDN = Fully Qualified Domain Name, ex: cassandra.mycompany.com 

(info) Note2:  For a quick start, the above settings will get you up and running, however for any production deployment scenarios you may wish to implement other settings to enhance security (ie:  changing the default cassandra superuser password, enabling encryption, etc.) & performance (setting the data & commitlog directories, swap file settings, etc.).  See Appendix B2.11 for more details.  

(warning) If you frequently deal with large artifact sizes, you may want to also bump up batch_size_fail_threshold_in_kb from default of 50 (KB) to, for ex. 100.

6. If any changes were made in the above step, type sudo systemctl restart cassandra to restart the service.  If the service successfully starts you should get the command prompt again.  To confirm, verify "Active: active (running)" shows up in the output of systemctl status cassandra

Code Block
languagebash
themeRDark
$ systemctl status cassandra
● cassandra.service - LSB: distributed storage system for structured data
   Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
   Active: active (running) since Wed 20182022-0109-1707 02:41:38 EST; 1 weeks 6 days ago
     Docs: man:systemd-sysv-generator(8)
  Process: 3536 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited, status=0/SUCCESS)
 Main PID: 3761 (java)
   CGroup: /system.slice/cassandra.service
           ‣ 3761 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151332-1.b12b09-2.el7el8_46.x86_64/jre/bin/java -Xloggc:/var/log/cassandra/gc.log -XX:+UseParNewGC -XX:+UseCon...

JanSep 1707 02:41:35 cassandra.mycompany.com systemd[1]: Starting LSB: distributed storage system for structured data...
JanSep 1707 02:41:36 cassandra.mycompany.com su[3570]: (to cassandra) root on none
JanSep 1707 02:41:38 cassandra.mycompany.com cassandra[3536]: Starting Cassandra: OK
JanSep 1707 02:41:38 cassandra.mycompany.com systemd[1]: Started LSB: distributed storage system for structured data.
$

...

Code Block
languagetext
themeRDark
$ less /var/log/cassandra/system.log
[...]
INFO  [main] 20192022-0409-0507 13:55:43,277 YamlConfigurationLoader.java:89 - Configuration location: file:/etc/cassandra/cassandra.yaml
INFO  [main] 20192022-0409-0507 13:55:43,613 Config.java:481 - Node configuration:[allocate_tokens_for_keyspace=null; authenticator=PasswordAuthenticator; authorizer=CassandraAuthorizer; auto_boo
tstrap=true; auto_snapshot=true; back_pressure_enabled=false; back_pressure_strategy=org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5, flow=FAST}; batch_size_fail_t
hreshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; broadcast_address=null; broadcast_rpc_address=null; buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=1000; 
[...]
INFO  [main] 20192022-0409-0507 13:55:43,613 DatabaseDescriptor.java:367 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
[...]
INFO  [main] 20192022-0409-0507 13:55:43,886 CassandraDaemon.java:471 - Hostname: cassandra.mycompany.com
INFO  [main] 20192022-0409-0507 13:55:43,887 CassandraDaemon.java:478 - JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.8.0_131332
[...]
INFO  [main] 20192022-0409-0507 13:55:50,097 QueryProcessor.java:163 - Preloaded 328 prepared statements
INFO  [main] 20192022-0409-0507 13:55:50,098 StorageService.java:617 - Cassandra version: 3.11.113
INFO  [main] 20192022-0409-0507 13:55:50,098 StorageService.java:618 - Thrift API version: 20.1.0
INFO  [main] 20192022-0409-0507 13:55:50,098 StorageService.java:619 - CQL supported versions: 3.4.4 (default: 3.4.4)
INFO  [main] 20192022-0409-0507 13:55:50,099 StorageService.java:621 - Native protocol supported versions: 3/v3, 4/v4, 5/v5-beta (default: 4/v4)
INFO  [main] 20192022-0409-0507 13:55:50,134 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 98 MB and a resize interval of 60 minutes
INFO  [main] 20192022-0409-0507 13:55:50,142 MessagingService.java:753 - Starting Messaging Service on cassandra.mycompany.com/127.0.0.1:7000 (eth0)
INFO  [main] 20192022-0409-0507 13:55:50,168 StorageService.java:706 - Loading persisted ring state
INFO  [main] 20192022-0409-0507 13:55:50,169 StorageService.java:819 - Starting up server gossip
INFO  [main] 20192022-0409-0507 13:55:50,224 TokenMetadata.java:479 - Updating topology for cassandra.mycompany.com/127.0.0.1
INFO  [main] 20192022-0409-0507 13:55:50,225 TokenMetadata.java:479 - Updating topology for cassandra.mycompany.com/127.0.0.1
[...]
INFO  [main] 20192022-0409-0507 13:55:50,392 StorageService.java:2268 - Node localhost/127.0.0.1 state jump to NORMAL
INFO  [main] 20192022-0409-0507 13:55:50,404 AuthCache.java:172 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000)
INFO  [main] 20192022-0409-0507 13:55:50,406 Gossiper.java:1655 - Waiting for gossip to settle...
INFO  [main] 20192022-0409-0507 13:55:58,408 Gossiper.java:1686 - No gossip backlog; proceeding
INFO  [main] 20192022-0409-0507 13:55:58,470 NativeTransportService.java:70 - Netty using native Epoll event loop
[...]
INFO  [main] 20192022-0409-0507 13:55:58,520 Server.java:156 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)...
INFO  [main] 20192022-0409-0507 13:55:58,623 ThriftServer.java:116 - Binding thrift service to localhost/127.0.0.1:9160
INFO  [Thread-2] 20192022-0409-0507 13:55:58,629 ThriftServer.java:133 - Listening for thrift clients...

...

10. Validate correct operation and create an archive image to use as a new base image if the node needs to be rebuilt or if you wish to create a cluster.  

(info)  Before making the image you may wish to first stop and optionally disable the service temporarily to prevent auto-start on boot, ie:  sudo systemctl disable cassandra 


...

Multi-Node (Cluster) Setup Instructions

...

(warning) IMPORTANT: Pay special attention to steps 3b & 4, the data dir must be empty for a node to join the cluster and auto_bootstrap: false should only be added in cassandra.yaml on seed nodes. Per the “Prerequisites” section, normally one would elect a subset of the nodes to be seeds (usually 2-3 per datacenter is sufficient), however be aware there currently is a regression bug in Cassandra v3.6 ~ v3.11.1 that prevents non-seed nodes from starting, the workaround currently is to set all nodes as seeds, ex: seeds: <node1_IP>, <node2_IP>, ... <nodeN_IP> (see https://issues.apache.org/jira/browse/CASSANDRA-13851)

20. Repeat steps 16 ~ 19 for each additional cluster node.

...