Deployment

Overview

In this section, the Deployment of Syndeia Cloud is presented.

(1) Please review the following sections before proceeding further.

  • Architecture - lists the Syndeia Cloud services and the infrastructure components required for Syndeia Cloud

  • Version Compatibility - lists the OS versions and infrastructure component versions that this release of Syndeia Cloud has been qualified and verified with

  • Requirements - lists the OS requirements, hardware requirements + sizing guidelines, for this release of Syndeia Cloud

(2) Syndeia Cloud - New installation vs. Upgrading from 3.5 SP2.

  • New Syndeia users will be doing a fresh deployment of Syndeia Cloud 3.6. Follow the instructions under Deployment Methods.

  • Existing Syndeia users will be upgrading from Syndeia Cloud 3.5 SP2 → Syndeia Cloud 3.6. Follow the instructions under Upgrades and Migration.


Requirements

Software

Architecture

A Syndeia Cloud (SC) 3.6 installation has the following stack of component dependencies (#1 = bottom of stack, #5 = top of stack):  

  1. Apache (or DataStax) Cassandra:   NoSQL Database (DB) layer

  2. Janusgraph (JG):  Gremlin query language and graph framework (depends on #1 for DB)

  3. Apache Zookeeper (ZK):  Cluster broker (even in single-node deployments)

  4. Apache Kafka:  Event processing queue for streams (depends on #3, even in single-node deployments)- used by the SC graph service

  5. Syndeia Cloud (SC):  Server-based platform to create a queryable, visualizable & extensible federated digital thread among/within various engineering PLM tools.  

    1. SC core services:  core services that make up the SC framework (ideally should be started in the below order)  

      1. sc-store:  communicates with backend DB (depends on #1 (Cassandra)),

      2. sc-auth:  handles authentication-related requests (depends on #1 (Cassandra)),

      3. sc-graph:  handles graph visualization tasks, ie: conversion of data from main DB into graph-friendly DB format, processing Gremlin graph queries, etc. (depends on #1 (Cassandra) & #4 (Apache Kafka)),

      4. sc-web-gateway:  web dashboard + service request router (no service start dependencies but other services should be available for this service to be useful)

    2. SC integration services (Green = New!):  integration-specific service for each PLM tool being integrated with (ex: Aras Innovator, jFrog Artifactory, Bitbucket (BB), Atlassian Confluence, SmartBear Collaborator, IBM Jazz CLM- includes Doors NG (DNG), Github, Gitlab, Jama, JIRA, REST-ful, Siemens Teamcenter, SysMLv2, TestRail, NoMagic Teamwork Cloud (TWCloud), VOLTA, PTC Windchill (WC), Zuken DS-CR, Zuken DS-E3, Zuken Genesys).   (no service start dependencies but core services should be available for these services to be accessible)


Software: Version Compatibility

Syndeia Cloud 3.6 has been tested with the following software versions:  

  • RHEL/Alma Linux 8.6 ~ 8.10 (  Linux strongly recommended)

    • GNU bash v4.X

    • python3 v3.6

  • Windows Server 2016 ( you will need to run this within Docker/WSL2)

  • Java (Oracle or Open)JDK/JRE v11.0.23

  • Apache Cassandra v4.1.0

  • Janusgraph v1.0.0

  • Zookeeper v3.8.4

  • Kafka v2.13-3.7.0 ( Note, “2.13” = the Scala language version & “3.2.1” = the actual Kafka version. The Apache Kafka project releases builds for multiple Scala versions & lists their download versions as vX.XX-Z.Z.Z where X = Scala version and Z = actual Kafka version )

Note, newer versions may or may not work.  

Software: OS Account Permissions

  • Windows: Administrator access

  • Linux: DO NOT INSTALL ANY COMPONENT AS root OR INSTALL WITH sudo!
    Login/SSH in with a normal user, the setup scripts will ask for permission where needed and install each component creating separate segregated system accounts per standard *NIX security best practices.
    (Failing to heed this advice will result in the creation of files that syndeia-cloud:syndeia-cloud , cassandra:cassandra, janusgraph:janusgraph, zookeeper:kafka-zookeeper, or kafka:kafka-zookeeper user:group accounts will subsequently NOT have access to!)


Hardware

Hosting

Currently, we recommend any machine that will be running Syndeia Cloud (SC), Cassandra, Janusgraph, Zookeeper, and Kafka together be dedicated (ie: not shared with other vendor software, ex:  sharing Cassandra DB with Teamwork Cloud, etc.). 

A word about hosting infrastructure:  

 WARNING:  attempting to run on a non-dedicated node (ie: default shared cloud/VPS/AWS instance or (overcommitted) internal IT VM hosts) may subject you to unknown "noisy guest neighbors".  Depending on your relationship with the hosting provider (ie: internal or external 3rd-party) and the monitoring tools provided, you may or may not have any visibility or control over this.  This could cause intermittent CQLSH timeouts or SC Circuit Breaker errors.  If you experience this, please increase (and or reserve) the resources allocated to your node.   Linux guests have a bit more visibility by allowing examination of the steal metric, ex: by running iostat 1 10 and/or running top | htop (from epel repository). 

Having said the above we have the following hardware sizing requirements:

Hardware: Minimum

Single Node Deployment Topology

  • CPU cores:  8 cores

  • RAM:  32GB of RAM

  • HD space:  100GB

Caution regarding partitioning layouts on Linux:  Some installers will by default suggest overly-complex partition layouts where the disk space is wastefully chopped up across the FHS paths, and /home, /opt, /var/{lib,log} end up having minuscule space.  Please ensure this is NOT the case.  The majority of disk space should be allocated to these directories which is used by SC software, infrastructure components, DB data files, and log files. The simplest solution is to use a 2-partition layout. However, if you absolutely have to have more partitions, please ensure the following minimums:

  • /home can at least fit the downloaded & extracted SC media (currently ~2 x 2GB = 4GB)

  • /opt can at least fit the installed ZK + Kafka + JG + SC (~3GB)

  • /var/lib/cassandra can at least fit your Cassandra DB (~7GB for a DB with, ex: 31.3k artifacts + 16.1k relations)

  • /var/log can at least fit any large logging events (ex: ~6GB for a DB with activity on, ex: 31.3k artifacts + 16.1k relations)

Hardware: Ideal for Production (~100 Users)

Single-Node Deployment Topology (Recommended)

  • CPU cores:  16

  • RAM:  32GB of RAM

  • HD space:  200GB

Multi-Node Deployment Topologies

  • CPU cores:  (varies depending on the topology, see Single-Node Deployment for core breakdown) Ideal for production

  • RAM:  32GB of RAM

  • HD space:  100GB

Note, multi-node topologies are an advanced deployment with cluster management responsibilities, we recommend beginning with a single-node deployment to start with and re-evaluate as your usage requirements grow


Sizing References

  • CPU: Adjust per your expected CCU requirements.  As a reference point, internal SC 3.6 benchmark scalability testing showed the following results  

  • RAM: Adjust per your expected graph sizing requirements.

  • HD Space: As a reference point, SC itself when extracted requires ~1.8GB and a sample environment with 9.2k relations, 66 relation types, ~16.4k artifacts, 285 artifact types, ~1k containers, 101 container types, 71 repositories, and 18 repository types requires ~306MB at the Cassandra database data layer (Replication Factor (RF) = 1 ) and ~281MB at the Kafka stream log layer on an EXT4 file system on a single-node Cassandra deployment.   Note, we recommend periodically monitoring the size of the Kafka stream "logs" & gremlin-server.log file(s) as they have a tendency to grow (the latter can grow up to 1GB in a year with the default slf4jReporter metric enabled and logging every 180000 ms)