JanusGraph Setup & Testing Instructions for Windows (2012-R2 x64)
Single Node Setup Instructions
Pre-requisites
1. Ensure you have the syndeia-cloud-3.4
_janusgraph_setup.zip
(or latest service pack) downloaded to your home directory (or home directory's Downloads folder) from the download/license instructions sent out by our team.
Note: the .ZIP will pre-create a separate folder for its contents when extracted so there is no need to pre-create a separate folder for it.
2. Ensure you satisfy Janusgraph's pre-requisites, ie: have (Open|Oracle)JDK/JRE + Apache Cassandra (& optionally Elasticsearch) installed. Re. CPU, memory, HD space, etc. this will depend on how much data you expect to store, query, and graph.
3. If using a firewall, and JG will be on a separate server from SC, ensure the following port is accessible (consult your local network admin if required): TCP port 8182 (this is the port to listen to for client connections).
Note: If required by your IT department, perform any other standard configuration, server hardening (ie: enabling & configuring local firewall, etc.)
4. Ensure you have Apache Commons Daemon installed from https://archive.apache.org/dist/commons/daemon/binaries/windows/commons-daemon-1.2.3-bin-windows.zip.
Note, if you followed the Note from step 11 from Download, Install, & Run Apache Cassandra, you should already have this installed.
To avoid any issues on Windows, it is recommended you do NOT extract it to a path that contains any spaces in any of the directory names.
Downloading & Extracting JanusGraph
1. Launch a Cygwin Terminal, it should start in your home dir.
2. Download Janusgraph 0.31 from https://github.com/JanusGraph/janusgraph/releases/tag/v0.3.1, ie: wget https://github.com/JanusGraph/janusgraph/releases/download/v0.3.1/janusgraph-0.3.1-hadoop2.zip;
Note, releases currently seem to be all suffixed with “...-hadoop2...
” despite being applicable for all databases, even cassandra, so download the one suffixed with “...-hadoop2.zip
”
3. Download Syndeia Cloud's JanusGraph setup package from the Downloads page.
4. Unzip the JanusGraph package into /opt
, ie: JG_build_ver=0.3.1-hadoop2; unzip janusgraph-
${JG_build_ver}
.zip -d /opt/
Note, on Windows, HortonWorks' winutils.exe
is "required" to avoid an error when gremlin-server.bat
or gremlin.bat
is run. To avoid this error, cd to your Janusgraph bin
dir and download winutils.exe
(you may need to also copy in MSVCR100.DLL
from $JAVA_HOME\bin
too), ie: cd /opt/janusgraph-${JG_build_ver}/bin; wget https://github.com/cdarlint/winutils/blob/master/hadoop-3.2.1/bin/winutils.exe; cp $JAVA_HOME/bin/msvcr100.dll
/opt/janusgraph-${JG_build_ver/bin};
IMPORTANT! There is currently a bug in Janusgraph's bin\gremlin.bat
file that prevents proper CLI script execution mode. This requires applying a patch (see Appendix E5.1)
5. Unzip Syndeia Cloud's JanusGraph setup packages into /opt/icx
, ie: unzip syndeia-cloud-3.4.SP3_2022-07-01_janusgraph_setup.zip
-d /opt/icx/
Pre-Janusgraph Setup Health Check
When completed, confirm all of the following are true via the Cygwin Terminal before proceeding:
(The tools used and the network locations used are site-dependent.)
- Cassandra service is running:
sc queryex cassandra
- Cassandra port is accessible/open:
netstat -ab | grep -A1 9042
- Cassandra service is accessible via CQLSH: Windows: clqsh check
- JanusGraph files exist at:
/opt/janusgraph-${JG_build-ver}
JanusGraph Keyspace Setup, Service Setup & Start:
6. cd into the bin dir of where you extracted and run the Syndeia Cloud JanusGraph setup script
cd /opt/icx/syndeia-cloud-3.4_janusgraph_setup/bin ./syndeia-cloud-3.4_janusgraph_setup_windows.bash
This will:
- Create/update janusgraph-current
symlink to specified version, default = 0.3.1
- Create a new syndeia_admin
superuser in Cassandra with a password you specify
- Create a new syndeia_cloud_graph
and syndeia_cloud_graph_config
keyspaces
- GRANT ALL PERMISSIONS ON KEYSPACE syndeia_cloud_graph TO syndeia_admin
- GRANT ALL PERMISSIONS ON KEYSPACE syndeia_cloud_graph_config TO syndeia_admin
- Install & run a Groovy JanusGraph setup script to set storage params for your graph and build indexes
- Create a renamed copy of the file
as /opt/janusgraph
onf/janusgraph-cql-configurationgraph.properties-<
release_ver>/c
janusgraph-cql-configurationgraph-syndeia.properties
with:
graph.graphname=syndeia_cloud_graph_config
Add storage.username=syndeia_admin
Add storage.password=
<password_specified>storage.hostname=<
your_Cassandra_host>
- If you installed Elasticsearch on the same machine, add index.search.backend=elasticsearch
and index.search.hostname=localhost
to use Elasticsearch for search indexing
- Create a renamed copy of the file /opt/janusgraph
-<
release_ver>/
conf/gremlin-server/gremlin-server-configuration.yaml
as gremlin-server-configuration-syndeia.yaml
and set ConfigurationManagementGraph
to point to janusgraph-cql-configurationgraph-syndeia.properties
.
- Install service file for JanusGraph service
- Start JanusGraph service
Note: use *NIX style paths, ie: /
vs \
& from the C:\cygwin64
= / (FS root). You will also be prompted for your cassandra
account password (default = cassandra
), to set your syndeia_admin
password and the FQDN of your Cassandra host.
Avoid all of these special characters: \?*[]+#&.{}$
when choosing your syndeia_admin
password.
Configuration Health Check
When completed, confirm all of the following are true before proceeding or when trying to determine if the JanusGraph service is properly configured and operating.
(The tools used and the network locations used are site-dependent.)
- SC's .groovy script for JG ran successfully, ie: you should have only received a
WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARN
-ing - JanusGraph service is now running:
sc queryex janusgraph
- JanusGraph port is accessible:
netstat -ab | grep -A1 8182
- Cassandra
syndeia_cloud_graph
andsyndeia_cloud_graph_config
keyspaces exist, ie: if you typeSELECT * FROM syndeia_cloud_
in CQLSH and hit tab, you should seesyndeia_cloud_graph. syndeia_cloud_graph_config.
- Following files exist as described below:
/opt/janusgraph-current
symlink exists, and points to the latest Intercax-approved version of JanusGraph/opt/janusgraph-current/conf/gremlin-server/gremlin-server-configuration-syndeia.yaml
exists.When completed with the defaults, the beginning of the file should resemble this content:
exists/opt/janusgraph-current/conf/
janusgraph-cql-configurationgraph-syndeia.propertiesstorage.hostname
entry should be set to the hostname/IP address for the Cassandra host
exists (may only apply for Linux)/opt/janusgraph-current/log/gremlin-server.log
- A message similar to
INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Channel started at port 8182.
appears near the recent end of thegremlin-server.log
file
Managing JanusGraph:
7. To check the status of Janusgraph services, use either the services.msc
Control Panel applet or run sc "\\localhost" queryex janusgraph
from an Administrator "Command Prompt" (CMD.EXE
) (launch via Start→Run or if using GUI, R-click on icon and select "Run as Administrator"). You can verify that it started by verifying "4 RUNNING" shows up in the STATE field's output:
C:\>sc "\\localhost" queryex janusgraph SERVICE_NAME: janusgraph TYPE : 10 WIN32_OWN_PROCESS STATE : 4 RUNNING (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN) WIN32_EXIT_CODE : 0 (0x0) SERVICE_EXIT_CODE : 0 (0x0) CHECKPOINT : 0x0 WAIT_HINT : 0x0 PID : 8336 FLAGS : C:\>
8. To stop/start the Janusgraph services, use either the services.msc
Control Panel applet or run sc "\\localhost"
<action> janusgraph
; where <action> = one of start|stop
from the Command Prompt (CMD.EXE). The service will be issued a start|stop command (query the status a few seconds afterwards to confirm successful start), ex:
C:\>sc "\\localhost" start janusgraph SERVICE_NAME: janusgraph TYPE : 10 WIN32_OWN_PROCESS STATE : 2 START_PENDING (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN) WIN32_EXIT_CODE : 0 (0x0) SERVICE_EXIT_CODE : 0 (0x0) CHECKPOINT : 0x0 WAIT_HINT : 0x7d0 PID : 8336 FLAGS : C:\>
Note: If you wish to ensure the services run on startup, use either the services.msc
Control Panel applet or run sc "\\localhost" config janusgraph start=auto depend=cassandra
. For alternative ways to setup non-native Windows services (to auto-start), see Setting up Services to Start on Boot.
9. To view the logs for JanusGraph, use less /opt/janusgraph-0.3.1-hadoop2/log/gremlin-server.log
. To follow the log files, you can use tail -f /opt/janusgraph-0.3.1-hadoop2/log/gremlin-server.log
in the Cygwin terminal. You should see output similar to the following (abridged) text:
Note, for your convenience you may wish to create a symlink to /opt/janusgraph-0.3.1-hadoop2/log
/
from /var/log
, ie: ln -nfs
/opt/janusgraph-0.3.1-hadoop2/log
/ /var/log/janusgraph
$ less /var/log/janusgraph/gremlin-server.log [...] 2987 [main] INFO com.datastax.driver.core.NettyUtil - Found Netty's native epoll transport in the classpath, using it 4827 [main] INFO com.datastax.driver.core.policies.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) 4835 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host janusgraph.domain.com/localhost:9042 added 9043 [main] INFO org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration - Set default timestamp provider MICRO 11325 [main] INFO org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration - Generated unique-instance-id=2d3877742385-janusgraph-domain-com1 11341 [main] INFO com.datastax.driver.core.ClockFactory - Using java.lang.System clock to generate timestamps. 11902 [main] INFO com.datastax.driver.core.policies.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) 11902 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host janusgraph.domain.com/localhost:9042 added 12251 [main] INFO org.janusgraph.diskstorage.Backend - Initiated backend operations thread pool of size 8 27969 [main] INFO org.janusgraph.diskstorage.Backend - Configuring total store cache size: 208138508 33098 [main] INFO org.janusgraph.diskstorage.log.kcvs.KCVSLog - Loaded unidentified ReadMarker start time 2019-03-18T01:55:02.918Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@28d6290 35668 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized Gremlin thread pool. Threads in pool named with pattern gremlin-* 36019 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized GremlinExecutor and preparing GremlinScriptEngines instances. 40510 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized gremlin-groovy GremlinScriptEngine and registered metrics 40532 [main] INFO org.apache.tinkerpop.gremlin.server.op.OpLoader - Adding the standard OpProcessor. 40540 [main] INFO org.apache.tinkerpop.gremlin.server.op.OpLoader - Adding the session OpProcessor. 41030 [main] INFO org.apache.tinkerpop.gremlin.server.op.OpLoader - Adding the traversal OpProcessor. 41076 [main] INFO org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor - Initialized cache for TraversalOpProcessor with size 1000 and expiration time of 600000 ms 41097 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - idleConnectionTimeout was set to 0 which resolves to 0 seconds when configuring this value - this feature will be disabled 41098 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - keepAliveInterval was set to 0 which resolves to 0 seconds when configuring this value - this feature will be disabled 41247 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 41248 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 41298 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v3.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0 41299 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0 41306 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0 41311 [main] WARN org.apache.tinkerpop.gremlin.server.AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0 serialization class is deprecated. 41314 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-lite with org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0 41315 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0 41325 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v2.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0 41347 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0 41350 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - application/json already has org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0 configured - it will not be replaced by org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0. 41437 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Gremlin Server configured with worker thread pool of 1, gremlin pool of 4 and boss thread pool of 1. 41438 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Channel started at port 8182. 181192 [metrics-logger-reporter-thread-1] INFO org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics - type=GAUGE, name=org.apache.tinkerpop.gremlin.server.GremlinServer.gremlin-groovy.sessionless.class-cache.average-load-penalty, value=2.450843818E9
10. Validate correct operation and create/update an archive image to use as a new base image if the node needs to be rebuilt or if you wish to create a cluster.
Before making the image you may wish to first stop and optionally disable the service temporarily to prevent auto-start on boot, ie: sc "\\localhost" config janusgraph start=disabled
.
Multi-Node (Cluster) Setup Instructions
Enabling your single-node deployment for cluster operation
11. If Cassandra will be on a separate server from JG, ensure you first enable Cassandra for cluster operation.
12. Update the Janusgraph syndeia_cloud_graph
configuration by running (from the Cygwin Terminal) /opt/janusgraph-current/bin/gremlin.bat
and running the following Groovy code ; where cassandra.mydomain.com
= the FQDN of the Casandra server that you have now exposed :
:remote connect tinkerpop.server conf/remote.yaml session :remote console map = new HashMap(); map.put('storage.backend', 'cql'); // should return: ==>null map.put('storage.hostname', 'cassandra.mydomain.com'); ConfiguredGraphFactory.updateConfiguration('syndeia_cloud_graph', map);
13. If Janusgraph will be on a different server than SC, please also:
13A. update the host: localhost
entry on L1 in /opt/janusgraph-current/conf/gremlin-server/gremlin-server-configuration-syndeia.yaml
, ex:
host: jg-server.domain.tld port: 8182 [...]
where jg-server.domain.tld
= the FQDN of your JG server
Note, Java apps tend to not reset the TTY terminal correctly in Cygwin's terminal, so if after running the above you do not see do not see any of your keystrokes being output, ie: your typing is "invisible", type "stty sane
" <Enter> (you may need to hit ^C and then enter a few times to ensure you're at a fresh prompt)
13B. Ensure port 8182 is accessible on the JG server.
14. Follow "Things to Consider in a Multi-Node JanusGraph Cluster" https://docs.janusgraph.org/v0.3/basics/multi-node/
Adding new nodes to an existing single-node
15. Deploy another instance of your Janusgraph base image.
16. Make any appropriate changes for the MAC address (ex: in the VM settings).
17. Setup forward & reverse DNS records on your DNS server (consult your IT admin/sysadmin if required) and set the hostname and primary DNS suffix on the machine itself (sudo hostnamectl set-hostname
<new_JanusGraph_node_FQDN> where FQDN = Fully Qualified Domain Name, ex: janusgraph2.mycompany.com
)
18. RDP to the IP (or the FQDN of the new node if DNS has already propagated).
19. Verify the Cassandra “datacenter” name (default = dc1
) from the CLI by running nodetool status
and then from Cassandra CQLSH increment the Replication Factor (RF) for your JanusGraph keyspace(s):
ALTER KEYSPACE syndeia_cloud_graph WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '<datacenter_name>' : <total_number_of_nodes> }; -- where <datacenter_name> = the name of the datacenter as shown via nodetool status, and <total_number_of_nodes> = total # of nodes (in the cluster) ALTER KEYSPACE syndeia_cloud_graph_config WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '<datacenter_name>' : <total_number_of_nodes> }; -- where <datacenter_name> = the name of the datacenter as shown via nodetool status, and <total_number_of_nodes> = total # of nodes (in the cluster)
20. Follow "Things to Consider in a Multi-Node JanusGraph Cluster" https://docs.janusgraph.org/v0.3/basics/multi-node/
21. Repeat steps 15 ~ 20 for each additional cluster node.
Validating JanusGraph Operation for 1-node (or multiple nodes)
22. To validate JanusGraph operation, we start a Gremlin client and issue commands to check the edges and vertices in our graph To do this, perform the following steps:
22.1. Open a new terminal window, ie: CMD.EXE
22.2. Go to the bin folder in JanusGraph installation and type the following command: gremlin.bat
. You should see output similar to the below:
c:\cygwin64\opt\janusgraph-current\bin>gremlin.bat \,,,/ (o o) -----oOOo-(3)-oOOo----- SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/janusgraph-0.3.1-hadoop2/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/janusgraph-0.3.1-hadoop2/lib/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] plugin activated: janusgraph.imports plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities 20:19:16 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable plugin activated: tinkerpop.hadoop plugin activated: tinkerpop.spark plugin activated: tinkerpop.tinkergraph gremlin>
22.3. Type the following commands, substituting your appropriate values as required:
:remote connect tinkerpop.server conf/remote.yaml session :remote console graph = ConfiguredGraphFactory.open('syndeia_cloud_graph'); // should return: ==>standardjanusgraph[cql:[cassandra.mydomain.com]] g = graph.traversal(); g.V(); g.E();
The last 2 commands above should not return any results since the graph (syndeia_cloud_graph
) is empty - no vertices or edges.
23.4. Type :quit