Exercise 23.2 - Gremlin Queries for Bitbucket

Objectives

The objective of this exercise is to help new users of Syndeia graph analysis to formulate Gremlin queries to analyze their graph with the Syndeia Web Dashboard. Bitbucket is an open-source configuration-managed software code repository and DevOps tool developed by Atlassian. The specific learning objectives of this exercise are to create lists of

  • Bitbucket artifacts of a specific artifact type

  • Bitbucket artifacts in a specific Container (Git Repository)

  • Bitbucket artifacts connected as part of a specific Syndeia Project

Preparation

This exercise assumes the student has

  • Syndeia Cloud 3.3 or 3.4 installed with a valid user account, and

  • An existing Syndeia graph containing Bitbucket objects connected to elements in other repositories.

Because the content of your Syndeia graph will be different, the specific examples in the following exercise instructions are only a guide and example for your actions. It is generally advisable to carry out these exercises in a non-production repository, a “sandbox”, set up for training and practice purposes.

See the tutorials under Syndeia Cloud Web-Dashboard/Part 19 – Syndeia Cloud Graph Analysis for an overview of this feature.

Background – Syndeia Cloud Data Model

Figure 1 provides a simplified schema for elements in the Syndeia Cloud graph. All graph nodes are either Repositories, Containers, or Artifacts, where each Artifact is owned by a Container and Each Container is owned by a Repository. Each has a Type; the set of ContainerTypes and ArtifactTypes are owned by the Repository. How the Bitbucket data model maps to the Syndeia Cloud data model is discussed in the next section.

Figure 1 Syndeia Cloud Schema (simplified)

Background – Bitbucket

As of Syndeia release 3.4, the Syndeia Web Dashboard can extract and display some model information from a Bitbucket repository. Figure 2 shows a tree view of this information, with labels identifying the Bitbucket element types. Note the different icons. The label color coding indicates how the Bitbucket element type is mapped to the Syndeia Cloud element types: Repository (green), Container (red), and Artifact (blue). The Syndeia Bitbucket integration supports a large number of standard and custom Bitbucket artifact types, including Branch, Commit, Tag, and Source (files and folders). A more complete diagram of the Bitbucket data model as it is understood by Syndeia is available through the web dashboard help menu on the left.

It is also important to understand the limitations of graph queries with respect to the Bitbucket repositories. As of Syndeia 3.4, graph queries cannot extract the internal structure of a Bitbucket repository, i.e. they cannot be used to obtain the full structure of the Bitbucket repository or internal (intra-model) relations between Bitbucket artifacts. Graph queries are most useful in viewing inter-model connections from Bitbucket elements to other repositories.

Figure 2 Tree view of Bitbucket repository

Exercise

 

  1. Log on to the Syndeia Cloud Web Dashboard (see Video 1.9) and click on the Graph Queries icon on the left border.

  2. The first task is to compile a list of Bitbucket Artifacts of a specific type. Per Figure 1, ArtifactTypes are owned by (specific to) a Repository. We typically want to begin by creating a list of Artifact types available in such a Repository.

    1. If we use Query Builder (Figure 3), we select ArtifactType from the pull-down menu under Label.

    2. To restrict the list of ArtifactTypes to our current Bitbucket repository, we click Filters. We will filter by the name of our Repository, so we select Repository from the pull-down menu at the top marked Property of. Under Property Key, we select the Name property and under Property Value, we enter BitBucket @ Intercax. We then click the Plus (+) button to add the filter in the bottom list and the window should look similar to Figure 4. Click Close.

  3. Back on the Graph Queries page, click Run. The results, a list of all ArtifactTypes in BitBucket @ Intercax, may be displayed in table form as in Figure 5. Key ArtifactType properties in the table are Name and Key because we will use these in the next search. Click the Exports icon to export the list as a CSV file for future reference, if desired.

  4. Note at the top of Figure 5, the Query Builder utility has created a Gremlin query. We could have performed the same search with the same results by going to the Raw Query mode and entering this query directly.
    g.V().has('sLabel','ArtifactType').where(outE().has('sLabel','ownedBy').inV().has('name','BitBucket @ Intercax'))

  5. The final part of the first task is to generate a list of all Artifacts of type source within the BitBucket @ Intercax Repository. Note that Syndeia will return only those Bitbucket Sources that are connected within the Syndeia Cloud graph or own elements that are connected within the Syndeia Cloud graph, not all files and folders in the repository.

    1. We can search by ArtifactType Name (“source”) or Key (ART-TYPE238), which we got from the table in Figure 5. Generally, it is better to search by Key, which is unique within the Syndeia Cloud database, rather than Name, which is not unique.

    2. If we use Query Builder, we select Artifact from the pull-down menu under Label, as in Figure 6.

    3. To restrict the list of Artifacts to the Bitbucket source type, we click Filters. We will filter by the ArtifactType Key, so we select ArtifactType from the pull-down menu at the top marked Property of. Under Property Key, we select the sKey property and under Property Value, we enter ART-TYPE238, which we took from the table in Figure 5.  After we click the Plus (+) icon, the Filters window should look like Figure 7. Click Close.

  6. Back on the Graph Queries page, click Run. The results, a list of all Artifacts of type ART-TYPE238, which is owned by the repository BitBucket @ Intercax, may be displayed in table form as in Figure 8. Click the Exports icon to export the list as a CSV file for future reference, if desired.

  7. Note at the top of Figure 5, the Query Builder utility has created a Gremlin query. We could have performed the same search with the same results by going to the Raw Query mode and entering this query directly.
    g.V().has('sLabel','Artifact').where(outE().has('sLabel','hasType').inV().has('sKey','ART-TYPE238'))

  8. The second task is to compile a list of Bitbucket Artifacts in a specific Bitbucket Git Repository. Per Figure 2, Git Repositories in Bitbucket are Containers. We will begin by creating a list of Containers available in a Bitbucket Repository.

    1. If we use Query Builder (Figure 9), we select Container from the pull-down menu under Label.

    2. To restrict the list of Containers to our current Bitbucket repository, we click Filters. We will filter by the name of our Repository, so we select Repository from the pull-down menu at the top marked Property of. Under Property Key, we select the Name property and under Property Value, we enter BitBucket @ Intercax. We then click the Plus (+) button to add the filter in the bottom list and the window should look similar to Figure 10. Click Close.

  9. Back on the Graph Queries page, click Run. The results, a list of all Containers in BitBucket @ Intercax may be displayed in table form as in Figure 11. Key Container properties in the table are Name and Key because we will use these in the next search. Click the Exports icon to export the list as a CSV file for future reference, if desired.

    Caution: The list of Bitbucket Containers in Figure 11 includes both Workspaces and Git Repositories. Because the Syndeia data model in Figure 1 does not map perfectly to the Bitbucket data model in Figure 2, Gremlin queries related to Workspaces work irregularly and we will only be working with Git Repositories as Containers. The list also does not include all Containers in the BitBucket @ Intercax repository. Only those Git Repositories that own Artifacts that are connected to other models (or are connected directly themselves) appear on the list. Other Bitbucket Git Repositories that do not involve connections to other repositories are not part of the Syndeia Cloud graph and do not appear in Gremlin graph query results.

  10. 10.  Note at the top of Figure 11, the Query Builder utility has created a Gremlin query. We could have performed the same search with the same results by going to the Raw Query mode and entering this query directly.
    g.V().has('sLabel','Container').where(outE().has('sLabel','ownedBy').inV().has('name','BitBucket @ Intercax'))

  11. The final part of the second task is to generate a list of all Artifacts in a specific Container within the BitBucket @ Intercax Repository. Note that Syndeia will return only those Bitbucket Artifacts that are connected within the Syndeia Cloud graph, not all Artifacts in the container or repository.

    1. We can search by Container Name (“Hello-BitBucket”) or Key (CONT842), which we got from the table in Figure 11. Generally, it is better to search by Key, which is unique within the Syndeia Cloud database, rather than Name, which is not unique.

    2. If we use Query Builder, we select Artifact from the pull-down menu under Label, as in Figure 12.

    3. To restrict the list of Artifacts to the Bitbucket Project Hello-BitBucket, we click Filters. We will filter by the Container Key, so we select Container from the pull-down menu at the top marked Property of. Under Property Key, we select the sKey property and under Property Value, we enter CONT842, which we took from the table in Figure 11.  After we click the Plus (+) icon, the Filters window should look like Figure 13. Click Close.

  12. Back on the Graph Queries page, click Run. The results, a list of all Artifacts in Container CONT842, which is owned by the repository BitBucket @ Intercax, may be displayed in table form as in Figure 14. Note that only Bitbucket elements that are part of the Syndeia Cloud graph appear; there may be other Bitbucket elements in this Project without connections to other repositories that do not appear.

  13. Note at the top of Figure 14, the Query Builder utility has created a Gremlin query. We could have performed the same search with the same results by going to the Raw Query mode and entering this query directly.
    g.V().has('sLabel','Artifact').where(outE().has('sLabel','ownedBy').inV().has('sKey','CONT842'))

  14. The third task is to compile a list of Bitbucket Artifacts that are connected as part of a specific Syndeia Project. Syndeia Projects are partitions within the Syndeia Cloud graph database that separate different projects or system models. Syndeia Projects are Containers owned by the Syndeia Repository. Unlike Bitbucket Git Repositories, Syndeia Projects contain only relations, the inter-model relations that define the “macrostructure” of the Digital Thread for that system or project. In this case, we are looking not for the Bitbucket elements directly; we are looking for inter-model connections where one end is a Bitbucket element.

  15. We will begin by creating a list of Containers available in the Syndeia Repository.

    1. If we use Query Builder (Figure 15), we select Container from the pull-down menu under Label.

      Figure 15 Graph Queries page (icon outlined in red) – Query Builder

    2. To restrict the list of Containers to the Syndeia repository, we click Filters. We will filter by the name of our Repository, so we select Repository from the pull-down menu at the top marked Property of. Under Property Key, we select the Name property and under Property Value, we enter Syndeia Repository. We then click the Plus (+) button to add the filter in the bottom list and the window should look similar to Figure 16. Click Close.

      Figure 16 Query Builder Filters window

  16. Back on the Graph Queries page, click Run. The results, a list of all Containers in the Syndeia Repository may be displayed in table form as in Figure 17. Key Container properties in the table are Name and Key because we will use these in the next search. Click the Exports icon to export the list as a CSV file for future reference, if desired.

  17. Note at the top of Figure 17, the Query Builder utility has created a Gremlin query. We could have performed the same search with the same results by going to the Raw Query mode and entering this query directly.
    g.V().has('sLabel','Container').where(outE().has('sLabel','ownedBy').inV().has('name','Syndeia Repository'))

  18. The next part of the third task is to generate a list of all Relations within a specific Syndeia Project.

    1. We can search by Container Name (“Manas Sandbox #1”) or Key (MBSB01), which we got from the table in Figure 17. Generally, it is better to search by Key, which is unique within the Syndeia Cloud database, rather than Name, which is not unique.

      Figure 17  Graph Queries page, Containers results in table format, truncated

    2. If we use Query Builder, we select Relation from the pull-down menu under Label, as in Figure 18. Remember, the Syndeia Projects contain relations, not artifacts.

      Figure 18  Query Builder, Artifact search

    3. To restrict the list of Relations to a specific Syndeia Project, we click Filters. We will filter by the Container Key, so we select Container from the pull-down menu at the top marked Property of. Under Property Key, we select the sKey property and under Property Value, we enter MBSB01, which we took from the table in Figure 17.  After we click the Plus (+) icon, the Filters window should look like Figure 19. Click Close.

      Figure 19  Query Builder Filters window, filter by Container sKey

  19. Back on the Graph Queries page, click Run. The results, a list of all Relations in Container DZSB15, which is owned by the Syndeia Repository, may be displayed in table form as in Figure 20. Note that all relations within the project appear, not just those with a Bitbucket artifact at one end.

    Figure 20  Graph Queries page, Relations (Edges) results in table format, truncated

  20. The final step is to identify the Bitbucket source elements that participate in these relations, but this cannot be done in Query Builder alone. Note at the top of Figure 20, the Query Builder utility has created a Gremlin query.

     

    g.E().has('sLabel','Relation').has('container','MBSB01')

    We will use the Gremlin query language to append an additional condition. First, we will add an additional traversal step to go to the vertices at the end of the relations. Since we don’t know whether the Bitbucket requirement will have an incoming or outgoing relation in the Syndeia project, we use the bothV() step to cover both ends.

     

    g.E().has('sLabel','Relation').has('container', 'MBSB01').bothV()

    Next, we will check all vertices for ArtifactType. Going back to the table in Figure 5, we choose Bitbucket source, ART-TYPE238.

     

    g.E().has('sLabel','Relation').has('container', 'MBSB01').bothV().has(‘type‘,‘ART-TYPE238‘)

    If we select Raw Query and enter this in the Gremlin Query field, we generate the table in Figure 21, showing all Bitbucket elements of ART-TYPE238 used in the Syndeia Project MBSB01.

    Figure 21  Graph Queries page, Artifacts results in table format, truncated

  21. 21.  There are alternate ways to approach the problem.  If we wanted to search for Bitbucket elements in a specific Bitbucket Project (CONT842) that were used in a Syndeia Project (MBSB01), we could reformulate the query using the first part from Step 20 and the second part from Step 13.

     

    g.E().has('sLabel','Relation').has('container','MBSB01').bothV().where(outE().has('sLabel','ownedBy') .inV().has('sKey','CONT842'))