Skip to main content
All CollectionsData AnalysisGremlin
Gremlin Tricks: Filtering With the Where-Step
Gremlin Tricks: Filtering With the Where-Step

This article shows how the where-step can facilitate more advanced filtering across graph traversals

Kristine Marhilevica avatar
Written by Kristine Marhilevica
Updated over a week ago

The most basic and common ways of filtering are done with the following steps:

  • filter

  • not (a "reversed filter")

  • has (can check if a field exists and if a field has a specific value)

  • hasNot (check if a field does not exist)

  • hasLabel (check the component/reference type)

  • is (to check if a value matches)

Here's an example that tries to use all the filters at once to find components that

  • are Applications

  • has a Lifecycle Phase of Live

  • does not have a Criticality

  • does have an owner (via the Owns-reference)

  • has more than two references

  • do not realize a Business Capability (via the Is Realized By-reference)

g.V().  hasLabel('Application').  has('lifecycle_phase', 'Live').  hasNot('criticality').  filter(__.in('Owns')).  filter(both().count().is(gt(2))).  not(__.in('Is Realized By').hasLabel('Business Capability'))

Advanced Filtering Using "Where"

The filter options seen above takes care of most use-cases, but you'll notice that none of these filters compare values from one part of the graph traversal against the value from another part of the traversal.

For example, how would we see if there are two directly connected components with the same Lifecycle Phase? This is when the where-step becomes useful:

g.V().  has('lifecycle_phase').as('lp1').  bothE().otherV().has('lifecycle_phase').as('lp2').  where('lp1', eq('lp2')).by('lifecycle_phase').  path()

Explaining the Parts ("as" and "by")

For the where-step to do its job, we need to add some aliases to the entities found during the graph traversal. This is done using the as-step.

In our example, we create a traversal that starts by finding all components that have a lifecycle_phase, and then give them the alias "lp1".

Then, we find all incoming and outgoing references (to ensures they are included in the path), that connect to a component that also has a Lifecycle Phase. We give this other component the alias "lp2".

Now that we have two aliases, we can use where to see if "lp1" equals "lp2". However, this means that we are comparing two components, which will never be the same (so we get zero results)!

This is where the "by-step" comes in, which lets us modulate the where-statement to instead use a different value for the comparison. We write by('lifecycle_phase') which tells Gremlin that it should compare the Lifecycle Phase field for these two components instead of comparing them in their entirety. Since we are comparing the same field on both components we may use only one by-step, but if you want to compare two different fields, just add two by-steps, for example:

where('lp1', eq('lp2')).  by('lifecycle_phase').  by('criticality')

More Advanced By-step Modulation

The by-step used in the where-filter can be very powerful! For example, what if we simply want to find all components that are connected to at least one component that has a similar Lifecycle Phase? In that case, we don't even need to create a traversal beforehand, we can just create two aliases for the same component and perform the traversal in the by-modulation. Note that since we now compare a list of referenced components, we need to use "within" instead of "eq(uals)", and fold the referenced components' lifecycle phases together into a list:

g.V().  has('lifecycle_phase').as('lp1', 'lp2').  where('lp1', within('lp2')).    by('lifecycle_phase').    by(both().values('lifecycle_phase').fold())

More examples

Find all components that have a second-degree reference to themselves

g.V().as('c1').  out().out().as('c2').  where('c1', eq('c2')).  path()

Find all components that do not have a second-degree reference to themselves

g.V().  hasLabel('Application').as('c1', 'c2').  where('c1', without('c2')).    by(identity()).    by(out().out().fold())

Still have questions? Feel free to reach out to us via the in-app chat or our website.

Did this answer your question?