All Collections
Data Analysis
Gremlin Tricks: Assigning Numeric Values to Text and List Fields
Gremlin Tricks: Assigning Numeric Values to Text and List Fields

A how-to guide for assigning numeric values and storing fields as text when writing Gremlin queries.

Kristine Marhilevica avatar
Written by Kristine Marhilevica
Updated over a week ago

When writing Gremlin queries, you may find yourself in a situation where you need to interpret a non-numeric value as a number. In general, if you want to perform numeric calculations based on a field's value, we recommend storing the field as a number field. However, there may be cases where storing such fields as text can be convenient:

  • You have stored numeric values in a list field (list field values in Ardoq are stored as text strings), so as to only make it possible to choose among a predefined set of values.

  • The values are not inherently numeric, but it may make sense to assign numeric values to them during analysis. Consider a field called Hosting Type that may have the values "On Premise", "Cloud", "SaaS" or "3rd. Party". In this case it makes sense to store the values as text, but if you were to calculate an application's rating based on its hosting type it is useful to be able to assign a number to each value.

Example: Parse integers stored as text

Assume that we have a field called Technical Integrity, whose values are integers that, for whatever reason, are stored as text. We want to find the average technical integrity for all of our applications, but we get an error if we run the following:

g.V().  hasLabel('Application').  values('technical_integrity').  mean()
Ardoq Parse integers stored as text error example

We get this error because the Technical Integrity values are not integers, but text representations of integers, so it is not possible to calculate their average.

To make this work, we can create a function that converts strings to integers, and then use this to map our text values to numeric values:

def toInteger = { it.get().toInteger() }g.V().  hasLabel('Application').  values('technical_integrity').  map(toInteger).  mean()
Ardoq Gremlin map text values to numeric values

Example: Parse non-integer numbers stored as text

Our approach of converting string representations of numbers to integers only works if the numbers are indeed integers, e.g. "3". If the number is not an integer, e.g. "3.14", we will get an error:

Ardoq Parse non-integer numbers stored as text

To avoid this, we can write a more robust function that converts string representations of numbers to floating-point numbers instead of integers. This way, we will be able to handle both integers and non-integers:

def toFloat = { it.get().toFloat() }g.V().  values('technical_integrity').  map(toFloat).  mean()
Ardoq convert string representations of numbers to floating-point numbers

Example: Handle cases where not all the values can be interpreted as numbers

If we try to interpret a text value that is not a number, e.g. "High" as a floating-point number we will get an error:

Ardoq example error when value cannot be interpreted as a number

There are two ways to handle this:

  • We can provide a fallback value for cases where the value cannot be converted to numbers.

  • We can filter away those values that cannot be converted to numbers before converting them to numbers.

Example: Providing a fallback value for values that cannot be converted to numbers

In the below example we have modified the toFloat closure to convert the text value to a number if possible, and return 0 otherwise.

def toFloat = {  try {    it.get().toFloat()  } catch (NumberFormatException e) {    return 0  }}g.V().  values('technical_integrity').  map(toFloat).  mean()
Ardoq fallback value for values that cannot be converted to numbers

Example: Filtering away values that cannot be converted to numbers

In the below example we have kept our toFloat function as it was, but introduced a separate function canBeConvertedToFloat which returns true if the value can be converted to a number and false otherwise. We provide this function to a filter()-step before the map()-step where we convert the values to numbers.

def canBeConvertedToFloat = {  try {    it.get().toFloat()    true  } catch (NumberFormatException e) {    false  }}def toFloat = { it.get().toFloat() }g.V().  values('technical_integrity').  filter(canBeConvertedToFloat).  map(toFloat).  mean()
ardoq gremlin tricks

Example: Assigning values to non-numerical fields

Assume that applications have a field called Hosting Type, which may have the values "Saas", "3rd. Party", "Cloud" or "On Premise". Assume that we eventually want to only use SaaS applications and want to write a query that gives our application portfolio a score from 1 to 5 based on the hosting type of the applications, that can be tracked over time. We want to achieve this by assigning the value 5 to "SaaS", 4 to "3rd. Party", 3 to "Cloud", 2 to "On Premise" and 1 to any other value, and then take the average of this number for all of our applications.

If Hosting Type was already a number field we could calculate the average application score with the following query:

g.V().  hasLabel('Application').  values('hosting_type').  mean()

In our case Hosting Type is not a number field, so running the above query would give us an error:

Ardoq Assigning values to non-numerical fields

The following example shows how we can map each possible value of Hosting Type to a number:

g.V().  hasLabel('Application').  has('hosting_type').  choose(values('hosting_type')).    option('SaaS',       constant(5)).    option('3rd. Party', constant(4)).    option('Cloud',      constant(3)).    option('On Premise', constant(2)).    option(none,         constant(1)).  mean()
Ardoq map each possible value of Hosting Type to a number

Notice how we use has('hosting_type') to check that the Hosting Type field exists on all the components that are passed to the choose()-step which maps the Hosting Type values to numbers. The choose()-step requires that the number of values returned from the choose()-step is equal to the number of values we have before the choose()-step. The values()-step is only able to extract the Hosting Type value from those components that have it, so if some of our applications do not have a Hosting Type field the choose()-step would not be able to map all of the values it is passed to another value.

πŸ’¬ We're always happy to help. Feel free to reach out to us via the in-app chat if you have any follow-up questions about this article.

Did this answer your question?