Step Functions: Forget JSONPath, JSONata is here

Gerard Alquezar
Principal Solutions Architect

Gerard Alquezar
Principal Solutions Architect
Select, sort, and filter data with JSONata with AWS Step Function
I have been enjoying AWS Step Functions since its launch at the end of 2016. I received the launch of this orchestration service with enthusiasm because the service was an alternative to Amazon Simple Workflow Service (SWF) a much more complex service to use. As a Cloud Support Engineer long ago, I was supporting both. Step Functions is an easy-to-use orchestration service that helps you build distributed applications via workflows and tasks. Yet, it wasn’t all sunshine and rainbows, when you needed to use JSONPath to access some values of your JSON input or output document, you struggled with the syntax and limitations of the feature.
What was my surprise when I recently needed to implement a couple of Step Functions state machines for a project, and I discovered the new Step Functions support of JSONata released on November 2024.
This article describes some common use cases of JSONata, a powerful query and transformation language for JSON documents, a recent feature of Step Functions. They will definitely make your life easier, and your state machines more maintainable.
Let’s start.
Imagine you receive the following JSON document from a previous state in your Step Functions state machine:
{
"data": [
{
"name": "Emma",
"age": 64,
"cars": [
"Mercedes-Benz"
]
},
{
"name": "Oliver",
"age": 31,
"cars": [
"Renault",
"Seat"
]
},
{
"name": "John",
"age": 18,
"cars": [
"Ford"
]
},
{
"name": "Lily",
"age": 42,
"motorcycles": [
"Honda",
"KTM"
]
},
{
"name": "Amelia",
"age": 20,
"cars": [
"Toyota",
"Volkswagen"
]
},
{
"name": "George",
"age": 56,
"cars": [
"Volvo"
]
}
]
}
However, all that information is not useful for you. You want to filter it.
You might want to perform an operation in a later state across car drivers. If so, you could filter them in the output of a Pass state like this:
"Pass": {
"Type": "Pass",
"Output": "{% $states.input.data[cars] %}",
"Next": "Next State"
}
Which will produce the following result:
[
{
"name": "Emma",
"age": 64,
"cars": [
"Mercedes-Benz"
]
},
{
"name": "Oliver",
"age": 31,
"cars": [
"Renault",
"Seat"
]
},
{
"name": "John",
"age": 18,
"cars": [
"Ford"
]
},
{
"name": "Amelia",
"age": 20,
"cars": [
"Toyota",
"Volkswagen"
]
},
{
"name": "George",
"age": 56,
"cars": [
"Volvo"
]
}
]
You can see up there that the resulting array doesn’t contain Lily anymore because she is a motorcycle rider.
What about if you want to sort the result. Let’s say sorted alphabetically by name. You would use the following output:
"Pass": {
"Type": "Pass",
"Output": "{% $states.input.data[cars]^(<name) %}",
"Next": "Next State"
}
Which will generate this result:
[
{
"name": "Amelia",
"age": 20,
"cars": [
"Toyota",
"Volkswagen"
]
},
{
"name": "Emma",
"age": 64,
"cars": [
"Mercedes-Benz"
]
},
{
"name": "George",
"age": 56,
"cars": [
"Volvo"
]
},
{
"name": "John",
"age": 18,
"cars": [
"Ford"
]
},
{
"name": "Oliver",
"age": 31,
"cars": [
"Renault",
"Seat"
]
}
]
Perhaps, sorted by age, the oldest first. Pass state would be:
"Pass": {
"Type": "Pass",
"Output": "{% $states.input.data[cars]^(>age) %}",
"Next": "Next State"
}
And the result:
[
{
"name": "Emma",
"age": 64,
"cars": [
"Mercedes-Benz"
]
},
{
"name": "George",
"age": 56,
"cars": [
"Volvo"
]
},
{
"name": "Oliver",
"age": 31,
"cars": [
"Renault",
"Seat"
]
},
{
"name": "Amelia",
"age": 20,
"cars": [
"Toyota",
"Volkswagen"
]
},
{
"name": "John",
"age": 18,
"cars": [
"Ford"
]
}
]
Both sorting criteria could be combined like this:
"Pass": {
"Type": "Pass",
"Output": "{% $states.input.data[cars]^(<name, >age) %}",
"Next": "Next State"
}
The resulting array will sort by age in case of two or more people with the same name. Thus, the oldest will be first.
Maybe, those results are too many for you, and you only want the top 3. You will add another filtering option like this:
"Pass": {
"Type": "Pass",
"Next": "Next State",
"Output": "{% $states.input.data[cars]^(<name, >age)[[0..2]] %}"
}
Resulting in:
[
{
"name": "Amelia",
"age": 20,
"cars": [
"Toyota",
"Volkswagen"
]
},
{
"name": "Emma",
"age": 64,
"cars": [
"Mercedes-Benz"
]
},
{
"name": "George",
"age": 56,
"cars": [
"Volvo"
]
}
]
Alternatively, you might want to filter the motorcycle riders, and put the result into the riders key instead of the top-level of the JSON resulting document. You would do it like this:
"Pass": {
"Type": "Pass",
"Output": {
"riders": "{% $states.input.data[motorcycles] %}"
},
"Next": "Next State"
}
Which will result on this document
{
"riders": {
"name": "Lily",
"age": 42,
"motorcycles": [
"Honda",
"KTM"
]
}
}
Note riders value is an object, and not an array. This could be troublesome if your code is expecting an array. Nonetheless, this could be easily fixed adding []
around $states
variable like this:
"Pass": {
"Type": "Pass",
"Output": {
"riders": "{% [$states.input.data[motorcycles]] %}"
},
"Next": "Next State"
}
The previous examples are the ones I wanted to show, given it is helpful for multiple use cases to select, sort, and filter data. However, JSONata’s capabilities go further than those. You can check other use cases and features in the project documentation JSONata.
My AWS Serverless mentor reminded a very good point when we discussed about the wonders of JSONata in Step Functions and I cannot agree more with him. It is important to remember Step Functions is an orchestration service, and features like JSONata open the door to entangle orchestration logic with business logic. Thus, it is key to follow Software Engineering best practices when developing Step Functions state machines with JSONata.
Stay tuned for more articles about Serverless and Step Functions.