Finding log records with more than 10 names in JSON array

Hi everyone… We have logs that are JSON and contains a field that is an array:
{
"names": [ "mickey", "minnie", "donald", "goofy", "pluto" ]
}
I want to query our logs to find how many log records where the number of names is greater than 10. I can’t figure out how to do this. For pseudo-syntax of the query, I would imagine something like this:
@names.length > 10
Does anyone know how to do this?

is the array parsed as a attribute? and have you created a facet on that attribute?

if yes, you could do something like this:

note, your names will need to be a facet which should be indicated by the lack of a @ value in front of it, i am using the @names value as an example here.

use grouping by names to get a count, then add cutoff formula with a minimum value of 10

Thanks so much for your response!
It is definitely a facet. I’m not sure about if it is “parsed as an attribute”.
I tried what you suggested and I am not seeing what I want (though I maybe wasn’t clear in my initial question). The question I ultimately want answered is: How many logs had more than 10 items in the “names” field over the 6 months?

this is a interesting use case, i have a workaround for ya but im not sure if it will make sense to use. really depends on the end goal you have.

The issue here is you need to count on values that can then be used to count the number of logs involved, im not aware of a way to do this without adding a pipeline modification to create a new value to count on that includes the 10 or more requirement.

So heres the idea,
in your pipeline, create a Gork parser that is using a targeted source attribute value for names
then use a boolean logic rule that uses regex to match on instances of when the names array has 10 or more values.
This will be used to output a new attribute with a true or false value

You can then filter your logs with the new attribute set as true, and should now only display logs where you have 10 or more values.
Could also use this with a tables view to make a simple count.

Details:
Grok rule
rule %{boolean("(?:[^,]+,){9,}[^,]+",".+"):Ten_Or_more}
set “Extract from” value from the default message to names NOTE you should not need to use the @names, adding the @ will likely cause it to not count on the desired array.

See attached screenshots for working example.
You will need to tweak the rules logic to match the log pattern in your array, what ive shared should get you 95% of the way there

hope this helps!

For your consideration, be aware that pipeline and attribute changes will typically only apply to logs from that moment forward, in practices, i find that logs within the last 15 or 30 min could be used with updated attribute or pipeline changes, results vary.

All of that to say, if your NEED to answer the 10 or more question over the last 6 months you will most likely need to rehydrate old logs so they are reprocessed by datadog, which in turn will let you search using your new 10 or more attribute.

also might be worth pointing out, depending on goal/needs if this is a one off thing you need an answer to you, and the cost to rehydrate 6 months of data is not a option. could look at running some local filtering on your raw logs pulled from storage to table out/count the number of logs with 10 or more name values in it. (grep, awk, ripgrep, something in python etc etc)