Help with Loki Dashboard Setup – Duplicate Fields

michaleObrien · August 26, 2025, 9:48pm

Hi all!
I’m currently trying to set up my first local Loki dashboard with promtail (I know I should use Alloy instead but I wanted to get promtail working first) and wanted to know why my logs have duplicated fields where they have property and property_extracted, it’s causing all my logs to be a little cluttered with fields. I’ve attached what my fields look like in grafana and my current config file is above ^

kennyHarold · August 26, 2025, 10:17pm

https://grafana.com/docs/loki/latest/configure/#limits_config|https://grafana.com/docs/loki/latest/configure/#limits_config
Check
discover_service_name
discover_lof_level

zTakacs · August 26, 2025, 11:00pm

P stands for Parsed, I for Indexed.
Example: You have level=INFO host=Ubuntu msg="Connected to PostgreSQL database" as a log line, and you send it to loki with host as a label. When you will do {host=~".*"} | logfmt , Loki will return you host and host_extracted because your parsing gives a key that is conflicting with your existing indexed keys.

zTakacs · August 26, 2025, 11:57pm

You are not storing duplicated keys, but when you parse, you have conflicts so the number of keys seem to grow. It has no impact on performance

michaleObrien · August 27, 2025, 12:31am

right so in this context if I removed host from my labels in my config file I would only get one key for host? Would it therefore be better to keep my labels and remove them from the actual log line instead? If I have it in both the log line and the label that’s what causes this issue?

zTakacs · August 27, 2025, 12:36am

Since it has no incidence on performance, it’s better to keep it this way.

zTakacs · August 27, 2025, 1:15am

Loki’s “ideal” scenario is when you index nothing about what is in the log line. You don’t even read, you just ship it and add labels about where it’s coming from (hostname, filename, cluster name, etc)

zTakacs · August 27, 2025, 1:42am

Now IRL scenarios, it’s never perfect, and you could need to add a label that is either already present in the logline (conflicts) or need to have a details that is already insde

zTakacs · August 27, 2025, 2:13am

So Loki just use this concept of “extracted” if the parsing is giving the same keys, but it has no impact on features or performance

zTakacs · August 27, 2025, 2:23am

You will be happy to have it the day you have a real conflict

michaleObrien · August 27, 2025, 3:06am

Right, so in many logs something like “service” or “env” won’t be parsed and will have to be inferred by labels

zTakacs · August 27, 2025, 4:02am

Exactly !
Real example of why you want to have extracted vs indexed: an app on hostname=my-server-1 logging something like “I received this batch from hostname=my-backend-12”… You will be happy to have a hostname for both where it comes from and what’s inside the log line

michaleObrien · August 27, 2025, 4:03am

Right where does that indexed information come from if it’s not parsed through the log? I thought it was all just doing parsing from text, I guess things like promtail or alloy add this information under the hood before pushing to loki?

zTakacs · August 27, 2025, 4:15am

Alloy/Promtail can either enrich from environment or read the log line (extract data with regex)

kennyHarold · August 27, 2025, 5:07am

Loki is doing that extract. It is in the limits_config and I Provided the config Parameters. _extracted is in my understand nothing from the logql query.

michaleObrien · August 27, 2025, 5:16am

right so doesn’t that just say any label in discover_service_name should be used to populate the service_name field and any label in the log_level_fields should be used to populate the level field. That doesn’t quite explain why things like env_extracted are being created or how to stop them showing. The explanation that it’s just pulling these fields from both the log line and the label config and effectively creating the same field twice makes sense to me though

michaleObrien · August 27, 2025, 5:39am

As a last question I just wanted to check I’ve understood labels correctly. So labels are used to create streams for logs and the more unique label values there are, the more demanding this is on the machine running Loki. I previously had a field client_ip and user_id(hashed) in my labels (as I want to be able to monitor bad actors or malicious traffic) but moved it out of the labels and just put it into the log line. Is this the correct practice? Have I misunderstood anything there?

kennyHarold · August 27, 2025, 6:21am

Correct. You should avoid Labels with high cardinality. Clientip and userid are probably Bad candidates for labels