NATS JetStream backend troubleshooting

Symptom

Events were not received by the consumers.

Remedy

  1. Follow the diagnostic steps as mentioned in Eventing Troubleshooting.

  2. Use the nats CLI to check if the stream was created:

    1. Port forward the Kyma Eventing NATS Service to localhost. Use port 4222. Run:
      Click to copy
      kubectl -n kyma-system port-forward svc/eventing-nats 4222
    2. Use the nats CLI to list the streams:

      Click to copy
      $ nats stream ls
      ╭────────────────────────────────────────────────────────────────────────────╮
      │ Streams │
      ├──────┬─────────────┬─────────────────────┬──────────┬───────┬──────────────┤
      │ Name │ Description │ Created │ Messages │ Size │ Last Message │
      ├──────┼─────────────┼─────────────────────┼──────────┼───────┼──────────────┤
      │ sap │ │ 2022-05-03 00:00:00 │ 0 │ 318 B │ 5.80s │
      ╰──────┴─────────────┴─────────────────────┴──────────┴───────┴──────────────╯
    3. If the stream exists, check the timestamp of the Last Message that the stream received. A recent timestamp would mean that the event was published correctly.

    4. Check if the consumers were created and have the expected configurations.

      Click to copy
      nats consumer info

      To correlate the consumer to the Subscription and the specific event type, check the description field of the consumer.

    5. If the PVC storage is fully consumed and matches the stream size as shown above, the stream can no longer receive messages. Either increase the PVC storage size or set the MaxBytes property which removes the old messages.

  3. Check the JetStream grafana dashboard:

    1. Port forward the Kyma Eventing NATS Service to localhost. Use port 8081. Run:
      Click to copy
      kubectl -n kyma-system port-forward svc/monitoring-grafana 8081:80
    2. On localhost:8081 search for NATS JetStream dashboard. You can find the stream and consumer metrics as well as the storage and memory consumption.

    3. Also search for JetStream Event Types Summary and Delivery per Subscription dashboards to visualize and debug the phase during which the events were lost.