Dynamic Sampling Context (Experimental)

Until now, traces sampling was only done through a traces_sample_rate option in the SDKs. This has quite a few drawbacks for users of Sentry SDKs:

  • Changing the sampling rate involved either redeploying applications (which is problematic in case of applications that are not updated automatically, i.e., mobile apps or physically distributed software) or building complex systems to dynamically fetch a sampling rate.
  • Sampling only happened based on a factor of randomness. Employing sampling rules, for example, based on event parameters, is currently very complex. While writing rules for singular transactions is possible, enforcing them on an entire trace is infeasable.

The solution for these problems is Dynamic Sampling. Dynamic Sampling allows users to configure sampling rules directly in the Sentry interface. Important: Sampling rules may also be applied to entire traces.

High-Level Problem Statement

Ingest

Implementing Dynamic Sampling comes with challenges, especially on the ingestion side of things. For Dynamic Sampling, we want to make sampling decisions for entire traces. However, to keep ingestion speedy, Relay only looks at singular transactions in isolation (as opposed to looking at whole traces). This means that we need the exact same decision basis for all transactions belonging to a trace. In other words, all transactions of a trace need to hold all of the information to make a sampling decision, and that information needs to be the same across all transactions of the trace. We call the information we base sampling decisions on "Dynamic Sampling Context" or "DSC".

SDKs

SDKs are responsible for propagating Dynamic Sampling Context across all applications that are part of a trace. This involves:

  1. Collecting the information that makes up the DSC xor extracting the DSC from incoming requests.
  2. Propagating DSC to downstream SDKs.
  3. Sending the DSC to Sentry via an envelope.

Because there are quite a few things to keep in mind for DSC propagation and to avoid every SDK running into the same problems, we defined a unified propagation mechanism (step-by-step instructions) that all SDK implementations should be able to follow.

Baggage

We chose baggage as the propagation mechanism for DSC. (w3c baggage spec) Baggage is a standard HTTP header with URI encoded key-value pairs.

For the propagation of DSC, SDKs first read the DSC from the baggage header of incoming requests/messages. To propagate DSC to downstream SDKs/services, we create a baggage header (or modify an existing one) through HTTP request instrumentation.

The following is an example of what a baggage header containing Dynamic Sampling Context may look like:

Copied
baggage: other-vendor-value-1=foo;bar;baz, sentry-traceid=771a43a4192642f0b136d5159a501700, sentry-publickey=49d0f7386ad645858ae85020e393bef3; sentry-userid=Am%C3%A9lie, other-vendor-value-2=foo;bar;

See the Payloads section for a complete list of key-value pairs that SDKs should propagate.

Payloads

Dynamic Sampling Context is propagated via a baggage header and sent to Sentry via transaction envelope headers.

Baggage-Header

SDKs may set the following key-value pairs on baggage headers. While all of these values are optional, SDKs should make their best effort to add as many of them to the baggage header as possible when starting a trace.

  • sentry-traceid - The original trace ID as generated by the SDK
  • sentry-publickey - Public key as defined by the user via the DSN in the SDK options
  • sentry-release - The release as defined by the user in the SDK options
  • sentry-environment - The environment as defined by the user in the SDK options
  • sentry-transaction - The name of the trace's origin transaction in unparameterized (raw) format
  • sentry-userid - User ID as set by the user with scope.set_user
  • sentry-usersegment - User segment as set by the user with scope.set_user
  • sentry-samplerate - Sample rate as defined by the user in the SDK options

SDKs must set all of the keys in the form of "sentry-[name]". The prefix "sentry-" acts to identify key-value pairs set by Sentry SDKs. Additionally, we chose [name] to be written in "snake case" without any underscore ( _ ) characters. This naming convention is the most language agnostic.

Envelope Header

Dynamic Sampling Context is transferred to Sentry through the transaction envelope headers, keyed by trace. It corresponds directly to the definition of Trace Context.

When a transaction is reported to Sentry, the Dynamic Sampling Context must be mapped to Trace Context in the following way:

  • sentry-releaserelease
  • sentry-environmentenvironment
  • sentry-transactiontransaction
  • sentry-useriduser.id
  • sentry-usersegmentuser.segment
  • sentry-sampleratesample_rate
  • sentry-traceidtrace_id
  • sentry-publickeypublic_key

Unified Propagation Mechanism

SDKs should follow these steps for any incoming and outgoing requests (in python pseudo-code for illustrative purposes):

Copied
def collect_dynamic_sampling_context():
  # Placeholder function that collects as many values for Dynamic Sampling Context
  # as possible and returns a dict

def on_incoming_request(request):
  if has_header(request, "sentry-trace") and (not has_header(request, "baggage") or not has_sentry_value_in_baggage_header(request)):
    # Request comes from an old SDK which doesn't support Dynamic Sampling Context yet
    # --> we don't propagate baggage for this trace
    transaction.baggage_locked = true
    transaction.baggage = {}
  elif has_header(request, "baggage") and has_sentry_value_in_baggage_header(request):
    transaction.baggage_locked = true
    transaction.baggage = baggage_header_to_dict(request.headers.baggage)

def on_outgoing_request(request):
  if not transaction.baggage_locked:
    transaction.baggage_locked = true
    if not transaction.baggage:
      transaction.baggage = {}
    transaction.baggage = merge_dicts(collect_dynamic_sampling_context(), transaction.baggage)

  if has_header(request, "baggage"):
    outgoing_baggage_dict = baggage_header_to_dict(request.headers.baggage)
    merged_baggage_dict = merge_dicts(outgoing_baggage_dict, transaction.baggage)
    merged_baggage_header = dict_to_baggage_header(merged_baggage_dict)
    set_header(request, "baggage", merged_baggage_header)
  else:
    baggage_header = dict_to_baggage_header(transaction.baggage)
    set_header(request, "baggage", baggage_header)

While there is no strict necessity for the transaction.baggage_locked flag yet, there is a future use case where we need it: We might want users to be able to set Dynamic Sampling Context values themselves. The flag becomes relevant after the first propagation, where Dynamic Sampling Context becomes immutable. When users attempt to set DSC afterwards, our SDKs should make this operation a noop.

Considerations

Todo:

  • Why baggage and not trace context https://www.w3.org/TR/trace-context/?
  • Why must baggage be immutable before the second transaction has been started?
  • Why can't we just make the decision for the whole trace in Relay after the trace is complete?
You can edit this page on GitHub.