Elasticsearch Integration Set Up for DSR Automation
Configure your Elasticsearch integration to fulfill Data Subject Access Requests (DSRs) — both ACCESS (right-to-know) and ERASURE (right-to-be-forgotten) — against any index in your cluster.
With Transcend's Elasticsearch integration, you can fulfill ACCESS and ERASURE DSRs directly against the documents stored in your Elasticsearch indices.
Elasticsearch indices are schema-flexible, which means PII often lands in fields you didn't plan for — emails inside log payloads, user IDs in clickstream events, names buried in error messages. Transcend's integration is built for that reality: every index that gets discovered through datapoint schema discovery automatically picks up DSR support. DSR's will run the ACCESS query and ERASURE redaction script, which you define on a per-index basis using the Edit Query UI on the datapoint.
Because Elasticsearch is most often used as a logging or analytics store, this integration redacts matching fields rather than deleting documents. That preserves log integrity and audit trails while still removing the personal data.
The first step is making sure your Elasticsearch indices are registered as datapoints in Transcend. The recommended path is to run Schema Discovery against your Elasticsearch silo — this enumerates every non-system index and surfaces its fields automatically.
Once schema discovery has run, every discovered index is treated as one datapoint. You'll then need to enable ACCESS and ERASURE actions for the relevant indices, as well as define a query JSON statement for the enabled actions.
Note: System indices (those whose names start with ., like .kibana and .security-*) are excluded from discovery by default.
Transcend's Elasticsearch integration supports both ACCESS and ERASURE. Each datapoint accepts a JSON payload. You can either enter the statement manually or click "Generate ACCESS Query Statements" to auto-generate the statement.
| Data Access | Auto-generated Statement | Manually Defined Statement |
|---|---|---|
ACCESS | Field-agnostic query_string search against every analyzed field, matching the request identifier | A custom Query DSL body |
ERASURE | Redact every top-level field of every document returned by ACCESS, replacing the value with [REDACTED] | A custom script body |
For ACCESS, your query JSON statement is a JSON object matching Elasticsearch's Query DSL. What you provide is the value of the query field in a _search request — not the full request body.
Use the {{identifier}} placeholder anywhere inside your query statement where the DSR identifier (the data subject's email, user ID, etc.) should be substituted. Note: Transcend's Sombra encryption gateway fills {{identifier}} in just before the request leaves — the integration code itself never sees the plaintext identifier.
If you click "Generate ACCESS Query Statement", Transcend auto-generates the following query JSON payload:
{
"query_string": {
"query": "{{identifier}}",
"default_operator": "AND"
}
}This matches any document where every token in the identifier appears in at least one analyzed field. The AND operator keeps multi-token identifiers (e.g. "first last") from over-matching.
Here is an example of a custom payload using a bool compound query:
{
"bool": {
"should": [
{ "term": { "message": "quick" }},
{ "term": { "message": "brown" }},
{ "prefix": { "message": "f" }}
]
}
}In this example, Transcend will return any document whose message field contains the term quick, the term brown, or starts with f.
A more typical query statement that targets a specific field by identifier could look like this:
{
"bool": {
"must": [
{ "term": { "user.email.keyword": "{{identifier}}" }}
]
}
}- The payload must be a JSON object -- not an array, not a raw string.
- Only the contents of the
queryparameter are supplied -- do not wrap your override in an outer{ "query": { ... } }. {{identifier}}is the only placeholder Transcend substitutes. Any other{{...}}token is sent to Elasticsearch as-is.
ERASURE runs against the same set of documents ACCESS finds. For every document ACCESS returns, Transcend calls POST /{index}/_update/{_id} with a Painless script that determines how the document is rewritten.
Important: ACCESS must be enabled for ERASURE to be run successfully, as ERASURE depends on the results of ACCESS.
The ERASURE query statement is a JSON object with only the following keys, matching Elasticsearch's script object shape: source, lang, and params.
If you click "Generate ERASURE Query Statements", Transcend auto-generates the following query statement:
{
"source": "List keys = new ArrayList(ctx._source.keySet()); for (def key : keys) { ctx._source[key] = params.redacted; }",
"lang": "painless",
"params": {
"redacted": "[REDACTED]"
}
}This replaces every top-level field of _source with the string [REDACTED]. The document itself is preserved — only its field values are overwritten. This is the safest default because it guarantees no personal data remains in the document, regardless of which fields contain it.
If you want to redact only specific fields (e.g. only email and name), provide a custom script:
{
"source": "if (ctx._source.containsKey('email')) { ctx._source.email = params.redacted; } if (ctx._source.containsKey('name')) { ctx._source.name = params.redacted; }",
"lang": "painless",
"params": {
"redacted": "[REDACTED]"
}
}ACCESS and ERASURE are wired together. The flow on every DSR is:
ACCESSruns the configured (or default) Query DSL search against the index. Matching documents are streamed back through Transcend's Sombra encryption gateway.- For each matching document, Transcend records the document
_id. ERASUREcalls_update/{_id}once per document, applying the configured (or default) Painless script.
This means scoping ACCESS also scopes ERASURE. If your ACCESS query is broad, your ERASURE will run against a broad set of documents. If your ACCESS query is narrow, only those documents are redacted.
It also means ERASURE does not delete documents. The document remains in the index; its field values are overwritten. This is intentional: most Elasticsearch deployments are used for logs, and deleting log entries breaks audit-trail integrity. Redaction satisfies the privacy requirement while leaving the surrounding event record intact.