split

Split a data extract into multiple extracts.

Introduction

Use the split enrichment to split the data extract into multiple extracts based on given criteria. You can split a data extract using a given number of rows or based on distinct values in a given column.

Creating the enrichment

For more information on creating an enrichment, see Using custom scripts.

Configuring the enrichment

To configure the enrichment, fill in the following fields. Required fields are marked with an asterisk (*).

Key*

Enter the name of the column to search in for distinct values. The Enrichment splits the data extract into a new extract for each unique value in the column.

Transformers

Enter the name of the enrichment scripts to apply to the new data extracts created after the split enrichment is complete. The name of the enrichments entered in this field must be an exact match of the name of an existing enrichment.

Tags

Enter the name of the tags to apply to the data extracts created after the split enrichment is complete.

Update Metadata From Key

Select this field to add the key to the metadata of the new data extracts created.

Name Pattern

Enter a name pattern from which the names of the split data extracts are created. You can use the following placeholders in the name pattern:

  • {datastream_slug} - This is an identifier for the datastream that combines the data source, the ID of the datastream and the name assigned to the datastream.

  • {id} - This is the ID of the datastream.

  • {app_label} - This is the data source

  • {split_counter} - This is a count of how many data extracts have been created. Counting starts at 0.

  • {key} - This is the key used to split the data extract.

  • {uuid} - This is the unique file ID.

  • {meta[value]} - Use a value from the data extract metadata

Rowcount

Enter the number of rows that each split data extract will contain. For example, enter 10, to split a data extract of 120 rows into twelve separate data extracts, each containing 10 rows.

Subtable

Enter the name for a subtable that you want to contain the enriched data. The enrichment is applied to the whole data extract, then the enriched data is output into the subtable you have named here.

This subtable is a temporary table, which means it only exists for this custom script. You can apply additional instructions within the same custom script to the subtable. However, the subtable cannot be used in any other custom scripts.

Example

Enrichment configuration

Key

Campaign

Name Pattern

{Key}_{split_counter}

Data table before enrichment

Campaign

Ad Group

Clicks

Brand

media

7

Brand

ecommerce

3

Brand

festivals

18

Dashboard

ecommerce

4

Dashboard

media|social

5

Dashboard

media

11

Data table after enrichment

Table 1

This table is called Campaign_0.

Campaign

Ad Group

Clicks

Brand

media

7

Brand

ecommerce

3

Brand

festivals

18

Table 2

This table is called Campaign_1.

Campaign

Ad Group

Clicks

Dashboard

ecommerce

4

Dashboard

media|social

5

Dashboard

media

11