In this article
We review the steps to set up data routing from permutive to your AWS S3 destination.
Introduction
Permutive Routing makes it easy to load all of your first-party event data from web and mobile into data warehouses like Google BigQuery and Amazon S3.
The Permutive S3 Router gives you access to your raw event data from Permutive, allowing you to run your own analysis or integrate with your existing data pipelines in AWS. Data can be written as JSON or Parquet (GZIP or Snappy compressed, respectively).
This guide will walk you through how to set up the S3 bucket and the permissions required for the Routing integration. With these in place, our system will write event data to your bucket every 24 hours.
Getting started with S3
Create an AWS account
You will need to create an AWS account if you do not already have one.
AWS offers a free tier that includes 5GB of free storage in S3 for the first 12 months of opening your account.
Create an S3 bucket
If you do not already have one, you will need to create a new S3 bucket within your AWS account to hold your Permutive data. Instructions for doing this are available from the AWS documentation.
Creating your S3 Router
Set permissions on the S3 bucket
You must grant cross-account access to the Permutive AWS user for your integration with the following bucket permissions:
- PutObject
- GetObject
- DeleteObject
- ListBucket
There will be a specific user ARN for the policy, which will be provided by Permutive support at your request during the setup process. We can provide a working policy document for you to apply during the creation process.
Notify Permutive of the bucket details
You must now provide the following details to your contact at Permutive:
- S3 bucket name
- path prefix (optional)
- the format for exported data (gzip-compressed JSON or Snappy-compressed Parquet)
We will then activate your S3 integration, with data being written every 24 hours (at approximately 3am UTC). Read on for details on the paths within the bucket for the various types of data we export.
Data included in the export
Events
Data for the events created in your Permutive account will be uploaded under a path containing _events. Events are written to separate paths for each day of data. For example, Pageview events for Jan 1st, 2021 will be written to the following path:
s3://<bucket_name>/<prefix>/data/pageview_events/year=2021/month=1/day=1
Similar paths will be created for each event defined in your Permutive account..
Aliases
Alias data will be written similarly to events, with separation of each day’s worth of data:
s3://<bucket_name>/<prefix>/data/aliases/year=2021/month=1/day=1
Domains
Domains data is not date-specific but contains a snapshot of the latest domains in your Permutive account. This latest data is overwritten with each export. Data is written to:
s3://<bucket_name>/<prefix>/data/domains
Segment metadata
As with domains, segment metadata is written as a snapshot of the latest data, overwritten with each export:
s3://<bucket_name>/<prefix>/data/segment_metadata
Comments
0 comments
Article is closed for comments.