In this article
A guide intended for publishers and marketers who want to upload second-party data into Permutive. This is typically user cohort membership data that comes from an advertising partner or an advertising partner’s DMP. It could also be the publisher's first-party data on their subscribers or other identifiable users.
This type of data import can be used as a one-off data dump or a periodic import.
In order to use Permutive data imports, you’ll first need access to a Google Cloud Storage bucket for data uploads.
Permutive supports one GCS per customer, but each project/data source (we refer to them as data providers) is broken down into subfolders.
Please contact Permutive support with the following information and our team will provide you with a bucket and relevant access:
1. Users to have access to the bucket. There are a couple of options depending on your intended upload method:
- Manual file upload: we will grant Google workspace group email addresses access to the GCS bucket so that you can upload files via the Google Cloud Platform Console or command-line tool. The account has to be recognised as associated with an active Google Account or Google Apps account. Using group email accounts allows you to manage user access, meaning if you want to add/remove users you do not need to reach out to us. If you do not have a group email address please let us know.
- Programmatic file upload: you will need to create a service account within your GCP project, which will be used to perform uploads to Permutive. In this case, please let us know the email address of your service account so that we can grant access.
Note: You can also provide both types of users to have access to the same bucket.
2. The name of the project/data source (this could be a partner's name or an internal project name if it's your data) - it is quite arbitrary and will form a part of your bucket's path. Please use just letters, numbers and an underscore (ie.
3. Name of the user id alias to be used in matching. If the data comes from an advertising partner then this will most likely be an AppNexus ID. When you are using your own data, users have to be identified via our identity framework, see User Matching below.
Sending data into Permutive
Data should be uploaded to your Permutive GCS bucket in the format described in this document. The Permutive platform will detect new uploads made to this bucket and immediately ingest new data into the platform.
Any data item included in your upload must have a user ID associated with it. In order for Permutive to tie imported data with a user landing on your site, the same user ID must be available on-site. This could be a user ID picked up by one of our existing cookie syncs, or it could be your own internal user ID that you are using with permutive.identify. The ID must be a string containing up to 100 characters.
Permutive treats these external user IDs as aliases, and each alias has an alias type associated with it. We typically rely on other third-party IDs such as the AppNexus ID to match Permutive users with second and third-party data that are sent into the platform. This alias has type Appnexus. If Permutive is not already picking up AppNexus IDs for your users, or if you’d like to send a different third-party ID in your data imports, please discuss this with our support team.
Below, we've used a sample scenario where you have data collected on subscribers. This will only apply to visitors who have logged in. We assume that when a user logs in you will have an opportunity to execute code and will have some form of internal ID for that user (in
user.id variable). You could then use this code to associate the login ID with that visitor:
We would be able to use the
subscriber_id as a user ID alias in the audience data.
Please do refer to our Identity Framework Guide for more details.
Note: Ensure that the value passed as an id is never empty as these would cause ingestion errors and could make different users on the platform collapse into one.
Prior to beginning uploads for a new second-party data provider, you must send us, via email, a cohort taxonomy that describes the cohort IDs you will be sending in your data import files. We are able to receive taxonomy files in either Excel (xlsx) or CSV format.
A taxonomy file should include the following information about each cohort:
A unique identifier for the cohort. This can be any alphanumeric string. This will be never displayed in the UI as it's only used by our platform. The best practice here would be to have a sequence (ie.
The display name for the cohort. This will show in the Permutive dashboard. You may choose to organise your cohorts into categories, in which case category levels should be delimited by a hyphen. You will be able to update this value.
Description of the cohort. This will show in the Permutive dashboard.
The CPM for the cohort. This will be displayed in the dashboard, but it has no effect on how the cohort functions. You can leave it as '0' if you do not need to see the value in the dashboard.
The expiry time for this cohort. At the moment we don't use this value for determining the validity, but a default of 60 days is used instead. However, we still require this field for consistency with our third-party taxonomies. We plan to support custom Lifetimes in the future.
You can update the taxonomy at any time in the future, adding, removing or modifying cohorts as required. Please always send us the full current taxonomy, not just the changes you've made.
We'll continue our subscribers' data scenario. Let's assume that you have the following data on each user, most of them optional:
- The country a user lives in
- Their income bracket
- Their declared gender
- Subscription type (required value)
- List of interests
You would model this as follow, with each possible value as a separate cohort:
|ID||Name||Description||CPM (USD)||Lifetime (days)|
|0001||Country - France||Users living in France||0||45|
|0002||Country - Spain||Users living in Spain||0||45|
|0003||Country - Germany||Users living in Germany||0||45|
|0004||Country - India||Users living in India||0||45|
|0005||Country - China||Users living in China||0||45|
|0006||Income < $20,000||Having an income of less than $20,000||0||60|
|0007||Income $20,000 - $40,000||Having an income of between $20,000 and $40,000||0||60|
|0008||Income > $40,000||Having an income of over $40,000||0||60|
|0009||Gender - Female||People that identify as Female||0||30|
|0010||Gender - Male||People that identify as Male||0||30|
|0011||Subscriber - Free||Non-paying subscribers||0||30|
|0012||Subscriber - Premium||Paying subscribers||0||30|
|0013||Interest - Cars||Those who are interested in cars.||0||30|
|0014||Interest - Travel||Those who are interested in travelling||0||30|
|0015||Interest - Musicals||Those who are interested in musicals||0||30|
You can find the example taxonomy file attached at the bottom of the page.
Each row in your file should describe a list of second-party cohorts for a specific user. The row must be in the following format:
<USER ID><SPACE><SEGMENT IDS AS CSV>
For example, an import of four cohorts against a user ID would appear as a single row in the data file, with a comma-separated list of cohort IDs:
Note: Cohort updates for an individual user are incremental. This means that if a user is already a member of other second-party cohorts from the same provider, as a result of a previous upload, Permutive will append the new list of cohorts to the existing cohorts. User IDs and cohort IDs must not contain spaces. Every line in your input should be terminated with a new line and there should be no enclosing quotes (or any other wrapping characters).
In our subscribers' data scenario, we have already modelled our data and are ready to prepare the file for upload.
We start with the following data in our database:
User 1 - subscriber-id: 76E5F445-1993, a premium subscriber from Spain with Income $20,000 - $40,000
User 2 - subscriber-id:5E824DCF-2C6D, a male user, free account
User 3 - subscriber-id:69E0985B-50C0, a female user from China, premium account, interested in cars and travelling
User 4 - subscriber-id:2DABE6C1-07DD, a male user from France, premium account, interested in cars, with Income > $40,000
This would be translated into the following structure, ready for the upload to the Permutive GCS:
Please note that lines 3-5 describe the same user - they will be treated the same as:
You could also format each line to have a user id and just one cohort if this format is easiest for you.
Please email the file (or the first 100 lines if the file is larger than 5 MB) to email@example.com for format verification before uploading to the bucket.
You can find the example upload file attached at the bottom of the page.
Warning: It is important to ensure that the CPM is listed as '0' when importing second-party cohorts. There is no additional charge to import second-party data. However, if there is any number larger than zero listed for the CPM, you will be charged for that amount as part of third-party billing.
All data should be uploaded under a 2p subdirectory in your bucket:
You will need a data provider ID for each second-party data partner you import data from. Please liaise with our support team to ensure your second-party data IDs are set up prior to beginning data uploads and to ensure you know the alias type for your uploads.
The exact URL will be provided after bucket and data provider creation.
For efficiency, data files should be uploaded in compressed format. Please gzip compress your files prior to upload. Post compression, file names should end with a .gz extension.
To ensure seamless data ingestion, we impose some system limits for each file upload. If you think you will need to exceed these limits, please let us know.
User IDs per file
Total daily data volume
Please be advised that the turnaround time for this task will be between 2-3 weeks.
Creating Cohorts with Second-party Data
Once our support team has confirmed the cohorts are available:
1. Navigate to the 'Audience' > 'Custom Cohorts' tab in the Permutive dashboard
2. Select '+ Add Cohort'
4. Set up any first-party rules
5. Choose '+OR/+AND' and then 'Second party'
6. Search for and select the relevant second-party cohort
7. Save the cohort
If you have any questions, please contact customer support by emailing firstname.lastname@example.org or chat to the Customer Operations Team via the LiveChat icon in the bottom right corner of your screen.