{{appName}}

Install

          ## clone it
          $ git clone https://github.com/forcedotcom/Analytics-Cloud-Dataset-Utils.git
          $ cd Analytics-Cloud-Dataset-Utils
          ## build it
          $ mvn package
          ## run it
          $ cd target

        

Data Prep

Data Prep - contd


           $ cat Oppty-1.csv
              ID,Name,Amount,Closedate,Stage
              006f4000002uWRFAA2,Dickenson Mobile Generators,15000,05/15/17,Qualification
              006f4000002uWRGAA2,United Oil Office Portable Generators,125000,05/03/17,Negotiation/Review
              006f4000002uWRHAA2,Express Logistics Standby Generator,220000,03/29/17,Closed Won
              006f4000002uWRIAA2,GenePoint Standby Generator,85000,05/06/17,Closed Won
              006f4000002uWRJAA2,Grand Hotels Kitchen Generator,15000,03/24/17,Id. Decision Makers
              006f4000002uWRKAA2,United Oil Refinery Generators,270000,06/17/17,Proposal/Price Quote
              006f4000002uWRLAA2,United Oil SLA,120000,06/24/17,Closed Won
              006f4000002uWRMAA2,Grand Hotels Guest Portable Generators,250000,07/02/17,Value Proposition
              006f4000002uWRNAA2,Edge Emergency Generator,75000,07/02/17,Closed Won
              006f4000002uWROAA2,University of AZ Portable Generators,50000,04/02/17,Closed Won
              006f4000002uWRPAA2,Pyramid Emergency Generators,100000,05/11/17,Prospecting
              006f4000002uWRQAA2,Express Logistics Portable Truck Generators,80000,03/27/17,Value Proposition
              006f4000002uWRRAA2,GenePoint Lab Generators,60000,06/26/17,Id. Decision Makers
              ...
         

Loading CSV file into EA

Error file


$ cat Oppty-1_err.csv
Error,ID,Name,Amount,Closedate,Stage
Amount is not a valid number value {null} : null,006f4000002E1k5AAC,Stardollar-7000,null,09/30/17,Qualification



        

Generated Schema File

Lenses

Lenses - Contd.

Dataset Recipe

Dataset Recipe - Contd

Column Profile

Recipe to Dataset

Dataflow Recipe Completion Report

Create Dataset with dataflow

Sample Workflow ELT JSON file

Sample Workflow ELT Editor

Sample Workflow ELT Editor - Contd.

Simple Workflow With Extracts, Augument and Filter

Uploading and Running the Dataflow

Simple filter Transformation

edgemart Transformation

  • Gives dataflow access to an existing, registered dataset, which can contain Salesforce data, external data, or a combination of the two.
  • Example: Compare the final-sales-amount against the opportunity-amount to determine if heavy discounts were offered to close deals

edgemart Transformation

SAQL - Salesforce Analytics Query Language

  • Used to access data in Analytics datasets
  • Used in Lenses, Dashboards and Explorer to gather data for visualizations

SAQL how to edit?

SAQL how to edit? - contd.

SAQL how to edit? - contd..

SAQL Reference

Analytics Studio Menu Structure

  • Apps
    • Dashboards
    • Lenses
    • Datasets
    • App Details - Created, LastModified

Exploring a Dashboard

Define SFDC Extract Data Flow

Define SFDC Extract Data Flow - Contd.

Define SFDC Extract Data Flow - Multiple

Define SFDC Extract Data Flow - Multiple - Contd.

Define SFDC Extract Data Flow - Multiple - Contd.

Define Dataset Augment Data Flow

Define Dataset Augment Data Flow - Contd.

Download All XMD Json Files

Download All XMD Json Files - Contd.

Detect file encoding

Available Datasetutil Actions:
1. Load CSV
2. Define SFDC Extract Data Flow
3. Define Dataset Augment Data Flow
4. Download All XMD Json Files
5. Upload User XMD Json File
6. Detect file encoding
7. Fetch CSV Upload Error Report

Enter Action number (0  = Exit): 6
Enter inputFile: /tmp/output.csv

*******************************************************************************
File {/tmp/output.csv} has Charset {ISO-8859-2}, Confidence: {45%}
*******************************************************************************
  

Append Transformation

Combines records from multiple datasets into a single dataset

Append Use Case

  • PS (process status) data ps_all needs to be updated frequently (daily) by the input ps_inc
  • Inputs:
    1. ps_all.csv - startup data
    2. ps_d1.csv - day-1 data
    3. ps_d2.csv - day-2 data
  • Operation needed:
    1. append(ps_all, ps_d1) => ps_all
    2. append(ps_all, ps_d2) => ps_all
    3. ...

Steps

  1. Load CSV: ps_all.csv into dataset ps_all
  2. Load CSV: ps_d1.csv into dataset ps_d1
  3. Schedule the Dataflow to run right after daily Load of csv (ps_dn.csv) is completed.

Datafiles

Load ps_all.csv

Load ps_d1.csv

Run Dataflow

Dataflow file

Status - After first run

Datasets - after first run

Next day - Load ps_d2.csv

Run Dataflow - Daily Run after ps_d2.csv load

After Run Dataflow - Datasets updated

Append in single step via command line


$ java -jar datasetutils-37.0.4-SNAPSHOT.jar --server false --action load --dataset ps_all --operation Append  --inputFile file.csv
    

Dataset before append

Dataset after append

Upsert Operation

  • First operation Overwrite

  java -jar datasetutils-37.0.4-SNAPSHOT.jar --server false --action load --dataset fruits \
              --operation Overwrite --u username \
              --inputFile /Users/mchinnappan/Downloads/fruits.csv

# fruits.csv
Id,Name,Qty
1,Apple,100
2,Mango,22
3,Peach,11
4,Apricot,78
5,Cherry,45
6,Banana,23
7,Jackfriut,22
8,Pear,62
9,Plum,21
# make sure that for Name column in fruits_schema.json has:
  "isUniqueId" : true,
# set

Upsert - 1

# fruits_update.csv
Id,Name,Qty
31,Strawberry,96
# make sure that for Name column in fruits_update_schema.json has:
  "isUniqueId" : true,
# set
# command
java -jar datasetutils-37.0.4-SNAPSHOT.jar --server false --action load --dataset fruits \
      --operation Upsert --u username \
      --inputFile /Users/mchinnappan/Downloads/fruits_update.csv
  

Upsert - 1 - contd.

Upsert - 2

# fruits_update.csv
Id,Name,Qty
31,Strawberry,201
99,Orange,10
# command
java -jar datasetutils-37.0.4-SNAPSHOT.jar --server false --action load --dataset fruits \
      --operation Upsert --u username \
      --inputFile /Users/mchinnappan/Downloads/fruits_update.csv
  

Upsert - 2 - contd.

Transformations

  • sfdcDigest
  • append
  • augment
  • computeExpression
  • computeRelative
  • delta
  • dim2mea
  • edgemart
  • filter
  • flatten
  • update
  • sfdcRegister

Download

Download datasetutils-37.0.5-SNAPSHOT.jar

References