DataWeave Analytics Library icon

DataWeave Analytics Library

(1 review)
Collection of DataWeave functions for data analysis

home

DataWeave Analytics Library

The DataWeave Analytics library provides a collection of simple functions
intended to perform data analysis over datasets.
DataWeave, which supports data transformations in multiple formats, is a
fast analyzer for datasets. This library fills a gap in
some commonly used functionalities.

Overview

The CSVSummary on analytics::summaries consumes a CSV file and produces
a summary of the values on the different columns present in it.

For example, assume the following CSV input:

id,time,age
1,234,18
2,333,24
3,108,15
4,444,44

The CSVSummary mapping produces the following JSON output from the CSV input:

{
  "id": {
    "mean": 2.5,
    "modes": [
      "1",
      "2",
      "3",
      "4"
    ],
    "median": 2.5,
    "stdev": 1.118033988749894848204586834365638,
    "quartiles": [
      1.5,
      2.5,
      3.5
    ]
  },
  "time": {
    "mean": 279.75,
    "modes": [
      "234",
      "333",
      "108",
      "444"
    ],
    "median": 283.5,
    "stdev": 123.8999092009352523399554085511010,
    "quartiles": [
      171,
      283.5,
      388.5
    ]
  },
  "age": {
    "mean": 25.25,
    "modes": [
      "18",
      "24",
      "15",
      "44"
    ],
    "median": 21,
    "stdev": 11.29988937998952271159312403049506,
    "quartiles": [
      16.5,
      21,
      34
    ]
  }
}

Notice that the result provides a summary consisting of the mean, the modes, the median,
the standard deviation, and the quartiles of the values of each column in the CSV file.

Contributions Welcome

Contributions to this project can be made through Pull Requests and Issues on the
GitHub Repository.

Before creating a pull request review the following:

When you submit your pull request, you are asked to sign a contributor license agreement (CLA) if we don't have one on file for you.


DataWeave Version

This library requires DataWeave version 2.4 or higher.

Modules

NameDescription
StatisticsThis module provides basic functionality for analyzing
datasets of numeric and non-numeric values.
CommonThis module defines common functions for all summaries.

Mappings

NameDescription
CSVSummaryTransform a CSV file into a summary of some important statistics.
JsonSummaryTransform a JSON file into a summary of some important statistics.

Reviews

TypeDataWeave Library
OrganizationMuleSoft
Published by
MuleSoft Organization
Published onMar 23, 2022
Asset overview

Asset versions for 1.0.x

Asset versions
VersionActions
1.0.1
1.0.0