How-To Guides
Analytics Guides

Use the Statistical Tests Function

10min

The Statistical Tests Function can help you determine how well a set of data matches a certain distribution, especially normal distribution. You can do this by analyzing the data's characteristics and calculating corresponding p-values.

Understanding Different Types of Statistical Tests

You have the option of the using the following types of statistical tests.

Jarque-Bera Test

The Jarque-Bera test measures how a dataset's skewness and kurtosis compare to that of a normal distribution, with its value indicating the likelihood of non-normality.

Formula: JB = (n/6) (skewness^2 + (1/4)(kurtosis-3)^2)

Cramer-Von Mises Test

This test assesses the fit of a data sample to a specified cumulative distribution by analyzing the squared differences, with a low p-value indicating a poor fit.

Formula: cramerVonMises = (1/(12*n)) + sum(((2*i - 1)/(2*n) - Phi(Z))^2)

Anderson-Darling Test

The Anderson-Darling test focuses on the tails of the distribution to evaluate if the data conforms to a specified distribution, typically giving more weight to outliers.

Formula: A^2 = -n - (1/n) * sum((2*i - 1) * log(Phi(Z_i)) + (2*(n - i) + 1) * log(1 - Phi(Z_(n-i))))

D'Agostino Pearson Test

This test combines skewness and kurtosis to examine if the shape of a dataset's distribution aligns with a normal distribution, looking for deviations in symmetry and sharpness of the highest point.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov test, particularly the Lilliefors modification, evaluates the normality of a dataset by comparing the empirical distribution function with the cumulative normal distribution.

Formula: K = max(D+, D-) * sqrt(n)

User Scenario

Review the following scenario for the Statistical Tests function. Then, you will simulate PLC data and calculate the corresponding test values for the collected data.

In a chemical manufacturing plant, quality control engineers use the Statistical Tests Function to ensure that the mixture ratios of raw materials are consistent with the required standards for product batches. By analyzing characteristics such as consistency and concentration, the tests determine if the batch data deviates from normal distribution, which is critical for product quality.

Step 1: Add a Device

Follow the steps to Connect a Device and configure the following parameters:

  • Device Type: Simulator
  • Driver Name: Generator
  • Enable Alias Topics: Select the checkbox.

Step 2: Add Tags

After connecting the device, add the following tag. See Add Tags to learn more.

Tag 1: input1

  • Name: Select S - Random value generator
  • Value Type: Select float64
  • Polling Interval: Enter 1
  • Tag Name: Enter input1
  • Min_value: Enter 101
  • Max_value: Enter 199

Step 3: Create Analytics Flows

You can now create the analytics flows using data from the device and tag you previously created.

To create an analytics flow with the Statistical Function Processor:

  1. In Litmus Edge, navigate to Analytics.
  2. On the analytics canvas, click Add processor. The Create a processor dialog box displays.

    The Add processor option
    The Add processor option
    
  3. Select DataHub Subscribe.
  4. In the Topic field, click the Search icon, select the device you previously created, and then select the alias topic for the input1 tag.

    Create a Processor dialog box
    Create a Processor dialog box
    
  5. Click Save.
  6. Click Add processor again and select the Statistical Function processor. The Edit a Processor dialog box appears.
    • Window Size: Enter a value that represents the range to apply the statistical tests. For this example, we input a value of 100.
    • Select the Jarque Bera, Anderson Darling, Cramer Von Mises, D Agostino Pearson, and Kolmogorov Smirnov checkboxes.
    • Click Save.

      Edit a Processor dialog box
      Edit a Processor dialog box
      
  7. Connect the DataHub Subscribe processor (tag: input1) to the Statistical Function processor with a wire and use the events connection.
  8. On the analytics canvas, click Save. The configured analytics flows should look like the following:
Completed Flows Canvas
Completed Flows Canvas


Step 4: View Output of Processor

Click the View icon in the Statistical Function processor to view the output values.

The tests suggest the sample data may not fit a normal distribution, as indicated by the p-values and test statistics provided.

Output of Statistical Test Function
Output of Statistical Test Function