Analyzing numerical data validating identification numbers
With a few recipe steps, you can create custom validation checks to verify values. In the Summary area, you can review the count of Outlier values.In the data quality bar at the top of a column, you can review the valid (green), mismatched (red), and missing (black) values. Through the Column Details panel, you can review statistical information about individual columns. In Cloud Dataprep, an outlier is defined as any value that is more than 4 standard deviations from the mean for the set of column values.
Before you get started building your recipe on your dataset, it might be a good idea to create a visual profile of your source data.
This visual profile information is part of the record for the job, which remains in the system after execution.
For more information, see Profile Your Source Data.
The following transform evaluates to You can add additional permitted characters inside the square brackets. Cloud Dataprep provides easy methods for identifying if cells are missing values or contain null values.
You can also create lookups to identify if values are not represented in your dataset.
You can create your custom transforms to evaluate standard deviations from mean for a specific column. If you need to test a column of values compared to two fixed values, you can use the following transform. If the value in transform allows you to remove identical rows.