This is an early access preview, but you are encouraged to try it out, file bug reports, and add features. Read more and catch the live stream.
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets. We are happy to receive feedback and contributions. Deequ depends on ...