Quality Report
Semantic Audit PipelinesA Quality Report is a validation checkpoint generated after each key pipeline step that identifies data quality issues and validation failures.
In the semantic audit pipeline, quality reports check: after clustering (Silhouette Score, cluster sizes, outliers), after embeddings (value distribution, zero vectors, anomalies), and after content gaps (analysis completeness, graph coverage). It acts as a quality gate that provides go/no-go decisions between pipeline steps, preventing problems from corrupting downstream processes.
For example, a quality report after clustering shows a Silhouette Score of 0.35, revealing poor cluster separation. The report also identifies 2 clusters with 100+ elements, suggesting over-broad groupings. Recommendation: increase k and re-cluster. Without this checkpoint, these issues only surface in the final report, forcing a complete pipeline rerun. This follows the fail-fast principle: detect problems as early as possible. In practice, teams add automatic quality checks after each pipeline step. Automated validation checks prevent costly downstream failures.