Data Dictionary
Bring Clarity to Your Data Landscape
A well-maintained data dictionary is the foundation of data understanding and governance.
Yavantha’s Data Dictionary provides a structured, accessible, and dynamic inventory of your data elements, bridging the gap between raw data and business meaning
Ingestion & Source Diversity
Yavantha’s ingestion engine connects to a wide variety of data sources—whether structured, semi-structured, or unstructured—across cloud, on-premise, and hybrid environments:
- Source Types: Relational databases (Oracle, PostgreSQL, SQL Server, Swowflake, Databricks,...), NoSQL (MongoDB, Cassandra,...), file systems, cloud storage (S3, Azure Blob, GCS), BI tools (Power BI, Tableau), and more
- Formats Supported: CSV, Excel, JSON, XML, Parquet, Avro, Delta Lake, Iceberg, and others
- Connection Methods: JDBC/ODBC, REST APIs, native connectors
- Deployment Flexibility: Ingestion can be scheduled or triggered manually, and scoped to specific schemas, tables, or file paths
Smart Profiling & Enrichment
During ingestion, Yavantha performs advanced analysis to enrich metadata and reveal deeper insights:
- Smart Profiling: Automatically computes statistics like mean, min/max, frequency, percentiles, and detects patterns, distributions, and functional dependencies
- AI-Powered Auto-Tagging: Machine learning algorithms suggest classifications based on technical and semantic similarities across sources
- Automated Change Detection: Tracks structural changes between ingestion runs—new columns, type changes, deletions—and alerts users to updates
- Suggested Relationships: Identifies and recommends links between data elements, glossary terms, and quality rules
Integrate with Governance and Quality
The Data Dictionary is deeply integrated with the rest of Yavantha:
- Data Catalog: Enrich and document elements collaboratively
- Business Glossary: Add semantic context and definitions
- Data Quality: Associate rules and monitor anomalies
- Governance: Assign roles and manage access
Enterprise-Ready and Scalable
Built to support complex data environments:
- Scale across thousands of systems and millions of fields
- Deploys quickly via Docker containers
- Integrate with SSO, IAM, and data catalog platforms