โ† Back to Home

๐Ÿ“š Data Catalog Simulator

WIA-DATA-007 Interactive Experience

๐Ÿ” Data Discovery
๐Ÿ“‹ Metadata Management
๐Ÿ”— Data Lineage
๐Ÿ“š Business Glossary
๐Ÿท๏ธ Classification

๐Ÿ” Data Discovery & Search

์กฐ์ง์˜ ๋ชจ๋“  ๋ฐ์ดํ„ฐ ์ž์‚ฐ์„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๋ฐœ๊ฒฌํ•˜์„ธ์š”.

1,247
Total Datasets
8,934
Tables
124,567
Columns
89%
Documented

customers

๊ณ ๊ฐ ์ •๋ณด๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฉ”์ธ ํ…Œ์ด๋ธ”

Database: prod_db Schema: public Type: Table

Owner: data-team@company.com | Updated: 2025-12-20

orders

์ฃผ๋ฌธ ์ •๋ณด ๋ฐ ํŠธ๋žœ์žญ์…˜ ๋ฐ์ดํ„ฐ

Database: prod_db Schema: sales Type: Table

Owner: sales-team@company.com | Updated: 2025-12-25

user_behavior_logs

์‚ฌ์šฉ์ž ํ–‰๋™ ๋กœ๊ทธ ๋ฐ ์ด๋ฒคํŠธ ๋ฐ์ดํ„ฐ

Database: analytics_db Schema: events Type: Table

Owner: analytics-team@company.com | Updated: 2025-12-26

product_catalog

์ œํ’ˆ ์นดํƒˆ๋กœ๊ทธ ๋ฐ ์žฌ๊ณ  ์ •๋ณด

Database: prod_db Schema: inventory Type: Table

Owner: product-team@company.com | Updated: 2025-12-24

๐Ÿ“‹ Metadata Management

๋ฐ์ดํ„ฐ ์ž์‚ฐ์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๊ด€๋ฆฌํ•˜๊ณ  ๋ฌธ์„œํ™”ํ•˜์„ธ์š”.

Table: customers

๐Ÿ”— Data Lineage Tracking

๋ฐ์ดํ„ฐ์˜ ํ๋ฆ„๊ณผ ๋ณ€ํ™˜ ๊ณผ์ •์„ ์ถ”์ ํ•˜์„ธ์š”.

Lineage: customer_analytics_report

Source
customers
PostgreSQL
โ†’
ETL
customer_transform
Apache Spark
โ†’
Staging
stg_customers
Data Warehouse
โ†’
Analytics
customer_analytics
BigQuery
โ†’
Report
customer_report
Tableau

Transformation Details

Step 1: Data Extraction

Source: PostgreSQL customers table

Tool: Apache Airflow DAG

Schedule: Daily at 02:00 UTC

Records: ~500K rows

Step 2: Data Transformation

Process: customer_transform Spark job

Operations: Cleansing, standardization, enrichment

Business Rules: Customer segmentation, LTV calculation

Output: Cleaned and enriched customer data

Step 3: Data Loading

Destination: BigQuery customer_analytics table

Load Type: Incremental (SCD Type 2)

Partition: By registration_date

Update Frequency: Daily

๐Ÿ“š Business Glossary

๋น„์ฆˆ๋‹ˆ์Šค ์šฉ์–ด์˜ ์ •์˜์™€ ์˜๋ฏธ๋ฅผ ํ‘œ์ค€ํ™”ํ•˜์„ธ์š”.

Customer Lifetime Value (LTV)

์ •์˜: ๊ณ ๊ฐ์ด ๊ธฐ์—…๊ณผ์˜ ๊ด€๊ณ„ ๋™์•ˆ ์ฐฝ์ถœํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๋Š” ์ด ์ˆ˜์ต

๊ณ„์‚ฐ์‹: (ํ‰๊ท  ์ฃผ๋ฌธ ๊ธˆ์•ก) ร— (์—ฐ๊ฐ„ ๊ตฌ๋งค ๋นˆ๋„) ร— (๊ณ ๊ฐ ์œ ์ง€ ๊ธฐ๊ฐ„)

๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉ์ : ๋งˆ์ผ€ํŒ… ํˆฌ์ž ROI ์ธก์ •, ๊ณ ๊ฐ ์„ธ๊ทธ๋จผํ…Œ์ด์…˜

๊ด€๋ จ ๋ฐ์ดํ„ฐ: customers.lifetime_value, orders.total_amount

Owner: Marketing Analytics Team

Customer Segment

์ •์˜: ๊ณ ๊ฐ์„ ํŠน์ • ๊ธฐ์ค€์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ํ•œ ๊ทธ๋ฃน

์นดํ…Œ๊ณ ๋ฆฌ:

  • VIP: LTV > $10,000, ํ™œ๋™ ์ ์ˆ˜ > 90
  • Regular: LTV $1,000-$10,000, ํ™œ๋™ ์ ์ˆ˜ 50-90
  • New: ๋“ฑ๋ก ํ›„ 3๊ฐœ์›” ์ด๋‚ด
  • At-Risk: ์ตœ๊ทผ 6๊ฐœ์›” ๊ตฌ๋งค ์—†์Œ

๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉ์ : ํƒ€๊ฒŸ ๋งˆ์ผ€ํŒ…, ๊ฐœ์ธํ™”๋œ ์„œ๋น„์Šค ์ œ๊ณต

๊ด€๋ จ ๋ฐ์ดํ„ฐ: customers.customer_segment

Owner: Customer Success Team

Active User

์ •์˜: ์ง€์ •๋œ ๊ธฐ๊ฐ„ ๋‚ด์— ์„œ๋น„์Šค๋ฅผ ์ด์šฉํ•œ ์‚ฌ์šฉ์ž

์ธก์ • ๊ธฐ์ค€:

  • DAU (Daily Active User): ํ•˜๋ฃจ ๋™์•ˆ 1ํšŒ ์ด์ƒ ๋กœ๊ทธ์ธ
  • WAU (Weekly Active User): ์ฃผ๊ฐ„ 1ํšŒ ์ด์ƒ ๋กœ๊ทธ์ธ
  • MAU (Monthly Active User): ์›”๊ฐ„ 1ํšŒ ์ด์ƒ ๋กœ๊ทธ์ธ

๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉ์ : ์„œ๋น„์Šค ๊ฑด๊ฐ•๋„ ๋ชจ๋‹ˆํ„ฐ๋ง, ์„ฑ์žฅ ์ง€ํ‘œ

๊ด€๋ จ ๋ฐ์ดํ„ฐ: user_behavior_logs.event_type, users.last_login_date

Owner: Product Analytics Team

Conversion Rate

์ •์˜: ํŠน์ • ํ–‰๋™์„ ์™„๋ฃŒํ•œ ์‚ฌ์šฉ์ž์˜ ๋น„์œจ

๊ณ„์‚ฐ์‹: (์ „ํ™˜ ์ˆ˜ / ์ „์ฒด ๋ฐฉ๋ฌธ์ž ์ˆ˜) ร— 100

์˜ˆ์‹œ:

  • ํšŒ์›๊ฐ€์ž… ์ „ํ™˜์œจ: ํšŒ์›๊ฐ€์ž… ์™„๋ฃŒ / ๋žœ๋”ฉ ํŽ˜์ด์ง€ ๋ฐฉ๋ฌธ
  • ๊ตฌ๋งค ์ „ํ™˜์œจ: ๊ตฌ๋งค ์™„๋ฃŒ / ์ œํ’ˆ ํŽ˜์ด์ง€ ์กฐํšŒ

๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉ์ : ํผ๋„ ์ตœ์ ํ™”, A/B ํ…Œ์ŠคํŠธ ํ‰๊ฐ€

๊ด€๋ จ ๋ฐ์ดํ„ฐ: events.conversion_events, orders.completed

Owner: Growth Team

๐Ÿท๏ธ Data Classification & Tagging

๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ณ  ํƒœ๊ทธ๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ๊ด€๋ฆฌํ•˜์„ธ์š”.

Classification Categories

2,145
PII (Personal Info)
1,876
Financial
4,523
Business
3,390
Operational

Auto-Classification Rules

Rule 1: PII Detection

Pattern: email, phone, ssn, address, name

Classification: PII (Personally Identifiable Information)

Security Level: High

Actions: Encryption required, Access logging enabled

Rule 2: Financial Data

Pattern: amount, price, revenue, cost, payment

Classification: Financial

Security Level: High

Actions: Audit trail required, Restricted access

Rule 3: Temporal Data

Pattern: date, timestamp, time, created_at, updated_at

Classification: Temporal

Security Level: Low

Actions: Time-based partitioning recommended

Tag Management

customers table

Domain: Customer Team: CRM Criticality: High Compliance: GDPR Compliance: CCPA Quality: Gold

orders table

Domain: Sales Team: Sales Ops Criticality: High Compliance: PCI-DSS Quality: Gold

user_behavior_logs

Domain: Analytics Team: Data Science Criticality: Medium Compliance: GDPR Quality: Silver

product_catalog

Domain: Product Team: Product Ops Criticality: Medium Quality: Gold