← Back to Home

πŸ“š Data Catalog Simulator

WIA-DATA-007 Interactive Experience

πŸ” Data Discovery
πŸ“‹ Metadata Management
πŸ”— Data Lineage
πŸ“š Business Glossary
🏷️ Classification

πŸ” Data Discovery & Search

쑰직의 λͺ¨λ“  데이터 μžμ‚°μ„ κ²€μƒ‰ν•˜κ³  λ°œκ²¬ν•˜μ„Έμš”.

1,247
Total Datasets
8,934
Tables
124,567
Columns
89%
Documented

customers

고객 정보λ₯Ό μ €μž₯ν•˜λŠ” 메인 ν…Œμ΄λΈ”

Database: prod_db Schema: public Type: Table

Owner: data-team@company.com | Updated: 2025-12-20

orders

μ£Όλ¬Έ 정보 및 νŠΈλžœμž­μ…˜ 데이터

Database: prod_db Schema: sales Type: Table

Owner: sales-team@company.com | Updated: 2025-12-25

user_behavior_logs

μ‚¬μš©μž 행동 둜그 및 이벀트 데이터

Database: analytics_db Schema: events Type: Table

Owner: analytics-team@company.com | Updated: 2025-12-26

product_catalog

μ œν’ˆ μΉ΄νƒˆλ‘œκ·Έ 및 재고 정보

Database: prod_db Schema: inventory Type: Table

Owner: product-team@company.com | Updated: 2025-12-24

πŸ“‹ Metadata Management

데이터 μžμ‚°μ˜ 메타데이터λ₯Ό κ΄€λ¦¬ν•˜κ³  λ¬Έμ„œν™”ν•˜μ„Έμš”.

Table: customers

πŸ”— Data Lineage Tracking

λ°μ΄ν„°μ˜ 흐름과 λ³€ν™˜ 과정을 μΆ”μ ν•˜μ„Έμš”.

Lineage: customer_analytics_report

Source
customers
PostgreSQL
β†’
ETL
customer_transform
Apache Spark
β†’
Staging
stg_customers
Data Warehouse
β†’
Analytics
customer_analytics
BigQuery
β†’
Report
customer_report
Tableau

Transformation Details

Step 1: Data Extraction

Source: PostgreSQL customers table

Tool: Apache Airflow DAG

Schedule: Daily at 02:00 UTC

Records: ~500K rows

Step 2: Data Transformation

Process: customer_transform Spark job

Operations: Cleansing, standardization, enrichment

Business Rules: Customer segmentation, LTV calculation

Output: Cleaned and enriched customer data

Step 3: Data Loading

Destination: BigQuery customer_analytics table

Load Type: Incremental (SCD Type 2)

Partition: By registration_date

Update Frequency: Daily

πŸ“š Business Glossary

λΉ„μ¦ˆλ‹ˆμŠ€ μš©μ–΄μ˜ μ •μ˜μ™€ 의미λ₯Ό ν‘œμ€€ν™”ν•˜μ„Έμš”.

Customer Lifetime Value (LTV)

μ •μ˜: 고객이 κΈ°μ—…κ³Όμ˜ 관계 λ™μ•ˆ μ°½μΆœν•  κ²ƒμœΌλ‘œ μ˜ˆμƒλ˜λŠ” 총 수읡

계산식: (평균 μ£Όλ¬Έ κΈˆμ•‘) Γ— (μ—°κ°„ ꡬ맀 λΉˆλ„) Γ— (고객 μœ μ§€ κΈ°κ°„)

λΉ„μ¦ˆλ‹ˆμŠ€ λͺ©μ : λ§ˆμΌ€νŒ… 투자 ROI μΈ‘μ •, 고객 μ„Έκ·Έλ¨Όν…Œμ΄μ…˜

κ΄€λ ¨ 데이터: customers.lifetime_value, orders.total_amount

Owner: Marketing Analytics Team

Customer Segment

μ •μ˜: 고객을 νŠΉμ • 기쀀에 따라 λΆ„λ₯˜ν•œ κ·Έλ£Ή

μΉ΄ν…Œκ³ λ¦¬:

  • VIP: LTV > $10,000, ν™œλ™ 점수 > 90
  • Regular: LTV $1,000-$10,000, ν™œλ™ 점수 50-90
  • New: 등둝 ν›„ 3κ°œμ›” 이내
  • At-Risk: 졜근 6κ°œμ›” ꡬ맀 μ—†μŒ

λΉ„μ¦ˆλ‹ˆμŠ€ λͺ©μ : νƒ€κ²Ÿ λ§ˆμΌ€νŒ…, κ°œμΈν™”λœ μ„œλΉ„μŠ€ 제곡

κ΄€λ ¨ 데이터: customers.customer_segment

Owner: Customer Success Team

Active User

μ •μ˜: μ§€μ •λœ κΈ°κ°„ 내에 μ„œλΉ„μŠ€λ₯Ό μ΄μš©ν•œ μ‚¬μš©μž

μΈ‘μ • κΈ°μ€€:

  • DAU (Daily Active User): ν•˜λ£¨ λ™μ•ˆ 1회 이상 둜그인
  • WAU (Weekly Active User): μ£Όκ°„ 1회 이상 둜그인
  • MAU (Monthly Active User): μ›”κ°„ 1회 이상 둜그인

λΉ„μ¦ˆλ‹ˆμŠ€ λͺ©μ : μ„œλΉ„μŠ€ 건강도 λͺ¨λ‹ˆν„°λ§, μ„±μž₯ μ§€ν‘œ

κ΄€λ ¨ 데이터: user_behavior_logs.event_type, users.last_login_date

Owner: Product Analytics Team

Conversion Rate

μ •μ˜: νŠΉμ • 행동을 μ™„λ£Œν•œ μ‚¬μš©μžμ˜ λΉ„μœ¨

계산식: (μ „ν™˜ 수 / 전체 방문자 수) Γ— 100

μ˜ˆμ‹œ:

  • νšŒμ›κ°€μž… μ „ν™˜μœ¨: νšŒμ›κ°€μž… μ™„λ£Œ / λžœλ”© νŽ˜μ΄μ§€ λ°©λ¬Έ
  • ꡬ맀 μ „ν™˜μœ¨: ꡬ맀 μ™„λ£Œ / μ œν’ˆ νŽ˜μ΄μ§€ 쑰회

λΉ„μ¦ˆλ‹ˆμŠ€ λͺ©μ : 퍼널 μ΅œμ ν™”, A/B ν…ŒμŠ€νŠΈ 평가

κ΄€λ ¨ 데이터: events.conversion_events, orders.completed

Owner: Growth Team

🏷️ Data Classification & Tagging

데이터λ₯Ό λΆ„λ₯˜ν•˜κ³  νƒœκ·Έλ₯Ό λΆ€μ—¬ν•˜μ—¬ κ΄€λ¦¬ν•˜μ„Έμš”.

Classification Categories

2,145
PII (Personal Info)
1,876
Financial
4,523
Business
3,390
Operational

Auto-Classification Rules

Rule 1: PII Detection

Pattern: email, phone, ssn, address, name

Classification: PII (Personally Identifiable Information)

Security Level: High

Actions: Encryption required, Access logging enabled

Rule 2: Financial Data

Pattern: amount, price, revenue, cost, payment

Classification: Financial

Security Level: High

Actions: Audit trail required, Restricted access

Rule 3: Temporal Data

Pattern: date, timestamp, time, created_at, updated_at

Classification: Temporal

Security Level: Low

Actions: Time-based partitioning recommended

Tag Management

customers table

Domain: Customer Team: CRM Criticality: High Compliance: GDPR Compliance: CCPA Quality: Gold

orders table

Domain: Sales Team: Sales Ops Criticality: High Compliance: PCI-DSS Quality: Gold

user_behavior_logs

Domain: Analytics Team: Data Science Criticality: Medium Compliance: GDPR Quality: Silver

product_catalog

Domain: Product Team: Product Ops Criticality: Medium Quality: Gold