💗 WIA Emotion AI Standard Ebook | Chapter 5 of 8

💗 Chapter 5: Phase 2 - API Interface

Hongik Ingan (弘益人間)

"Benefit All Humanity"

A well-designed API enables developers to easily integrate emotion recognition into their applications, regardless of the underlying implementation.

5.1 API Design Principles

5.1.1 Core Principles

Principle	Implementation
RESTful	Standard HTTP methods, resource-based URLs
JSON	All requests and responses in JSON format
Versioned	API version in URL path (/v1/)
Authenticated	API keys or OAuth 2.0
Rate Limited	Clear rate limits with headers

5.1.2 Base URL

Production: https://api.wiastandards.com/emotion-ai/v1
Staging:    https://api-staging.wiastandards.com/emotion-ai/v1

5.2 Authentication

5.2.1 API Key Authentication

Header: X-WIA-API-Key: your_api_key_here

Example Request:
curl -X POST https://api.wiastandards.com/emotion-ai/v1/analyze/face \
  -H "X-WIA-API-Key: sk_live_abc123" \
  -H "Content-Type: application/json" \
  -d '{"image_url": "https://example.com/face.jpg"}'

5.2.2 OAuth 2.0 (Optional)

Header: Authorization: Bearer <access_token>

Token Endpoint: POST /oauth/token
Scopes:
  - emotion:read    - Read emotion analysis results
  - emotion:analyze - Submit content for analysis
  - emotion:stream  - Real-time streaming access

5.3 Facial Emotion Analysis API

5.3.1 Analyze Face Image

POST /v1/analyze/face

Request Body:
{
    "image_url": "https://example.com/face.jpg",
    // OR
    "image_base64": "data:image/jpeg;base64,/9j/4AAQ...",

    "options": {
        "return_action_units": true,
        "return_dimensions": true,
        "return_landmarks": false,
        "min_face_size": 50,
        "max_faces": 5
    }
}

Response (200 OK):
{
    "request_id": "req_abc123",
    "processing_time_ms": 145,
    "faces": [
        {
            "face_id": 0,
            "bbox": { "x": 120, "y": 80, "width": 200, "height": 250 },
            "emotions": {
                "primary": { "label": "happiness", "confidence": 0.87 },
                "all": [
                    { "label": "happiness", "confidence": 0.87 },
                    { "label": "neutral", "confidence": 0.08 },
                    { "label": "surprise", "confidence": 0.05 }
                ]
            },
            "dimensions": {
                "valence": 0.72,
                "arousal": 0.45
            },
            "action_units": [
                { "au": "AU6", "intensity": 0.8 },
                { "au": "AU12", "intensity": 0.9 }
            ]
        }
    ]
}

5.3.2 Analyze Video Frame

POST /v1/analyze/face/video

Request Body:
{
    "video_url": "https://example.com/video.mp4",
    // OR
    "frames_base64": ["data:image/jpeg;base64,...", ...],

    "options": {
        "sample_rate": 5,  // Analyze every 5th frame
        "start_time_ms": 0,
        "end_time_ms": 10000,
        "track_faces": true
    }
}

Response (200 OK):
{
    "request_id": "req_video123",
    "duration_ms": 10000,
    "frame_count": 60,
    "analyzed_frames": 12,
    "timeline": [
        {
            "timestamp_ms": 0,
            "faces": [{ ... }]
        },
        {
            "timestamp_ms": 833,
            "faces": [{ ... }]
        }
    ],
    "summary": {
        "dominant_emotion": "happiness",
        "average_valence": 0.65,
        "average_arousal": 0.42,
        "emotion_transitions": 3
    }
}

5.4 Voice Emotion Analysis API

5.4.1 Analyze Audio

POST /v1/analyze/voice

Request Body:
{
    "audio_url": "https://example.com/audio.wav",
    // OR
    "audio_base64": "data:audio/wav;base64,...",

    "options": {
        "language": "en-US",
        "return_transcript": true,
        "return_prosody": true,
        "segment_by": "utterance"
    }
}

Response (200 OK):
{
    "request_id": "req_voice456",
    "duration_ms": 5500,
    "language_detected": "en-US",
    "transcript": "I'm really excited about this opportunity!",

    "emotions": {
        "primary": { "label": "excitement", "confidence": 0.82 },
        "all": [
            { "label": "excitement", "confidence": 0.82 },
            { "label": "happiness", "confidence": 0.65 },
            { "label": "neutral", "confidence": 0.12 }
        ]
    },

    "dimensions": {
        "valence": 0.78,
        "arousal": 0.85
    },

    "prosody": {
        "pitch_mean_hz": 185.5,
        "pitch_range_hz": 120.3,
        "intensity_db": 68.2,
        "speech_rate_wpm": 145,
        "pause_ratio": 0.12
    },

    "segments": [
        {
            "start_ms": 0,
            "end_ms": 2500,
            "text": "I'm really excited",
            "emotion": "excitement",
            "confidence": 0.85
        },
        {
            "start_ms": 2500,
            "end_ms": 5500,
            "text": "about this opportunity!",
            "emotion": "happiness",
            "confidence": 0.78
        }
    ]
}

5.4.2 Real-time Voice Analysis

POST /v1/analyze/voice/stream

Request: WebSocket upgrade (see Chapter 6)

Initial Message:
{
    "type": "config",
    "sample_rate": 16000,
    "encoding": "LINEAR16",
    "language": "en-US"
}

Audio Chunks:
Binary audio data (PCM)

Response Messages:
{
    "type": "partial",
    "timestamp_ms": 1500,
    "emotion": { "label": "neutral", "confidence": 0.7 }
}

{
    "type": "final",
    "segment": {
        "start_ms": 0,
        "end_ms": 3000,
        "text": "Hello, how are you?",
        "emotion": { "label": "happiness", "confidence": 0.82 }
    }
}

5.5 Text Sentiment Analysis API

5.5.1 Analyze Text

POST /v1/analyze/text

Request Body:
{
    "text": "I absolutely love this product! Best purchase ever.",
    "language": "en",

    "options": {
        "return_aspects": true,
        "return_entities": true,
        "detect_sarcasm": true
    }
}

Response (200 OK):
{
    "request_id": "req_text789",
    "text_length": 52,
    "language": "en",

    "sentiment": {
        "polarity": 0.92,
        "subjectivity": 0.85,
        "label": "very_positive"
    },

    "emotions": {
        "primary": { "label": "happiness", "confidence": 0.91 },
        "all": [
            { "label": "happiness", "confidence": 0.91 },
            { "label": "excitement", "confidence": 0.75 },
            { "label": "satisfaction", "confidence": 0.68 }
        ]
    },

    "dimensions": {
        "valence": 0.88,
        "arousal": 0.65
    },

    "aspects": [
        {
            "aspect": "product",
            "sentiment": 0.95,
            "mentions": ["product", "purchase"]
        }
    ],

    "entities": [
        {
            "text": "product",
            "type": "PRODUCT",
            "emotion": "happiness",
            "sentiment": 0.95
        }
    ],

    "sarcasm": {
        "detected": false,
        "confidence": 0.02
    }
}

5.5.2 Batch Text Analysis

POST /v1/analyze/text/batch

Request Body:
{
    "texts": [
        { "id": "review_1", "text": "Great product!" },
        { "id": "review_2", "text": "Terrible experience." },
        { "id": "review_3", "text": "It was okay, nothing special." }
    ],
    "language": "en"
}

Response (200 OK):
{
    "request_id": "req_batch001",
    "results": [
        {
            "id": "review_1",
            "sentiment": { "polarity": 0.85, "label": "positive" },
            "emotion": { "label": "happiness", "confidence": 0.82 }
        },
        {
            "id": "review_2",
            "sentiment": { "polarity": -0.78, "label": "negative" },
            "emotion": { "label": "anger", "confidence": 0.71 }
        },
        {
            "id": "review_3",
            "sentiment": { "polarity": 0.1, "label": "neutral" },
            "emotion": { "label": "neutral", "confidence": 0.85 }
        }
    ],
    "summary": {
        "average_sentiment": 0.06,
        "positive_count": 1,
        "negative_count": 1,
        "neutral_count": 1
    }
}

5.6 Biosignal Analysis API

5.6.1 Analyze Biosignals

POST /v1/analyze/biosignal

Request Body:
{
    "signals": {
        "ecg": {
            "sample_rate": 256,
            "data": [0.12, 0.15, 0.18, ...],
            "unit": "mV"
        },
        "eda": {
            "sample_rate": 4,
            "data": [2.5, 2.6, 2.8, ...],
            "unit": "uS"
        }
    },
    "duration_ms": 60000,

    "options": {
        "return_hrv": true,
        "return_stress": true,
        "return_engagement": true
    }
}

Response (200 OK):
{
    "request_id": "req_bio001",
    "duration_ms": 60000,

    "heart_rate": {
        "mean_bpm": 72,
        "min_bpm": 65,
        "max_bpm": 82,
        "std_bpm": 5.2
    },

    "hrv": {
        "rmssd_ms": 42.5,
        "sdnn_ms": 55.3,
        "pnn50": 0.18,
        "lf_hf_ratio": 1.8
    },

    "eda": {
        "scl_mean": 3.2,
        "scr_count": 5,
        "scr_amplitude_mean": 0.8
    },

    "derived_states": {
        "stress_level": 0.35,
        "relaxation": 0.55,
        "engagement": 0.72,
        "arousal": 0.45
    },

    "timeline": [
        { "time_ms": 0, "stress": 0.3, "engagement": 0.7 },
        { "time_ms": 10000, "stress": 0.35, "engagement": 0.75 },
        { "time_ms": 20000, "stress": 0.4, "engagement": 0.68 }
    ]
}

5.7 Multimodal Fusion API

5.7.1 Multimodal Analysis

POST /v1/analyze/multimodal

Request Body:
{
    "modalities": {
        "face": {
            "image_url": "https://example.com/face.jpg"
        },
        "voice": {
            "audio_url": "https://example.com/audio.wav"
        },
        "text": {
            "text": "I'm feeling great today!"
        }
    },

    "fusion": {
        "method": "weighted_average",
        "weights": {
            "face": 0.5,
            "voice": 0.3,
            "text": 0.2
        }
    }
}

Response (200 OK):
{
    "request_id": "req_multi001",

    "fused_result": {
        "emotions": {
            "primary": { "label": "happiness", "confidence": 0.88 }
        },
        "dimensions": {
            "valence": 0.75,
            "arousal": 0.52
        }
    },

    "modality_results": {
        "face": {
            "emotions": { "primary": { "label": "happiness", "confidence": 0.85 } },
            "dimensions": { "valence": 0.72, "arousal": 0.48 }
        },
        "voice": {
            "emotions": { "primary": { "label": "happiness", "confidence": 0.82 } },
            "dimensions": { "valence": 0.78, "arousal": 0.55 }
        },
        "text": {
            "emotions": { "primary": { "label": "happiness", "confidence": 0.90 } },
            "dimensions": { "valence": 0.80, "arousal": 0.60 }
        }
    },

    "fusion_details": {
        "method": "weighted_average",
        "weights_used": { "face": 0.5, "voice": 0.3, "text": 0.2 },
        "agreement_score": 0.92
    }
}

5.8 Error Handling

5.8.1 Error Response Format

{
    "error": {
        "code": "INVALID_IMAGE",
        "message": "The provided image could not be processed",
        "details": {
            "reason": "No face detected in image",
            "suggestion": "Ensure the image contains a clearly visible face"
        }
    },
    "request_id": "req_err001"
}

5.8.2 Error Codes

HTTP Status	Error Code	Description
400	INVALID_REQUEST	Malformed request body
400	INVALID_IMAGE	Image cannot be processed
400	NO_FACE_DETECTED	No face found in image
401	UNAUTHORIZED	Invalid or missing API key
403	FORBIDDEN	Insufficient permissions
429	RATE_LIMITED	Too many requests
500	INTERNAL_ERROR	Server error

5.9 Rate Limits

Plan	Requests/Minute	Requests/Day
Free	10	100
Developer	60	10,000
Business	300	100,000
Enterprise	Custom	Custom

5.10 Chapter Summary

[OK] Key Takeaways:

RESTful Design: Standard HTTP methods, JSON format
Four Modality APIs: Face, Voice, Text, Biosignal
Multimodal Fusion: Combine modalities with configurable weights
Comprehensive Output: Emotions, dimensions, AUs, metadata
Error Handling: Clear error codes and messages
Rate Limiting: Tiered plans for different use cases

Chapter 5 Complete | Approximate pages: 16

Next: Chapter 6 - Phase 3: Streaming Protocol

WIA - World Certification Industry Association

Hongik Ingan - Benefit All Humanity

https://wiastandards.com