Podcastle vs Riverside: AI Podcast Recording Tools Compared

For developers building podcast production workflows, choosing the right AI-powered recording platform impacts both production quality and automation capabilities. This comparison examines Castle and Riverside from a technical perspective, focusing on API access, integration patterns, and practical use cases for programmatic podcast creation.

Understanding the Recording Platform Landscape

Both platforms position themselves as AI-enhanced recording solutions, but their architectural approaches differ significantly. Castle emphasizes browser-based recording with AI-powered editing features, while Riverside focuses on studio-quality local recording with cloud processing. Understanding these foundational differences helps you select the platform that aligns with your technical requirements.

The core distinction lies in where audio processing occurs. Castle performs more processing in the browser, offering faster turnaround for basic edits. Riverside captures high-fidelity local audio on each participant’s machine before uploading to the cloud, resulting in higher quality but requiring more bandwidth during recording.

API Capabilities and Developer Integration

Castle API Overview

Castle provides a REST API for managing recordings, accessing transcripts, and triggering AI-powered edits. The API authentication uses API keys that you generate through the developer dashboard.

import requests

# Castle API: Start a new recording session
def create_podcastle_session(api_key, title, hosts):
    url = "https://api.castle.ai/v1/sessions"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {
        "title": title,
        "hosts": hosts,
        "ai Enhancement": True,
        "transcription": True
    }
    response = requests.post(url, json=payload, headers=headers)
    return response.json()

# Example usage
session = create_podcastle_session(
    api_key="your_api_key",
    title="Tech Developer Podcast Ep. 42",
    hosts=["speaker1@example.com", "speaker2@example.com"]
)
print(f"Recording session created: {session['join_url']}")

The API returns a join URL that you can distribute to participants. Castle handles the recording session management, including automatic transcription and AI noise removal.

Riverside API Overview

Riverside offers a more comprehensive API suite, including their standard REST API and a webhook system for event-driven workflows. Their API supports both recording management and media file retrieval.

import requests
import json

# Riverside API: Create recording session and configure AI features
def create_riverside_session(api_key, scene_id):
    url = "https://api.riverside.fm/api/v1/scenes"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {
        "name": "Developer Podcast Session",
        "sceneId": scene_id,
        "settings": {
            "localRecording": True,
            "videoQuality": "1080p",
            "audioOnly": False,
            "aiTranscription": {
                "enabled": True,
                "language": "en"
            },
            "studioMode": {
                "enabled": True,
                "remote Participants": 4
            }
        }
    }
    response = requests.post(url, json=payload, headers=headers)
    return response.json()

# Example usage
session = create_riverside_session(
    api_key="your_riverside_key",
    scene_id="your_scene_id"
)
print(f"Riverside session ready: {session['inviteLink']}")

Audio Quality and Processing

Recording Quality Analysis

For technical implementations, audio fidelity matters significantly. Riverside’s local recording approach captures uncompressed audio (up to 48kHz/16-bit WAV) on each participant’s machine before any compression or transmission occurs. This means each speaker’s track remains independent, enabling precise post-production editing.

Castle uses WebRTC-based recording with adaptive bitrate streaming. While convenient for browser-based workflows, the compression applied during recording can introduce artifacts, particularly with unstable network connections. However, their AI-powered enhancement can compensate for some quality loss.

AI-Powered Enhancement Features

Both platforms offer AI enhancement, but their implementations differ:

Castle AI Enhancement:

Noise reduction (applied in-browser during recording)
Auto-leveling for consistent volume across speakers
Silence removal in post-production
Speaker diarization for multi-host identification

Riverside AI Enhancement:

Studio Sound (local processing, maintains quality)
AI-powered transcription with speaker labels
Automatic filler word removal
Cross-talk detection for overlapping speech

# Example: Processing Castle transcription with speaker labels
def process_castle_transcript(transcript_data):
    """Parse Castle transcript with speaker identification."""
    processed = []
    for segment in transcript_data['segments']:
        processed.append({
            'speaker': segment['speaker_id'],
            'text': segment['text'],
            'start': segment['start_time'],
            'end': segment['end_time'],
            'confidence': segment.get('confidence', 0.0)
        })
    return processed

# Example: Processing Riverside transcription with timestamps
def process_riverside_transcript(transcript_data):
    """Parse Riverside transcript with enhanced speaker labels."""
    speakers = {s['id']: s['name'] for s in transcript_data['speakers']}
    return [{
        'speaker': speakers.get(seg['speakerId'], 'Unknown'),
        'text': seg['text'],
        'timestamp': seg['timestamp'],
        'words': seg['words']  # Word-level timestamps available
    } for seg in transcript_data['transcript']]

Integration Patterns for Podcast Automation

Webhook Configuration

Both platforms support webhooks for event-driven automation. Here’s how to configure them:

# Flask webhook handler for Riverside recording events
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/webhook/riverside', methods=['POST'])
def handle_riverside_webhook():
    event = request.json
    
    if event['type'] == 'recording.completed':
        recording_id = event['data']['recordingId']
        # Trigger your post-processing workflow
        process_podcast_recording.delay(recording_id)
        
    elif event['type'] == 'transcription.ready':
        # Handle new transcription available
        transcript_url = event['data']['transcriptUrl']
        fetch_and_index_transcript.delay(transcript_url)
    
    return jsonify({'status': 'received'}), 200

Media File Retrieval

After recording completes, you need programmatic access to the media files:

# Retrieve processed media from Castle
def get_castle_media(session_id, api_key):
    url = f"https://api.castle.ai/v1/sessions/{session_id}/media"
    headers = {"Authorization": f"Bearer {api_key}"}
    response = requests.get(url, headers=headers)
    data = response.json()
    
    return {
        'video': data['video_url'],
        'audio_mixed': data['audio_mixed_url'],
        'audio_individual': data['audio_tracks'],  # Per-speaker tracks
        'transcript': data['transcript_url']
    }

# Retrieve processed media from Riverside
def get_riverside_media(session_id, api_key):
    url = f"https://api.riverside.fm/api/v1/recordings/{session_id}"
    headers = {"Authorization": f"Bearer {api_key}"}
    response = requests.get(url, headers=headers)
    data = response.json()
    
    return {
        'video': data['output']['videoUrl'],
        'audio_wav': data['output']['audioWavUrl'],
        'audio_mp3': data['output']['audioMp3Url'],
        'transcript_srt': data['output']['transcriptSrtUrl'],
        'transcript_json': data['output']['transcriptJsonUrl']
    }

Cost Considerations for Scaling

When building automated podcast workflows, understanding the pricing model impacts your architecture decisions:

Feature	Castle	Riverside
Free tier	Limited recordings/month	Limited hours/month
API access	Available on Pro plans	Available on Business plans
Storage	Included (with limits)	Pay per storage used
Transcription	Included	Pay per minute
Individual tracks	Limited on lower tiers	Full access on Business

For large-scale implementations, evaluate your monthly recording volume and whether you need individual speaker tracks for post-production. Riverside’s local recording approach provides more flexibility for complex editing workflows, while Castle offers faster setup for simpler use cases.

Decision Framework for Developers

Choose Castle if you need:

Quick browser-based recording setup
Faster AI enhancement turnaround
Simpler integration for basic workflows
Cost-effective for shorter recordings

Choose Riverside if you require:

Maximum audio quality for post-production
Individual speaker track isolation
Webhook-driven automation at scale
More granular control over recording settings

Both platforms continue evolving their AI features, so evaluate current capabilities against your specific requirements rather than relying solely on general comparisons.

Built by theluckystrike — More at zovo.one