Build Screen Sharing with Chrome Extensions and WebRTC

18 min read

Build Screen Sharing with Chrome Extensions and WebRTC

Build Screen Sharing with Chrome Extensions and WebRTC

Real-time screen sharing has become an essential feature for remote collaboration, technical support, online education, and product demonstrations. Combining Chrome Extensions with WebRTC provides a powerful platform for building sophisticated screen sharing applications that run directly in the browser. This comprehensive guide walks you through creating a complete screen sharing extension using Manifest V3, the Screen Capture API, and WebRTC peer connections.

The integration of screen capture capabilities within Chrome extensions opens up numerous possibilities for developers. Whether you are building a collaboration tool like Zoom or Google Meet, a technical support application, an online learning platform, or a product demonstration recorder, understanding how to implement screen sharing in Chrome extensions is a valuable skill that can set your projects apart.


Understanding the Screen Capture API in Chrome Extensions

Chrome provides the getDisplayMedia API as part of the Media Capture and Streams specification, which allows web applications and extensions to capture screen content in real-time. This API is the foundation upon which all screen sharing functionality is built in modern Chrome extensions.

The getDisplayMedia method works similarly to the familiar getUserMedia API used for camera and microphone access, but instead of capturing input devices, it prompts the user to select a screen, window, or browser tab to share. When invoked, Chrome displays a native picker UI that lets users choose exactly what they want to share, providing granular control over their privacy.

For Chrome extensions specifically, the Screen Capture API requires proper permissions in the manifest file and user consent through the picker dialog. Unlike traditional web applications, Chrome extensions can leverage additional capabilities and maintain persistent states that make them ideal for building dedicated screen sharing tools.

The API returns a MediaStream object that contains video tracks representing the screen content. This stream can then be processed, recorded, or transmitted to other participants using WebRTC. Understanding how to properly capture, handle, and transmit these streams is crucial for building reliable screen sharing applications.

Key Capabilities of the Screen Capture API

The Screen Capture API provides several powerful features that make it suitable for various use cases. First, it supports capturing entire displays, individual application windows, or specific browser tabs. This flexibility allows users to choose exactly what content to share, whether it is their entire desktop for a presentation or a single application window for focused demonstration.

The API also supports audio capture on Windows and Mac, enabling shared audio alongside screen content. This is particularly useful for applications that need to share system audio or application audio along with visual content, such as video tutorials or product demonstrations.

Additionally, the API provides automatic content type detection, meaning Chrome intelligently identifies whether the selected source is a screen, window, or tab and optimizes the capture accordingly. This helps ensure the best possible quality for different types of content.


Setting Up the Chrome Extension Project

Every Chrome extension begins with the manifest file, and screen sharing extensions require specific permissions to access the capture functionality. Let us set up a proper Manifest V3 configuration for our screen sharing extension.

Creating the Manifest

Create a new directory for your extension and add the manifest.json file with the necessary permissions. The key permission needed is "desktopCapture", which grants the extension access to screen capture capabilities.

{
  "manifest_version": 3,
  "name": "Screen Share Pro",
  "version": "1.0",
  "description": "Professional screen sharing extension with WebRTC support",
  "permissions": [
    "desktopCapture",
    "storage",
    "tabs"
  ],
  "host_permissions": [
    "<all_urls>"
  ],
  "action": {
    "default_popup": "popup.html",
    "default_icon": {
      "16": "icons/icon16.png",
      "48": "icons/icon48.png",
      "128": "icons/icon128.png"
    }
  },
  "background": {
    "service_worker": "background.js"
  }
}

The desktopCapture permission is the critical addition that enables screen sharing functionality. The storage permission allows you to save user preferences, while tabs permission enables accessing tab information for enhanced functionality.

Setting Up the Popup Interface

The popup serves as the main user interface for initiating screen sharing. Create a simple HTML file that provides controls for starting and stopping screen capture.

<!DOCTYPE html>
<html>
<head>
  <style>
    body {
      width: 300px;
      padding: 16px;
      font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
    }
    h2 { margin-top: 0; font-size: 18px; }
    .btn {
      display: block;
      width: 100%;
      padding: 12px;
      margin: 8px 0;
      border: none;
      border-radius: 6px;
      cursor: pointer;
      font-size: 14px;
      font-weight: 600;
      transition: background 0.2s;
    }
    .btn-primary {
      background: #4285f4;
      color: white;
    }
    .btn-primary:hover { background: #3367d6; }
    .btn-danger {
      background: #ea4335;
      color: white;
    }
    .btn-danger:hover { background: #d33426; }
    .status {
      margin-top: 12px;
      padding: 8px;
      border-radius: 4px;
      font-size: 12px;
      text-align: center;
    }
    .status.active { background: #e6f4ea; color: #137333; }
    .status.inactive { background: #fce8e6; color: #c5221f; }
    #streamInfo { display: none; margin-top: 12px; }
  </style>
</head>
<body>
  <h2>Screen Share Pro</h2>
  <button id="startShare" class="btn btn-primary">Start Screen Share</button>
  <button id="stopShare" class="btn btn-danger" style="display:none;">Stop Sharing</button>
  <div id="status" class="status inactive">Not sharing</div>
  <div id="streamInfo">
    <p><strong>Sharing:</strong> <span id="sourceType">-</span></p>
  </div>
  <script src="popup.js"></script>
</body>
</html>

This popup provides a clean interface for users to initiate and terminate screen sharing sessions. The status indicator provides visual feedback about the current sharing state.


Implementing Screen Capture Logic

The core functionality of screen sharing is implemented in the popup JavaScript file. This handles the invocation of the Screen Capture API and manages the resulting media streams.

Capturing the Screen

Create the popup.js file with the screen capture implementation:

let currentStream = null;

document.addEventListener('DOMContentLoaded', () => {
  const startBtn = document.getElementById('startShare');
  const stopBtn = document.getElementById('stopShare');
  const statusDiv = document.getElementById('status');
  const streamInfo = document.getElementById('streamInfo');
  const sourceTypeSpan = document.getElementById('sourceType');

  startBtn.addEventListener('click', async () => {
    try {
      // Request screen capture
      const stream = await navigator.mediaDevices.getDisplayMedia({
        video: {
          displaySurface: 'monitor', // prefer entire screen
          width: { ideal: 1920 },
          height: { ideal: 1080 },
          frameRate: { ideal: 30 }
        },
        audio: true // capture system audio on supported systems
      });

      currentStream = stream;
      
      // Handle user stopping via browser UI
      stream.getVideoTracks()[0].onended = () => {
        stopSharing();
      };

      // Update UI
      startBtn.style.display = 'none';
      stopBtn.style.display = 'block';
      statusDiv.textContent = 'Sharing active';
      statusDiv.className = 'status active';
      streamInfo.style.display = 'block';
      
      // Get source information
      const settings = stream.getVideoTracks()[0].getSettings();
      sourceTypeSpan.textContent = settings.displaySurface || 'Unknown';
      
      // Send stream to background for processing or WebRTC
      chrome.runtime.sendMessage({
        type: 'STREAM_STARTED',
        streamId: stream.id
      });

    } catch (error) {
      console.error('Screen capture failed:', error);
      statusDiv.textContent = 'Capture failed: ' + error.message;
      statusDiv.className = 'status inactive';
    }
  });

  stopBtn.addEventListener('click', stopSharing);

  function stopSharing() {
    if (currentStream) {
      currentStream.getTracks().forEach(track => track.stop());
      currentStream = null;
    }
    
    startBtn.style.display = 'block';
    stopBtn.style.display = 'none';
    statusDiv.textContent = 'Not sharing';
    statusDiv.className = 'status inactive';
    streamInfo.style.display = 'none';
    
    chrome.runtime.sendMessage({ type: 'STREAM_STOPPED' });
  }
});

This implementation properly handles the screen capture lifecycle, including user-initiated stopping through the browser’s built-in UI. The code also captures additional metadata about the shared source, which can be useful for analytics or UI purposes.


Building WebRTC Peer Connections

Once you have captured the screen stream, the next step is transmitting it to other participants using WebRTC. WebRTC provides real-time communication capabilities that enable low-latency streaming of screen content.

Understanding WebRTC for Screen Sharing

WebRTC (Web Real-Time Communication) is a free, open-source project that provides web browsers and mobile applications with real-time communication capabilities. For screen sharing, WebRTC handles the transmission of video and audio streams between peers with minimal latency.

The three main components of WebRTC relevant to screen sharing are the MediaStream API for handling audio and video, the RTCPeerConnection API for establishing peer-to-peer connections, and the RTCSessionDescription API for negotiating connection parameters.

When implementing WebRTC for screen sharing, you create an RTCPeerConnection, add the screen capture stream as a track, and then exchange session descriptions with the remote peer to establish the connection.

Implementing the Background Service Worker

The background service worker acts as the central hub for managing WebRTC connections and coordinating screen sharing sessions. Create background.js with the WebRTC implementation:

// Store active connections
const peerConnections = {};

const rtcConfig = {
  iceServers: [
    { urls: 'stun:stun.l.google.com:19302' },
    { urls: 'stun:stun1.l.google.com:19302' }
  ]
};

// Initialize WebRTC peer connection
function createPeerConnection(peerId) {
  const pc = new RTCPeerConnection(rtcConfig);
  
  pc.onicecandidate = (event) => {
    if (event.candidate) {
      // Send ICE candidate to remote peer
      sendSignalingMessage(peerId, {
        type: 'ice-candidate',
        candidate: event.candidate
      });
    }
  };
  
  pc.ontrack = (event) => {
    // Handle incoming tracks from remote peer
    console.log('Received remote track:', event.track.kind);
  };
  
  pc.onconnectionstatechange = () => {
    console.log('Connection state:', pc.connectionState);
    if (pc.connectionState === 'disconnected' || pc.connectionState === 'failed') {
      cleanupPeerConnection(peerId);
    }
  };
  
  peerConnections[peerId] = pc;
  return pc;
}

// Add local screen share stream to peer connection
async function addScreenStream(pc, stream) {
  stream.getTracks().forEach(track => {
    pc.addTrack(track, stream);
  });
}

// Handle incoming messages from popup
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
  if (message.type === 'STREAM_STARTED') {
    console.log('Screen sharing started');
    // Here you would typically initiate WebRTC connections
    // to share the stream with other participants
  } else if (message.type === 'STREAM_STOPPED') {
    console.log('Screen sharing stopped');
    // Clean up WebRTC connections
    Object.keys(peerConnections).forEach(peerId => {
      cleanupPeerConnection(peerId);
    });
  }
});

function cleanupPeerConnection(peerId) {
  if (peerConnections[peerId]) {
    peerConnections[peerId].close();
    delete peerConnections[peerId];
  }
}

// Placeholder for signaling server communication
function sendSignalingMessage(peerId, message) {
  // Implement your signaling mechanism here
  // This could be WebSocket, Firebase, or any other messaging service
  console.log('Signaling message:', message);
}

This background service worker provides the foundation for WebRTC screen sharing. It manages peer connections, handles ICE candidate exchange, and coordinates the flow of screen capture streams to remote participants.


Advanced Screen Sharing Features

Building a production-ready screen sharing extension requires implementing additional features that enhance usability, performance, and user experience.

Source Type Selection

Allowing users to specifically choose what they want to share improves the user experience. You can customize the getDisplayMedia prompt behavior using the selfBrowserSurface and systemSurface constraints:

const stream = await navigator.mediaDevices.getDisplayMedia({
  video: {
    displaySurface: 'browser', // restrict to browser tabs
    selfBrowserSurface: 'include', // include current tab
    systemSurface: 'exclude', // exclude entire screen
    monitorTypeSurface: 'exclude' // exclude monitors
  },
  audio: true
});

These constraints help guide users toward appropriate sharing sources for your application’s specific use case, reducing confusion and improving the quality of shared content.

Stream Quality Management

Different use cases require different quality settings. For presentations, you might prioritize higher resolution, while for remote support sessions, lower latency might be more important. Implement quality presets that users can choose from:

const qualityPresets = {
  presentation: {
    width: { ideal: 1920 },
    height: { ideal: 1080 },
    frameRate: { ideal: 30 },
    bitrate: { ideal: 5000000 }
  },
  lowLatency: {
    width: { ideal: 1280 },
    height: { ideal: 720 },
    frameRate: { ideal: 60 },
    bitrate: { ideal: 2500000 }
  },
  recording: {
    width: { ideal: 1920 },
    height: { ideal: 1080 },
    frameRate: { ideal: 60 },
    bitrate: { ideal: 10000000 }
  }
};

By allowing users to select their preferred quality mode, you can optimize the screen sharing experience for their specific needs.

Handling Stream Changes

Users may change their shared source during an active session. The Screen Capture API provides events to handle these transitions gracefully:

stream.getVideoTracks()[0].onaddstream = (event) => {
  // Handle when user switches from one window to another
  console.log('Source changed:', event.stream.id);
};

stream.getVideoTracks()[0].onended = () => {
  // Handle when user clicks the browser's stop sharing button
  stopSharing();
};

Properly handling these events ensures a smooth user experience even when unexpected changes occur during a sharing session.


Testing and Debugging Extensions

Developing screen sharing extensions requires careful testing to ensure compatibility across different scenarios and configurations.

Loading the Extension

To test your extension, open Chrome and navigate to chrome://extensions/. Enable Developer mode using the toggle in the top right corner, then click Load unpacked and select your extension’s directory. The extension will appear in your browser’s toolbar, ready for testing.

Debugging Tips

When debugging screen sharing extensions, use the Chrome DevTools for the extension’s popup and background service worker. Right-click on the extension icon and select “Inspect popup” to open DevTools for the popup. For background worker debugging, click the service worker link on the extensions page.

Common issues include permission errors, which typically stem from incorrect manifest configuration, stream errors from incompatible constraints or hardware issues, and WebRTC connection failures often related to network configuration or ICE server availability.


Best Practices and Security Considerations

Building secure and reliable screen sharing extensions requires following established best practices and security guidelines.

User Privacy

Always prioritize user privacy by clearly explaining what is being captured and transmitted. Never capture or transmit screen content without explicit user consent through the getDisplayMedia prompt. Implement visual indicators showing when sharing is active, and ensure users can easily stop sharing at any time.

Performance Optimization

Screen sharing can be resource-intensive. Optimize performance by adjusting video quality based on network conditions, using hardware acceleration when available, properly cleaning up streams and connections when they are no longer needed, and implementing efficient encoding settings to balance quality and performance.

Error Handling

Implement comprehensive error handling to provide users with meaningful feedback when issues occur. Handle permission denied errors gracefully, provide fallback options when screen capture fails, and maintain clear communication about connection status throughout the sharing session.


Conclusion

Building a screen sharing Chrome extension with WebRTC opens up extensive possibilities for creating collaborative applications. This guide covered the essential components: configuring Manifest V3 with appropriate permissions, implementing the Screen Capture API for capturing screen content, establishing WebRTC peer connections for real-time transmission, and adding advanced features for production-ready applications.

The combination of Chrome extensions and WebRTC provides a powerful platform for building screen sharing tools that rival dedicated applications. With the foundation established in this guide, you can extend functionality to include recording capabilities, multi-participant support, annotation tools, and integration with broader collaboration platforms.

Remember to thoroughly test your extension across different scenarios and Chrome versions, and always prioritize user privacy and security in your implementation. The screen sharing capabilities in Chrome extensions continue to evolve, so stay updated with the latest Chrome documentation for new features and API changes.

Start building your screen sharing extension today, and explore the countless possibilities for enhancing remote collaboration and communication through the power of Chrome extensions and WebRTC technology.


Part of the Chrome Extension Guide by theluckystrike. Built at zovo.one.

No previous article
No next article