Chrome Extension Screen Capture: Build a Screenshot and Recording Tool
Screen capture functionality is among the most sought-after features for Chrome extensions. Whether you’re building a documentation tool that needs to capture web pages, a collaboration platform requiring screen sharing, or a productivity application with screenshot annotation capabilities, the Chrome tabCapture API provides the foundation you need. This comprehensive guide will teach you how to leverage the chrome.tabCapture API and related methods to create powerful screen capture and recording functionality in your Chrome extension.
The Chrome Extension APIs offer multiple approaches to screen capture, each with distinct capabilities and use cases. Understanding these options and their appropriate applications is essential for building robust screenshot chrome extension solutions that meet real-world user needs.
Understanding Screen Capture in Chrome Extensions
Chrome extensions can capture screen content through several APIs, each designed for specific scenarios. The primary options include the chrome.tabCapture API for capturing tab content as media streams, the desktopCapture API for capturing entire screens or windows, and the Canvas API for capturing rendered page content as images.
The chrome.tabCapture API
The chrome.tabCapture API is specifically designed for capturing the visual and audio content of a browser tab. This API returns a MediaStream that contains video and audio tracks representing the tab’s content. This makes it ideal for building a screen recording extension that needs to capture browser tab activity, online presentations, or web-based tutorials.
The tabCapture API offers several significant advantages. First, it provides high-quality video capture at the tab’s native resolution. Second, it can capture both visual and audio content when the tab is playing media. Third, it works within Chrome’s security model without requiring additional permissions for screen recording at the system level.
However, there are important limitations to consider. The chrome.tabCapture API can only capture tab content, not the entire desktop or other applications. Users must explicitly grant permission each time capture begins through a user-initiated action. Additionally, some websites implement measures to prevent or restrict capture due to digital rights management or privacy concerns.
The desktopCapture API
For scenarios requiring capture of the entire screen or specific windows, Chrome provides the chrome.desktopCapture API. This API is more powerful but requires stricter permissions and user confirmation. It’s the appropriate choice when building a screen recording extension that needs to capture multiple applications, system UI elements, or content outside the browser.
The desktopCapture API supports several capture source types: screens, windows, and tabs from all browsers. This flexibility makes it essential for building comprehensive screen capture tools that work across the entire operating system.
Setting Up Your Extension Manifest
Before implementing screen capture functionality, you must configure your extension’s manifest file with the appropriate permissions and declarations.
Required Permissions
For basic tab capture functionality, you’ll need the “tabCapture” permission in your manifest. This permission is considered sensitive and requires you to explain its use during the Chrome Web Store review process.
{
"name": "Screen Capture Pro",
"version": "1.0.0",
"manifest_version": 3,
"permissions": [
"tabCapture"
],
"host_permissions": [
"<all_urls>"
],
"action": {
"default_popup": "popup.html",
"default_icon": {
"16": "icons/icon16.png",
"48": "icons/icon48.png",
"128": "icons/icon128.png"
}
}
}
For desktop capture capabilities, you’ll need the “desktopCapture” permission instead:
{
"permissions": [
"desktopCapture"
],
"host_permissions": [
"<all_urls>"
]
}
Understanding Permission Implications
The permissions you request affect both the security model of your extension and the review process for Chrome Web Store publication. The tabCapture permission allows capture of tab content but doesn’t grant access to system-level screen recording. The desktopCapture permission provides broader capabilities but undergoes more rigorous review.
When publishing to the Chrome Web Store, you must provide a video demonstrating your extension’s functionality and explain why screen capture is essential for your extension’s core purpose. Extensions that request screen capture permissions without clear justification may be rejected or require additional review time.
Implementing Tab Capture
Now let’s implement the core screen capture functionality using the chrome.tabCapture API. The implementation involves requesting capture, handling the resulting media stream, and managing the capture lifecycle.
Initiating Tab Capture
The primary method for starting capture is chrome.tabCapture.capture(), which accepts configuration options and returns a Promise resolving to a MediaStream.
// capture.js - Background script or popup script
async function startTabCapture(tabId, options = {}) {
const defaultOptions = {
audio: true,
video: true,
videoConstraints: {
mandatory: {
minWidth: 1280,
maxWidth: 1920,
minHeight: 720,
maxHeight: 1080,
minFrameRate: 30,
maxFrameRate: 60
}
},
audioConstraints: {
mandatory: {
chromeMediaSource: 'tab'
}
}
};
const captureOptions = { ...defaultOptions, ...options };
try {
const stream = await chrome.tabCapture.capture(captureOptions);
return stream;
} catch (error) {
console.error('Tab capture failed:', error);
throw error;
}
}
The capture options allow you to specify whether to capture audio and video, along with constraints for video quality. The videoConstraints object lets you define resolution and frame rate parameters, which is particularly important for building a screen recording extension that needs consistent quality.
Handling Capture Streams
Once you have a MediaStream, you can use it in various ways depending on your extension’s requirements. Common use cases include recording to a file, streaming to a server, or displaying in a preview element.
// recorder.js - Handle media stream recording
class TabRecorder {
constructor(stream) {
this.stream = stream;
this.mediaRecorder = null;
this.chunks = [];
}
startRecording(options = {}) {
const mimeType = options.mimeType || 'video/webm;codecs=vp9';
this.mediaRecorder = new MediaRecorder(this.stream, {
mimeType,
videoBitsPerSecond: options.bitrate || 2500000
});
this.chunks = [];
this.mediaRecorder.ondataavailable = (event) => {
if (event.data.size > 0) {
this.chunks.push(event.data);
}
};
this.mediaRecorder.onstop = () => {
this.onRecordingComplete();
};
this.mediaRecorder.start(1000); // Collect data every second
}
stopRecording() {
if (this.mediaRecorder && this.mediaRecorder.state !== 'inactive') {
this.mediaRecorder.stop();
this.stream.getTracks().forEach(track => track.stop());
}
}
async getRecordingBlob() {
return new Blob(this.chunks, { type: 'video/webm' });
}
onRecordingComplete() {
console.log('Recording completed');
}
}
This implementation creates a reusable recorder class that handles the MediaRecorder API. The recorder collects video chunks and can generate a Blob containing the recorded content when stopped.
Implementing Desktop Capture
For applications requiring full screen or window capture, the chrome.desktopCapture API provides the necessary functionality. This API requires a different approach, using a picker UI to let users select what to capture.
Requesting Desktop Capture
The desktopCapture API uses a different pattern than tabCapture, requiring you to present a source selection UI to users.
// desktop-capture.js
async function requestDesktopCapture() {
// Request capture of screens and windows
const sources = await chrome.desktopCapture.getDesktopSources({
types: ['window', 'screen'],
thumbnailSize: { width: 320, height: 180 }
});
return sources;
}
// In your popup or UI code
document.getElementById('captureButton').addEventListener('click', async () => {
const sources = await requestDesktopCapture();
displaySourcePicker(sources);
});
function displaySourcePicker(sources) {
const picker = document.getElementById('sourcePicker');
picker.innerHTML = '';
sources.forEach(source => {
const option = document.createElement('div');
option.className = 'source-option';
option.innerHTML = `
<img src="${source.thumbnail.toDataURL()}" alt="${source.name}">
<span>${source.name}</span>
`;
option.onclick = () => selectSource(source.id);
picker.appendChild(option);
});
}
async function selectSource(sourceId) {
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: {
mandatory: {
chromeMediaSource: 'desktop',
chromeMediaSourceId: sourceId
}
}
});
return stream;
}
This implementation requests available capture sources and presents them to the user. The user must explicitly select which screen or window to capture, maintaining user privacy and control.
Building a Complete Screenshot Extension
Beyond video recording, many Chrome extension screen capture tools need to capture static screenshots. There are several approaches to implementing screenshot functionality.
Canvas-Based Screenshot Capture
The most straightforward approach uses the Canvas API to render captured content:
// screenshot.js
async function captureTabScreenshot(tabId) {
// First, get the tab details
const tab = await chrome.tabs.get(tabId);
// Inject a content script to capture the page
const results = await chrome.scripting.executeScript({
target: { tabId },
func: () => {
return {
width: window.innerWidth,
height: window.innerHeight,
devicePixelRatio: window.devicePixelRatio
};
}
});
const { width, height, devicePixelRatio } = results[0];
// Use chrome.tabs.captureVisibleTab
const dataUrl = await chrome.tabs.captureVisibleTab(tab.windowId, {
format: 'png',
quality: 100
});
return dataUrl;
}
// Alternative: Capture at specific resolution
async function captureAtResolution(tabId, width, height) {
const dataUrl = await chrome.tabs.captureVisibleTab(tabId, {
format: 'png',
quality: 100,
fromSurface: true
});
// Resize if needed using Canvas
if (width && height) {
return await resizeImage(dataUrl, width, height);
}
return dataUrl;
}
async function resizeImage(dataUrl, width, height) {
return new Promise((resolve) => {
const img = new Image();
img.onload = () => {
const canvas = document.createElement('canvas');
canvas.width = width;
canvas.height = height;
const ctx = canvas.getContext('2d');
ctx.drawImage(img, 0, 0, width, height);
resolve(canvas.toDataURL('image/png'));
};
img.src = dataUrl;
});
}
The chrome.tabs.captureVisibleTab method provides a simpler approach for basic screenshots, capturing exactly what the user sees in the viewport. For more comprehensive capture of the entire page, including scrollable content, you’ll need to implement a scrolling capture approach.
Full Page Capture
Capturing an entire page beyond the visible viewport requires a more sophisticated approach:
// fullpage-screenshot.js
async function captureFullPage(tabId) {
// Get original scroll position
const originalPosition = await chrome.scripting.executeScript({
target: { tabId },
func: () => ({ x: window.scrollX, y: window.scrollY })
});
// Get total page dimensions
const dimensions = await chrome.scripting.executeScript({
target: { tabId },
func: () => ({
width: document.documentElement.scrollWidth,
height: document.documentElement.scrollHeight,
windowWidth: window.innerWidth,
windowHeight: window.innerHeight
})
});
const { width, height, windowWidth, windowHeight } = dimensions[0];
// Capture the full page by capturing multiple sections
const screenshots = [];
const segments = Math.ceil(height / windowHeight);
for (let i = 0; i < segments; i++) {
// Scroll to position
await chrome.scripting.executeScript({
target: { tabId },
func: (y) => window.scrollTo(0, y),
args: [i * windowHeight]
});
// Wait for scroll to complete
await new Promise(resolve => setTimeout(resolve, 100));
// Capture visible portion
const segment = await chrome.tabs.captureVisibleTab(tabId, {
format: 'png',
quality: 100
});
screenshots.push(segment);
}
// Combine segments
const combinedImage = await combineScreenshots(screenshots, width, height);
// Restore original position
await chrome.scripting.executeScript({
target: { tabId },
func: (pos) => window.scrollTo(pos.x, pos.y),
args: [originalPosition[0]]
});
return combinedImage;
}
async function combineScreenshots(screenshots, totalWidth, totalHeight) {
const canvas = document.createElement('canvas');
canvas.width = totalWidth;
canvas.height = totalHeight;
const ctx = canvas.getContext('2d');
let yOffset = 0;
for (const dataUrl of screenshots) {
await new Promise((resolve) => {
const img = new Image();
img.onload = () => {
ctx.drawImage(img, 0, yOffset);
yOffset += img.height;
resolve();
};
img.src = dataUrl;
});
}
return canvas.toDataURL('image/png');
}
This implementation captures a full page by taking multiple screenshots while scrolling through the document, then stitching them together into a single image.
Best Practices and Considerations
When building a screenshot chrome extension, several best practices ensure a positive user experience and successful Chrome Web Store publication.
User Privacy and Consent
Always clearly communicate to users when capture is active. Consider implementing visible indicators such as a badge or toolbar icon that shows when recording or capture is in progress. This transparency builds user trust and complies with privacy expectations.
Performance Optimization
Screen capture can be resource-intensive. Implement these optimizations to maintain performance:
- Use appropriate resolution settings based on the intended use
- Implement frame rate limiting for video capture when high frame rates aren’t necessary
- Consider using web workers for video encoding to prevent UI blocking
- Release media tracks promptly when capture ends
Error Handling
Implement comprehensive error handling for common failure scenarios:
// error-handling.js
async function safeCapture(tabId) {
try {
// Check if tab exists and is accessible
const tab = await chrome.tabs.get(tabId);
if (tab.status !== 'complete') {
throw new Error('Tab is still loading');
}
if (tab.url.startsWith('chrome://') ||
tab.url.startsWith('chrome-extension://') ||
tab.url.startsWith('about:')) {
throw new Error('Cannot capture Chrome internal pages');
}
return await chrome.tabs.captureVisibleTab(tabId, { format: 'png' });
} catch (error) {
handleCaptureError(error);
throw error;
}
}
function handleCaptureError(error) {
const errorMessages = {
'Tab is still loading': 'Please wait for the page to fully load before capturing.',
'Cannot capture Chrome internal pages': 'This type of page cannot be captured due to browser restrictions.',
'Permission denied': 'Please grant screen capture permissions when prompted.'
};
const message = errorMessages[error.message] || 'An error occurred during capture. Please try again.';
showNotification(message);
}
File Management
For extensions that save captured content, implement proper file handling:
// file-manager.js
async function saveCapture(dataUrl, filename) {
// Convert data URL to blob
const response = await fetch(dataUrl);
const blob = await response.blob();
// Use downloads API to save the file
const downloadId = await chrome.downloads.download({
url: dataUrl,
filename: filename,
saveAs: true,
conflictAction: 'uniquify'
});
return downloadId;
}
function generateFilename(prefix, extension) {
const timestamp = new Date().toISOString()
.replace(/[:.]/g, '-')
.replace('T', '_')
.split('.')[0];
return `${prefix}_${timestamp}.${extension}`;
}
Conclusion
Building a chrome extension screen capture tool requires understanding the available APIs and their appropriate use cases. The chrome.tabCapture API provides an excellent foundation for capturing tab content as video streams, while chrome.desktopCapture enables broader screen and window capture capabilities. For static screenshots, chrome.tabs.captureVisibleTab offers a straightforward solution.
By following the implementation patterns and best practices outlined in this guide, you can create a screenshot chrome extension or screen recording extension that provides reliable functionality while maintaining good performance and user privacy. Remember to thoroughly test your extension across different scenarios and browsers, and ensure proper error handling for a polished user experience.
The screen capture functionality you’ve now learned to implement opens doors to numerous extension ideas, from documentation tools and collaboration platforms to productivity applications and content creation tools. With these foundations in place, you’re well-equipped to build sophisticated screen capture solutions that serve real user needs.