Chrome Extension Web Clipper — Developer Guide
31 min readBuild a Web Clipper Extension – Full Tutorial
What We’re Building
A Notion/Evernote-style web clipper Chrome extension that lets you save content from any page:
- Clip full pages, text selections, or screenshots
- Extract page metadata (title, description, og:image, author)
- Convert selections to Markdown
- Reader-mode article extraction
- Screenshot capture via
chrome.tabs.captureVisibleTab - Side panel for managing and browsing saved clips
- Export clips as Markdown or JSON
Prerequisites
- Chrome 116+ with Manifest V3 and Side Panel API support
- Node.js 18+ and npm
- Basic TypeScript and Chrome extension knowledge (cross-ref:
docs/guides/extension-architecture.md)
mkdir web-clipper && cd web-clipper
npm init -y
npm install -D typescript @types/chrome
npm install @theluckystrike/webext-storage @theluckystrike/webext-messaging
mkdir -p src public
Step 1: Manifest with activeTab, Storage, and Side Panel Permissions
public/manifest.json:
{
"manifest_version": 3,
"name": "Web Clipper",
"version": "1.0.0",
"description": "Clip pages, selections, and screenshots from any website",
"permissions": ["activeTab", "storage", "sidePanel"],
"action": {
"default_popup": "popup.html",
"default_icon": { "48": "icons/icon-48.png", "128": "icons/icon-128.png" }
},
"background": { "service_worker": "background.js" },
"side_panel": { "default_path": "sidepanel.html" },
"content_scripts": [
{
"matches": ["<all_urls>"],
"js": ["content.js"],
"run_at": "document_idle"
}
],
"icons": { "48": "icons/icon-48.png", "128": "icons/icon-128.png" }
}
Key permissions:
activeTab– granted when the user clicks the action button, gives access to the current tab without broad host permissionsstorage– for persisting clips inchrome.storage.localsidePanel– for the clip manager sidebar
Step 2: Action Popup with Clip Options
public/popup.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="stylesheet" href="popup.css" />
</head>
<body>
<h2>Web Clipper</h2>
<button class="clip-btn" id="clip-page">
<span class="icon">📄</span>
<span>
<span class="label">Clip Full Page</span><br/>
<span class="desc">Extract article content in reader mode</span>
</span>
</button>
<button class="clip-btn" id="clip-selection">
<span class="icon">✍</span>
<span>
<span class="label">Clip Selection</span><br/>
<span class="desc">Save highlighted text as Markdown</span>
</span>
</button>
<button class="clip-btn" id="clip-screenshot">
<span class="icon">📷</span>
<span>
<span class="label">Clip Screenshot</span><br/>
<span class="desc">Capture visible area as image</span>
</span>
</button>
<div class="status" id="status"></div>
<hr />
<button class="open-panel" id="open-panel">Open clip manager</button>
<script src="popup.js"></script>
</body>
</html>
src/popup.ts:
import { createMessenger } from '@theluckystrike/webext-messaging';
type Messages = {
clipPage: { request: void; response: { success: boolean } };
clipSelection: { request: void; response: { success: boolean } };
clipScreenshot: { request: void; response: { success: boolean } };
openSidePanel: { request: void; response: void };
};
const messenger = createMessenger<Messages>();
function showStatus(msg: string) {
const el = document.getElementById('status')!;
el.textContent = msg;
el.classList.add('visible');
setTimeout(() => el.classList.remove('visible'), 2000);
}
document.getElementById('clip-page')!.addEventListener('click', async () => {
const { success } = await messenger.sendMessage('clipPage', undefined);
showStatus(success ? 'Page clipped!' : 'Failed to clip page');
});
document.getElementById('clip-selection')!.addEventListener('click', async () => {
const { success } = await messenger.sendMessage('clipSelection', undefined);
showStatus(success ? 'Selection clipped!' : 'No text selected');
});
document.getElementById('clip-screenshot')!.addEventListener('click', async () => {
const { success } = await messenger.sendMessage('clipScreenshot', undefined);
showStatus(success ? 'Screenshot saved!' : 'Failed to capture');
});
document.getElementById('open-panel')!.addEventListener('click', () => {
messenger.sendMessage('openSidePanel', undefined);
window.close();
});
Step 3: Content Script – Extract Page Metadata
src/content.ts runs on every page and responds to messages from the background script. It extracts structured metadata using Open Graph tags and standard meta elements:
import { createMessenger } from '@theluckystrike/webext-messaging';
type ContentMessages = {
getMetadata: { request: void; response: PageMetadata };
getSelection: { request: void; response: string };
getArticle: { request: void; response: ArticleContent };
};
interface PageMetadata {
title: string;
description: string;
url: string;
ogImage: string;
author: string;
siteName: string;
publishedDate: string;
}
interface ArticleContent {
title: string;
content: string;
textContent: string;
byline: string;
}
const messenger = createMessenger<ContentMessages>();
function getMeta(name: string): string {
const selectors = [
`meta[property="${name}"]`,
`meta[name="${name}"]`,
`meta[property="og:${name}"]`,
];
for (const sel of selectors) {
const el = document.querySelector<HTMLMetaElement>(sel);
if (el?.content) return el.content;
}
return '';
}
messenger.onMessage('getMetadata', async () => {
return {
title: document.title || getMeta('og:title'),
description: getMeta('description') || getMeta('og:description'),
url: window.location.href,
ogImage: getMeta('og:image') || getMeta('twitter:image'),
author: getMeta('author') || getMeta('article:author') ||
document.querySelector('[rel="author"]')?.textContent?.trim() || '',
siteName: getMeta('og:site_name'),
publishedDate: getMeta('article:published_time') || getMeta('datePublished'),
};
});
Step 4: Selection Clipper with Markdown Conversion
Add the selection handler to src/content.ts. This converts the selected HTML to Markdown by walking the DOM tree:
function htmlToMarkdown(html: string): string {
const div = document.createElement('div');
div.innerHTML = html;
function walk(node: Node): string {
if (node.nodeType === Node.TEXT_NODE) return node.textContent || '';
if (node.nodeType !== Node.ELEMENT_NODE) return '';
const el = node as HTMLElement;
const tag = el.tagName.toLowerCase();
const children = Array.from(el.childNodes).map(walk).join('');
switch (tag) {
case 'strong': case 'b': return `**${children}**`;
case 'em': case 'i': return `*${children}*`;
case 'code': return `\`${children}\``;
case 'a': return `[${children}](${(el as HTMLAnchorElement).href})`;
case 'h1': return `\n# ${children}\n`;
case 'h2': return `\n## ${children}\n`;
case 'h3': return `\n### ${children}\n`;
case 'p': return `\n${children}\n`;
case 'br': return '\n';
case 'li': return `- ${children}\n`;
case 'blockquote': return `\n> ${children.trim()}\n`;
case 'pre': return `\n\`\`\`\n${el.textContent}\n\`\`\`\n`;
case 'img': return `.src})`;
default: return children;
}
}
return walk(div).replace(/\n{3,}/g, '\n\n').trim();
}
messenger.onMessage('getSelection', async () => {
const selection = window.getSelection();
if (!selection || selection.isCollapsed) return '';
const range = selection.getRangeAt(0);
const fragment = range.cloneContents();
const wrapper = document.createElement('div');
wrapper.appendChild(fragment);
return htmlToMarkdown(wrapper.innerHTML);
});
Step 5: Full Page Article Extraction (Reader Mode Parsing)
Add the article extractor to src/content.ts. This is a simplified reader-mode parser that identifies the main content node using scoring heuristics:
function extractArticle(): ArticleContent {
// Score candidate nodes by text density and semantic signals
const candidates = document.querySelectorAll('article, [role="main"], main, .post-content, .entry-content, .article-body');
let bestNode: Element | null = null;
let bestScore = 0;
if (candidates.length > 0) {
// Prefer semantic elements
bestNode = candidates[0];
} else {
// Fall back to scoring paragraphs' parent nodes
const paragraphs = document.querySelectorAll('p');
const scores = new Map<Element, number>();
paragraphs.forEach((p) => {
const parent = p.parentElement;
if (!parent) return;
const text = p.textContent || '';
if (text.length < 25) return;
const score = (scores.get(parent) || 0) + text.length;
scores.set(parent, score);
if (score > bestScore) { bestScore = score; bestNode = parent; }
});
}
const node = bestNode || document.body;
const title = document.title;
const byline = getMeta('author') || '';
// Convert content to markdown
const content = htmlToMarkdown(node.innerHTML);
const textContent = node.textContent?.trim() || '';
return { title, content, textContent, byline };
}
messenger.onMessage('getArticle', async () => {
return extractArticle();
});
Step 6: Screenshot Capture with chrome.tabs.captureVisibleTab
The background service worker handles all clip actions and screenshot capture. src/background.ts:
import { createMessenger } from '@theluckystrike/webext-messaging';
import { createStorage, defineSchema } from '@theluckystrike/webext-storage';
interface Clip {
id: string;
type: 'page' | 'selection' | 'screenshot';
title: string;
url: string;
content: string; // Markdown text or data URL for screenshots
metadata: {
description: string;
ogImage: string;
author: string;
siteName: string;
};
createdAt: number;
}
const schema = defineSchema({ clips: 'json' as const });
const storage = createStorage(schema, 'local');
type Messages = {
clipPage: { request: void; response: { success: boolean } };
clipSelection: { request: void; response: { success: boolean } };
clipScreenshot: { request: void; response: { success: boolean } };
openSidePanel: { request: void; response: void };
getClips: { request: void; response: Clip[] };
deleteClip: { request: { id: string }; response: void };
exportClips: { request: { format: 'markdown' | 'json' }; response: string };
};
type ContentMessages = {
getMetadata: { request: void; response: any };
getSelection: { request: void; response: string };
getArticle: { request: void; response: any };
};
const messenger = createMessenger<Messages>();
const contentMessenger = createMessenger<ContentMessages>();
async function getClips(): Promise<Clip[]> {
const raw = await storage.get('clips');
return (raw as Clip[] | null) || [];
}
async function saveClip(clip: Clip): Promise<void> {
const clips = await getClips();
clips.unshift(clip);
await storage.set('clips', clips as any);
}
async function getActiveTab(): Promise<chrome.tabs.Tab> {
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
return tab;
}
// Clip full page -- sends message to content script to extract article
messenger.onMessage('clipPage', async () => {
try {
const tab = await getActiveTab();
if (!tab.id) return { success: false };
const metadata = await chrome.tabs.sendMessage(tab.id, { type: 'getMetadata' });
const article = await chrome.tabs.sendMessage(tab.id, { type: 'getArticle' });
const clip: Clip = {
id: crypto.randomUUID(),
type: 'page',
title: article.title || tab.title || 'Untitled',
url: tab.url || '',
content: article.content,
metadata: {
description: metadata.description,
ogImage: metadata.ogImage,
author: metadata.author || article.byline,
siteName: metadata.siteName,
},
createdAt: Date.now(),
};
await saveClip(clip);
return { success: true };
} catch {
return { success: false };
}
});
// Clip selection -- gets selected text as Markdown from content script
messenger.onMessage('clipSelection', async () => {
try {
const tab = await getActiveTab();
if (!tab.id) return { success: false };
const markdown = await chrome.tabs.sendMessage(tab.id, { type: 'getSelection' });
if (!markdown) return { success: false };
const metadata = await chrome.tabs.sendMessage(tab.id, { type: 'getMetadata' });
const clip: Clip = {
id: crypto.randomUUID(),
type: 'selection',
title: metadata.title || tab.title || 'Untitled',
url: tab.url || '',
content: markdown,
metadata: {
description: metadata.description,
ogImage: metadata.ogImage,
author: metadata.author,
siteName: metadata.siteName,
},
createdAt: Date.now(),
};
await saveClip(clip);
return { success: true };
} catch {
return { success: false };
}
});
// Screenshot -- uses chrome.tabs.captureVisibleTab to grab the viewport
messenger.onMessage('clipScreenshot', async () => {
try {
const tab = await getActiveTab();
const dataUrl = await chrome.tabs.captureVisibleTab(tab.windowId!, { format: 'png' });
const clip: Clip = {
id: crypto.randomUUID(),
type: 'screenshot',
title: tab.title || 'Screenshot',
url: tab.url || '',
content: dataUrl,
metadata: { description: '', ogImage: '', author: '', siteName: '' },
createdAt: Date.now(),
};
await saveClip(clip);
return { success: true };
} catch {
return { success: false };
}
});
// Open side panel
messenger.onMessage('openSidePanel', async () => {
const tab = await getActiveTab();
await chrome.sidePanel.open({ tabId: tab.id! });
});
The captureVisibleTab call captures only the currently visible viewport as a PNG data URL. No extra permissions are needed beyond activeTab because the capture happens in response to the user clicking the extension action.
Step 7: Side Panel – Clip Storage and Management
public/sidepanel.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="stylesheet" href="sidepanel.css" />
</head>
<body>
<h2>Saved Clips</h2>
<div class="toolbar">
<input type="text" id="search" placeholder="Search clips..." />
<select id="filter">
<option value="all">All</option>
<option value="page">Pages</option>
<option value="selection">Selections</option>
<option value="screenshot">Screenshots</option>
</select>
<button id="export-md">Export MD</button>
<button id="export-json">Export JSON</button>
</div>
<ul class="clip-list" id="clip-list"></ul>
<script src="sidepanel.js"></script>
</body>
</html>
src/sidepanel.ts:
import { createMessenger } from '@theluckystrike/webext-messaging';
import { createStorage, defineSchema } from '@theluckystrike/webext-storage';
interface Clip {
id: string;
type: 'page' | 'selection' | 'screenshot';
title: string;
url: string;
content: string;
metadata: { description: string; ogImage: string; author: string; siteName: string };
createdAt: number;
}
const schema = defineSchema({ clips: 'json' as const });
const storage = createStorage(schema, 'local');
async function getClips(): Promise<Clip[]> {
const raw = await storage.get('clips');
return (raw as Clip[] | null) || [];
}
function formatDate(ts: number): string {
return new Date(ts).toLocaleDateString('en-US', {
month: 'short', day: 'numeric', year: 'numeric', hour: '2-digit', minute: '2-digit',
});
}
function truncate(text: string, len: number): string {
return text.length > len ? text.slice(0, len) + '...' : text;
}
function renderClips(clips: Clip[], searchQuery: string, typeFilter: string): void {
const list = document.getElementById('clip-list')!;
list.innerHTML = '';
let filtered = clips;
if (typeFilter !== 'all') filtered = filtered.filter((c) => c.type === typeFilter);
if (searchQuery) {
const q = searchQuery.toLowerCase();
filtered = filtered.filter((c) =>
c.title.toLowerCase().includes(q) || c.content.toLowerCase().includes(q)
);
}
if (filtered.length === 0) {
list.innerHTML = '<li class="empty">No clips found</li>';
return;
}
filtered.forEach((clip) => {
const li = document.createElement('li');
li.className = 'clip-item';
const preview = clip.type === 'screenshot'
? `<div class="clip-preview"><img src="${clip.content}" alt="screenshot" /></div>`
: `<div class="clip-preview">${truncate(clip.content.replace(/[#*`>\[\]]/g, ''), 150)}</div>`;
li.innerHTML = `
<div class="clip-title">${clip.title}</div>
<div class="clip-meta">${clip.type} · ${formatDate(clip.createdAt)} ·
<a href="${clip.url}" target="_blank" style="color:#7c83ff">${truncate(clip.url, 40)}</a></div>
${preview}
<div class="clip-actions">
<button class="btn-copy" data-id="${clip.id}">Copy</button>
<button class="btn-delete" data-id="${clip.id}">Delete</button>
</div>`;
list.appendChild(li);
});
// Event delegation for actions
list.querySelectorAll('.btn-delete').forEach((btn) => {
btn.addEventListener('click', async () => {
const id = (btn as HTMLElement).dataset.id!;
let clips = await getClips();
clips = clips.filter((c) => c.id !== id);
await storage.set('clips', clips as any);
renderClips(clips, searchQuery, typeFilter);
});
});
list.querySelectorAll('.btn-copy').forEach((btn) => {
btn.addEventListener('click', async () => {
const id = (btn as HTMLElement).dataset.id!;
const clip = (await getClips()).find((c) => c.id === id);
if (clip) await navigator.clipboard.writeText(clip.content);
});
});
}
// Initialize
(async () => {
const clips = await getClips();
const searchInput = document.getElementById('search') as HTMLInputElement;
const filterSelect = document.getElementById('filter') as HTMLSelectElement;
renderClips(clips, '', 'all');
searchInput.addEventListener('input', async () => {
renderClips(await getClips(), searchInput.value, filterSelect.value);
});
filterSelect.addEventListener('change', async () => {
renderClips(await getClips(), searchInput.value, filterSelect.value);
});
// Export buttons
document.getElementById('export-md')!.addEventListener('click', () => exportClips('markdown'));
document.getElementById('export-json')!.addEventListener('click', () => exportClips('json'));
})();
Step 8: Export Clips as Markdown or JSON
Add the export function to src/sidepanel.ts:
async function exportClips(format: 'markdown' | 'json'): Promise<void> {
const clips = await getClips();
let output: string;
let filename: string;
let mimeType: string;
if (format === 'markdown') {
output = clips.map((clip) => {
const header = [
`# ${clip.title}`,
'',
`- **URL:** ${clip.url}`,
`- **Type:** ${clip.type}`,
`- **Date:** ${formatDate(clip.createdAt)}`,
clip.metadata.author ? `- **Author:** ${clip.metadata.author}` : '',
clip.metadata.siteName ? `- **Site:** ${clip.metadata.siteName}` : '',
'',
'---',
'',
].filter(Boolean).join('\n');
if (clip.type === 'screenshot') {
return `${header}\n`;
}
return `${header}${clip.content}\n`;
}).join('\n\n');
filename = 'clips-export.md';
mimeType = 'text/markdown';
} else {
const exportData = clips.map(({ content, ...rest }) => ({
...rest,
// Truncate screenshot data URLs in JSON export for readability
content: rest.type === 'screenshot' ? '[screenshot data URL]' : content,
}));
output = JSON.stringify(exportData, null, 2);
filename = 'clips-export.json';
mimeType = 'application/json';
}
// Trigger download via Blob URL
const blob = new Blob([output], { type: mimeType });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = filename;
a.click();
URL.revokeObjectURL(url);
}
This produces a clean Markdown file with YAML-like frontmatter for each clip, or a structured JSON array. The download triggers via a temporary Blob URL – no server needed.
Build, Load, and Test
Compile with npx tsc or your bundler of choice, targeting entry points popup.ts, content.ts, background.ts, and sidepanel.ts alongside the HTML and manifest in dist/.
Load the extension:
- Open
chrome://extensions/and enable Developer mode - Click “Load unpacked” and select the
dist/folder - Navigate to any page and click the extension icon
Testing Checklist
- Clip Full Page: Visit an article, click “Clip Full Page” – check the side panel for the saved Markdown clip
- Clip Selection: Highlight text, click “Clip Selection” – verify bold, links, and headings survive conversion
- Clip Screenshot: Click “Clip Screenshot” – verify the PNG preview renders in the side panel
- Side Panel: Search, filter by type, delete clips, and copy content to clipboard
- Export: Click “Export MD” or “Export JSON” – verify the downloaded file is well-formed
- Metadata: Verify title, URL, author, and site name are correctly extracted from meta tags
Debugging Tips
- Run
chrome.storage.local.get(null, console.log)in the service worker console to inspect stored clips - Content script errors appear in the page’s DevTools console, not the extension’s
- If
captureVisibleTabfails, ensure the capture runs in response to a user click soactiveTabis granted
Summary
This tutorial built a complete web clipper extension with three capture modes (full page, selection, screenshot), a content script that extracts metadata and converts HTML to Markdown, a side panel for managing clips, and export functionality. The extension uses @theluckystrike/webext-storage for typed persistent storage and @theluckystrike/webext-messaging for type-safe communication between popup, background, and content scripts.
-e
—
Part of the Chrome Extension Guide by theluckystrike. Built at zovo.one.