Build a Data Export Chrome Extension: CSV, JSON and PDF
Have you ever needed to extract data from a website and save it for offline use? Whether you are collecting product prices for comparison shopping, gathering contact information from directories, or compiling research data from multiple sources, the ability to export web data in multiple formats is incredibly valuable. Building a data export extension empowers users to capture structured information from any webpage and download it in their preferred format.
In this comprehensive tutorial, we will build a fully functional data export Chrome extension that can scrape table data, extract structured information, and export it to CSV, JSON, and PDF formats. This extension will serve as a powerful tool for researchers, marketers, data analysts, and anyone who needs to collect and organize web data efficiently.
The project will teach you essential Chrome extension development skills including content script injection for data extraction, message passing between components, file generation and download handling, and working with the chrome.storage API for persisting user preferences. By the end of this guide, you will have a complete data export extension ready for personal use or publication to the Chrome Web Store.
Why Build a Data Export Extension
Data export extensions are among the most useful tools in the Chrome Web Store ecosystem. They bridge the gap between web content and offline data analysis, enabling users to transform dynamic web pages into structured datasets. There are several compelling reasons to build this type of extension.
First, the demand for web data extraction tools continues to grow across industries. Marketing professionals need to export competitor pricing data. Researchers need to compile information from academic databases. Sales teams need to extract contact lists from lead generation websites. A well-built data export extension addresses these real-world needs and can attract a substantial user base.
Second, building a data export extension teaches you fundamental concepts that apply to many other extension types. You will learn how to inject content scripts to interact with webpage DOM, how to parse and transform data into different formats, how to handle file downloads programmatically, and how to create intuitive user interfaces for data selection. These skills transfer directly to building scrapers, analyzers, and other data-focused extensions.
Third, the technical challenges involved are manageable for developers at intermediate skill levels. You do not need complex backend infrastructure because the extension runs entirely in the browser. Modern JavaScript provides excellent built-in support for CSV and JSON manipulation, and several libraries make PDF generation straightforward. The entire application can be client-side, keeping development costs minimal.
Project Overview and Features
Our data export extension will include a comprehensive feature set designed for maximum utility. First, intelligent data detection that automatically identifies tabular data, lists, and structured elements on any webpage. Second, flexible format selection allowing users to choose between CSV, JSON, and PDF export formats. Third, column selection capability enabling users to choose which fields to include in the export. Fourth, preview functionality showing a sample of the data before export. Fifth, batch export support for exporting multiple pages of data. Sixth, export history storing previous exports for quick access. Seventh, custom delimiter options for CSV exports including comma, semicolon, and tab delimiters. Eighth, encoding selection supporting UTF-8 and other common encodings.
This feature set balances sophistication with usability. The extension will automatically detect data when possible but give users full control over what gets exported. The preview system prevents data loss from incorrect selections, and the history feature helps users manage ongoing projects.
Setting Up the Project Structure
Every Chrome extension requires a specific file organization to function correctly. Let us set up our project directory first. Create a new folder named data-export-extension in your development workspace. Inside this folder, we will create the following files and directories.
The manifest.json file serves as the configuration file that tells Chrome about our extension capabilities and permissions. The popup.html file defines the user interface that appears when clicking the extension icon. The popup.css file styles our popup with a clean, professional appearance. The popup.js file contains the JavaScript logic for handling user interactions and coordinating data export. The content.js file runs in the context of web pages to detect and extract data. The background.js file handles long-running tasks and manages extension state. The utils folder will contain helper functions for data processing. The icons folder will store different sized icons for the extension.
Let us begin by creating the manifest.json file with all necessary permissions and configuration.
{
"manifest_version": 3,
"name": "Data Export Pro",
"version": "1.0",
"description": "Export web data to CSV, JSON, and PDF formats with powerful extraction tools.",
"permissions": [
"storage",
"activeTab",
"scripting",
"downloads"
],
"host_permissions": [
"<all_urls>"
],
"action": {
"default_popup": "popup.html",
"default_icon": {
"16": "icons/icon16.png",
"48": "icons/icon48.png",
"128": "icons/icon128.png"
}
},
"background": {
"service_worker": "background.js"
},
"content_scripts": [
{
"matches": ["<all_urls>"],
"js": ["content.js"],
"run_at": "document_idle"
}
],
"icons": {
"16": "icons/icon16.png",
"48": "icons/icon48.png",
"128": "icons/icon128.png"
}
}
This manifest declares the permissions our extension needs. The storage permission allows us to save user preferences and export history. The activeTab permission gives us access to the current tab for data extraction. The scripting permission enables us to inject and execute content scripts. The downloads permission lets us programmatically trigger file downloads. The host permissions grant access to all URLs so the extension can work on any website.
Building the Content Script for Data Detection
The content script is the heart of our data export extension. It runs in the context of each webpage and is responsible for detecting extractable data, analyzing the page structure, and preparing data for export. Let us create a robust content script that can handle various data structures.
// content.js - Data detection and extraction
(function() {
// Store detected data structures
let detectedData = {
tables: [],
lists: [],
structuredData: []
};
// Detect all tables on the page
function detectTables() {
const tables = document.querySelectorAll('table');
return Array.from(tables).map((table, index) => {
const rows = table.querySelectorAll('tr');
const headers = Array.from(table.querySelectorAll('th')).map(th => th.textContent.trim());
const data = Array.from(rows).slice(1).map(row => {
const cells = row.querySelectorAll('td');
const rowData = {};
cells.forEach((cell, i) => {
rowData[headers[i] || `column${i}`] = cell.textContent.trim();
});
return rowData;
});
return {
type: 'table',
index,
headers,
data,
preview: data.slice(0, 5)
};
});
}
// Detect structured data (JSON-LD)
function detectStructuredData() {
const jsonLdScripts = document.querySelectorAll('script[type="application/ld+json"]');
return Array.from(jsonLdScripts).map((script, index) => {
try {
const data = JSON.parse(script.textContent);
return {
type: 'jsonld',
index,
data: Array.isArray(data) ? data : [data]
};
} catch (e) {
return null;
}
}).filter(Boolean);
}
// Detect list structures
function detectLists() {
const lists = document.querySelectorAll('ul, ol');
return Array.from(lists).map((list, index) => {
const items = Array.from(list.querySelectorAll('li')).map(li => li.textContent.trim());
return {
type: 'list',
index,
items,
preview: items.slice(0, 10)
};
});
}
// Main detection function
function detectAllData() {
detectedData.tables = detectTables();
detectedData.lists = detectLists();
detectedData.structuredData = detectStructuredData();
return detectedData;
}
// Listen for messages from popup
chrome.runtime.onMessage.addListener((request, sender, sendResponse) => {
if (request.action === 'detectData') {
const data = detectAllData();
sendResponse(data);
}
if (request.action === 'extractData') {
const { type, index } = request;
let extractedData = null;
if (type === 'table' && detectedData.tables[index]) {
extractedData = detectedData.tables[index];
} else if (type === 'list' && detectedData.lists[index]) {
extractedData = detectedData.lists[index];
} else if (type === 'jsonld' && detectedData.structuredData[index]) {
extractedData = detectedData.structuredData[index];
}
sendResponse(extractedData);
}
return true;
});
// Auto-detect on page load
detectAllData();
})();
This content script provides comprehensive data detection capabilities. The detectTables function finds all HTML tables and converts them to structured data with headers and rows. The detectStructuredData function locates JSON-LD schema markup, which many websites use for structured information. The detectLists function identifies unordered and ordered lists that may contain valuable data.
The message listener handles communication between the popup and content script. When the popup requests data detection, the content script scans the page and returns all detected structures. When the popup requests specific data extraction, the content script returns the selected data in a format suitable for export.
Creating the Popup Interface
The popup interface is what users interact with when they click our extension icon. It should provide a clean, intuitive interface for selecting data and choosing export options. Let us create the HTML structure first.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Data Export Pro</title>
<link rel="stylesheet" href="popup.css">
</head>
<body>
<div class="container">
<header>
<h1>Data Export Pro</h1>
<p class="subtitle">Extract and export web data</p>
</header>
<section id="data-selection">
<h2>Detected Data</h2>
<div id="detected-data-list"></div>
</section>
<section id="export-options">
<h2>Export Options</h2>
<div class="option-group">
<label for="format">Export Format:</label>
<select id="format">
<option value="csv">CSV (Comma Separated)</option>
<option value="json">JSON</option>
<option value="pdf">PDF Document</option>
</select>
</div>
<div class="option-group" id="delimiter-group">
<label for="delimiter">Delimiter:</label>
<select id="delimiter">
<option value=",">Comma (,)</option>
<option value=";">Semicolon (;)</option>
<option value="\t">Tab</option>
</select>
</div>
<div class="option-group">
<label for="encoding">Encoding:</label>
<select id="encoding">
<option value="utf-8">UTF-8</option>
<option value="iso-8859-1">ISO-8859-1</option>
</select>
</div>
</section>
<section id="preview-section">
<h2>Data Preview</h2>
<div id="preview-content"></div>
</section>
<button id="export-btn" class="primary-btn">Export Data</button>
<div id="status-message"></div>
</div>
<script src="popup.js"></script>
</body>
</html>
This HTML structure provides a logical flow for the user. The detected data section shows what the extension found on the page. The export options section lets users configure their output. The preview section shows a sample of the data before export. The export button triggers the download.
Styling the Popup
The popup CSS should be clean, professional, and easy to read. Users should be able to quickly identify data and understand their options. Let us create the stylesheet.
/* popup.css */
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
width: 400px;
min-height: 500px;
background: #f5f5f5;
color: #333;
}
.container {
padding: 20px;
}
header {
text-align: center;
margin-bottom: 20px;
padding-bottom: 15px;
border-bottom: 1px solid #e0e0e0;
}
h1 {
font-size: 20px;
color: #1a73e8;
margin-bottom: 5px;
}
.subtitle {
font-size: 12px;
color: #666;
}
h2 {
font-size: 14px;
color: #444;
margin-bottom: 10px;
margin-top: 15px;
}
section {
margin-bottom: 20px;
}
#detected-data-list {
max-height: 150px;
overflow-y: auto;
border: 1px solid #ddd;
border-radius: 4px;
background: white;
}
.data-item {
padding: 10px;
border-bottom: 1px solid #eee;
cursor: pointer;
transition: background 0.2s;
}
.data-item:hover {
background: #f0f7ff;
}
.data-item.selected {
background: #e8f0fe;
border-left: 3px solid #1a73e8;
}
.data-item:last-child {
border-bottom: none;
}
.data-type {
font-size: 10px;
text-transform: uppercase;
color: #1a73e8;
font-weight: 600;
}
.data-label {
font-size: 13px;
margin-top: 3px;
}
.option-group {
margin-bottom: 12px;
}
.option-group label {
display: block;
font-size: 12px;
color: #555;
margin-bottom: 5px;
}
select {
width: 100%;
padding: 8px;
border: 1px solid #ddd;
border-radius: 4px;
font-size: 13px;
background: white;
}
#preview-content {
max-height: 120px;
overflow: auto;
background: white;
border: 1px solid #ddd;
border-radius: 4px;
padding: 10px;
font-size: 11px;
font-family: monospace;
white-space: pre-wrap;
}
.primary-btn {
width: 100%;
padding: 12px;
background: #1a73e8;
color: white;
border: none;
border-radius: 4px;
font-size: 14px;
font-weight: 600;
cursor: pointer;
transition: background 0.2s;
}
.primary-btn:hover {
background: #1557b0;
}
.primary-btn:disabled {
background: #ccc;
cursor: not-allowed;
}
#status-message {
margin-top: 10px;
padding: 8px;
border-radius: 4px;
font-size: 12px;
text-align: center;
display: none;
}
#status-message.success {
display: block;
background: #e6f4ea;
color: #1e8e3e;
}
#status-message.error {
display: block;
background: #fce8e6;
color: #d93025;
}
These styles create a clean, modern interface that follows Chrome extension design guidelines. The selected data item is highlighted with a blue border, making it clear what will be exported. The preview section uses a monospace font to display data clearly.
Implementing Export Logic
The popup JavaScript coordinates the entire export process. It requests data from the content script, processes it according to user selections, and triggers downloads. Let us create the main popup script.
// popup.js - Main extension logic
document.addEventListener('DOMContentLoaded', async () => {
const formatSelect = document.getElementById('format');
const delimiterSelect = document.getElementById('delimiter');
const encodingSelect = document.getElementById('encoding');
const detectedDataList = document.getElementById('detected-data-list');
const previewContent = document.getElementById('preview-content');
const exportBtn = document.getElementById('export-btn');
const statusMessage = document.getElementById('status-message');
const delimiterGroup = document.getElementById('delimiter-group');
let detectedData = [];
let selectedData = null;
// Show/hide delimiter option based on format
formatSelect.addEventListener('change', () => {
delimiterGroup.style.display =
formatSelect.value === 'csv' ? 'block' : 'none';
});
// Get current tab and detect data
async function detectData() {
try {
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
const response = await chrome.tabs.sendMessage(tab.id, { action: 'detectData' });
return response;
} catch (error) {
console.error('Error detecting data:', error);
return null;
}
}
// Render detected data in the list
function renderDetectedData(data) {
detectedData = [];
detectedDataList.innerHTML = '';
if (data.tables.length === 0 && data.lists.length === 0 && data.structuredData.length === 0) {
detectedDataList.innerHTML = '<div class="data-item">No exportable data detected</div>';
return;
}
// Process tables
data.tables.forEach((table, index) => {
detectedData.push({ ...table, sourceIndex: index });
const item = document.createElement('div');
item.className = 'data-item';
item.innerHTML = `
<div class="data-type">Table</div>
<div class="data-label">${table.headers.length} columns, ${table.data.length} rows</div>
`;
item.addEventListener('click', () => selectData(detectedData.length - 1));
detectedDataList.appendChild(item);
});
// Process lists
data.lists.forEach((list, index) => {
detectedData.push({ ...list, sourceIndex: index });
const item = document.createElement('div');
item.className = 'data-item';
item.innerHTML = `
<div class="data-type">List</div>
<div class="data-label">${list.items.length} items</div>
`;
item.addEventListener('click', () => selectData(detectedData.length - 1));
detectedDataList.appendChild(item);
});
// Process JSON-LD
data.structuredData.forEach((item, index) => {
detectedData.push({ ...item, sourceIndex: index });
const div = document.createElement('div');
div.className = 'data-item';
div.innerHTML = `
<div class="data-type">Structured Data</div>
<div class="data-label">JSON-LD Schema</div>
`;
div.addEventListener('click', () => selectData(detectedData.length - 1));
detectedDataList.appendChild(div);
});
}
// Select and preview data
async function selectData(index) {
// Update UI selection
document.querySelectorAll('.data-item').forEach((item, i) => {
item.classList.toggle('selected', i === index);
});
selectedData = detectedData[index];
// Get full data from content script
try {
const [tab] = await chrome.tabs.query({ active: true, currentWindow: true });
const response = await chrome.tabs.sendMessage(tab.id, {
action: 'extractData',
type: selectedData.type,
index: selectedData.sourceIndex
});
selectedData = { ...selectedData, ...response };
updatePreview();
} catch (error) {
console.error('Error extracting data:', error);
}
}
// Update preview based on format
function updatePreview() {
if (!selectedData) return;
const format = formatSelect.value;
let previewText = '';
if (format === 'csv') {
const delimiter = delimiterSelect.value;
if (selectedData.headers) {
previewText += selectedData.headers.join(delimiter) + '\n';
}
selectedData.data.slice(0, 3).forEach(row => {
const values = selectedData.headers.map(h => row[h] || '');
previewText += values.join(delimiter) + '\n';
});
} else if (format === 'json') {
const preview = selectedData.data ? selectedData.data.slice(0, 3) : selectedData.items;
previewText = JSON.stringify(preview, null, 2);
} else {
previewText = 'PDF preview not available. Click Export to generate PDF.';
}
previewContent.textContent = previewText;
}
// Export data to file
async function exportData() {
if (!selectedData) {
showStatus('Please select data to export', 'error');
return;
}
const format = formatSelect.value;
const encoding = encodingSelect.value;
let content = '';
let filename = '';
let mimeType = '';
try {
if (format === 'csv') {
const delimiter = delimiterSelect.value;
if (selectedData.headers) {
content += selectedData.headers.join(delimiter) + '\n';
}
if (selectedData.data) {
selectedData.data.forEach(row => {
const values = selectedData.headers.map(h => {
const val = row[h] || '';
return val.includes(delimiter) ? `"${val}"` : val;
});
content += values.join(delimiter) + '\n';
});
}
filename = 'export.csv';
mimeType = 'text/csv';
} else if (format === 'json') {
const exportData = selectedData.data || selectedData.items || selectedData.data;
content = JSON.stringify(exportData, null, 2);
filename = 'export.json';
mimeType = 'application/json';
} else if (format === 'pdf') {
// PDF generation would use a library like jsPDF
showStatus('PDF export requires additional library integration', 'error');
return;
}
// Trigger download
const blob = new Blob([content], { type: `${mimeType};charset=${encoding}` });
const url = URL.createObjectURL(blob);
await chrome.downloads.download({
url: url,
filename: filename,
saveAs: true
});
// Save to history
await saveToHistory(filename, format);
showStatus(`Successfully exported to ${filename}`, 'success');
} catch (error) {
console.error('Export error:', error);
showStatus('Export failed: ' + error.message, 'error');
}
}
// Save export to history
async function saveToHistory(filename, format) {
const history = await chrome.storage.local.get('exportHistory') || { exportHistory: [] };
history.exportHistory.unshift({
filename,
format,
timestamp: new Date().toISOString(),
url: (await chrome.tabs.query({ active: true, currentWindow: true }))[0].url
});
// Keep only last 50 exports
history.exportHistory = history.exportHistory.slice(0, 50);
await chrome.storage.local.set(history);
}
// Show status message
function showStatus(message, type) {
statusMessage.textContent = message;
statusMessage.className = type;
setTimeout(() => {
statusMessage.className = '';
}, 3000);
}
// Event listeners
exportBtn.addEventListener('click', exportData);
formatSelect.addEventListener('change', updatePreview);
delimiterSelect.addEventListener('change', updatePreview);
// Initialize
const data = await detectData();
if (data) {
renderDetectedData(data);
}
});
This popup script handles the complete export workflow. It communicates with the content script to detect and extract data, processes the data into the selected format, and triggers downloads using the Chrome downloads API. The export history feature helps users keep track of their exports.
Adding PDF Export Support
For PDF export, we need to integrate a library. The simplest approach is to use jsPDF, a popular JavaScript PDF generation library. Let us update our manifest to include the library and modify the popup to support PDF generation.
First, download jsPDF from a CDN or include it in your project. Then update the manifest to declare it as a web-accessible resource:
{
"web_accessible_resources": [
{
"resources": ["jspdf.umd.min.js"],
"matches": ["<all_urls>"]
}
]
}
Update the popup.js to include PDF generation:
// Add PDF export function
async function exportToPDF(data, filename) {
// Dynamically load jsPDF if not already loaded
if (!window.jspdf) {
await loadScript('jspdf.umd.min.js');
}
const { jsPDF } = window.jspdf;
const doc = new jsPDF();
// Add title
doc.setFontSize(16);
doc.text('Data Export', 20, 20);
// Add metadata
doc.setFontSize(10);
doc.text(`Generated: ${new Date().toLocaleString()}`, 20, 30);
doc.text(`Source: ${data.source || 'Web Page'}`, 20, 36);
// Add table data
doc.setFontSize(12);
let yPos = 50;
if (data.headers && data.data) {
// Table headers
doc.setFont(undefined, 'bold');
data.headers.forEach((header, i) => {
doc.text(String(header), 20 + (i * 40), yPos);
});
yPos += 10;
// Table rows
doc.setFont(undefined, 'normal');
data.data.forEach(row => {
if (yPos > 270) {
doc.addPage();
yPos = 20;
}
data.headers.forEach((header, i) => {
const value = String(row[header] || '').substring(0, 30);
doc.text(value, 20 + (i * 40), yPos);
});
yPos += 8;
});
}
// Save the PDF
doc.save(filename);
}
This PDF implementation creates formatted documents with headers, metadata, and tabular data. The jsPDF library handles the complex PDF generation process, making it easy to create professional-looking exports.
Testing Your Extension
Before deploying your extension, thorough testing ensures everything works correctly. Load your extension in Chrome by navigating to chrome://extensions, enabling Developer Mode, and clicking Load Unpacked. Select your extension folder to install it.
Test the following scenarios to verify functionality. Visit a page with HTML tables and verify detection works correctly. Select different data items and ensure the preview updates appropriately. Try exporting to each format and verify the files generate correctly. Test with different delimiter options for CSV export. Check that export history saves properly. Test on multiple websites with varying data structures.
Pay special attention to error handling. What happens when no data is detected? What occurs when the export fails? Your extension should handle these situations gracefully with clear error messages.
Publishing to the Chrome Web Store
Once your extension is tested and working, you can publish it to the Chrome Web Store. First, create a developer account at the Chrome Web Store if you do not have one. Package your extension by going to chrome://extensions and clicking Pack Extension. Select your extension folder and optionally specify a private key.
Navigate to the Chrome Web Store Developer Dashboard and click Add New Item. Upload your packaged extension ZIP file. Fill in the required information including the extension name, description, and screenshots. Set appropriate categories and keywords to help users find your extension.
The keywords you use significantly impact discoverability. Include variations like data export extension, export to csv chrome, web data exporter extension, chrome extension csv export, and json export chrome extension throughout your description. These keywords match what users search for when looking for data export tools.
Submit your extension for review. Google reviews typically complete within a few hours to a few days. Once approved, your extension becomes available to all Chrome users.
Conclusion
Congratulations! You have built a complete data export Chrome extension capable of detecting, extracting, and exporting web data in multiple formats. This extension demonstrates core Chrome extension development concepts including content script injection, message passing, file generation, and download handling.
The skills you have learned in building this extension transfer directly to many other extension types. Data extraction, file generation, and user preference storage are fundamental capabilities used in countless successful Chrome extensions.
Consider extending this project with additional features. Add support for scheduled exports, cloud storage integration, or data transformation options. Implement data cleaning features to remove duplicates or filter specific content. Create templates for common export scenarios. The foundation you have built provides endless possibilities for expansion.
Building useful tools for yourself and others is one of the most rewarding aspects of extension development. Your data export extension solves real problems for researchers, marketers, and data analysts. With further development and refinement, it has the potential to become a valuable asset in the Chrome Web Store.
Turn Your Extension Into a Business
Ready to monetize? The Extension Monetization Playbook covers freemium models, Stripe integration, subscription architecture, and growth strategies for Chrome extension developers.
Built by theluckystrike at zovo.one