User Guide

Welcome to Page2Table AI! This guide will help you get started quickly and make the most of all the extension's features.

Table of Contents

Getting Started

Install the Extension

  1. Download and install the Page2Table AI extension from the Chrome Web Store.
  2. After installation, click the extension icon in the browser toolbar.
  3. On first use, you'll need to log in with your Google account (for AI analysis services).

Note: The extension requires access to webpage content to extract data. These permissions are only used when you actively click the \"Convert Current Page\" button.

First Extraction

  1. Open the webpage you want to extract data from (e.g., product list page, article list, etc.).
  2. Click the extension icon to open the sidebar workbench.
  3. Click the \"⚡️ Convert Current Page\" button in the top left.
  4. Wait for AI analysis to complete (first analysis may take a few seconds).
  5. Data will be displayed in table format in the workbench.

Basic Usage

Convert Current Page

This is the core feature of the extension. AI automatically analyzes page structure, identifies structured data such as lists and tables, and generates extraction schemas.

💡 Tip: If the page contains multiple logical data areas (such as \"Product List\" and \"Filter Conditions\"), they will be automatically separated into different worksheets, which you can switch between using the worksheet tabs at the bottom.

View Extracted Data

Export Data

Click the \"🗂️ Local Storage\" button in the top right to open the file manager:

Advanced Features

🔗 Drill Down to Detail Pages

In the extracted table, click a cell containing a link or the ⚡️ icon next to it to automatically extract detail page data.

  1. After clicking the link, AI will automatically analyze the detail page using the default extraction schema.
  2. On first drill-down, AI will learn and cache the extraction schema for this type of page.
  3. Subsequent drill-downs to similar pages will directly reuse the cached schema for instant extraction.
  4. After extraction is complete, detail page data will open as a new child file in the workbench, and the icon in the original table will change to .

🔄 Smart Pagination Collection

If a list worksheet is identified as having pagination functionality, pagination controls will appear below the table.

  1. The pagination control will show the pagination mode identified by AI (e.g., \"Click Next Page\" or \"Infinite Scroll\").
  2. Set the maximum number of pages (or scroll times) you want to capture in the input box.
  3. Click the \"Start Pagination\" button, and the program will automatically simulate click or scroll operations.
  4. Newly captured data will be deduplicated and appended to the end of the current table. You can click the \"Stop\" button at any time to interrupt.

Batch Drill-Down

If a column contains multiple drill-down links, a ⚡️ batch drill-down button will appear in the column header.

  1. Click the batch drill-down button. If there's no cached schema for this type of page, the program will prompt you to enter extraction requirements for the detail page (you can use the default prompt).
  2. After confirmation, the program will automatically access all unextracted links in that column in the background.
  3. Icons in corresponding rows of the table will update in real-time (loading 🔄 → completed ).

📦 Batch Operations

After batch drill-down is complete, a batch operations panel will appear. If the newly generated child pages also contain paginated lists, options will be provided to perform batch pagination on all these child pages.

💡 Tip: Batch operations can achieve exponential growth in data collection. For example: first batch drill-down 100 product detail pages, then perform batch pagination on each detail page to quickly collect large amounts of data.

🧠 Schema Cache & Reuse

For similar pages on the same website (such as multiple product detail pages), Page2Table AI only needs to analyze once. Subsequent extractions will automatically reuse cached extraction schemas, greatly improving speed and saving AI call costs.

Note: Cached extraction schemas only contain data structure information (such as field names, selectors, etc.) and do not contain any original webpage content, ensuring data privacy and security.

Data Management

Local Storage

All extracted data, file relationships, and extraction schemas are securely stored in your local browser:

File Manager

Click the \"🗂️ Local Storage\" button in the top right to:

FAQ

Q: Why do I need to log in with a Google account?

A: Google account is used for user authentication and AI analysis service call management. We only collect your email address and do not obtain other sensitive information.

Q: Will extracted data be sent to the server?

A: No. All extracted data is stored locally in your browser. Only HTML content used to generate extraction schemas is temporarily sent to the server for AI analysis and deleted immediately after analysis is complete.

Q: How can I improve extraction accuracy?

A: Make sure the page is fully loaded before extraction. For dynamically loaded content, wait for all content to finish loading. If the extraction results are not ideal, try clicking the \"Force Reanalyze\" button to let AI reanalyze the page structure.

Q: What types of webpages are supported?

A: Page2Table AI supports extracting any webpage containing structured data, including product lists, article lists, search results, table data, etc. For dynamically loaded content, it's recommended to wait for the page to fully load before extraction.

Q: Can I extract webpages that require login?

A: Yes. As long as you are logged in and can normally access the webpage in your browser, the extension can extract data from it.

Q: What is the relationship ID in exported Excel files?

A: Relationship IDs are used to identify parent-child relationships between data. The parent_file_sheet_row_id field in child file data corresponds to the unique ID (file_sheet_row_id) of a row in the parent file, making it easy to reconstruct data relationships in databases or BI tools.

Troubleshooting

Analysis Failed

Incomplete Data

Extension Not Working

⚠️ Important Note: If you encounter any issues, first try refreshing the page or reopening the extension. If the problem persists, please contact us via email and we will resolve it as soon as possible.

← Back to Home