Skip to main content

API and Database Integration

Overview

For custom-built sites and non-standard e-commerce platforms, ORBEXA provides flexible data integration methods. When Shopify/WooCommerce/WordPress plugins are not applicable, merchants can bring product data into the platform through CSV import, ETL pipelines, or visual scraping.

CSV Import

Basic Flow

ORBEXA provides CSV file import functionality through the ETL Router:
  1. Upload CSV file — The merchant uploads a CSV file containing product data
  2. Field mapping — The platform provides a visual field mapping interface to map CSV column names to UCP/ACP standard fields
  3. Data validation — Automatically checks data format, required fields, and data types
  4. Import execution — After validation passes, product data is bulk imported
  5. AI cleaning — Imported data automatically enters the Refinery Pipeline for cleaning

Field Mapping

Field mapping is the core of CSV import. Column names vary widely across different merchants’ CSV files. ORBEXA’s mapping engine supports:
  • Manual mapping: Merchants specify the corresponding standard field for each column in the interface
  • Smart suggestions: Automatically recommends mapping relationships based on column names and data content
  • Saved mappings: Mapping rules are saved for reuse, eliminating the need to reconfigure on subsequent imports

Visual Scraping

Stagehand Engine

For platforms that have no API and cannot export CSV files, ORBEXA provides visual scraping capabilities through visualScrapeRouter.ts. Powered by the Stagehand engine, it can browse web pages like a human and extract product information:
  • Automatically identifies product listing pages and detail pages
  • Extracts product name, price, images, description, and other fields
  • Handles dynamically rendered page content
  • Supports pagination and infinite scrolling

Applicable Scenarios

  • E-commerce sites built on traditional CMS platforms
  • Legacy platforms without API interfaces
  • Product data from third-party marketplaces or directories

Waterfall Data Ingestion

ORBEXA implements a waterfall data ingestion strategy:
WooCommerce REST API
        |
        v
  Success? --Yes--> Data stored
        |
        No
        |
        v
  Visual Scraping Fallback
        |
        v
    Data stored
API-first approach: If the platform provides a REST API (such as WooCommerce), data is fetched through the API first. Automatic fallback to visual scraping: If the API is unavailable or data is incomplete, Stagehand visual scraping is automatically enabled as a fallback. This strategy ensures ORBEXA can obtain product data regardless of the merchant’s technical setup.

Prism Pipeline

Prism Pipeline is ORBEXA’s intelligent data extraction engine:
  • Multi-source normalization: Normalizes data from different sources (CSV, API, scraping) into a unified format
  • Smart field recognition: Automatically identifies non-standard field names and maps them to UCP/ACP standard fields
  • Data quality scoring: Evaluates the completeness and quality of each product record
  • Anomaly detection: Flags price anomalies, missing descriptions, broken images, and other issues
Prism Pipeline serves as the preprocessing stage before data enters the AI Refinery Pipeline for cleaning.

Integration Method Comparison

MethodApplicable ScenarioTechnical RequirementsData Freshness
Shopify OAuthShopify merchantsNoneReal-time (Webhook)
WooCommerce APIWooCommerce merchantsGenerate API keysReal-time (Webhook)
WordPress PluginNon-e-commerce WordPressInstall pluginLow frequency
CSV ImportAny platformPrepare CSV fileManually triggered
Visual ScrapingPlatforms without APIsProvide URLPeriodic scraping
API IntegrationCustom-built sitesAPI developmentOn demand

Summary

API and database integration covers all scenarios beyond Shopify/WooCommerce/WordPress. CSV import provides the most universal bulk method, visual scraping solves the problem for platforms without APIs, the waterfall strategy ensures reliability of data acquisition, and Prism Pipeline guarantees data quality.
Next chapter: MCP Server — 5 tools and 3 resources that let AI agents directly query products and inventory