File Upload Integration

ACP’s Product Feed file upload uses a push-based SFTP model. Merchants proactively push product data files to an SFTP server designated by OpenAI, rather than having OpenAI pull from the merchant.

3.1 SFTP Push Model

Merchant Data System
  | SFTP upload (merchant-initiated push)
OpenAI SFTP Server
  | automatic parsing and indexing
ChatGPT Product Discovery Engine

Key point: This is a one-way push. Merchants control the upload timing and frequency. OpenAI does not actively pull data from merchant systems. SFTP credentials are provided by OpenAI after the merchant receives partner approval.

3.2 Supported File Formats

Format	Compression	Recommendation	Description
Parquet	zstd	Recommended	Columnar storage, highest compression efficiency
jsonl.gz	gzip	Optional	JSON Lines format, one record per line
csv.gz	gzip	Optional	Comma-separated, requires UTF-8 encoding
tsv.gz	gzip	Optional	Tab-separated, requires UTF-8 encoding

Encoding requirement: All text files must use UTF-8 encoding. Parquet with zstd compression is the preferred option because:

Columnar storage natively supports efficient field-level reads
zstd achieves higher compression ratios than gzip with faster decompression
Built-in schema information reduces type ambiguity

3.3 Snapshot Type: Full Catalog Override

ACP file uploads use a full catalog snapshot model, not an incremental (delta) model. Each uploaded file represents the complete source of truth for the product catalog. This means:

The uploaded file contains complete information for all active products
Each upload fully replaces the previous data
There is no need to mark operations as “add”, “modify”, or “delete”
If a product is absent from the latest snapshot, it is treated as delisted

Day 1 snapshot: [Product A, Product B, Product C]  -> catalog = A, B, C
Day 2 snapshot: [Product A, Product C, Product D]  -> catalog = A, C, D (B removed, D added)

3.4 Sharding Strategy

Large product catalogs need to be sharded for upload. Sharding guidelines:

Parameter	Recommended Value
Max products per shard	500,000 items
Target file size	Under 500 MB

Sharding example:

# Sharding plan for a 1 million product catalog
products_shard_001.parquet  -> Products 1 - 500,000
products_shard_002.parquet  -> Products 500,001 - 1,000,000

When sharding, ensure each product (including all its Variants) is contained entirely within a single shard file. Do not split different Variants of the same Product across different files.

3.5 Upload Frequency

Strategy	Frequency	Purpose
SFTP full snapshot	At least once daily	Product catalog baseline sync
REST API incremental	Real-time throughout the day	Price, inventory, promotion changes

Recommended approach: Upload a complete SFTP full snapshot once daily (early morning), and push real-time changes via REST API during the day (price adjustments, inventory updates, new product launches). This dual-channel strategy ensures:

Full snapshots provide a data consistency baseline
API incremental updates guarantee data freshness
Even if the API encounters brief issues, the full snapshot corrects data the next day

3.6 File Naming Conventions

Use stable, consistent file names. Uploading a file with the same name overwrites the previous content.

# Correct: fixed file names, overwrite each time
products_shard_001.parquet
products_shard_002.parquet

# Incorrect: do not append timestamps to file names
products_20260411_001.parquet
products_20260412_001.parquet

Do not append. ACP expects stable file names with content overwriting. Using timestamped file names means old files will not be automatically cleaned up, potentially causing data inconsistency.

3.7 Product Delisting

In the full snapshot model, there are two ways to delist a product: Method 1: Omit from the snapshot The simplest approach. If the product is not included in the next SFTP full snapshot, it naturally disappears from the catalog. Method 2: Set is_eligible_search to false Keep the product record in the snapshot but set the is_eligible_search field to false. The product data still exists but will not appear in ChatGPT’s product discovery.

{
  "id": "prod_discontinued_001",
  "is_eligible_search": false,
  "variants": [
    {
      "id": "var_001",
      "title": "Discontinued Product",
      "price": { "amount": 0, "currency": "USD" }
    }
  ]
}

Method 2 is suitable for scenarios where you need to retain the product record but temporarily hide it (e.g., seasonal products, temporarily out of stock).

3.8 Feed Header

Every uploaded file must include Feed Header information identifying the data source and target:

Field	Type	Description
`feed_id`	string	Unique identifier for the data feed
`account_id`	string	Merchant account ID
`target_merchant`	string	Target merchant identifier
`target_country`	string	Target country (ISO 3166-1 alpha-2)

{
  "feed_id": "feed_electronics_us",
  "account_id": "acct_merchant_123",
  "target_merchant": "merchant_xyz",
  "target_country": "US"
}

3.9 Best Practices Checklist

Practice	Description
Use Parquet + zstd	Highest compression efficiency, fastest parsing
Maintain UTF-8 encoding	Avoid character set issues
Daily full + API incremental	Dual channel ensures data freshness
Fixed file names with overwrite	Do not append timestamped files
Max 500,000 items per shard	Maintain processing efficiency
Max 500 MB per file	Avoid upload timeouts
Complete product in one shard	Product and Variants must not span files
Monitor upload status	Confirm SFTP transfers complete successfully

Next chapter: Chapter 4: REST API Integration — Product Feed REST API complete reference

ACP product data object field specification ACP product feed REST API reference

​File Upload Integration

​3.1 SFTP Push Model

​3.2 Supported File Formats

​3.3 Snapshot Type: Full Catalog Override

​3.4 Sharding Strategy

​3.5 Upload Frequency

​3.6 File Naming Conventions

​3.7 Product Delisting

​3.8 Feed Header

​3.9 Best Practices Checklist