Total unique SKUs
3,211
Across all sources
Bilingual EN+FR
2,659
Names in both languages
Box artwork PDFs
2,595
Provided by client
Studio photos
2,808
Quasimodo product shots
Have UPC
2,506
Barcode populated
Field coverage across all SKUs
| Field | Have | Missing | Coverage |
| Studio photo (Quasimodo) | 2,808 | 403 |
|
| Product category | 2,661 | 550 |
|
| EN name | 2,660 | 551 |
|
| FR name | 2,660 | 551 |
|
| Product subcategory | 2,623 | 588 |
|
| Box artwork PDF | 2,595 | 616 |
|
| UPC barcode | 2,506 | 705 |
|
| Short description | 2,141 | 1,070 |
|
| Site categories | 2,141 | 1,070 |
|
| Product images (current site) | 2,138 | 1,073 |
|
| Appears in a catalog PDF | 1,715 | 1,496 |
|
| WP full description | 0 | 3,211 |
|
Per-SKU completeness score distribution
| Score band | SKUs | Distribution |
| 80–100 | 1,611 | |
| 60–79 | 3 | |
| 40–59 | 1,053 | |
| 20–39 | 521 | |
| 0–19 | 23 | |
Migration action breakdown
| Action | SKUs | Meaning |
| preserve | 1,100 | Live on jessar.ca AND active per masters → migrate as-is |
| add | 567 | Active per masters but missing from current site → must add |
| cleanup | 81 | Live on jessar.ca but discontinued per masters → remove before migration |
| investigate | 366 | Live on jessar.ca but not in any master → status unclear, investigate |
| skip | 1,097 | Not live and not active, OR excluded — out of scope |
Content Extraction Pipeline
Phase 1 output — 1,667 primary SKUs · box artwork PDFs + WooCommerce specs + Excel merge · 2026-05-03
Primary SKUs processed
1,667
Entire primary tier
PDFs extracted
1,557
110 have no PDF on drive
Avg fields populated
8.3/17
Across all primary SKUs
Fully populated
0
No SKU hits all 17 fields*
Weight sourced from Excel
969
Items-Jessar for Sean.xlsx
Extracted field coverage — 1,667 primary SKUs
Fields marked ⚑ category-specific have lower nominal coverage because they only apply to a subset of products — see notes below the table.
| Field | Have | Missing | Coverage | Status / Source |
| Product name (EN) | 1,556 | 111 |
|
From PDF + DB |
| Product name (FR) | 1,556 | 111 |
|
From PDF + DB |
| Dimensions | 1,570 | 97 |
|
PDF text + Excel (W×L×D) |
| Weight | 1,005 | 662 |
|
Excel poid-stoc (kg) — gaps may lack data in any source |
| Features (EN) | 1,030 | 637 |
|
Bullet points from PDF |
| Features (FR) | 988 | 679 |
|
Bullet points from PDF |
| Country of origin | 911 | 756 |
|
From PDF text |
| Materials | 898 | 769 |
|
From PDF text |
| Romance copy (EN) | 822 | 845 |
|
→ Generate for 845 SKUs |
| Romance copy (FR) | 765 | 902 |
|
→ Generate for 902 SKUs |
| Warranty | 480 | 1,187 |
|
From PDF text |
| Voltage ⚑ | 609 | 1,058 |
|
76% within electrical/lighting (678 SKUs) |
| Wattage ⚑ | 503 | 1,164 |
|
60.5% within electrical/lighting (678 SKUs) |
| Capacity | 337 | 1,330 |
|
Only for applicable products (cookware, storage) |
| Certifications ⚑ | 234 | 1,433 |
|
→ Logos in PDF images — needs vision pass |
| Care instructions (EN) ⚑ | 206 | 1,461 |
|
33% within KITCH (608 SKUs) — N/A for lighting |
| Care instructions (FR) ⚑ | 181 | 1,486 |
|
33% within KITCH — N/A for lighting/electrical |
* "Fully populated" at 17/17 would require certifications and care on every product, including bulbs and fixtures where those fields are structurally inapplicable. Adjusted completeness (excluding N/A fields per category) is tracked separately in content_gaps.csv.
Extraction score distribution — 1,667 primary SKUs · out of 17 possible fields
| Score band | SKUs | Distribution |
| 80–99 (14–17 fields) | 24 | |
| 60–79 (10–13 fields) | 475 | |
| 40–59 (7–10 fields) | 657 | |
| 20–39 (3–6 fields) | 383 | |
| 0–19 (0–3 fields) | 128 | |
The 128 SKUs at 0–19 are predominantly no-PDF records (no box artwork provided). The 383 at 20–39 typically have name + dimensions + weight from Excel, but no PDF text content.