Constellation Harvest Regularization (CHR)
Arrange your document into data constellations for maximum harvestable energy.
Upload a .docx file. We embed each unit (paragraphs or sentences), then optimize a set of constellation directions to reduce range entropy and align slabs (the CHR principle).
You’ll get:
- A harvest score (MHEP) showing how much structure we extracted.
- A constellation map (2D PCA) with anchors (★) and your units as points.
- A structured table grouped by constellation and ordered along each ray.
- CSV/JSON exports for your pipeline.
2 24
5 100
2 30
3 16
1 20
0 9999
Upload a file to begin.
How it Works (Short Version)
- We convert your document into units (paragraphs by default; you can switch to sentences).
- We compute embeddings (MiniLM or a local fallback).
- We initialize K anchor directions and iteratively adjust them to lower the global range entropy while forming low-entropy slabs along each anchor.
- The Maximum Harvestable Energy Potential (MHEP) combines the normalized drop in global and slab entropy.
- We then group units by constellation and order them radially, making the dataset easier to exploit downstream (routing, chunking, sparsity).
Tip: Increase K for more granular constellations; increase iterations or beta for sharper structures.
Structured Output (head)