Constellation Harvest Regularization (CHR)

Arrange your document into data constellations for maximum harvestable energy.
Upload a .docx file. We embed each unit (paragraphs or sentences), then optimize a set of constellation directions to reduce range entropy and align slabs (the CHR principle).
You’ll get:

  • A harvest score (MHEP) showing how much structure we extracted.
  • A constellation map (2D PCA) with anchors (★) and your units as points.
  • A structured table grouped by constellation and ordered along each ray.
  • CSV/JSON exports for your pipeline.
Unit granularity
2 24
5 100
2 30
3 16
1 20
0 9999

Upload a file to begin.

How it Works (Short Version)

  • We convert your document into units (paragraphs by default; you can switch to sentences).
  • We compute embeddings (MiniLM or a local fallback).
  • We initialize K anchor directions and iteratively adjust them to lower the global range entropy while forming low-entropy slabs along each anchor.
  • The Maximum Harvestable Energy Potential (MHEP) combines the normalized drop in global and slab entropy.
  • We then group units by constellation and order them radially, making the dataset easier to exploit downstream (routing, chunking, sparsity).

Tip: Increase K for more granular constellations; increase iterations or beta for sharper structures.

Structured Output (head)