Defines the three corpus tiers (source, commentary, reference) that govern how works enter the OpenCosmos knowledge base, grounded in ethical respect for authorship rights and alignment between means and ends.
OpenCosmos is built on interconnection and generosity. How we build the corpus must reflect those same values. The means must match the ends.
For the knowledge base README, see knowledge/README.md. For the publication workflow, see opencosmos-knowledge-publish-workflow.md.
The question is not only "is this legal?" but "does this honor the author?"
These authors are wisdom carriers. Their life's work deserves the same care we'd want for our own. The corpus should function as a web of relationships — pointing toward, contextualizing, bridging, and always sending the reader back to the source.
The deciding question:
Does our use drive people toward the original work, or replace it?
If someone reads our summary and feels moved to seek out the original, we've amplified the author's work. If they feel they've gotten the gist and don't need the book, we've extracted from it. OpenCosmos amplifies. It does not extract.
Every document in the knowledge base must declare a corpus_tier in its frontmatter. This field makes the ethical decision explicit and machine-readable.
source — Full TextWhat it means: The complete original work is reproduced in the corpus.
When it's appropriate: Only for works that are genuinely in the public domain or released under an open license.
Public domain includes:
Important nuance — translations: Many ancient texts are public domain in the original language, but popular English translations are copyrighted. Coleman Barks's Rumi is copyrighted. Modern translations of Dogen, the Upanishads, and other classical texts are often under copyright. Always verify the specific translation, not just the original work.
What to look for: Project Gutenberg texts, public domain translations, works with explicit open licensing.
Frontmatter: corpus_tier: source
commentary — Curated Fair-Use CommentaryWhat it means: An original document written by an OpenCosmos curator that describes a copyrighted work's key ideas, includes limited direct quotation for illustration, and always directs the reader to the original.
When it's appropriate: For copyrighted works whose ideas are important to the corpus — modern authors, living authors, works under active copyright.
Fair-use principles that govern this tier:
Examples of works that belong in this tier:
Frontmatter: corpus_tier: commentary
reference — Pointer OnlyWhat it means: A reference entry that names the work, describes its relevance to the corpus, and directs the reader to it — without reproducing any content, not even brief quotation.
When it's appropriate:
What a reference entry looks like: A short paragraph identifying the work, its author, its relevance to the OpenCosmos mission, and where to find it.
Frontmatter: corpus_tier: reference
The /groom skill is the first human-in-the-loop step when text enters knowledge/incoming/. Before formatting, /groom checks whether the incoming text is a copyrighted work. If it is:
The pnpm knowledge:publish workflow requires corpus_tier in frontmatter. The publish CLI validates:
source tier documents must have a source field indicating public domain status or open licensecommentary tier documents must include attribution and a recommendation to seek the originalreference tier documents are validated for minimal required fieldsCosmo handles tiers differently at retrieval time:
source — Cosmo can draw directly from the text, quote it, and engage with it in depthcommentary — Cosmo synthesizes from the commentary and may direct the person to the original workreference — Cosmo recommends the work but does not attempt to reproduce or summarize its contentA foundational principle of copyright law — and of ethical curation:
The system prompts already model the right approach: they describe methodologies, name sources, and speak in their own voice. The corpus should follow the same pattern for copyrighted works.
Ask: Would this author feel honored by how we're using their work?
If the answer is yes — if we're amplifying their voice, directing people to their books, and treating their ideas with fidelity and care — proceed. If the answer is uncertain, err on the side of less reproduction, more attribution, and always a clear path to the original.
The corpus is not a library of extracted content. It is a web of relationships — each entry a node that connects to the living tradition it draws from. The authors are wisdom carriers. We honor them by pointing back to the source.