Patterns — research notes¶
You have a document. You want a graph of it. This is the walk from plaintext to queryable notes, with the honest bits in plain sight.
The helper tool text2N4L will take a plaintext document and propose an N4L
file from it. The operative word is propose. The draft it produces is
deliberately imperfect: it picks out sentences it thinks are high-signal,
sketches containment links, and decorates everything with n-gram context
tags, and then it hands the result to you to argue with. That argument is
the point. A corpus you have not argued with is not a corpus you know.
Expect to spend real time on this — certainly longer than it takes to run
the command. The first pass gives you a rough shape; the second pass gives
you chapters you believe in; later passes turn the chapters into something
you can tell a story from. Everything in SSTorytime is designed to make
that iteration cheap: the .n4l file is plain text, the upload is
idempotent, -wipe lets you start over without regret. Treat the tool
chain as a loom rather than a printer.
This cookbook walks the mechanics: generate a draft, refine it, upload it, search it. The judgement happens between the steps, not inside them.
Before you start
N4L,text2N4L, andsearchN4Lon your$PATH(from Install in 5 minutes).- A running PostgreSQL instance loaded with the SSTorytime schema.
- A plaintext
.txtdocument you want to explore. This walkthrough uses an imagined set of project retrospectives from a home-security platform — any prose works equally well: meeting notes, interview transcripts, book chapters, essay drafts. If you'd rather see the pattern on something smaller and already in the repo, the reading list atexamples/reading-list.n4lwas written by hand and makes a good comparison when you want to see what a polished, fully-edited N4L file looks like.
1. Drop the source document in place¶
A habit worth picking up: keep raw corpora under examples/
alongside the sample data that ships with the project. The files
are text, git handles them happily, and future-you will thank
past-you for keeping the source next to the N4L that came out of
it.
cd examples
cat > mycorpus.txt <<'EOF'
The system was originally designed as a single Raspberry Pi
running in the garage. Early tests showed that the camera
latency was too high for real-time motion alerts. After a
redesign we moved the inference pipeline onto a small x86 box
and kept the Pi as a network relay. The first production cut
went live in September and caught a delivery driver who tried
to move a package from the porch.
EOF
For a real run, substitute any plain UTF-8 text.
2. Fractionate the text¶
text2N4L reads the file and picks out the highest-signal
sentences, writing a proposed N4L skeleton alongside the source:
What you'll find:
- A new file
mycorpus.txt_edit_me.n4lsitting next to the source. -% 30asked for about 30% of the sentences; the output will usually be a little more than that, because two internal heuristics vote and their union gets kept.- Each selected sentence becomes an aliased item (
@senN) with a containment link back to the source document. - A
_sequence_context pulls the sentences into a running narrative so you can still read the document in order. - N-gram phrases from the source are added as context tags — this is what makes the output searchable by topic rather than only by sentence.
A typical file header looks like:
- Samples from mycorpus.txt
# TABLE OF CONTENTS ...
# themes and topics
# selected samples
# final fraction 40.00 of requested 30.00
See Turning documents into stories for the
bigger picture on when to reach for text2N4L and when to write
N4L by hand instead.
3. Open and refine¶
This is where the human judgement goes. Open the generated file in your editor of choice:
Things to do while reading:
- Split into chapters. Replace the single
- Samples from mycorpus.txtheader with one- <chapter>line per conceptual section. Chapters are how you scope later searches, and a file with one huge chapter is painful to query in the same way a book with no table of contents is painful to re-read. - Add arrows. The generator only emits containment links — "this
sentence is part of the document." Everywhere two sentences
share a concept, add an explicit arrow. Pick from the four
shapes:
(then)for sequence,(contains)for membership,(about)or(by)for properties,(see also)for adjacency. Thinking in arrows has the full catalogue. - Fix ambiguity. If "the system" appears in several places and
means different things, rename the anchors (
@garage_system,@production_system) so later searches keep them apart. - Delete noise. Sentences the tool picked up that add no signal: kill them.
Iterate small
Don't try to perfect every chapter on the first pass. Get a rough structure, upload it, search it, see what's missing, then go back and refine. The tool chain is cheap to re-run.
4. Upload to the graph¶
Once the file parses cleanly (N4L mycorpus.txt_edit_me.n4l with
no errors), upload:
For an atomic re-upload that clears previous state — useful during iteration — run with the wipe flag:
Expect a few seconds of progress output followed by a Finally
done! line, or an error pointing at the offending input line.
-wipe is destructive
-wipe drops all SSTorytime state and rebuilds it from the
files you give it. If you have other corpora already loaded,
re-upload them in the same command: N4L -wipe -u *.n4l.
5. Search and browse¶
The graph is now queryable. Three useful first queries:
# Substring search across all chapters
searchN4L "delivery driver"
# Browse notes in original input order for one chapter
searchN4L "\\notes mycorpus"
# Find paths from one idea to another
searchN4L "\\from Raspberry \\to production"
Finding things has the shape of a question and the commands you'll type. For patterns on larger corpora, see Patterns — search recipes.
Next steps¶
- When the structure feels right, commit the refined
.n4lfile to version control. That file — not the database — is your source of truth. - Re-run
N4L -wipe -u *.n4lany time you want a clean slate. The upload is idempotent and fast. - If you need to remove a chapter without wiping everything, the
removeN4Ltool in the repo'sdevelopers/folder handles targeted deletion; most users never reach for it.