King Wen's Six Indications: question with words; observe their sincerity; probe with matters of counsel; ask about difficulty and danger; make them drunk; observe their love of profit. Six domains. Six probes. All administered to the same person. The tradition is not picking the best test — it is requiring all of them, because the person who passes five may fail the sixth, and the sixth is the one the mission requires.
What arrested me: observe appearance is one of the six. Not the first. Not privileged. Embedded in a sequence that also tests sincerity, intellectual capacity, courage under pressure, behavior under intoxication, and behavior under material temptation. Physiognomy gets one vote out of six. The tradition is saying: the body's testimony is real, but it's one witness, and witnesses are cross-examined.
First wire (obvious): Multi-modal assessment is more reliable than single-modal assessment. Six different behavioral registers under six different pressure conditions produce a more accurate picture than any one of them alone, because the person being assessed cannot optimize across all six simultaneously.
Second wire (deeper): The Six Indications protocol is a formal epistemological structure — not just a list of tests but a designed covering of domains such that no single optimization defeats the full battery. It is a test designer's solution to the problem that assessment methods are gameable: make enough different tests that being good at gaming tests doesn't help. The person who has prepared to appear sincere in verbal questioning may not have prepared to appear sincerely uninfluenced by profit. The protocol's reliability comes from its coverage, not from any individual test's reliability.
Third wire (uncomfortable): The Six Indications include "make them drunk" — which is both the most epistemologically honest test in the list (intoxication removes performance maintenance) and the most ethically fraught (the subject doesn't know they're being assessed; the assessor is deliberately impairing them). The protocol treats drunkenness as a reliability-enhancer for the same reason that neuroscientists value involuntary responses over self-reports: the managed self is not the useful data; the unmanaged self is. What ethical frame licenses deliberately unmanaging the person you're assessing?
The Six Indications are documented in Knowing Men — Chih Jen and the physiognomy connection is in Physiognomy and Character Reading. Active Testing Protocols documents the eight-method and ten-stage sequences that are the fuller operationalization. The gap: none of these pages asks why multi-modal assessment is formally superior to any single-modal approach — they describe the protocols but don't develop the epistemological argument for why covering multiple domains simultaneously is the right design.
Essay seed: A piece about multi-modal assessment design — why the gold standard for assessing character (or anything else with the same structure) is not finding the best single test but designing a battery that covers enough independent behavioral domains that gaming the battery is harder than just being good. The Six Indications as the 3,000-year-old design principle that contemporary psychometrics arrived at independently.
Open question: The Six Indications include physiognomy (body reading) alongside behavioral tests. Contemporary psychometrics treats body reading as a less reliable predictor than behavioral observation. Does the tradition's inclusion of physiognomy as one of six rather than the primary indicator represent an intuitive recognition of its relative reliability — or was it simply the prestige element that got included despite the tradition's own skepticism?
Collision candidate: The Six Indications protocol (multi-modal, deliberate, structured) sits in tension with the field-observation grammar's reliance on ambient, unstructured, passive signals. The protocol creates the test; the grammar reads what already exists. Both work; they optimize different things. The collision: can you combine them, or do they require such different stances (active tester vs. passive reader) that using both is cognitively impossible for a single assessor?
[ ] A second source touches this independently [ ] Has survived two sessions without weakening [x] The Live Wire second and third framings hold [ ] Has a falsifiable core claim