The Persona, the Brief, and the Human Hand
A seven-way case study of AI and human renderings of Gabriele Tergit’s Effingers, Chapters 25 and 26, with two controlled probes and a cross-chapter replication
Case study · 2026-05-29
Stance and provenance. This is the capstone synthesis of a seven-way comparative study, now extended across a second chapter. It is composed from finished, independently verified materials — for Chapter 25, a qualitative close reading (
analysis.md), a verified quantitative layer (quant/quant_layer.md), a reasoning-typology exhibit (authority/authority.md), and the experimental record (inputs/methods_and_data.md); for Chapter 26, a fresh comparative-plus-quantitative analysis (analyst_ch26/analysis.md) and a reasoning/reading-convergence study (analyst_ch26/reasoning.md), coded independently on their own loci by a separate analyst instance; and across the two chapters, a reproducibility study of the readings and personas (analyst_xchapter/reproducibility.md) and a decision-consistency study (analyst_xchapter/decisions.md). Throughout, the seven renderings are treated as co-equal columns. The work is description, clustering, and evidence-grounded synthesis — not a ranking. Which rendering is the better or more faithful literary translation is a judgment reserved for human reviewers and is deliberately not made here. Every quantitative claim is grounded in the verified numbers; every qualitative claim is anchored in quoted text.
Abstract
Six AI translators and one published human translator independently produced English versions of the same German chapter — Chapter 25 (“Frühling”) of Gabriele Tergit’s Effingers (1951). The six AI arms vary along two axes. A persona-source axis: arm A inherited its translator-persona from the published human translator’s own writing about Tergit; arm B self-built a persona from a German-only corpus; arm D-let self-built a persona from an expanded corpus (the same German materials plus the anglophone novelists Wharton, Powell, Mitford, Isherwood); arms C and C-let had no persona at all. And a brief-nudge axis: the brief tail was either “no style guidance” (A, B, C), a categorical permission to domesticate carried quietly in the system prompt (C-let, D-let), or a categorical school instruction — Schleiermacher’s “bring the author to the reader” — delivered loudly at the translation step (D-aim). Two pairings are clean controlled probes: C ↔︎ C-let isolates a permission on a no-persona base; D-let ↔︎ D-aim isolates permission-versus-school on a shared persona, shared reading, same instance (the conversation forked only at the translation step). The seventh column, H, is the human translator’s own published version (Sophie Duvernoy, NYRB 2025) — the same translator whose writing formed A’s persona.
A close read of all seven translations against the German source, supported by the AI arms’ pass logs, personas, and reading notes, and a verified 45-locus quantitative coding, yields five findings. (1) On a 0–100 foreignization↔︎domestication composite the six AI arms occupy a narrow band — D-let 17.1 < B 19.7 < A 27.6 < C 29.0 < C-let 30.3 < D-aim 46.1 — while H stands far out at 80.3, well clear of the field. (2) The six AI arms agree with one another on a salient pick 63.1% of the time (15 unordered pairs; tightest B–C at 80%) but agree with H only 27.4% of the time (6 pairs); every AI–H pair is looser than every AI–AI pair. (3) The two probes behave differently. The permission nudged C-let perceptibly closer to H than C (17/45 vs 14/45). The school instruction roughly tripled D-aim’s domestication level (composite 46.1 vs D-let’s 17.1) yet did not move it onto more of H’s specific picks (D-aim–H = D-let–H = 10/45) — a clean dissociation between “more domesticating” and “more like the human.” (4) Where the domesticating arms abandon a vivid German idiom, they reliably share the direction but scatter to different destinations: at the strictest region loci the mean is 3.33 distinct destinations among the movers. “Domesticated Tergit” is a region of choice-space, not a point. H’s signature — paragraph fusion, invented rhyme, eye-dialect, a refrain restructured, “Young Ram” for a pub, first-person free indirect speech — is reached by no AI arm and produced by neither persona nor brief. The clean law: the persona controls the defaults, the brief controls the deviations, and the human’s craft picks are produced by neither. (5) Replication and reproducibility. The same seven arms were run again on a second chapter — Chapter 26 (“Der Sonntagmittag” / “Sunday Afternoon”) — coded independently on its own loci, and a cross-chapter study asked what held. The geometry replicates: the six machines again cluster (AI–AI mean distance 15.7 vs AI–H 33.5, 2.13× wider) and H is again the outlier, with both probes behaving as before (the school’s “more domesticating ≠ closer to the human” dissociation repeats). The reading of the novel reproduces and is text-driven — the no-persona controls reconstructed the same interpretation without the afterword the persona arms leaned on — but the self built from that reading does not: persona-revision is the least reproducible act in the experiment. Each translator’s decisions are a consistent policy meeting different material, with exactly two genuine fractures, both in the most heavily person’d arms (A’s honorific flip, D’s dialect reversal); the human is the steadiest of all seven. The longitudinal companion to the law: the persona is a source of drift, not anchoring — behavioural consistency across chapters comes from the stable method-core and the text itself, not from the elaborated persona. (Composites are not comparable across the two chapters — different loci, calibration, and analyst instance — so what replicates is the structure, not the numbers.)
1. Introduction — the question
Literary translation is often described as the production of a voice: not chiefly what is said but how it is said. If an AI translator’s voice can be shaped by a constructed authorial “persona,” and steered by a style-brief at translation time, two natural questions follow. First, does the source of the persona — where the agent’s understanding of the author comes from — leave a detectable mark on the finished translation? Second, does the kind of brief-nudge given at translation time (none, a permission, or a categorical school) move the choices, and if so how? And running beneath both: how do the six AI renderings stand in relation to the established human translation of the same chapter?
The present study isolates these inputs while holding the source text, the procedure, and the base model fixed across the AI arms, and adds the human translation as a seventh, co-equal column. The design crosses two axes:
- Persona source — none (C, C-let); inherited from the human translator’s writing-about-the-author (A); self-built from a German-only corpus (B); self-built from an expanded corpus adding anglophone novelists (D-let, D-aim).
- Brief nudge — none, “the interpretation is yours” (A, B, C); a categorical permission to domesticate carried in the system prompt (C-let, D-let); a categorical school instruction (Schleiermacher’s domesticating principle) delivered at the translation step (D-aim).
Two of the pairings are deliberately constructed as the cleanest probes in the experiment, each isolating a single varying input:
- Probe 1 — C ↔︎ C-let. Both are no-persona controls, identical in model, novel-read, pass budget, and isolation. The only difference is the brief tail: C’s “no style guidance” paragraph is replaced, in C-let, by an explicit permission to domesticate. This isolates the effect of a permission on a base with no persona to filter it.
- Probe 2 — D-let ↔︎ D-aim. These share the same instance, same expanded-corpus persona, same reading notes, same novel-read — the conversation forked only at the translation step, where D-let ran with the standard kickoff (relying on the same permission C-let had) and D-aim’s kickoff instead named a categorical domesticating school. This isolates permission-versus-school on a persona-anchored base.
The interaction between the two probes — does the same nudge-direction produce the same effect on a no-persona base versus a persona-anchored base? — is a primary analytical interest, and turns out to be the most illuminating result.
A note on what this paper does not do. This is description and synthesis, not a ranking. The six AI renderings were produced under controlled conditions for the purpose of comparison; the seventh is a published human translation read in its own light. None of the seven is treated as a benchmark or a target. The study’s design warns specifically against three expectation biases — that A will resemble H because A’s persona came from H’s writing; that C-let will resemble H more than C because of the permission; that D-aim will resemble H more than D-let because of the school. Each is tested against quoted evidence below, and two of the three are not borne out in the simple form the bias predicts.
After the seven-way read of Chapter 25 (§§2–8), the study extends to a second chapter and a longitudinal layer. The same seven columns were produced for Chapter 26 (“Der Sonntagmittag” / “Sunday Afternoon”), coded fresh on its own loci by a separate analyst instance, and two cross-chapter studies were run on top. This opens three further questions, taken up in §9. First, does the geometry replicate — do the machines cluster and the human stand apart on a different chapter, and do the two probes behave the same way? Second, is the apparatus reproducible across an independent run — does each arm read the novel the same way the second time, and does the translator-persona it builds from that reading come back? Third, are each translator’s decisions consistent, or does the chapter’s own material (a drawing-room lunch dense with honorifics, where Chapter 25 was a street panorama dense with dialect) drive the apparent differences? Because the two chapters were coded on different loci with a different calibration, the cross-chapter comparison is made at the level of structure — clustering, the human-outlier separation, probe behaviour, per-arm policy — and never as a direct comparison of absolute composite numbers.
2. Method
2.1 The chapter
The source chapter is “Frühling” (Spring), Chapter 25 of Effingers, a Berlin Saturday — 16 March 1887 — narrated hour by hour and stitched together by a recurrent, time-stamped spring-refrain:
Was für ein Frühlingstag, dieser Sonnabend im März des Jahres 1887! Was für eine Süße, [hour]!
The chapter is a montage that cross-cuts the whole society at 10am, 11am, 1pm, 5pm, 6pm, 8pm, and 3am: the wealthy Eugenie with her seamstress; the schoolgirl Sofie writing a forbidden love-letter; the Privatdozent Waldemar arguing Roman law with an old historian and then bedding the singer Susanna; the working-class Effingers and the ruined banker Mayer on the Chausseestraße; a Friedrichstraße street tableau; and Theodor’s jealous despair ending in a wine-tavern with a seventeen-year-old prostitute, Wanda. Its translation-bearing features make choices visible at many discrete sites: the incantatory refrain and its deliberate variants; the montage paragraphing; Berlin working-class dialect (“Sie doofe Ziege, Ihnen müßte man die Hammelbeine langziehen”); embedded lyric (Schumann’s Frauenliebe und -leben, Heine/Schumann’s Die Lotosblume); a quoted march line; period honorifics and academic ranks; costume and society vocabulary (Coupé, Mokka, Privatdozent, Kasinotoilette, Schlafbursche); and free indirect speech.
2.2 The seven renderings
| Arm | Persona source | Brief nudge | Where the nudge lives |
|---|---|---|---|
| A | inherited from the human translator’s writing-about-Tergit | none | — (“the interpretation is yours”) |
| B | self-built, German-only corpus | none | — |
| C | none (control) | none | — (“from the text alone”) |
| C-let | none (control) | permission | system prompt (“Style permission”) |
| D-let | self-built, expanded corpus (German + Wharton/Powell/Mitford/Isherwood) | permission | system-prompt tail |
| D-aim | same instance & persona as D-let | school | step-4 kickoff (Schleiermacher) |
| H | — (published human translation) | — | — |
A, B, C, C-let, D-let, D-aim are AI arms; H is the human published translation (Sophie Duvernoy, Effingers, New York Review Books, 2025) — the same translator whose writing formed A’s persona, folded in as a co-equal column. Only H’s finished text exists (no persona, notes, or pass log).
2.3 Procedure
All AI arms followed a four-step protocol; the persona arms (A, B, D) did all four, the no-persona arms (C, C-let) joined at step 3:
- Read the author’s primary corpus (5 works); notes only. (A, B, D)
- Constitute the persona (
persona.md); notes, then write the persona document. (A, B, D) - Read the full novel (Effingers, 151 chapters + Epilog); notes only. Persona may be revised. (A, B, C, C-let, D)
- Translate Chapter 25; up to five passes (one initial + up to four revisions), each logged. (all AI arms)
The D-let / D-aim fork is the controlled heart of the experiment. D’s persona-build and novel-read (steps 1–3) ran once as a single instance. At step 4 the conversation forked: D-let ran first with the standard kickoff (relying on the permission already in its system-prompt tail); its outputs were then physically archived out of the workspace, the conversation was rewound to the pre–Step-4 state, and D-aim ran with a different kickoff naming an explicit translation school —
“For this translation, the operating principle is domesticating translation, in Schleiermacher’s sense — bring the author to the reader, not the reader to the author. … Faithfulness operates at the level of meaning, scene, character, voice, and effect; not at the surface level of structure, syntax, or vocabulary.”
The school was named at the philosophical level only: no specific Ch.
25 choices, no naming of the human translator or her loci. D-let and
D-aim therefore share persona_before_novel.md,
persona_after_novel.md, and all of step1/2/3
notes; only the Step-4 kickoff differs. The permission that C-let and
D-let carried is the same paragraph in both — a labelled “Style
permission” that authorizes but does not require anglicizing
foreign vocabulary, plain English for dialect, paragraph fusion, English
rhyme for embedded lyric, and domestication of culturally specific
references.
2.4 Controls and measurements
All six AI arms had no internet access and minimal system prompts (no epoch or usage guidance beyond the noted nudge); a full-read protocol at steps 1 and 3 (read in full, no sampling); a uniform five-pass cap at step 4. Arm B never saw the human translator’s writing, and the same isolation applies to the D-shared persona (built from B’s German materials plus the anglophone reference writers, no writing by H of any kind). Source materials were pre-converted to plain text so no instance ever loaded a binary file. The coordinator supplied the following per-arm measurements (facts only):
| A | B | C | C-let | D-let | D-aim | H | |
|---|---|---|---|---|---|---|---|
| Persona source | inherited | self-built (German) | none | none | self-built (expanded) | shared with D-let | — (human) |
| Brief tail | “no guidance” | “no guidance” | “no guidance” | permission | permission | school | — |
| Persona revised after novel? | no | yes (+~980 chars) | — | — | yes (+419 w, +17%) | shared | — |
| Translation passes | 2 | 4 (cap) | 3 | 4 (cap) | 4 (cap) | 4 (cap) | — |
| Final length (words) | 4,294 | 4,454 | 4,418 | 4,303 | 4,326 | 4,418 | ≈4,285 |
2.5 The quantitative coding
The verified quantitative layer codes 45 loci — discrete sites at which the seven renderings make a salient, comparable choice — partitioning the seven arms into agreement groups at each locus (two arms “agree” when they make the same concrete salient pick). From this partition set the layer computes: a 7×7 pairwise-agreement matrix; a 0–100 foreignization↔︎domestication composite (a coverage-weighted mean of six tests, each measure weighted by the number of loci it covers — proper nouns, loanwords/titles, dialect, refrain/structure, idioms, honorifics); named clusters with 2-D coordinates (x = composite domestication, y = editorial boldness); a destination-dispersion measure (the region-vs-point test); and a position-relative-to-H table across six sub-axes. The coding basis is the German source plus the seven finished translations only; H’s values are read from the finished text as observations, not inferences about intent. All figures reported below are drawn from that verified layer and are not re-derived here.
2.6 Limitations, stated up front
Three constraints bound every claim that follows, and are restated in full in §11. n = 1 chapter, one instance per AI arm — a qualitative case study, not a statistically powered result; the integers should not be over-read, and nothing supports generalization to other chapters, authors, models, or runs. The six AI arms share a common base model — a live confound for every similarity among them; the AI–AI closeness measures how alike the outputs are, not independent convergence. This is an informed (non-blind) analysis — arm identities, including the A↔︎H link, were known; the mitigation is to anchor every claim in quoted evidence and to report the negative or split results the evidence supports. (Prior analyses on subsets of these arms — two four-way comparisons, a persona-readings study, and a five-way — were run independently and deliberately withheld, so this seven-way read stays independent for later triangulation.)
3. The two probes
3.1 Probe 1 — C ↔︎ C-let (no-persona base; a permission)
C and C-let are identical control arms but for the brief tail: C is told “No style guidance … the interpretation is yours, from the text alone,” while C-let receives an explicit permission to anglicize, render dialect plainly, fuse paragraphs, rhyme embedded lyric, and domesticate culturally specific references — wherever the choice serves the English text. Differences between them are attributable to that paragraph and nothing else.
What moved. The permission produced a small, directionally clean shift on the composite — C 29.0 → C-let 30.3 (+1.3) — concentrated in honorifics (test T6: 50.0 → 58.3) and idioms (test T5: 25.0 → 50.0). Four kinds of move are visible. The most striking is a structural cut: C-let deleted Mayer’s “Such is life!” exchange and his sarcasm preamble entirely, where C and every other arm preserve it; C-let’s pass-1 log cites the permission directly as the warrant (“the system prompt grants that permission”). No other arm cut at this scale. A dialect-and-idiom shift: for the carter’s “die Hammelbeine langziehen” C keeps the literal Berlin calque (“stretch your mutton-legs for you”) while C-let abandons it (“give you what-for”) and substitutes “stupid cow” for the literal goat. A chiastic naturalization: Waldemar’s “Schäm dich, daß du dich schämst!” is “Be ashamed that you are ashamed!” in C (preserved exactly) and “Be ashamed of being ashamed!” in C-let (idiomatic). And a refrain compression: C-let drops “of the year” where C keeps it, moving toward H.
What did not move — and where the permission moved C-let away from H. The permission did not produce wholesale anglicization. C-let kept Fräulein and Herr throughout, its pass-2 log spelling out the reasoning: “Fräulein / Frau / Herr kept throughout. Decided against Anglicizing in this pass; they’re functioning as period markers and class-form markers that English’s Miss / Mrs. / Mr. would flatten.” On three lyrical-cultural titles C-let moved toward the German where C had translated — keeping Frauenliebe und -leben, Feuerzauber, and the pub name Frischer Hammel in German italics, using italics as a cosmopolitan flag rather than a calque trigger. And in pass 3 it actively reversed a domesticating move, correcting “in the east end of Berlin” to “in the east of Berlin” because “East end carries London (the East End) in English ears.” The permission was read symmetrically: a license to re-choose in either direction, not an instruction to move all the way over.
The chapter-level pattern is that C-let read the chapter exactly as C did — same scene structure, same character voicing, same refrain spine — and applied the permission selectively at the sentence level, scoring genuine domesticating moves on dialect, chiastic shape and structural compression while foreignizing on musical and pub titles. The text is 115 words shorter than C (−2.6%), consistent with the cut and the tightened dialogue. The within-probe agreement remains high: C ↔︎ C-let = 31/45 (68.9%). The permission worked as a license, not as a school.
3.2 Probe 2 — D-let ↔︎ D-aim (shared persona; permission vs school)
D-let and D-aim share everything up to the translation step: the same expanded-corpus persona, the same step-1/2/3 notes, the same persona-revision-after-novel. Only the Step-4 kickoff differs — D-let leans on the permission already in its system prompt; D-aim receives the categorical Schleiermacher school. Differences between them are brief-driven, full stop.
What moved. The school produced the largest probe effect in the study: composite D-let 17.1 → D-aim 46.1 (+29.0), roughly tripling the persona arm’s domestication, with the biggest jumps in honorifics (T6: 0.0 → 83.3), proper nouns (T1: 10.0 → 50.0), and idioms (T5: 50.0). The movement is heaviest on a band of sociolinguistically marked vocabulary. Honorifics: D-let writes “Fräulein Winkel,” “Herr Mayer”; D-aim writes “Miss Winkel,” “Mr Mayer” — the only arm in the seven to anglicize Fräulein and Herr, and it does so despite sharing a persona with D-let that has the same “keep where it carries weight” rule. Academic rank: the two pass logs explicitly contradict each other on a single word. D-let kept Privatdozent (“no English equivalent”); D-aim anglicized to “the young lecturer,” its log reasoning “the contemporary anglophone reader does not parse ‘Privatdozent’ without a footnote, and footnotes are forbidden by the domesticating principle.” Same persona, opposite handling, different cited authority — the school overrode a persona-specified default. Social-historical lexicon: D-let kept Kaiserreich, Stadtrat, Schloß; D-aim translated all to Empire, the Councillor, the Palace. Song titles: D-let kept the Schumann cycle and the Heine lyric intact; D-aim translated both (rendering ängstigt as “trembles” — independently coinciding with H). Locale: D-let kept “the east of Berlin”; D-aim wrote “the East End of Berlin,” its log calling it “instantly resonant for the anglophone reader … perfect domestication” — the very move C-let, given the same permission, had explicitly rejected.
What did not move. The persona’s voice-models held across both forks — D-aim’s log: “The four anglophone voice-models settled in Step 2 — Isherwood-clear, Wharton-periodic, Mitford-quick, Powell-deadpan — are doing the per-scene register work.” Berlin dialect was handled the same way in both (no eye-dialect in either, even D-aim). Both kept Frischer Hammel in German italics, the Latin tag in verba magistri, Adieu, and the diminutive of Annettchen’s pinpoint cruelty. What changed was precisely the band of marked vocabulary the school targeted by overriding the persona’s “cultural-weight” exemption clause — roughly fifteen items. Unlike C-let, D-aim grew longer than its sibling (4,418 vs 4,326 words, +2.1%): the domesticating direction unpacked German compounds into English multi-word descriptions (“Privatdozent” → “the young lecturer”; “Stadtwald” → “woods outside the town”). The within-probe agreement is correspondingly low: D-let ↔︎ D-aim = only 22/45 (48.9%) — the school perturbed D far more than the permission perturbed C, consistent with D-aim being the single most-moved arm in the study.
3.3 The interaction between the probes
The two probes together test whether the same nudge-direction — license to domesticate, in two formats and at two intensities — produces the same effect on a no-persona base versus a persona-anchored base. The answer is no, and the asymmetry is the central result of this section.
Finding 1 — The persona dampens the permission. D-let, with the permission, kept more German than C-let with the identical paragraph: it held Stadtrat, Kaiserreich, Herr Kollege where C-let either rewrote or naturalized. The expanded-corpus persona supplied strong defaults that mostly preserved the German on lexicon, so the permission landed on already-formed preferences and acted at most as a tie-breaker. The composite makes the dampening visible: the two permission-bearing persona/control arms with the lighter brief, D-let (17.1) and B (19.7), sit lowest of all six — a persona left on a permission is more source-preserving than a no-persona control under the same permission.
Finding 2 — The school overrides the persona on targeted bands. D-aim’s persona is D-let’s, including the explicit “keep where it carries weight” rule. The school overrode that rule for honorifics, academic rank, social-historical lexicon, two of three song titles, and Berliner Osten — and the override is visible in the pass log’s switch in cited authority, from persona-derived “no English equivalent” to “the domesticating principle … footnotes are forbidden.”
Finding 3 — The format matters as much as the content. C-let’s permission and D-let’s permission are the same paragraph. But C-let has no persona-filter, so the permission must be the operating principle wherever the source-text alone does not decide; D-let has the persona-filter, so the permission is at most a tie-breaker; and D-aim’s school sits after the persona reading, framed as an operating principle (“Every choice … follows from this principle”), so it acts as a re-anchoring — a new categorical default rather than a tie-breaker. Same direction-name, three different mechanisms.
Finding 4 — Both formats move in the same direction but pick different features. C-let domesticated dialect-idiom and chiastic shape; D-aim domesticated honorifics and academic titles. Neither moved the same features. The single direction-name domestication covers operations in different operational regions of the translation — the seed of the region-vs-point finding (§6).
Finding 5 — Neither nudged arm lands at H’s specific picks. C-let overshot H on the “stupid cow” cluster (matching H exactly on the substituted noun) but converged with H on the Schumann title only coincidentally (both keep German, each reading it as cosmopolitan, not as foreignizing). D-aim overshot H on Fräulein/Herr (translated where H keeps), overshot on Berliner Osten (“East End” where H stays generic), undershot on Frischer Hammel (kept German where H invents “Young Ram”), and missed H’s structural moves entirely. The probes shift defaults; they do not produce H’s destinations. This is quantified in §5 and dissected in §6–§7.
4. The seven-way geometry
4.1 The composite gradient
Scoring each rendering on the 0–100 foreignization↔︎domestication composite (0 = source-preserving, 100 = naturalized into English) places the seven on a single line:
D-let 17.1 < B 19.7 < A 27.6 < C 29.0 < C-let 30.3 < D-aim 46.1 ≪ H 80.3
Two features dominate. First, H stands far out at the domesticating end (80.3) — well clear of the field, about 1.7 times the next-most-domesticating arm (D-aim, 46.1) and about four-and-a-half times the most source-preserving arm (D-let, 17.1). H scores 100 on three of the six tests (dialect, refrain/structure, idioms): it is the only arm that uses eye-dialect, fuses and reshapes the refrain and paragraphs, and dissolves the German idioms into plain English sense. Second, the six AI arms occupy a comparatively narrow band, 17.1–46.1 — they domesticate far less, and far less variably, than H. Both controlled probes are visible in the gradient: the permission’s small clean lift (C → C-let, +1.3) and the school’s large one (D-let → D-aim, +29.0), the latter the only thing that lifts an AI arm clear of the pack.
The AI ordering is itself informative. The two arms that most resist domestication are B (German-only persona) and D-let (expanded-corpus persona, permission only): persona arms on the lighter brief sit lowest. The no-persona controls (C, C-let) and the H-derived persona (A) sit mid-band, within about three points of one another at the foreignizing-to-mild end. Only the school-instructed D-aim climbs out.
An honesty note carried from the verified layer: H is not monotonically domesticating. On the lexical sub-current it occasionally foreignizes — it keeps the French “toilette” in Kasinotoilette (the only arm to do so), and reaches for period-English “Rhenish” where the AI render plain “Rhine wine.” Its loanword test score (62.5) is high but not maximal precisely because of those keeps. H’s composite is dominated by hard domestication of structure, dialect, and idiom, partly offset by a foreignizing taste in lexis. The composite reports the net; §7 separates the dials.
4.2 The 7×7 agreement matrix
Two arms “agree” at a locus when they make the same salient pick (out of 45). The matrix is symmetric:
| % | A | B | C | C-let | D-let | D-aim | H |
|---|---|---|---|---|---|---|---|
| A | — | 77.8 | 68.9 | 64.4 | 62.2 | 55.6 | 26.7 |
| B | 77.8 | — | 80.0 | 66.7 | 62.2 | 60.0 | 24.4 |
| C | 68.9 | 80.0 | — | 68.9 | 51.1 | 60.0 | 31.1 |
| C-let | 64.4 | 66.7 | 68.9 | — | 68.9 | 51.1 | 37.8 |
| D-let | 62.2 | 62.2 | 51.1 | 68.9 | — | 48.9 | 22.2 |
| D-aim | 55.6 | 60.0 | 60.0 | 51.1 | 48.9 | — | 22.2 |
| H | 26.7 | 24.4 | 31.1 | 37.8 | 22.2 | 22.2 | — |
The read-off is stark. The AI–AI mean across all 15 unordered pairs is 63.1% (28.4/45); the AI–H mean across the 6 pairs is 27.4% (12.33/45). The six AI arms agree with one another more than twice as often as any of them agrees with H. And the separation is total at the level of individual pairs: the tightest pair overall is the AI–AI pair B–C at 80.0% (next A–B at 77.8%), while every AI–H pair is looser than every AI–AI pair — the maximum AI–H agreement (C-let–H, 37.8%) sits below the minimum AI–AI agreement (D-let–D-aim, 48.9%). The AI arms cluster tightly; H sits apart from all of them. This must be read with the shared-base-model caveat: the figure measures how alike the outputs are, not independent convergence.
Ranking the AI arms by agreement with H: C-let 37.8 > C 31.1 > A 26.7 > B 24.4 > D-let 22.2 = D-aim 22.2. Two points stand out against the expectation biases. C-let is the AI arm closest to H — the permission, not a persona, produces the nearest neighbour. And A is not notably closer to H than the field (26.7%, mid-pack) despite its persona being built from H’s own writing-about-Tergit. The seven-way matrix corroborates the per-probe read-offs: the permission moved C-let closer to H (37.8 vs C’s 31.1), while the school left D-aim exactly level with D-let on H-agreement (both 22.2) even as it doubled D-aim’s overall domestication.
4.3 Clusters
Read jointly from the matrix and the composite, the seven fall into a small number of groupings:
| Cluster | Members | Numeric basis |
|---|---|---|
| Source-preserving AI core | A, B, C, C-let, D-let | mutual agreement 62–80%; composite band 17.1–30.3 |
| — loanword-keepers sub-pocket | C, C-let | C–C-let 68.9%; both keep “a Molle” |
| — German-title holdouts | D-let | D-let–H 22.2%; composite 17.1; uniquely keeps Stadtrat, Herr Kollege, Lotosblume |
| School-nudged outlier (AI) | D-aim | composite 46.1, clear of the AI band; D-let–D-aim 48.9% despite shared persona |
| Human anchor | H | apart from all six (AI–H mean 27.4%); composite 80.3; boldness 90 |
Cluster 1 is the gravitational center: five of six AI arms in a tight foreignizing-to-mild band. C-let is the cluster-1 member leaning furthest toward H (composite 30.3, H-agreement 37.8%) — the permission pulls it to the boundary without leaving the core. D-aim breaks out as a one-arm cluster: the school pushes it past the AI band on the domestication axis, yet it remains far short of H and (per §7) overshoots H on the single honorifics dial. H is its own cluster on both axes; there is no AI arm near it.
4.4 The 2-D cluster scatter
A second axis separates lexical domestication from editorial boldness — how far each rendering departs from the German’s form (reshaping the refrain, fusing paragraphs, inventing rhyme, eye-dialect, dropping locale), scored independently of vocabulary over ten form-level items:
| arm | x (domestication) | y (editorial boldness) |
|---|---|---|
| A | 27.6 | 0.0 |
| B | 19.7 | 0.0 |
| C | 29.0 | 0.0 |
| C-let | 30.3 | 0.0 |
| D-let | 17.1 | 10.0 |
| D-aim | 46.1 | 15.0 |
| H | 80.3 | 90.0 |
Six AI arms cluster in the lower-left quadrant (x ≤ 47, y ≤ 15) — they neither domesticate hard nor reshape form. H sits alone in the upper-right (80, 90). The two axes are correlated for H but separable for the AI: D-aim moves substantially right on x (lexical and honorific domestication, 46.1) while barely rising on y (15.0) — it domesticates vocabulary and titles without touching the German’s form. Editorial boldness is essentially an H-only behaviour in this chapter: only H reshapes the refrain, fuses paragraphs, invents the rhyme, and writes eye-dialect. The small y for D-let/D-aim (10–15) comes almost entirely from cosmetic “—” section dividers, not from any fusion.
4.5 The honorifics axis as a worked example
Honorifics are the single cleanest probe of the domesticating axis, and they show the geometry in miniature:
| Locus | A | B | C | C-let | D-let | D-aim | H |
|---|---|---|---|---|---|---|---|
| Fräulein Winkel | Fräulein | Fräulein | Fräulein | Fräulein | Fräulein | Miss | Fräulein |
| Herr Mayer | Herr | Herr | Herr | Herr | Herr | Mr | Herr |
| Privatdozent | Privatdozent | Privatdozent | Privatdozent | Privatdozent | Privatdozent | “lecturer” | “lecturer” |
| Herr Kollege | colleague | colleague | colleague | colleague | “Herr Kollege” | colleague | colleague |
| Herr Stadtrat | the Councillor | the Councillor | the Councillor | “my husband” | “the Stadtrat” | the Councillor | “my husband” |
Two arms anchor the extremes: D-aim alone reaches the “all the way to Miss/Mr” end, and D-let alone keeps “Herr Kollege” and “Stadtrat” in German. The rest cluster in the middle. Note two instructive convergences-on-restraint. A’s persona explicitly states “honorifics simplified — the Prussian Rat titles all to ‘Councilor’ … because rendered literally they go stilted and lofty-wrong”; A applies this to Stadtrat → “the Councilor” but pointedly does not extend it to Fräulein → Miss, and H — whose writing supplied that principle — also keeps Fräulein/Herr. Two arms (A and H) converge on not anglicizing Fräulein/Herr despite both having the simplification principle available. And C-let used the permission to think the question through and decline it: its pass-2 log records considering and rejecting the move. D-aim’s anglicization is the one place an AI arm sits past H on a whole sub-axis — a fact §7 returns to.
5. Region, not point
A natural hypothesis about the domesticating end of the field is that there is a target — a “domesticated Tergit” toward which stronger nudges should converge, and at which the human translation sits. The data reject the point hypothesis in favour of a region: the domesticating arms reliably share a direction but scatter to different destinations.
5.1 The carter’s tirade — four arms, four destinations
The clearest single illustration is the Berlin dialect of the carter’s tirade, “Sie doofe Ziege, Ihnen müßte man die Hammelbeine langziehen…”:
| Arm | Rendering of Hammelbeine langziehen |
|---|---|
| A | “somebody ought to box your ears” |
| B | “give your mutton-legs a good haul” (calque) |
| C | “stretch your mutton-legs for you” (calque) |
| C-let | “give you what-for” |
| D-let | “give you a hiding” |
| D-aim | “give you a thrashing” |
| H | “give you a good slap” (+ “Can’tcha see”) |
The field splits three ways. A, B, C stay on or near the German Hammelbeine image; the four domesticating-end arms — C-let, D-let, D-aim, H — all abandon the mutton-legs metaphor, but each lands on a different English idiom: what-for / hiding / thrashing / good slap. Four arms, four destinations. H goes further still, adding “Can’tcha” eye-dialect and rewriting the “frisierte Schnauze” line entirely as a register comment on speech itself (“I’ll have none of yer fancy talk”). The shared move is “leave the literal”; the destination is independent in every case.
5.2 The dispersion measure
The verified layer quantifies this. For each candidate locus it counts how many distinct English destinations the domesticating-end movers reach. The strict region loci — those where all four domesticating-end arms (C-let, D-let, D-aim, H) move off the literal — are:
| Region locus | distinct destinations among the 4 movers |
|---|---|
| L42 Hammelbeine langziehen | 4 (what-for / hiding / thrashing / good slap) |
| L45 Bärenhunger | 4 (ravenous / starving / wolf’s hunger / famished) |
| L39 doofe Ziege | 2 (stupid cow / silly cow) |
Across the three region loci the mean is 3.33 distinct destinations among the movers, and at 3 of 3 the movers share the direction (all leave the literal) but split on the destination (more than one distinct landing). Across all nine flagged movement loci the mean is 2.56 distinct destinations among the domesticating-end movers (3.11 counting all seven arms that moved). The two loci the brief pre-flagged — Hammelbeine and Bärenhunger — are maximally dispersed: four movers, four destinations each.
The honest counter-example is L39 doofe Ziege, where the four converge on one image (“cow”) and differ only in the modifier (stupid / silly). It reads best as a near-point. The pattern the data support is conditional and clean: when the domesticating arms abandon a vivid German idiom (the two pre-flagged cases) they reliably scatter; when they abandon a flat insult (Ziege) they converge on the obvious English image. Two further loci (Stadtwald, Kasinotoilette) show heavy scatter among the AI — five and four distinct destinations across all movers — but fail the strict “all four move” gate because one arm keeps the German.
5.3 The dissociation: more domesticating ≠ closer to H
The region reading is reinforced by the cleanest single result in the matrix. The point hypothesis predicts that a stronger nudge should move the AI toward H’s specific picks. It does not.
- C-let is closer to H than C. C-let–H = 17/45 (37.8%) vs C–H = 14/45 (31.1%) — a +3-locus (+6.7 pp) move toward H from the permission alone, at honorific and lexical loci where the permission nudged C-let onto H’s choice (e.g. Herr Stadtrat → “my husband”, Mätresse → “mistress”).
- D-aim is not closer to H than D-let — they tie. D-aim–H = D-let–H = 10/45 (22.2%). The school moved D-aim’s overall domestication far above D-let’s (composite 46.1 vs 17.1) but did not move it onto more of H’s specific choices. D-aim domesticates toward different destinations than H picks — it anglicizes honorifics, which H keeps; it writes “East End,” which H does not; it keeps Frischer Hammel, which H translates.
This is a clean dissociation between “more domesticating” and “more like the human.” The strongest-nudged arm became differently domesticating, not more H-like. Stronger nudge does not equal closer to H.
The same shape recurs across the whole field. On every region locus where the domesticating-end arms agree on direction they disagree on destination; the same persona (D-let / D-aim) produces two distinct points depending on the brief; the strongest-nudged AI arm does not converge on H’s picks; and H is itself at a particular, internally coherent point that is not reachable by direction-nudge. The region is bounded (no arm goes infinitely far; each is bounded by what its persona, brief, or close reading recommends preserving) and multi-dimensional (an arm can be “domesticating on dialect but not lexicon,” or the reverse). The point hypothesis is supported on only a couple of features — refrain compression, the “trembles” verb-choice — where several domesticating arms converge with H. On most features they each pick differently, from H and from each other. Region is the better fit.
6. Where the human sits
H occupies its own point in the geometry, and what defines it is not more of what the AI arms do but a different kind of move. The verified layer captures this in the partition tally: the single most common shape across the 45 loci is one arm standing alone against the other six (21 loci), and of those the “six AI agree, H alone differs” signature accounts for 14 loci — the largest single partition pattern in the data, and the structural backbone of the H-vs-the-field separation. H is a singleton at 23 of the 45 loci.
6.1 H’s structural and craft signature
Five moves belong to H alone and to no AI arm — not even the strongest-nudged D-aim:
1. The refrain restructured. The German “Was für eine Süße, morgens um zehn Uhr!” is an elliptical noun-exclamation. Six AI arms preserve the form (with minor variation — C adds “o’clock,” C-let and D-aim drop “of the year”); H alone leaves it:
H: “What a beautiful spring day, that Saturday in March 1887! How sweet the air was at ten o’clock in the morning!”
H adds the adjective beautiful (the German has only Frühlingstag), invents a subject (“the air”) and a finite past-tense verb, converting the rhetorical form entirely. This is the boldest reading-difference in the chapter, and it belongs to H.
2. Paragraph fusion at the refrain transitions. In the German and in all six AI versions, the refrains stand alone as the white-space beat of a clock-montage. H folds them into the surrounding prose, joining the end of one scene, the refrain, and the start of the next into a single block (“…will you take the girl out for a walk through the Tiergarten later?” What a beautiful spring day … How sweet the air was at one o’clock in the afternoon! It was time for the changing of the guard. Standing at the window…). No AI arm does this; H’s chapter is visibly denser and more prose-block than any AI’s.
3. English rhyme invented for the Schumann lyric. Sofie’s misquotation breaks Chamisso’s original rhyme. Six AI arms render it as prose; H invents a new English rhyme:
H: “Since first I saw you, I believe I’ve gone blind, as if in a dream, only you in my mind.”
A poetic-translation craft move — replacing the song-line’s shape with an English equivalent that feels sung. (D-let attempted period-marking with archaic “Thee”; D-aim added a line break to suggest couplet shape; neither rhymed.)
4. Heavy eye-dialect on the Berlin scenes. “Can’tcha see,” “d’you haff to,” “Yer old man,” “piss away your week’s pay,” “What a sop!” Every AI persona and pass log explicitly rejects eye-dialect — B calls it “the translator’s vanity”; D’s persona prescribes “plain working-class English with no eye-dialect”; A’s persona says “not mapped onto Cockney or Brooklyn.” H goes the other way.
5. A believable English pub name invented. Frischer Hammel becomes “the Young Ram” — a name that reads as a real English pub (compare “Bull and Bush,” “King’s Head”). B and C calque it to “Fresh Mutton”; A, C-let, D-let, D-aim keep the German. The move requires translator-pragmatic invention — knowing what English pubs sound like and being willing to invent rather than translate. No nudge-direction conveys this.
To these the layer adds further H-only behaviours: first-person interior monologue for Susanna’s free indirect speech (“to be honest about my desires” — where all six AI arms keep the German dürfen/permission frame), “Father” for Papa in Amalie’s interior (“Dear, innocent Father!”), “sop” for Dämel (the most regionally marked English word in the field, against the AI arms’ chump/ninny/idiot), and the modernized opaque period reference (“Matches! Matches!” dropping the wax of Wachsstreichhölzer).
6.2 H’s selective foreignizing — the anti-categorical signature
H is not uniformly domesticating, and the exceptions are the point. The same translator who fuses paragraphs and invents a pub name keeps the most foreign vocabulary in places:
- “casino toilette” for Kasinotoilette — H is the most literal of all seven on this obscure period-fashion word (A “dinner gown,” B/C “casino gown,” D-aim “evening gown”).
- Fräulein and Herr kept throughout — where D-aim, the most-nudged AI arm, anglicizes them.
- Frauenliebe und Leben and Feuerzauber kept in German — alongside C-let and D-let, not the more literal AI arms.
H’s domestication profile is therefore anti-categorical: case by case, with specific knowledge of which features warrant naturalization and which warrant preservation. It is coherent within itself — H’s craft has a logic — but not reachable by direction-nudge. You cannot get there by telling an AI to domesticate harder; D-aim, told exactly that, landed on a different coherent profile (anglicizing the honorifics H keeps, keeping the pub name H invents).
6.3 The three persona arms as readings of the author
The personas differ markedly in how they conceive the author, and the differences show. A’s persona — built from the human translator’s craft writing — is the most operationally specific, listing concrete defaults (“a coupé becomes a carriage … honorifics simplified … Berlinisch with a light hand, not Cockney”). B’s persona — German-only — is the most literary and stance-oriented, stating operating principles as attitude (“the law of the book: it is not written from the end”; “I will not turn Berlinisch into Cockney … that is the translator’s vanity”). D’s persona — expanded corpus — is operationally specific and explicitly cites its anglophone reference writers by scene assignment (Isherwood-clear, Wharton-periodic, Mitford-quick, Powell-deadpan), with its largest revision-after-novel (+419 words) adding the refrain’s formal weight, the keyword Süße, the letters as connective tissue, and the no-eye-dialect rule that both D forks obeyed. Each persona is internally coherent and historically plausible; each produces a recognizable English voice; none predicts the others. They converge on stance (Tergit’s anti-pathos, flat-line deaths, dialogue-driven scenes) but diverge on the operational layer — and it is the operational layer that the geometry measures.
6.4 Does A resemble H most? Does C-let? Does D-aim?
The three expectation biases, tested:
- A vs H. A and H share some orientation (keep Fräulein/Herr; light-hand Berlin dialect with no eye-dialect; period-French moderated) but A’s specific picks differ from H’s published practice at most loci — A keeps Frischer Hammel, Privatdozent, translates the Schumann title where H keeps it; A does not fuse paragraphs, invent rhyme, use eye-dialect, or invent “Young Ram.” A’s persona inherited the orientation but not the case-by-case picks, which are not derivable from the general principles. A–H = 26.7%, mid-pack. The bias is not borne out in the simple form.
- C-let vs C. C-let resembles H more than C does on dialect and idiom (the “stupid cow” cluster matches H exactly; drops “of the year”; idiomatic “of being ashamed”) but not on song titles (both keep) or on the structural cut (C-let’s alone). C-let–H = 37.8% vs C–H = 31.1%. The bias is borne out — modestly, and specifically on lexical/idiomatic loci.
- D-aim vs D-let. Split verdict. D-aim moves toward H on a few features (Privatdozent → “lecturer”; “trembles”; dropped “of the year”) and away from H on others (Fräulein/Herr anglicized; “East End”; Frauenliebe translated; all the structural moves missed). D-aim–H = D-let–H = 22.2%. The bias is not borne out: a stronger nudge produced a differently-domesticating arm, not a more-H-like one.
The deeper reading of the D-aim ↔︎ H gap is structural. D-aim went heaviest on dimensions a school-instruction can directly target — a categorical class of items (honorifics, academic ranks, social-historical lexicon, locale-naming). H went heaviest on dimensions that require case-by-case craft pragmatic invention — paragraph fusion, English-rhyme creation, believable pub-naming, register-specific eye-dialect, modernized opaque period words. A school can shift defaults across a category; it cannot generate the right pub name, decide to fuse paragraphs, or hear what sounds like an English song-line. Direction is shiftable; picks are not.
7. Position relative to H, by sub-axis
Domestication is not one dial. Comparing each arm’s domesticating-mean to H’s on six sub-axes (MORE / at / less than H, within rounding) separates the dials the composite nets together:
| arm | lexicon | honorifics | dialect | structure/para | embedded-song | locale-naming |
|---|---|---|---|---|---|---|
| A | less | less | less | less | at | less |
| B | less | less | less | less | at | less |
| C | less | less | less | less | at | less |
| C-let | less | less | less | less | less | less |
| D-let | less | less | less | less | less | less |
| D-aim | less | MORE | less | less | at | less |
(H’s per-sub-axis means, for reference: lexicon 0.83, honorifics 0.67, dialect 1.00, structure/para 0.75, embedded-song 0.60, locale-naming 0.88.)
The table reads sub-axis by sub-axis as follows. On lexicon, every AI arm is less domesticating than H, but the gap is small — H keeps some loans (toilette) yet domesticates Mokka/Coupé/Rheinwein to 0.83; A is closest at 0.75. On dialect, H is at the ceiling (1.00 — eye-dialect plus vulgarity plus idiomatic insults) and all six AI arms sit well below (0.00–0.33), the widest and most consistent gap in the table. On structure/paragraphing, all six are below H (0.75); D-let/D-aim’s small nonzero comes only from the cosmetic dividers. On embedded-song form, A, B, C, and D-aim sit at H (all translate the titles and lyric the way H mostly does, 0.60); C-let and D-let fall below because they keep more German titles — and note this is the sub-axis where H is least extreme (it keeps two German titles itself), so parity is easy to reach. On locale-naming, all six are below H (0.88 — “linden trees,” “park,” drops “Spreewald”).
The single exception is the headline. D-aim sits past H on exactly one sub-axis — honorifics (0.83 > H’s 0.67) — because it is the only arm to anglicize both Fräulein → Miss and Herr → Mr, moves H itself does not make (H drops “Stadtrat” to “my husband” but keeps Fräulein/Herr, landing at 0.67). So the strongest-nudged arm overshoots the human on one narrow dial while remaining well short of it on the other four, and at it on the one where H is least extreme. This is the quantitative form of the §6 finding: domestication is not a single slider, and even the hardest-pushed AI arm does not become H; it becomes differently domesticating. The two permission-only persona/control arms (C-let, D-let) are less domesticating than H on every one of the six sub-axes.
8. How the arms reasoned
The six AI arms left pass logs; H did not (a published translation has no documented reasoning, and is excluded from this section). An inductive typology of what each arm cites as its authority — persona, source-text, readability, permission, school, within-novel motif, fidelity-correction, voice-models, period-register, typographic — turns the output differences into a picture of why the arms diverged. The signatures are description, not ranking: they document what each arm appeals to, never judge whose reasons are better.
8.1 The modality asymmetry — a permission is quiet; a school is loud
The cleanest result sits inside Probe 2, where persona and reading are held constant and only the brief’s modality varies. The verbal signature is stark even before the loci:
| Verbal marker in the log | D-let (permission) | D-aim (school) |
|---|---|---|
| “domesticat*” (domesticate / -ing / -ion) | 0 | 15 |
| “Schleiermacher” / “bring the author to the reader” | no | yes |
| “the domesticating principle / school” | no | yes |
| “the contemporary anglophone reader” | no | yes |
| “footnotes are forbidden” | no | yes |
| “no English equivalent” reasoning | yes | (overridden) |
| default disposition of German tokens | kept | anglicized |
D-let’s permission is background: present in its prompt, almost never named, exercised silently (e.g. Frühstück → “lunch”) while each move is justified by persona or period-usage, and elsewhere the German is kept under source-text authority. The permission opens a door D-let walks through without announcing it. D-aim’s school is foreground: cited as the governing imperative at nearly every domesticating choice and reasserted in every pass header (“same domesticating school throughout”). And it overrides the very English-survival facts D-let uses to keep the German — on Taburett, both logs state the same fact (“tabouret survives in English period vocabulary”), and reach opposite verdicts: D-let keeps it (the survival licenses keeping), D-aim domesticates it (“needlessly opaque”). Same content in the two briefs; opposite citation-behaviour by modality. A permission behaves like an allowance; a school behaves like a rule.
8.2 The per-arm authority profiles
Counts of coded citations differ by log length, so within-arm profile matters more than raw totals. The headlines:
- A — leads with persona craft-defaults and voice-models, attributing domestication to the persona, never to a brief: “Domestication of society-words (persona defaults): Coupé → ‘carriage’.” Lightest touch (3 passes).
- B — the source-text and fidelity-correction arm; reasons from literary stance not voice-models, which it cites zero times: “I will not turn Berlinisch into Cockney … that is the translator’s vanity and it falsifies.” It names “persona” exactly once in its whole log.
- C — the barest authorities, “from the text alone” (source-text + motif + readability + period); notable for once ratifying a base-draft choice other arms called an error.
- C-let — the cite-then-decline arm: the permission is named once, globally, then walked back item by item by source-text and period authorities. It declines the very licenses the permission named (kept the Schumann unrhymed; kept Privatdozent, Fräulein, Molle, casino gown).
- D-let — persona + voice-models + source-text (“no English equivalent”); keeps German tokens; the permission is in its prompt but never named (0 uses of “domesticat*“).
- D-aim — the school in the foreground, plus the shared persona and voice-models; anglicizes the same tokens D-let keeps; cites “domesticat*” 15 times.
Two cross-arm patterns from this exhibit bear directly on the geometry above.
Voice-models are corpus-gated. The named anglophone authorities (Wharton, Mitford, Powell, Isherwood) appear only where the corpus supplied them — A (inherited persona) and the D pair (expanded corpus). B does not and cannot cite them: its persona and log contain zero mentions of those writers or “mid-Atlantic,” because its step-2 corpus was German-only. The no-persona arms (C, C-let) cite none either. The same family of authority is present or absent strictly according to whether the arm’s reading furnished it — and this maps onto the composite, where B’s German-only persona sits near the foreignizing floor (19.7, second-lowest of the six). A cites craft defaults where B cites literary stance; the contrast is the difference between a persona built from a translator’s blueprint and one built from an author’s interior.
Independent fidelity-corrections reveal shared-base-model error tendencies. Several arms caught and fixed the same meaning-errors on re-reading, each citing the German: selber lieben misread as “loves oneself” was fixed independently by C-let (pass 2), D-let (pass 2), and D-aim (pass 4); vergessen (= forget) misread as “forgive” was fixed by B (pass 2) and C-let (pass 4) — though C ratified the base “forgive” as “faithful to the accusative object,” so a fidelity-correction is not universal but reading-dependent. That three or four arms rediscover the same error is the clearest in-log sign of the shared base model that the §4 caveat flags: the model has shared tendencies on certain German constructions, and the arms each detect them separately.
The reasoning exhibit thus corroborates the output geometry from the inside. The permission and the school produce their measured composite effects (§4) through measurably different citation behaviour (§8.1); the corpus that built a persona gates which authorities are even available (§8.2), which is why the persona-source axis shows up in the gradient; and the convergent fidelity-corrections name the shared-base-model confound in the arms’ own words.
9. A second chapter: replication and reproducibility
Everything above is a read of a single chapter. To see which of its results are properties of the experiment and which are accidents of that chapter, the same seven columns were produced for a second one — Chapter 26, “Der Sonntagmittag” (“Sunday Afternoon”), a single bourgeois Sunday-lunch tableau where Chapter 25 was a seven-scene cross-class panorama — and coded fresh, on its own loci, by a separate analyst instance with no reference to the Chapter 25 study. Two further studies then asked, across the two runs, whether the readings and personas reproduced and whether each translator’s decisions stayed consistent. The headline is that the structure replicates cleanly while the numbers do not transfer, that the reading reproduces and is shown to live in the text rather than the scaffolding, that the self built from the reading does not reproduce, and that the persona turns out to be a source of drift rather than anchoring. Each result is anchored below.
A caveat governs the whole section and is stated once, plainly, here: the two chapters’ absolute composites are not comparable. They were coded on different loci, with a different coverage-weighting calibration, by a different analyst instance. What can be compared across the two chapters is the shape of the data — the clustering, the human-outlier separation, the probe behaviour, and each arm’s policy — never the composite values themselves. Every cross-chapter claim that follows is a claim about structure.
9.1 The geometry replicates (but only the geometry)
On its own 0–100 foreignization↔︎domestication composite, Chapter 26 places the seven in the same shape Chapter 25 did:
D-let 27.5 < B 27.8 < C 34.9 < D-aim 36.6 < C-let 37.7 < A 52.7 ≪ H 69.3
The two facts that dominated Chapter 25 dominate again. H is the outlier at the domesticating pole (69.3), separated from the nearest machine (A, 52.7) by about 17 points and from the machine pack by 30 or more. And the six machines occupy a compressed band (27.5–52.7), five of them bunched within a ten-point sliver near the foreignizing end. The clustering is as tight as before: the AI–AI mean distance is 15.7 against an AI–H mean of 33.5, so the machines agree with one another 2.13× more closely than any of them agrees with H. Every machine’s nearest neighbour is another machine — never H — and H falls within fifteen points of the AI median on only 40% of the loci (16 of 40). The machine-consensus-versus-human-pole structure of §4 is not a Chapter 25 artifact.
Both controlled probes also hold, and they hold in kind. The permission (C → C-let) produced a small shift toward domestication — composite +2.8 — concentrated in vocabulary and mostly toward H (of the 18 loci where the two differ, 14 nudge toward H, 3 away, 1 overshoots), but it again left untouched every structural and period locus where H actually domesticates. The school (D-let → D-aim) moved about three times as far — composite +9.2 — but in a partly orthogonal direction: the directional split is 13 loci toward H against 9 away, and the move is bidirectional, anglicizing a set of surface society-tags (tout Berlin → “the whole of Berlin,” Zigeunerbaron → “The Gypsy Baron”) while newly leaving in German a set of flavour-realia D-let had glossed (Sofiechen, Konditorei, Berliner Pfannkuchen, Bürger). So the Chapter 25 dissociation reappears: the school makes D-aim more domesticating on average yet lands it on a different domestication profile from H, not a more H-like one. “More domesticating” is again not “more like the human.” The permission behaves like a license, the school like a rule, in both chapters.
And “what only H does” is, once more, a different repertoire rather than more of the machines’ moves. In Chapter 26 H alone translates the embedded magazine, novel, and operetta titles the machines keep (Über Land und Meer → “Over Land and Sea,” Zola’s L’œuvre → “The Masterpiece”), renders the untranslatable mealtime formula “Mahlzeit, Mahlzeit” by its function (“I hope you enjoyed the meal”), re-rhymes and re-sings every embedded verse fragment where the machines render flat prose, inverts Tergit’s periodic “Blood and Iron” sentence into native English information-flow (and swaps the calque for the proverb “Might makes right”), modernizes the period-stiff letter register (“Dear Mother, dear Father,” and einen häuslichen Herd gründen → “settle down”), and adds interpretive paratext the German does not carry (identifying the long-bearded jurist as “Councilor Billinger,” glossing “Hartert, the bank apprentice”). As in Chapter 25, H’s domestication is anti-categorical — the same translator who dissolves Bürger to “a good citizen” foreignizes Brüsseler Poularde into the haute-cuisine French “poularde de Bruxelles” and keeps the drink-word “Schoppen.” The structure is the finding; the numbers (27.5 here against 17.1 there for D-let, 69.3 against 80.3 for H) are not on one ruler, and §9.4–§9.5 explain why the ranks move the way they do without any rule moving.
9.2 The reading reproduces — and it is in the text, not the scaffolding
The three persona arms (A, B, D) each read the whole novel in full and wrote private reading-notes before translating, once before each chapter; the no-persona controls (C, C-let) read it twice as well. Comparing the two runs, the reading of Effingers is strikingly stable. Independently, in both runs, the afterword-reading arms recover the same core interpretation: a seventy-year, four-generation Berlin family-chronicle — emphatically “not a Roman des jüdischen Schicksals” but “ein Berliner Roman, in dem sehr viele Leute Juden sind” — whose true subject is Time (“der unbarmherzige Motor und eigentliche Held des Romans ist die Zeit”), narrated under a law of lightness in which the catastrophe is the reader’s to supply, built on the Frühling-refrain as its formal spine and the two-letter Ring-Komposition (Ch.1 ↔︎ Ch.151) as its frame, with Waldemar as the moral center and “sweetness” / Süße the keyword to keep. The same crux-set recurs in every run — the refrain as a lock-once English template, the Schiller “Glocke” line that recurs from bankrupt Mayer’s mouth in Ch.11 and must be rendered identically both places, the inflation chapter-titles, the dialect register-problem, “bis 1914” as the one narratorial flash-forward.
The clincher is the controls. C and C-let deliberately did not read the Henneberg afterword that A, B, and D leaned on to confirm their interpretation — and yet the two no-persona controls independently reconstructed the same reading from the text alone (Waldemar as “the novel’s intellectual/moral center,” the Frühling-refrain spine, the Sonntagmittag tableau series). That convergence without the paratext is the strongest available evidence that the interpretation is genuinely recoverable from the novel, not imported from persona or afterword. The Chapter 26 reasoning study sharpens the point on its own chapter: reading independently, the six arms flagged 8 of 8 of that chapter’s load-bearing cruxes — the same sentences, for the same reasons, often with the same rendering instruction — behaving, as readers, like a seminar that has converged. The §6 result that “the AI is an excellent reader” is reproducible across runs and independent of scaffolding.
What does wobble is attentional bookkeeping, not interpretation. Each arm tilts its attention toward whichever structural device its target chapter instantiates — run-26 arms, translating “Der Sonntagmittag,” newly foreground the Sonntagmittag-tableau series, where run-25 arms foreground the Frühling refrain — and the refrain’s instance-count drifts by ±1 in the summaries, in both directions (D’s reread drops the 1930 instance from its list; C-let’s reread adds the instance it had earlier omitted). The device itself never wavers; only the tally of how many times it fires. And volume is not fidelity: B’s reading collapsed from roughly 49,000 to 15,000 words between runs yet reproduced fully — the back half folded into a compressed section while the architecture stayed at depth — and the shorter run was in one place more textually accurate than the longer (it corrected a run-25 misattribution of the novel’s closing line). Compression, not loss.
9.3 The self built from the reading does not reproduce
If the reading is the most reproducible layer, the self the arm builds from it is the least — persona-revision is the single least reproducible act in the experiment, and the instability is real, not an artifact of measurement (personas were diffed exactly, by hash).
- A never revises. Its persona is byte-identical
across the two runs (
diffreturns nothing). This is the clean control: A’s identity is fixed by design, so any variation in A is pure reading-variation. - D rebuilds the same core, then reflects differently. D rebuilt its persona from the identical before-novel reset in both runs, and the constructed self came back byte-for-byte — the first 117 lines (who-I-am, formation, the writing self, the English voice-models, the one-sentence self) hash identically across both runs. But the after-novel reflection it appends diverges substantially: both runs add exactly four numbered points under the same frame, yet only one of the four is shared. The Waldemar “this is me” self-identification — arguably the most consequential line in either reflection — appears only in run-26; the Süße-imperative appears only in run-25. Same self, different reflection.
- B revised, then discarded the revision. After
reading in run-25, B appended a ten-line “Time is the hero” addendum to
its persona. On the reset-and-reread for run-26, B discarded
it: its run-26 persona is hash-identical to the before-novel reset
(
diffclean). So B translated the two chapters under different persona conditions — after-novel reflection in place for Chapter 25, the pristine before-novel self for Chapter
The consequence is the load-bearing one. Given the identical standing invitation to revise, no free arm reproduces its own revision behaviour: B does not reproduce whether to revise, D does not reproduce what the revision says. And yet — as §9.4 shows — B’s translation decisions were among the most consistent of any arm across the two chapters, despite the persona condition changing underneath it. The reading reproduces; the operative translation policy reproduces; the after-novel reflection does not. An arm’s policy therefore lives in its stable method-core, not in the discardable after-novel reflection it writes about itself.
9.4 Consistent policy, two genuine fractures
Holding each arm’s handling of a class of problem — honorifics, foreign and culture tags, proper names and diminutives, embedded verse, period syntax, register, realia — against itself across the two chapters separates a real change of policy from the same rule meeting different material. The dominant finding is that six of the seven arms show no genuine policy change at all: every cross-chapter difference is the same disposition firing on a chapter that supplied different triggers. The two chapters load the dice differently — Chapter 26’s drawing-room lunch hands an “anglicize honorifics” arm five times the honorifics Chapter 25’s street panorama did — so an arm’s rank can move sharply while its rule does not. A’s apparent leap from third-most-foreignizing (Chapter 25) to most-domesticating machine (Chapter 26) is roughly 80% this: its standing persona rule “simplify the Prussian Rat-titles” simply fires far more often on a chapter dense with Justizrat and Frau Kommerzienrat.
Two differences, though, are genuine fractures — and both belong to the two most heavily person’d arms, each contradicting itself.
(a) A’s honorific flip. In Chapter 25, A kept the plain honorifics in German, and its pass log records the deliberation verbatim: “Kept Herr / Frau / Fräulein in German for period-Berlin … and simplified only the lofty Stadtrat → ‘Councilor’ per persona.” It explicitly considered anglicizing and rejected it. In Chapter 26, A anglicizes the same forms — “Miss Sidonie,” “Miss Kelchner,” “Mr. Effinger” — logging the move as routine (“clean, modern. Per persona-default”). But this is not pre-written in A’s persona, which specifies only the Rat-titles (“the Prussian Rat titles all to ‘Councilor,’ ‘Frau Kommerzienrat’ to ‘Mrs. Oppner’”); the plain-honorific extension is A’s own second-run invention, an over-reading of the persona the first run had declined. That the material did not compel it is shown by the other arms: on the same Chapter 26 honorifics, B, C, C-let, D-let, and H all keep Herr and Fräulein. And A’s flip overshoots past H — A writes “Mr. Effinger” where H keeps “Herr Effinger” — so A’s Chapter 25 honorifics were the more human-like ones, and the “A tracks the human” intuition breaks precisely here. (This corrects a slight over-claim in the Chapter 26 reasoning exhibit, which read A’s whole honorific drift as persona-pre-written; only the Rat-title clause is — the plain-honorific extension is not.)
(b) D’s dialect reversal. D’s run-25 reading insisted that Berlinerisch be rendered with no eye-dialect — “the dialect must not be coloured Cockney … I do not use eye-dialect spellings … the contrast is made by cadence and word-choice, not by mis-spellings.” Its run-26 reading prescribed the opposite — “lower-class London or Cockney-ish … a hint of g-droppings, the occasional guv or mate,” rendered as “are yer comin’ from a funeral?” This is the only place in the whole dataset where the same translation problem receives incompatible solutions on a reread, and it contradicts D’s own (byte-identical, fixed) persona, which rules against Cockney explicitly. The no-persona controls (C, C-let) and B did not fracture: both controls held their dialect decision steady across the runs — so on the single clearest method-flip in the study, the persona-free arms were the more stable ones.
9.5 The human is the steadiest — and the persona is a source of drift
Set against all of this, the human is the most method-stable column of the seven. Sophie Duvernoy’s editorial repertoire is identical across both chapters: in each she translates the embedded titles the machines keep, re-rhymes and re-sings the embedded verse, modernizes the period register to contemporary idiom, inverts Tergit’s periodic syntax into native English flow, adds interpretive paratext, strips the Berlin “die + surname” article, and fuses paragraphs — while keeping a few choice realia in German or French as deliberate flavour. Laid side by side, the two “what only H does” inventories are the same list. Her composite falls (80.3 → 69.3) only because the material and the instrument changed, not the method: Chapter 25’s street scenes gave her eye-dialect to do (scoring the dialect test at its ceiling) where Chapter 26’s drawing room offers almost none, and Chapter 26 hands her more realia she chooses to keep — and the two chapters were scored on different loci by different instances. The method did not move; the chapter moved under it.
The inversion is worth naming. One might assume that the human would be the idiosyncratic, case-by-case variable and the machines the consistent rule-followers — and on this evidence it is the reverse. The two arms that broke their own rules (A, D) are the two most heavily “person’d” machines; the no-persona controls and the revision-discarding arm (B) were the stable ones; and the human ran one unchanging editorial logic across both chapters. The persona, in other words, is not behaving as an anchor. It supplies strong baseline defaults — but it is also precisely where the system drifts when the material changes: A’s fracture is an over-reading of its persona’s honorific rule, D’s fracture is its reading contradicting the persona it was translating from. This extends the Chapter 25 law with a longitudinal companion. The law there was the persona controls the defaults, the brief controls the deviations, and the human’s craft picks are produced by neither. Across two chapters it gains a clause: the persona also controls where the system drifts when the material changes; behavioural consistency comes from the stable method-core and from the text itself, not from the elaborated persona.
10. Discussion
The seven-way geometry and the two probes converge on one clean law:
The persona controls the defaults, the brief controls the deviations, and the human’s craft picks are produced by neither.
Each clause is grounded. The persona controls the defaults. The composite ordering of the AI arms tracks the persona-source axis more than anything else within the AI band: the two German-leaning persona arms on the lighter brief (D-let 17.1, B 19.7) sit lowest; the persona dampens a permission (D-let kept more German than C-let under the identical paragraph); and the corpus that built a persona gates which authorities the arm can even cite (voice-models present for A and D, absent for B). The persona is doing real work — A is shorter, lighter, and more naturalized than B; D-let preserves more German than any other arm — but the work is on the baseline disposition, not on individual destinations.
The brief controls the deviations. Holding the persona constant, the school moved D-aim’s composite by +29.0 and the permission moved C-let’s by +1.3, both in the predicted direction, both concentrated on the features the nudge could target — and the format of the nudge set its reach: a quiet permission is a tie-breaker that a strong persona can absorb (D-let barely moved), while a loud school re-anchors the operating principle and overrides persona-specified defaults on whole bands of vocabulary (D-aim moved hard). The deviation a brief produces is real, shiftable, and measurable; but it is a deviation from a baseline the persona set, not a relocation to a new fixed point.
The human’s craft picks are produced by neither. H’s defining moves — restructuring the refrain, fusing paragraphs, inventing the Schumann rhyme, writing eye-dialect, inventing “Young Ram,” breaking into first-person free indirect speech — are made by no AI arm, including the strongest-nudged. They are case-by-case craft pragmatics, each drawing on a specific kind of target-language knowledge (English pub-naming inventories, English-song idiom, English working-class register, English prose-paragraph rhythm) that no categorical direction carries. And H’s foreignizing exceptions (keeping “casino toilette,” Fräulein/Herr, the German song-titles) are equally case-by-case — H domesticates the references an English reader cannot catch and preserves the cultural artifacts an English reader is more likely to recognize. This is what the region-vs-point finding (§5) and the relative-to-H table (§7) measure from two directions: the strongest plausible domesticating nudge moves an AI to its own coherent domesticating point — different from the un-nudged AI’s point, different from the permission-armed AI’s point, and different from H’s point.
The practical upshot is a clean statement of what categorical instructions can and cannot do for translation. A categorical instruction can move an AI across the macro-regional axis: it reliably shifts defaults across a whole category of lexical substitution (honorifics, academic ranks, social-historical lexicon, locale-naming). It cannot specify the local picks within the region — the right pub name, the decision to fuse paragraphs at a refrain transition, the rhyme that makes a misquoted lyric sound sung. Those require either case-by-case human craft or a far more granular and operationally specific kind of instruction than “domesticate in the Schleiermacher sense.” The domesticating direction is a real and shiftable orientation. The destination is craft.
The second chapter (§9) leaves the law standing and adds one clause to it. The geometry, the probe behaviour, and the human’s apartness all replicate; the human’s editorial method turns out to be the most stable variable of the seven across the two chapters, while the two heavily person’d arms (A, D) are the ones that fracture against their own prior rulings. So the persona is doing two jobs at once: it sets the defaults and it is where the system drifts when the material changes — A over-reads its own honorific rule, D’s reread contradicts the persona it is translating from, while the no-persona controls and the revision-discarding arm hold steady. Read longitudinally, then, the law reads: the persona controls the defaults and the drift, the brief controls the deviations, and the human’s craft picks — produced by neither — are also the most consistent across chapters. Behavioural consistency, where it exists, comes from an arm’s stable method-core and from the text itself, not from the elaborated self it writes about itself.
This is, to be explicit, a description of an experimental space and not a verdict on it. Six of the renderings were made under controlled conditions for the purpose of comparison; one is a published human translation read in its own light. None of the seven is bad; each is a coherent and intelligent reading of the German source. They differ — and the differences are where translation craft lives.
11. Limitations
The findings above are bounded, and the bounds are not incidental.
- n = 1 chapter, one instance per AI arm. Every pattern here is a case-study observation, not a measured effect. The integers — composite scores, agreement counts, dispersion means — describe this chapter; the qualitative shapes (AI–AI ≫ AI–H; the permission-vs-school asymmetry; region not point) are robust to reasonable recodings, but nothing supports extrapolation to other chapters, authors, models, or runs.
- Cross-chapter non-comparability. The two chapters (§9) were coded on different loci, with a different coverage-weighting calibration, by a different analyst instance. Only the structure is comparable across them — the clustering, the human-outlier separation, the probe behaviour, each arm’s policy — and never the absolute composite values. Every Chapter 25 ↔︎ Chapter 26 number in §9 (e.g. D-let 17.1 vs 27.5, H 80.3 vs 69.3) is a rank or position indicator on its own chapter’s ruler, not a measurement on a shared one.
- n = 2 chapters is a replication, not a population. That the geometry and probe behaviour recur on a second chapter shows the structure is not a one-chapter accident; it does not establish a population-level claim. Two chapters of one novel by one author, on one base model, remain a case study; the longitudinal findings (reading reproduces, self does not, two fractures, persona-as-drift) are observations across two runs, not a measured rate.
- The Chapter 26 reasoning typology is a lexical proxy. The per-arm authority counts in the Chapter 26 reasoning exhibit (and the analogous figures behind §9.2’s reasoning claims) are indicative magnitudes from lexical tagging of the pass logs, not a hand census. The shape of each arm’s profile is solid and is what the prose relies on; the exact integers are approximate and should not be over-read.
- Shared base model. The six AI arms share a common base model — a live confound for every similarity among them. The 63.1% AI–AI agreement measures how alike the outputs are, not independent convergence; the convergent fidelity-corrections (§8.2) make the shared tendencies visible in the arms’ own logs. The two controlled probes still isolate their varied input cleanly, because the model is held constant within each probe; but the six-arm clustering against H should not be read as a universal claim about AI translation.
- Informed, not blind. Arm identities — including the A↔︎H link — were known throughout, a standing source of expectation bias. The mitigation is procedural: anchor every claim in quoted evidence and the verified numbers, and report the negative or split results the evidence supports (A is not notably closer to H; D-aim is not closer to H than D-let). It is not a substitute for blinding. Independent prior analyses on subsets of these arms were withheld for later triangulation.
- Single human anchor. The comparison includes one human translation. The editorial-boldness axis on which H stands alone, and the specific foreignizing exceptions H makes, may reflect this translator’s craft; another human translator might sit differently. Read the H-vs-the-field separation as “H-vs-these-six-AI-arms, in this case study.”
- A’s persona contains H’s own craft writing. Where A’s choices resemble H’s on the persona’s named defaults, that is a feature of A’s specific inputs (H’s translator wrote those defaults), not a generalization about persona-built AI translation.
- Pass logs are self-reports. The reasoning attributed to each AI arm (§8) is drawn from its log — a record of what the agent says about its choices, not a transparent window into computation. H, having no log, contributes only observable text; statements about H’s intent are avoided in favour of statements about what its finished text does.
- The coding is one defensible instrument. Locus selection, granularity, and the salient-pick calls affect the exact counts; one coding correction (L20) was logged after a text re-check. The composite’s coding choices (0/0.5/1; two non-discriminating items dropped from a test basis; the honest handling of H’s cross-direction foreignizing) are stated in the quant layer. The integers are not to be over-read.
- Description, not evaluation. Closeness to the source is not literary merit; a freer, more domesticated rendering may be the superior translation. No ranking is offered or implied.
12. Companion materials
This paper is the synthesis layer over a set of finished, verified exhibits, to which the reader is referred for the underlying detail.
For Chapter 25:
- The qualitative analysis (
analysis.md) — the full seven-way close reading: the two probes locus by locus, the seven-way geometry with side-by-side renderings, where H sits, and the region-vs-point argument in narrative form. - The verified quantitative layer
(
quant/quant_layer.md) — the 45-locus partition table, the 7×7 agreement matrix, the 0–100 composite, the clusters and 2-D coordinates, the destination-dispersion measure, and the position-relative-to-H table, ending in a machine-readable JSON block. All Chapter 25 figures in this paper are drawn from it. - The reasoning-typology exhibit
(
authority/authority.md) — the ten authority-types, per-arm counts and within-arm shares, and the D-let↔︎D-aim modality contrast, each anchored in a quoted pass-log line.
For Chapter 26 and the cross-chapter layer (§9):
- The Chapter 26 comparative analysis
(
analyst_ch26/analysis.md) — the fresh, independently coded seven-way reading of “Der Sonntagmittag”: the loci table, the coverage-weighted composite, the AI–AI vs AI–H clustering, the two probes, and the “what only H does” inventory for that chapter. - The Chapter 26 reasoning exhibit
(
analyst_ch26/reasoning.md) — the reading-convergence test (8/8 cruxes flagged independently), the authority typology, the D-let↔︎D-aim modality contrast, and the A-anomaly analysis on that chapter. - The reproducibility study
(
analyst_xchapter/reproducibility.md) — whether each arm’s reading and persona came back across the two independent runs, with the personas diffed exactly (hash/diff): the reading reproduces and is text-driven; the persona-revision does not. - The decision-consistency study
(
analyst_xchapter/decisions.md) — whether each translator’s policy held across the two chapters, classifying every cross-chapter difference as a genuine policy change or same-policy-different-material, with the two fractures (A, D) isolated.
A separate persona-readings study compares the arms’ readings of Tergit — what each persona understood the author to be — rather than the translations they produced; it is the natural companion to the present comparison of outputs, and the two are designed to be triangulated against each other and against the withheld prior analyses.
A web presentation of these findings, with the charts rendered from
the verified numbers, accompanies this paper as
study.html.