Discussion:
[bug #58500] default value for second parameter to .ss should follow modern typographic convention
(too old to reply)
Dave
2020-06-05 04:23:43 UTC
Permalink
URL:
<https://savannah.gnu.org/bugs/?58500>

Summary: default value for second parameter to .ss should
follow modern typographic convention
Project: GNU troff
Submitted by: barx
Submitted on: Thu 04 Jun 2020 11:23:42 PM CDT
Category: Core
Severity: 3 - Normal
Item Group: New feature
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Planned Release: None

_______________________________________________________

Details:

As mentioned in passing in bug #58450, typographic convention of the last
70-100 years has been to use the same amount of space between sentences as
between words.

groff allows the user to specify any amount, including none, of additional
inter-sentence space to add. This flexibility is good.

However, by default, groff should follow modern typographic convention. This
would require the default value of sentence_space_size (register \n[.sss]) be
0. Instead, currently:

* If a document never calls the .ss request, groff defaults to a value of 12
for word_space_size and 12 for sentence_space_size.
* If a document calls the .ss request but with only one argument
(word_space_size), groff sets sentence_space_size to be equal to
word_space_size.

In both these cases, the default sentence_space_size should be 0.




_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
G. Branden Robinson
2020-06-12 09:08:24 UTC
Permalink
Follow-up Comment #1, bug #58500 (project groff):

I disagree with this proposal. While I think you're right to characterize the
growing consensus among professional writers who _use_ computerized
typesetting systems but don't _develop_ them, I don't think groff should be
bound by that consensus.

Frankly, as our irritable but erudite (and above all, research-driven)
Heraclitean friend you noted in bug #58450 observed, much of that consensus
seems to arise from ignorance and a seemingly willful refusal to interrogate
the typographic simplifications imposed by the semi-automation of typesetting
with the Linotype machine and similar.

I'm all for documenting ".ss 12 0" to tell people how to kill off that
additional inter-sentence space if that is what they (or their editors or
instructors) or demand, but I am pretty uncomfortable with changing the
defaults.

Moreover, I think we should liaise with the TeX community before proceeding
along such an iconoclastic path. That community and ours are the only ones I
trust to produce well-reasoned opinions in this field.

Here's a counter-example, by Russell Harper, of a well-reasoned opinion:

https://cmosshoptalk.com/2020/03/24/one-space-or-two/

Yes, there's some stuff about historical practice in there, though it compares
poorly in depth and breadth to the link in bug #58450.

I would draw the reader's attention to how much of the emphasis in Harper's
hortatory has to do with the _input conventions_ one uses with Microsoft Word.
He pops open dialog boxes to do search and replace. He says "your editor"
will take your second space after a sentence out again, which presumes the
distribution of documents for review in source form--_rather than the form in
which they will appear in print_. One can easily infer that he fears the
tedium of having to distinguish between different sorts of space when the only
tool he has to express them in his source document is the number of times he
presses the space bar. He wants to pop open the dialog box, click the
"Replace All" button and be done forever.

This is not only unspeakably crude typography, but unspeakably crude computer
use. This is a man who can operate a car and thereby believes himself an
automotive engineer.

Harper is representative of the bulk of the professional writing community,
which seems to consider Microsoft Word the ne plus ultra of composition and
publishing. When I was in school, students were taught to compose _pages_, as
opposed to text, in tools like Quark Xpress and Aldus PageMaker. Why do these
receive no mention from Harper?

The false sense of expertise that WYSIWYG systems has imparted to these people
is grievous indeed.

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
G. Branden Robinson
2020-06-12 09:11:46 UTC
Permalink
Update of bug #58500 (project groff):

Status: None => Need Info
Assigned to: None => gbranden


_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Dave
2020-06-13 12:26:58 UTC
Permalink
Follow-up Comment #2, bug #58500 (project groff):

[comment #1 comment #1:]
While I think you're right to characterize the growing consensus among
professional writers who _use_ computerized typesetting systems but don't
_develop_ them, I don't think groff should be bound by that consensus.

That's not how I would characterize this consensus. I'm not looking at who is
making the decisions, but at the decisions themselves.

As of 2020, with very little exception, major publishers use the same spacing
between words and sentences. We can't peer into those publishing houses and
tell if those decisions were made by professional writers, software
developers, or janitors. All we can do is look at the results, which tell us
that major publishers today follow this practice. This makes it the current
industry standard.

You may hate it. I'm no fan of it myself. But no less a typographical
authority than Robert Bringhurst endorses it in _The Elements of Typographic
Style_. His opinion carries a weight that yours and mine don't. Pretty much
every modern style manual agrees: there's an entire Wikipedia article
<http://en.wikipedia.org/wiki/Sentence_spacing_in_language_and_style_guides>
documenting this consensus. (And given how many details these various style
guides diverge on, the unanimity on this issue is remarkable.)

It's hard to dismiss every major publisher and style guide as guilty of
typographic ineptitude. Ultimately, it doesn't matter whether the current
standard arose from ignorance, or as a cost-saving measure, or as the result
of a long-ago meeting of the Secret Cabal of Typesetters whose minutes we are
not privy to. Even if you claim all these publishers and guides are "wrong,"
collectively they define the industry standard. "Wrong" is the new right.

Groff's default sentence spacing is out of step with the industry standard,
and that's a bug regardless how much you or I may wish the industry standard
were something other than what it is.

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Dave
2020-06-13 22:16:34 UTC
Permalink
Follow-up Comment #3, bug #58500 (project groff):

I think this passage warrants an additional response.

[comment #1 comment #1:]
I think we should liaise with the TeX community before proceeding along such
an iconoclastic path.

But it is groff's and TeX's defaults that are the outliers on this. (I didn't
know till a quick google just now that TeX also defaults to wider sentence
spacing.) As an academic exercise, it would be interesting to know TeX's
reasoning. As a practical matter, it's irrelevant. The convention used by
the rest of the world is clear; groff's and TeX's defaults do not align with
it.
That community and ours are the only ones I trust to produce well-reasoned
opinions in this field.

Perhaps that's the core of our disagreement: I don't think it matters how well
reasoned anyone's defense of wider sentence spacing is. You could give me 100
reasons why it's better, and even if I agree with 99 of them, it doesn't
change today's industry standard.

I certainly can't impugn TeX's typography, regardless what I think of its
syntax. But trusting _only_ their opinion, while dismissing those of
typographic experts like Bringhurst, the designers of commercial typesetting
packages, and the publishing houses who use them, feels a bit like choosing
your allies based on the answer you already know you want.

(As a possibly interesting side note, if I understand
http://tex.stackexchange.com/a/4726 correctly, TeX's default is a sentence
space 33% wider than a word space, whereas groff's default is one 100% wider.
So groff's double-wide space is already out of step with TeX's much subtler
widening.)

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
G. Branden Robinson
2020-06-14 01:47:32 UTC
Permalink
Follow-up Comment #4, bug #58500 (project groff):

[comment #3 comment #3:]
Post by Dave
I think this passage warrants an additional response.
Sure. I don't see much I can fruitfully respond to in comment 2 that I can't
sweep up with this one--I like to get concrete, and I find the
virtue-preachings of authority figures less persuasive than empirical
measurement.
Post by Dave
[comment #1 comment #1:]
I think we should liaise with the TeX community before proceeding along
such an iconoclastic path.
Post by Dave
But it is groff's and TeX's defaults that are the outliers on this. (I
didn't know till a quick google just now that TeX also defaults to wider
sentence spacing.)

TeX and *roff (historically, a mix--today, mostly TeX, I think) enjoy a status
in many fields of technical writing whereas they are almost completely unused
elsewhere. This boundary is meaningful: it represents community, shared
practice, some degree of shared values, and moreover this boundary is
constitutive of a discernible and justifiable set of independent idioms.

To be less abstract, there are several different citation styles in English
professional writing: legal, MLA, APA, Chicago/Turabian, and so on. By the
same argument you are making here, you could decree that groff should support
by default the single most popular of these. Perhaps even if there were only
a plurality winner, meaning a majority of users would be disserved?

Community matters.
Post by Dave
As an academic exercise, it would be interesting to know TeX's reasoning.
As a practical matter, it's irrelevant. The convention used by the rest of
the world is clear; groff's and TeX's defaults do not align with it.

It doesn't align with, arguendo, the vast majority of authorities writing
style guides, in English, outside the hard sciences.

That does not mean that how digital type actually gets set, even in specimens
that these authorities would declare normative, adheres to those
prescriptions.

This subject is so full of puffery and grandstanding (to which I, admittedly,
may be contributing my own vituperation) that I am suspicious of the claims
being made for the prevalence of single inter-sentence spacing. Here's why.

Again, arguendo, I'm willing to stipulate to an overwhelming prevalence of
identical spacing between words and sentences for the extremely common case of
filled-but-not-adjusted (i.e., ragged-right) text composed in Times New Roman
using Microsoft Word (or LibreOffice Writer) and then composited into a box on
a page (often directly to PDF by the same user) from there.

Such documents may comprise 95% or better of all "typeset" documents ever
produced, such as informal memoranda, though much of that has shifted to email
over the past two decades--as a matter of fact, let me interrupt my own
argument here.

I submit that a large part of what drove the uptake of paperless memorandizing
in professional contexts was the advent of "rich text" email. You couldn't
get people who weren't, at some level, computer nerds to adopt email until
people could put real boldface, italics, and underlining into the text (often,
all three at once) so that the excitable (or, less happily, authoritarian)
could place appropriate emphasis on their mandates and prohibitions. (To
level up your migraine to a cluster, change the font to Comic Sans and add
greengrocer's apostrophes.)

Among the many possible candidates, what technologies were used to achieve
that transition? "Rich Text", for a while stored and transmitted as "RTF",
which as I recall was a feature-limited version of a format originated by
Microsoft Works (a hobbled version of MS Office at a lower price point), and,
shortly thereafter and tellingly, HTML.

HTML is notorious for its (non-)handling of whitespace. Of course you were
going to get only one intersentence space in HTML. Type one, type 50, mix in
some tabs, one space was what you were going to get.

Rich Text and HTML are typography, but of a particular, limited, not to say
debased, variety.

I expect you to tell me that even if I'm precisely correct, my argument is
irrelevant. Don't worry, I haven't forgotten and will return to it.
Post by Dave
That community and ours are the only ones I trust to produce well-reasoned
opinions in this field.
Post by Dave
Perhaps that's the core of our disagreement: I don't think it matters how
well reasoned anyone's defense of wider sentence spacing is. You could give
me 100 reasons why it's better, and even if I agree with 99 of them, it
doesn't change today's industry standard.

It doesn't change what people are saying, certainly. But are we examining
professionally-set documents and taking their measurements? And by "we", I
mean real researchers. Societally, are we just taking Russell Harper's word
for it? Sure, he's an authority, but the way authorities achieve their status
is by building a peer-reviewed corpus of findings. An authority whose claims
you cannot interrogate (albeit with appropriate training or the aid of someone
who has it), is a false one.

The non-adjustability of groff's own intersentence spacing was, apparently,
little remarked-upon until you came to the issue and it remains hard to
observe, especially in proportionally-spaced fonts, until one contrives
rendering parameters to reveal it.

Bluntly, I don't trust most of the authorities you cite to experiment the way
we do. Before you say that doesn't matter, either, I would remind you that
majoritarian arguments about what _is_ the case have to be independently
derived if they come from sources that confuse their wishes with reality.

For example, it doesn't matter if Russell Harper tells us that 99% of all
literature published since 2000 by the University of Chicago Press follows the
single intersentence-space rule if a representative sample can be taken of
those publications and a statistically significant different proportion turns
up.

And that sort of empirical measurement is what I don't see in these arguments.
Possibly, when it comes down to the nuts and bolts of page composition for
publication, there is a community of experts who apply their own practices
regardless of what the old man in the glass-walled office upstairs says.

I admit that that is pure speculation. But _somebody_ has to be measuring
these things, don't they?

And if they're not, why should we take these exhortations as anything more
than aspirational?
Post by Dave
I certainly can't impugn TeX's typography, regardless what I think of its
syntax. But trusting _only_ their opinion, while dismissing those of
typographic experts like Bringhurst, the designers of commercial typesetting
packages, and the publishing houses who use them, feels a bit like choosing
your allies based on the answer you already know you want.

Where are the designers of the commercial typesetting packages on the record?
Where are the configurable parameters for these packages by major publishing
houses documented?

I'd like to see these for curiosity's sake, at the very least.
Post by Dave
(As a possibly interesting side note, if I understand
http://tex.stackexchange.com/a/4726 correctly, TeX's default is a sentence
space 33% wider than a word space, whereas groff's default is one 100% wider.
So groff's double-wide space is already out of step with TeX's much subtler
widening.)

So in groff, for a typesetter/troff device, we could express that as:


.ss 12 4


The above won't work (by my lights) for typewriter/nroff devices because there
are is no fractional spacing; groff will round that 4 down to a zero.

Let me return to the majoritarian argument as promised.

The _majority_ of groff documents are not written in the raw language. I
myself never wrote a single such one until I joined the development team (and
now I do it all the time to test things).

They are written in macro languages by package which take a variety of
approaches to intermix of their own lexicon with that of the underlying roff
engine. So I suggest that a good place to set defaults, as with hyphenation
and adjustment modes, is the macro package. Long ago I wondered why our macro
packages (and TeX's) didn't have modules for APA, MLA, and so forth. (The
answer appears to be a combination of long-toothed packages that have seen
little change for decades, and the relative dearth of use of either *roff or
TeX outside the hard sciences.)

If you want to get people's groff documents conforming to a set of common
practices, then a good way is to provide a macro package that encapsulates
them. I don't propose that you (or we) alter or rewrite ms or mm. We could
start small. As documented in the Texinfo manual, there are four fundamental
things groff does with text:

* filling
* hyphenation
* breaking
* adjusting

The .ss request directly affects the very first of these.

So a first cut of "chicago.tmac" might look like this:


.fi
.ss 12 0
.hy 6 \" no idea--what does the CMoS say?
.ad l


(My copy of CMoS, a book I otherwise highly esteem, unfortunately did not make
it with me in an intercontinental move.)

We then tell newcomers to groff who seek to produce workaday documents to pass
"-mchicago" as an option.

We certainly wouldn't tell them not to use a macro package at all; that's been
a bad idea since the 1970s.

groff the engine has long catered to a specialized audience of technical
writers in particular fields. It may very well remain there without a GUI
front end for document composition. (I find LyX impressive, personally, and
have used it for real work.) But it is _designed_ to cater to the broadest
expressible class of typographical challenges, including non-textual drawings
and non-English text.

I think I find your majoritarian argument unpersuasive because it poorly fits
both the majority of uses to which groff is actually put at present, and the,
shall we say, aspirational majority of typographical ends to which it could be
put, which includes such things as the annotation of architectural schematics
in Mandarin Chinese.

I have assigned this ticket to myself because I am interested and I am,
indeed, seeking more information as the status indicates. If the matter comes
down to "will the groff default for the second argument to .ss change in
src/roff/troff/input.cpp (or wherever)?" change, it is my intention to
unassign myself from the ticket rather than bottling it up as a sort of pocket
veto.

Right now, I oppose such a change, and would -1 it on the mailing list
(however we interpret such a thing in our community), but I do not wish to be
anything more or less than a non-volunteer for such work.

Thank you for the discussion, by which I certainly do not mean to draw it to a
close, but rather to reassure you than I am not personally piqued and continue
to appreciate your collegial work on groff.

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Dave
2020-06-19 02:12:40 UTC
Permalink
Follow-up Comment #5, bug #58500 (project groff):

Comment #4 covers an awful lot of ground and I don't have the brain space to
tackle it all at once, so I'm breaking it down based on the ideas it
introduces. Today's installment: community.

If we want to be data-driven, what is the data surrounding this concept of
community?

Certain scientific publications mandate TeX format for submissions, and it is
a de facto standard for typesetting in those fields. Its community is as
well-defined as such an inherently fuzzy concept can be.

*roff's largest community is man-page authors, where the software tends to
drive the style rather than the other way around. That is, man pages use
extra sentence space not because there is a man page style guide that mandates
it, but because they're rendered with tools that add it by default (historical
troff, and now groff). And for a significant subset of authors and readers
(probably a vast majority, but I don't have any data here and am not sure it's
even attainable), typeset output of man pages is far less important than
terminal output. Arguably, in this community, typographical minutiae are
largely irrelevant (though, again, this is guesswork in the absence of data).

Outside man page use... what is the *roff "community"? I genuinely don't
know, and due to the nature of software that anyone can download and use
freely without ever reporting its use, I'm not sure it's even answerable. But
my sense is that it's not mandated by any organizations nor used exclusively
by any set of people in some definable category. It's largely used ad hoc by
individuals who want to avoid WYSIWYG, who are fans of its powerful (like TeX)
yet terse (unlike TeX) markup style, and who have the freedom to choose their
own typesetting tool. Active members of the email list use it for everything
from technical publications to novels.

That is, it's used by a scattering of people across a large number of
disciplines, in a wide range of contexts, to present a diverse array of
subject matter. This makes us a "community" defined not by some commonality
of purpose, field, employer, or discipline, but a community whose only
defining characteristic is that we use *roff tools.

_If_ that's the case -- and again, I genuinely don't know and would welcome
any data -- it argues for groff's defaults to reflect convention across the
current world of typography. Any document author is free to use the second
parameter to .ss to override those defaults. But our default should not be
something out of step with what's done in modern professional typography in
general -- because that doesn't serve our community, which spans geography,
disciplines, and genres.

Your point about the variance in citation styles, I think, supports my
position better than yours. If I were arguing for defaults that championed
APA over MLA, there would be good grounds to dismiss my argument in light of
the fact that both styles are widely used. But this is simply not the case*
with sentence spacing. The typesetting world has coalesced around one style.

* absent evidence of the hypothesized silent revolt of in-the-trenches
typesetters from their tyrannical style-manual overlords, colluding on
dark-web forums to consistently add an amount of extra space between sentences
slight enough that it requires careful measurement to even detect (aka
Boringest Conspiracy Ever)

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
Dave
2020-07-08 11:21:28 UTC
Permalink
Follow-up Comment #6, bug #58500 (project groff):

Today's episode: boring into the Boringest Conspiracy Ever

[comment #4 comment #4:]
it doesn't matter if Russell Harper tells us that 99% of all literature
published since 2000 by the University of Chicago Press follows the single
intersentence-space rule if a representative sample can be taken of those
publications and a statistically significant different proportion turns up.
100% agree.
And that sort of empirical measurement is what I don't see in these
arguments.

True. We're taking a lot of authorities at their word because we don't have
an independent commission, perhaps a branch of the the Department of the
Interior, whose mission includes taking typographical measurements across
publishers and over time. We lack chart-filled reports with titles like
Practices In Kerning: 1970-2000.

I'm gathering that your response to this dearth of data is to throw our hands
in the air, declare "All is unknowable!," and either wait for typography's
Robert Mueller to issue his report, or dive into the field and gather the data
ourselves.

My proposal is simpler: let's give the authorities the benefit of the doubt
until data emerges to call their authority into question.

If someone wants to claim there is a discrepancy between style-manual edicts
and practices by publishers claiming to follow such style manuals, the burden
of proof pretty much has to be on the person making that claim.

And I don't think accepting the authorities' statements at face value is going
that far out on a limb. Forget careful measurements for a moment: just _look_
at some modern examples of professionally typeset works (i.e., not the garbage
that Word or the web produces). You'll be hard-pressed to find examples with
discernible wider sentence spacing. Not even in _The New Yorker_, known for
its idiƶsyncratic style guide.

Thus, even if there is some additional sentence spacing that's not easily
visible to the naked eye, it's at levels on par with ".ss 12 3" or ".ss 12 4".
Groff's default, ".ss 12 12", is grossly out of step with that.

So we have authorities telling us that word spaces and sentence spaces should
be the same, and we have a corpus of works published in the past 75 years,
where, by gum, they all pretty much look the same. Ockham's razor strongly
suggests that using normal word spaces between sentences is today's industry
standard, and has been for decades.

While we're waiting for concrete data, groff has to have _some_ default for
sentence spacing. I submit that the most sensible one is the one that pretty
much every modern authority agrees on. If and when data surfaces that shows
these authorities to be widely ignored, the question should absolutely be
revisited.

_______________________________________________________

Reply to this item at:

<https://savannah.gnu.org/bugs/?58500>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Loading...