Pre-submission advice: formatting
The following advice is of a general nature. Elements will be valuable both for native-speakers of English and for non-native speakers. It will be useful as a reference prior to submission to a journal or conference, or prior to submission to a copy-editing or proof-reading service. It covers selected aspects that are fairly general to publishing of technical work, and is this applicable in a diverse range of languages (not just specific to English language expression).
Mathematical formulæ and scientific symbols
Symbols: good and bad habits
How to insert symbols in Windows programs
Italic and roman fonts
Long variable names
Units
Hyphen
En-dash
Em-dash
Minus sign
Swung dash
Tilde
Text typesetting
Whatever software you use for word processing, find out how to:
- use styles for various parts of a document (top-level headings; subheadings; footnotes; captions; etc.), so that each component of your document has a consistent look;
- customise style definitions, so that each component of your document is formatted the way you (or a publisher) prefers;
- create your own new, user-defined styles, to cater for your own customised layout (e.g. you might want to create a new style for "warnings", or "sidebar notes", or "abstract text").
Styles are typically applied to entire paragraphs (not word-by-word, or character-by-character), albeit with a few exceptions.
Using styles has several benefits, including:
- ensuring that your document has a consistent look (e.g. all captions are in the same font, same size, same style, and so on);
- allowing you to change the look of all instances with just a couple of clicks (Want to add more white space before all of your dozens of subheadings? Just edit the style definition, and voilà!); and
- allowing you to very easily generate a table of contents (or a table of figures, etc.) in your document, as well as electronic 'bookmarks' when exporting to a PDF file.
For Microsoft Word there are numerous resources online to guide you — although most of the basics are fairly intuitive, if you take the time. Charles Kenyon has prepared a quite extensive guide to use of styles in Word, long with a list of dozens of other online (and print) resources.
Bonus: learning how to apply styles in your word processing software will help if you ever have to edit web pages (including HTML & CSS and, to a lesser extent, wikis).
Mathematical formulæ and scientific symbols
Symbols: good and bad habits
Always use the correct mathematical symbols.
Some people have developed 'bad habits', because they think that certain symbols are not available for them to use, or because they have not found out how to insert them, or because they simply do not know the difference between correct and incorrect symbols.
Consider the following examples:
INCORRECT: 500um x 2^3 = 4E3um = 4mm.
INCORRECT: 500 um * 2**3=4.00e+003 um=4 mm.
INCORRECT: 500 µm × 23 = 4000 µm = 4×103 µm = 4 mm.
POOR STYLE: 500 µm × 23 = 4000 µm = 4×103 µm = 4 mm.
CORRECT: 500 µm × 23 = 4000 µm = 4×103 µm = 4 mm.
CORRECT: 500 µm × 23 = 4000 µm = 4×103 µm = 4 mm.
Do not use letter "x" or the asterisk in place of a multiplication sign. (Exception: computer code.)
Do not use the caret or double-asterisk in place of superscript. (Exception: computer code.)
Do not use the "E" form for scientific notation. (Exception: computer code.)
Do not use "um" (without preceding space) as a unit symbol for micrometres; the correct form is " μm" (all roman, with preceding space) (Note: most modern programming languages allow a range of symbols, such as "μ", to be correctly produced in output.)
Notice also that a logical use of spacing helps greatly with readability.
How to insert symbols in Windows programs
In Microsoft Windows there are three basic alternative methods of inserting symbols:
- Use the Character Map tool. In modern versions of Windows, just search for this to run it. In older versions of Windows you can also load it by running the command "charmap.exe", if you cannot find it in the menu.
- Choose a font. For consistency of appearance this will preferably be the same as the font that you want to use in your document, e.g. "Times New Roman". It should generally not be the "Symbol" font.
For uncommon symbols you may need to compromise by either: (a) using a font that contains more characters [e.g. "Times New Roman" has 3891 'gyphs', whereas "Arial Unicode MS" has 50377 'glyphs']; or (b) using a specialised font [e.g. "Cambria Math" has 7320 'glyphs', but contains several mathematical symbols not contained in "Arial Unicode MS"] - Find the character you want.
- You can browse through the characters, until you find the one you want. If browsing, it will be easier to use the "Group by Unicode Subrange" option, to focus your search. (Tick "Advanced view" to view this option.)
- It is often easier to search for the character you want. For example, for the available integral symbols you might just search for "int". (Tick "Advanced view" to view the Search option.)
- Select the character.
- Copy the character. (Note: this will copy some undesirable formatting information too!)
- Paste the character into your document without its formatting.
- Some applications have a particular option to paste without formatting. In Word that is done with a "Paste Special" command.
- An easy way to strip formatting is paste from Character Map to a text-based application (for example, Notepad), copy the pasted text, and then paste this unformatted character into your document.
- Choose a font. For consistency of appearance this will preferably be the same as the font that you want to use in your document, e.g. "Times New Roman". It should generally not be the "Symbol" font.
- Use the Alt key plus the numeric keypad to enter an extended-ASCII-style or Unicode code-number. All of these involve holding down the "Alt" key while pressing a few digits on the numeric keypad (with "Num Lock" turned on!). These will not work if you just press the digits on the main part of the keyboard! There are a few options.
- ‹Alt› + ‹xxx›, where xxx is the decimal value of a code point, generates an OEM-encoded character. This follows IBM / PC Extended ASCII, not the ISO version of extended ASCII. These codes can be obtained from a few resources online.
- ‹Alt› + ‹0xxx›, where xxx is the decimal value of a code point, generates a Windows-encoded character. This follows Window's ANSI/ISO Latin-1/ANSI Extended ASCII, which is the ISO version of extended ASCII. These codes can be obtained: from the bottom-right corner of Character Map (see above) for some characters, after selecting the symbol of interest; from a few resources online.
- ‹Alt› + ‹+› + ‹xxxx›, where xxxx is the hexadecimal Unicode code point, generates a Unicode-encoded (UTF-16) character. These codes can be obtained: from the bottom-left corner of Character Map (see above), after selecting the symbol of interest; from the Unicode Character Code Charts; from numerous resources online. Use of Character Map to identify the relevant code is recommended, as otherwise you may enter codes that your font does not support!
- And a few other variations.
- Use custom tools built into your application.
- For example, Word 2013 has a few options:
- Enter a single symbol using Insert > Symbol > More symbols. This is the preferred option for inserting individual symbols. It can also be used to construct in-line formulæ.
- Enter a formula. This is recommended for multi-line formulæ. It can also be used to construct in-line formulæ.
There are three ways of doing this:- Insert > Equation > Insert New Equation (newest tool).
- Insert > Object > Object > Microsoft Equation 3.0 (older tool).
Equivalently: Insert > Quick Parts > Field; Field name = "Eq"; click on "Equation Editor". - Insert > Quick Parts > Field; Field name = "Eq"; click on "Field Codes"; click on "Options". (Old tool with very limited features).
- For example, Word 2013 has a few options:
If you can find the required symbol elsewhere (e.g. in a PDF file, online, somewhere else in the document), sometimes it is possible to simply copy from there and paste into the new location. However, this does not always work.
Italic and roman fonts
Various international authorities, including IUPAC, NIST and ISO have produced similar recommendations with regard to typesetting variables and other mathematical symbols (whether in equations or otherwise). The best and simplest guide to typesetting mathematical formulæ is the succinct list of recommendations for use of italic and roman font issued by IUPAC.
This document was slightly revised in 2007 and full text included in the Guidelines For Drafting IUPAC Technical Reports And Recommendations and also in the 3rd edition of the IUPAC Green Book.
See also Typefaces for Symbols in Scientific Manuscripts, issued by NIST in January 1998. This cites the family of ISO standards 31-0:1992 to 31-13:1992.
See also "More on Printing and Using Symbols and Numbers in Scientific and Technical Documents". Chapter 10 of NIST Special Publication 811 (SP 811): Guide for the Use of the International System of Units (SI). 2008 Edition, NIST. This cites the ISO standards 31-0:1992 and 31-11:1992, but notes "Currently ISO 31 is being revised [...]. The revised joint standards ISO/IEC 80000-1—ISO/IEC 80000-15 will supersede ISO 31-0:1992—ISO 31-13.".
The most useful aspect is that conscientious application of italic and roman fonts will make the meaning of symbols apparent.
In general, anything that represents a variable (for example, h for a patient's height) should be set in italic, and everything else should be set in roman type. This applies equally to characters from the Latin/English alphabet (a, b, c, ...; A, B, C, ...) as to letters from any other alphabet, most notably Greek (α, β, γ, ...; Α, Β, Γ, ...). Any operator, such as cos (representing cosine) or ∑ (representing summation), should therefore be set roman. Note that each element must be set depending upon its own merits, including subscripts and superscripts. Thus, hi would be suitable for the initial height, while hi would represent one instance from a set of heights (h1, h2, h3, ...). Notice that numbers (1, 2, 3, etc.) are not variables, and so are always set roman. Likewise, in some special cases symbols are used to represent general constants, such as π used to represent the ratio of a circle's circumference to its diameter, and such general constants can be set in roman. (This does not apply to parameters which are merely chosen to not vary.)
For vectors, matrices and tensors, it is recommended to set the variable itself in boldface (excluding any associated subscripts or superscripts). Hence, ui would be suitable for the initial velocity, while ui would represent one instance from a set of velocities (u1, u2, u3, ...). Italic is still used for variables, both for lowercase and for uppercase symbols (Latin, Greek, or otherwise). The only general situation where italic is not used for bolded symbols is for vector operators, such as ∇ (nabla), set bold and roman.
[Note: the above text is adapted from text that I originally contributed to Wikipedia articles on conventions for mathematical formulæ and for use of italics.]
Long variable names
In engineering there are a multitude of dimensionless numbers. By convention, these commonly are given symbols composed of two letters derived from a surname. For example, Re for the "Reynolds number". As variables, these should be set italic. However, care must be taken in spacing of variables in multiplications, to avoid incorrect interpretation of the symbol as R×e. There should be no confusion in writing:
μ Re = ρ v D. (Thin spaces and hair spaces look better, but ordinary spaces are better than no spaces.)
To avoid variable symbols having more than one character, a small number of people suggest calling all dimensionless numbers N, and then attaching a subscript to distinguish them (for example NRe, NFr, NPr, and so on). This is not common, and not recommended because it is considerably harder to read at a glance (compare the preceding symbols with Re, Fr and Pr).
In practical application of mathematics to various scientific fields, and in teaching of students, sometimes authors like to use complete words or textual abbreviations instead of individual symbols. In that case these should be interpreted as 'annotations', and therefore always set roman. That is, consider that they are not the variable themself, but rather a description of it. Thus write:
Reynolds number = ρ v D / μ .
If the terms happen to be involved in mathematical operations, then enclosing in some sort of brackets can be considered, and is certainly recommended if more than one word is used for a variable's name. Thus write:
Reynolds number = {density of fluid} × {velocity of fluid} × {characteristic dimension} / {viscosity of fluid} .
Ugly hybrids such as "Re. No." or "ReNo" or "ReNum" should be avoided.
Units
Units are always given in roman (unless they occur in a passage of italicised text — for example a sentence that runs for 10 cm or more).
Units are always separated from the preceding value by a space (thin spaces and hair spaces look better, or otherwise a non-breaking regular space), with two possible exceptions.
The units of degrees for angles must not be preceded by a space ("60°" is correct); however, temperatures must include a space before the units ("60 °C" is correct).
In principle, percentages are supposed to include a space ("60 %" is recommended). Personally I find that if the percentages are all 'round numbers' (multiples of 5, say, and certainly integers), then a space does not really add much ("65%" still looks OK), whereas for any non-integers a space makes the text easier to read ("61.23 %" is clearer than "61.23%").
Significant figures
Don't report results like: "The average age of patients was 55.3376253 years." Think about it: that number corresponds to roughly 55 years, 123 days, 7 hours, 32 minutes, and 32 seconds — depending upon the interpretation of 'year'. The third and subsequent decimal places have no practical meaning here, and the statistic should have been rounded to "55.34 years" or "55.3 years", or possibly even "55 years".
Another obvious example would be reporting results such as: "A wavelength of 420.215±20.666 nm." Maybe a wavelength can be determined to a precision of picometres, but here the uncertainty is acknowledged to be of order tens of nanometres. This should have been reported as "420±21 nm".
Cross-referencing
When cross-referencing, you life will be easier if you: (i) use Styles to define your captions, and (ii) insert a cross-reference to the caption of interest.
From Word this is done with Insert > Cross-reference, then choose the "Reference type", then change "Insert reference to", the choose the reference of interest, and finally click "Insert".
The benefits of this? If you already have ten figures in your document (with associated captions and cross-references in the text), and then decide you need to include a new figure between the original Figure 1 and Figure 2, then the software will take care of renumbering all of the subsequent captions (up to what will now be Figure 11) as well as correspondingly updating the cross-references in the text.
This also works for cross-references to heading numbers (e.g. §5·3·2), etc.
By the way: inserting a cross-reference like this in Word means you have inserted a Field. Some people find it easier to work with a document if all Fields are 'highlighted', which can be effected from File > Options > Advanced where you can set "Field shading" to "Always" (scroll down to the 'Show document content' section to find this option).
Citations
Citations should be treated as something that is akin to an editorial annotation to the text. Following this rule, the text must make sense if the citation were deleted.
CORRECT:
"We followed the same method as Tetris et al. [1987a], which was first suggested by Avicenna almost a thousand years ago, circa 1030 [Carruthers & Lee, 1992]."
Notice that this makes perfect sense without the citations ("We followed the same method as Tetris et al., which was first suggested by Avicenna almost a thousand years ago, circa 1030.").
INCORRECT:
"We followed the same method as [Tetris et al., 1987a], which was first suggested by [Avicenna, 1030] almost a thousand years ago."
The text no longer makes any sense if read without the citations ("We followed the same method as, which was first suggested by almost a thousand years ago.").
Furthermore, it is implied in the original 'correct' text above that the authors have not read a particular document written by Avicenna, but rather they are aware of the history because it was reported (relatively recently) by Carruthers & Lee. Perhaps Avicenna never wrote about this method himself, and we are only aware of it through records made by his students or peers; perhaps Avicenna did write about it, but not in English; or perhaps one has to visit a library in the Middle East to view the only remaining copy of this document. In contrast, the 'incorrect' text indicates that Avicenna wrote about this method specifically in the year 1030, and the authors have read that document themselves.
Dashes
Many writers are unaware of the distinct types of dashes, that each have different forms and different meanings.
- Hyphen (-)
- Predominantly used as follows.
- Between words in which the first word (or morpheme) modifies the following word, in adjectival phrases, in 'double-barrelled' names
EXAMPLE: "Our lawyer asked the executive to re-sign the contract, using a pen instead of a pencil."
EXAMPLE: "We used a low-cost mixture of sand and anthracite."
EXAMPLE: "Jar testing was performed in a six-paddle stirrer."
EXAMPLE: "Sir Tim Berners-Lee is credited with inventing the World Wide Web in 1989." - To split a long word at the end of a line. (The split is preferably placed at a syllable or morpheme boundary.)
EXAMPLE:
"On Tuesday we constructed tetra-
hedral bricks, which we arranged
in order of size. On Wednesday we
painted the bricks with profes-
sional materials." - To indicate chapter prefixes.
EXAMPLE: "Chapter 12 reviews the physiological variations; see especially the summary on page 12-10." - A few other stylistic, semantic and syntactic variants exist, for example non-breaking hyphens, optional hyphens, and double-hyphens (where a usually-hyphenated word or phrase happens to be broken at the end of a line).
- Between words in which the first word (or morpheme) modifies the following word, in adjectival phrases, in 'double-barrelled' names
- En-dash (–)
- Predominantly used as follows.
- Between words of equal semantic value, including the combination of names from multiple individuals.
EXAMPLE: "The sand–coal mixture was dumped into the filter beds."
EXAMPLE: "The mixer was designed with a rotor–stator configuration."
EXAMPLE: "The Stokes–Einstein–Sutherland equation describes the fundamental process of diffusion." - To indicate ranges of values.
EXAMPLE: "Theovin et al. used 1–2 mm spheres in their computer simulation."
EXAMPLE: "Chapter 7 explains the importance of hierarchical structure. There is a particularly useful summary on pages 7-10–7-12."
- Between words of equal semantic value, including the combination of names from multiple individuals.
- Em-dash (—)
- Used for the following.
- Parenthetic remarks or 'asides' in the middle of a sentence.
EXAMPLE: "The mixture was stirred — even though theoretically this should not have affected the results — before loading into the analysis chamber."
In the above example, parentheses could have been used instead. - Afterthoughts or 'addenda' to sentences.
EXAMPLE: "After lunch they would sit around and chat — which usually degenerated into office gossip."
In the above example, parentheses could have been used instead.
- Parenthetic remarks or 'asides' in the middle of a sentence.
- Minus sign (−)
- Used for the following.
- In mathematical equations to denote subtraction.
EXAMPLE: "Atotal = Aouter − Ainner." - To indicate a negative number.
EXAMPLE: "The specimens were stored in a freezer at −18 °C (±2 °C) for between 2 and 6 weeks prior to analysis."
- In mathematical equations to denote subtraction.
- Swung dash (⁓)
- Used for the following.
- In dictionaries to indicate repetition of the headword.
EXAMPLE: "Dash: a punctuation mark resembling a short horizontal line. Em-⁓: such a dash that is as wide as the letter 'm'."
In English the sentence should be rendered with an en-dash or the word 'to'. That is, either "Patients of ages 30–50 were enrolled in the trial." or, "Patients of ages 30 to 50 were enrolled in the trial." - In dictionaries to indicate repetition of the headword.
- Tilde (~)
- Used for the following.
- To indicate approximation (in which uncertainty is larger than for ≈).
EXAMPLE: "The specimens were examined outside in the field (~25 °C), prior to shipping."
EXAMPLE: "We begin with a cylinder of dimensions: radius = 25 mm, thickness = 0.2 mm, and length ~ 300 mm."
- To indicate approximation (in which uncertainty is larger than for ≈).
Note: While most common fonts properly cater for all of these basic dashes (and sometimes more), unfortunately there are some less-popular fonts that have either incomplete coverage, or rather inappropriate glyphs, so a little bit of care is needed if using an 'unusual' font.