A space is a punctuation convention for providing interword separation in some scripts, including the Latin, Greek, Cyrillic, and Arabic.
Not all languages use spaces between words; the ancient Latin and Greek did not. Spaces were not used to separate words until roughly 600–800 AD. (See interword separation for more on the history.) Traditionally, all CJK languages have no space: modern Chinese and Japanese (except when written with little or no kanji) still do not, but modern Korean uses spaces.
For use of spaces after full stops, exclamation marks, and question marks, see discussion in the article Full stop.
Spaces and computers
In programming language syntax, spaces are frequently used to explicitly separate tokens. Aside from this use, spaces and other whitespace characters are usually ignored by most modern programming languages; with the exception of Haskell, ABC, and Python, which use the amount of whitespace in indentation to indicate the scope of a block (unlike Algol-derived languages, like Pascal, C and Perl, which use
braces for that purpose).
In word processors and text editors, if a line on a screen is shorter than the width of the screen or window, then the empty space to the right usually does not correspond with space characters in the file: there is simply a code indicating that the next text should be put on a new line. Thus, the size of the file is not made unnecessarily larger. If there are space characters, one usually does not see the difference; text editors and word processors often have an option to make them visible. Also, if there is a space character, the cursor can move there, otherwise usually not.
Spaces and digital typography
In computer programming, the normal space corresponds to Unicode and ASCII character 32, or U+0020. In HTML and XML multiple spaces or new line characters collapse into a single space, unless they are contained in an HTML element such as pre, the xml:space="preserve" XML attribute is used, or CSS defines whitespace="pre" (or pre-line or pre-wrap). The special non-breaking-space always gives a non-collapsable space character, often used to indent text, though some web authorities discourage using it for that purpose.
Other kinds of spaces exist for special uses: for example an em dash can optionally be surrounded with a so-called hair space, Unicode character 8202, or U+200A. This space should be much thinner than a normal space, and is seldom used on its own. It can be written in HTML by using the numeric character entity   or  . Unfortunately, very few user agents are able to render a hair space correctly: in most cases the result is an unwanted symbol or a question mark on the screen (depending on the font).
Normal space versus hair space
| Normal space
| left right
|
| Normal space with em dash
| left — right
|
| Hair space with em dash
| left — right
|
| No space with em dash
| left—right
|
Unicode defines several space characters for fine typography. Depending on the browser and fonts used to view this table, not all spaces may display properly:
Space characters defined in Unicode
| Code
| HTML entity
| Name
| In Block
| Display
| Description
|
| U+0020
| not necessary
| Space
| Basic Latin
| ] [
| Normal space, same as ASCII character 0x20
|
| U+00A0
|
| No-Break Space
| Latin-1 Supplement
| ] [
| Identical to U+0020, but not a point at which a line may be broken
|
| U+1680
|  
| Ogham Space Mark
| Ogham
| ] [
| Used for interword separation in Ogham text. Normally a vertical line in vertical text or a horizontal line in horizontal text, but may also be a blank space in "stemless" fonts. Requires an Ogham font.
|
| U+2002
|  
| En Space, or Nut
| General Punctuation
| ] [
| Width of one en (half of one em)
|
| U+2003
|  
| Em Space, or Mutton
| General Punctuation
| ] [
| Width of one em
|
| U+2004
|  
| Three-Per-Em Space, or Thick Space
| General Punctuation
| ] [
| One third of an em wide
|
| U+2005
|  
| Four-Per-Em Space, or Mid Space
| General Punctuation
| ] [
| One fourth of an em wide
|
| U+2006
|  
| Six-Per-Em Space
| General Punctuation
| ] [
| One sixth of an em wide
|
| U+2007
|  
| Figure Space
| General Punctuation
| ] [
| In fonts with monospaced digits, equal to the width of one digit
|
| U+2008
|  
| Punctuation Space
| General Punctuation
| ] [
| As wide as the narrow punctuation in a font
|
| U+2009
|  
| Thin Space
| General Punctuation
| ] [
| One eighth of an em wide
|
| U+200A
|  
| Hair Space
| General Punctuation
| ] [
| Thinner than a thin space
|
| U+200B
| ​
| Zero-Width Space
| General Punctuation
| ][
| Used to indicate word boundaries to text processing systems when using scripts that do not use explicit spacing; normally not a visible separation, but it may expand in passages that are fully justified
|
| U+202F
|  
| Narrow No-Break Space
| General Punctuation
| ] [
| Similar to U+00A0 No-Break Space
|
| U+205F
|  
| Medium Mathematical Space
| General Punctuation
| ] [
| Used in mathematical formulae
|
| U+3000
|  
| Ideographic Space
| CJK Symbols and Punctuation
| ] [
| As wide as a CJK character cell
|
Unicode also provides some visible characters to stand in for space when necessary in the "Control Pictures" block: the Symbol For Space ␠ (U+2420), the Blank Symbol ␢ (U+2422), and the Open Box ␣ (U+2423).
See also