[Copyright 1975,1979,1983,2002,2003,2004,2005,2007,2008,2011 Frank Durda IV, All Rights Reserved. Mirroring of any material on this site in any form is expressly prohibited. The official web site for this material is: http://nemesis.lonestar.org Contact this address for use clearances: clearance at nemesis.lonestar.org Comments and queries to this address: web_reference at nemesis.lonestar.org]
0000 |
0001 |
0010 |
0011 |
0100 |
0101 |
0110 |
0111 |
1000 |
1001 |
1010 |
1011 |
1100 |
1101 |
1110 |
1111 |
||
o s t S i g n i f i c a n t B i t s |
000 |
(0) 00 |
(1) 01 |
(2) 02 |
(3) 03 |
(4) 04 |
(5) 05 |
(6) 06 |
(7) 07 |
(8) 08 |
(9) 09 |
(10) 0A |
(11) 0B |
(12) 0C |
(13) 0D |
(14) 0E |
(15) 0F |
001 |
(16) 10 |
(17) 11 |
(18) 12 |
(19) 13 |
(20) 14 |
(21) 15 |
(22) 16 |
(23) 17 |
(24) 18 |
(25) 19 |
(26) 1A |
(27) 1B |
(28) 1C |
(29) 1D |
(30) 1E |
(31) 1F |
|
010 |
(32) 20 |
(33) 21 |
(34) 22 |
(35) 23 |
(36) 24 |
(37) 25 |
(38) 26 |
(39) 27 |
(40) 28 |
(41) 29 |
(42) 2A |
(43) 2B |
(44) 2C |
(45) 2D |
(46) 2E |
(47) 2F |
|
011 |
(48) 30 |
(49) 31 |
(50) 32 |
(51) 33 |
(52) 34 |
(53) 35 |
(54) 36 |
(55) 37 |
(56) 38 |
(57) 39 |
(58) 3A |
(59) 3B |
(60) 3C |
(61) 3D |
(62) 3E |
(63) 3F |
|
100 |
(64) 40 |
(65) 41 |
(66) 42 |
(67) 43 |
(68) 44 |
(69) 45 |
(70) 46 |
(71) 47 |
(72) 48 |
(73) 49 |
(74) 4A |
(75) 4B |
(76) 4C |
(77) 4D |
(78) 4E |
(79) 4F |
|
101 |
(80) 50 |
(81) 51 |
(82) 52 |
(83) 53 |
(84) 54 |
(85) 55 |
(86) 56 |
(87) 57 |
(88) 58 |
(89) 59 |
(90) 5A |
(91) 5B |
(92) 5C |
(93) 5D |
(94) 5E |
(95) 5F |
|
110 |
(96) 60 |
(97) 61 |
(98) 62 |
(99) 63 |
(100) 64 |
(101) 65 |
(102) 66 |
(103) 67 |
(104) 68 |
(105) 69 |
(106) 6A |
(107) 6B |
(108) 6C |
(109) 6D |
(110) 6E |
(111) 6F |
|
111 |
(112) 70 |
(113) 71 |
(114) 72 |
(115) 73 |
(116) 74 |
(117) 75 |
(118) 76 |
(119) 77 |
(120) 78 |
(121) 79 |
(122) 7A |
(123) 7B |
(124) 7C |
(125) 7D |
(126) 7E |
(127) 7F |
In this table, the code or symbol name is shown on the first line, followed by the decimal value for that code or symbol, followed by the hexadecimal value. The binary value can be computed based on the row and column where the code or symbol resides, or directly from the hexadecimal value. For example, the character "+" has the binary value "010 1011", with "010" taken from the row and "1011" taken from the column. Similarly, the lowercase letter 'p' has the binary value "111 0000".
The background color for each code or symbol indicates the category that the code resides in. Red indicates control (non-printable) codes. Orange indicates basic punctuation and symbols. Yellow indicates numeric digits. Green indicates the uppercase letters. Blue indicates lowercase letters. (part of the extended character set). Purple indicates punctuation and symbols that are in the extended character set. (If color viewing is not available, the following table gives these categories as numeric ranges.)
Control 0 to 31 and 127 |
Control Characters (non-printable) (Red) | 0 to 31 | 0x00 to 0x1F | 0b0000000 to 0b0011111 |
127 | 0x7F | 0b1111111 | ||
Basic Printable 32 to 95 |
Symbols and Punctuation (Orange) | 32 to 47 | 0x20 to 0x2F | 0b0100000 to 0b0101111 |
58 to 64 | 0x3A to 0x40 | 0b0111010 to 0b1000000 | ||
91 to 95 | 0x5B to 0x5F | 0b1011011 to 0b1011111 | ||
Numbers (Yellow) | 48 to 57 | 0x30 to 0x39 | 0b0110000 to 0b0111001 | |
Uppercase Letters (Green) | 65 to 90 | 0x41 to 0x5A | 0b1000001 to 0b1011010 | |
Extended Printable 96 to 126 |
Lowercase Letters (Blue) | 97 to 122 | 0x61 to 0x7A | 0b1100001 to 0b1111010 |
Extended Symbols and Punctuation (Purple) | 96 | 0x60 | 0b1100000 | |
123 to 126 | 0x7B to 0x7E | 0b1111011 to 0b1111110 |
The extended printable character set was deliberately arranged so that if a symbol was received in this range and could not be displayed due to limitations of the printing or display device, the symbol in the basic printable range exactly 32 (0x20) positions earlier could be substituted and would provide reasonable results. In such situations, "{" and "}" would be displayed or printed as "[" and "]", while lowercase letters would be displayed or printed in uppercase.
Codes shown in Yellow are used for physical level synchronous link idle fills, editing, and DCE command functions. Orange is for codes primarily used in Synchronous transmission protocols, such as SDLC. Green codes are used to direct printer (or VDT display) paper and print head (cursor) non-printing movement. Blue codes are peripheral operator alert controls. Red codes divide data in higher level protocols, including some file system and multi-track tape format structures.
It should be mentioned that the operating systems for computers made by the late Digital Equipment Corporation (DEC), particularly the PDP-8, PDP-11, and PDP-10/DECSystem-10/DECSsystem-20 systems, had a profound and lasting influence on the uses of ASCII control characters that were employed by users at client terminals, directing operating systems and applications.
Virtually all the control character uses and conventions that were used in the DEC PDP-11 operating systems (including RT-11 and RSTS-11) were copied into the Digital Research CP/M operating system, which itself was later copied by other vendors to create a CP/M-clone that ran on the Intel 8088/8086 processor, an operating system which Microsoft Corporation bought, renamed PC-DOS, and licensed to IBM. (PC-DOS was later renamed again to MS-DOS.)
The earliest versions of UNIX were developed on DEC systems, and the Bell Labs programmers elected to use many of the same character codes that DEC had already defined for various functions in their operating systems.
Finally, additional control code uses from the DECSystem TOPS, TWENEX and ITS (aka "Incompatible Time Sharing") operating systems can be found today in BSD-derived versions of UNIX terminal drivers and applications. One specific duplication of the TOPS environment can be found in the "set filec" mode of BSD csh shell. (I am talking about the real BSD 4.x csh, not the Convex tcsh one commonly renamed as "csh" today. The "set" command with no parameters will show you if it is the real csh or not.)
Mnemonic |
Binary Hexadecimal Control Key |
||
NUL | 0 0b0000000 0x00 CTRL-@ |
NULL - No Punch | Generates an unpunched position on paper tape (except
for the traction hole) and was commonly used to create leader and trailer
areas.
Some systems use the character to indicate a [BREAK] signal has been sent, even though an actual asynchronous modem break has no character code and is actually represented by a period exceeding at least one charactaer transmission time in duration with all spacing and no marking. (Discussed further below.) Also used by some systems as an idle transmission or pad character instead of SYN. |
SOH | 1 0b0000001 0x01 CTRL-A |
Start Of Heading | |
STX | 2 0b0000010 0x02 CTRL-B |
Start Of Text | |
ETX | 3 0b0000011 0x03 CTRL-C |
End Of Text | Many DEC-derived systems use ETX as an interrupt signal
to abort software execution.
Most UNIX systems used ETX as the default code to generate a SIGINT signal for the foreground process. (Exception: Solaris responds to ETX by sending the SIGINT signal to all processes on the given controlling terminal.) |
EOT | 4 0b0000100 0x04 CTRL-D |
End Of Transmission |
DEC TOPS-10/20 and UNIX C shell use EOT for command line options displays.
In C language environments with a Standard In (stdin) device, a EOT can indicate that the end of input has been reached. |
ENQ | 5 0b0000101 0x05 CTRL-E |
Enquiry, Also known as WRU (Who aRe You), HERE IS, and Answerback |
Some Teletype models would transmit an equipment identification string
in response to this code.
In TOPS-20 environments, applications usually responded to ENQ by displaying the executable version string and other identifying information. |
ACK | 6 0b0000110 0x06 CTRL-F |
Acknowledge | |
BEL | 7 0b0000111 0x07 CTRL-G |
Bell | Audible Signal or Alert, or visual indicator on some VDTs |
BS | 8 0b0001000 0x08 CTRL-H |
Backspace |
The print head or cursor is moved one position to the left.
In the case of VDTs, the character in that position may be erased,
depending on local settings. If VDT cursor is already at column 1,
cursor may move to end of previous line, depending on local settings.
In many operating systems and DCE devices, directs that the most recently entered character that has not yet been processed should be erased from the input buffer. On some operating systems that are aware that the client is using a hard copy terminal, transmitting this character to the server causes the server to send a sequence of characters that indicate that the previous character has been discarded, but the print head does not actually back over the now-erased character. Deleting the last three characters of the sequence ABCDEF on a DECSystem-20 with a printing terminal would result in ABCDEF\F\E\D\ being printed. If a VDT was being used, the characters DEF would be erased and the cursor positioned just after the "C" character. |
HT | 9 0b0001001 0x09 CTRL-I |
Horizontal Tabulation |
This moved the print head or cursor to the next tab stop, traditionally
placed every eight columns. Some electronic printers and VDTs allow tab
stops to be programmed. Most Teletype models did not implement horizontal
tab stops and would ignore the code entirely.
Card punch systems skip the current card in response to this code. The DEC TOPS-10/20 ESC COMND JSYS completion behavior is partly emulated in the UNIX tcsh shell using the HT code instead of the ESC code. The BSD csh shell uses the traditional ESC code for original TOPS behavior. (See ESC for more information.) |
LF | 10 0b0001010 0x0A CTRL-J |
Line Feed (Paper Advance) |
Paper Advance one line or move cursor down one line. If VDT is at the bottom
of screen already, scroll screen one line or wrap to top, depending on
settings.
UNIX system display routines treat LF as though it received CR and LF in most situations. However, TCP communication software on UNIX systems running in the default "cooked" mode must use the proper CR/LF sequence to end a given line of ASCII text that is transmitted or received. |
VT | 11 0b0001011 0x0B CTRL-K |
Vertical Tabulation | Paper Advance by number of lines dictated by the control tape or similar mechanism. |
FF | 12 0b0001100 0x0C CTRL-L |
Form Feed | Paper Advance to next page, screen clear and/or position to top or bottom line on some VDTs. |
CR | 13 0b0001101 0x0D CTRL-M |
Carriage Return |
Move print head or cursor to column 1.
Early TRS-80 systems performed the actions of a CR and LF when a CR was output by applications. |
SO | 14 0b0001110 0x0E CTRL-N |
Shift Out | De-select alternate Font or character set on some equipment. On terminals with APL character sets, this code would return to using the normal ASCII character set. |
SI | 15 0b0001111 0x0F CTRL-O |
Shift In | Select alternate Font or character set (such as APL) on some equipment |
DLE | 16 0b0010000 0x10 CTRL-P |
Data Link Escape | Controls access on some DCE equipment by the DTE, such as voice-capable modems. On DEC VAX and some related equipment, halts main processor when entered from console. |
DC1 | 17 0b0010001 0x11 CTRL-Q |
Device Control 1, Also known as X-ON |
Starts paper reader on some equipment. Seven bit asynchronous communications frequently use this code for flow control. |
DC2 | 18 0b0010010 0x12 CTRL-R |
Device Control 2 | Starts paper punch or tape recorder on some equipment |
DC3 | 19 0b0010011 0x13 CTRL-S |
Device Control 3, Also known as X-OFF |
Stops paper reader on some equipment. Seven bit asynchronous communications frequently use this code for flow control. |
DC4 | 20 0b0010100 0x14 CTRL-T |
Device Control 4 |
Stops paper punch or tape recorder on some equipment.
On DEC TOPS10/20 and some UNIX platforms, causes the display of current load status of system or run status of the foreground process. May also cause the controlling terminal shell to send a SIGINFO signal to a foreground process on modern BSD and BSD-dervied UNIX platforms. |
NAK | 21 0b0010101 0x15 CTRL-U |
Negative Acknowledge |
May initiate re-transmit of frame in Synchronous transmission systems.
DEC-derived systems use this to discard/erase an entire unprocessed line of command text. |
SYN | 22 0b0010110 0x16 CTRL-V |
Sychronous Idle | Transmitted to maintain timing on a Synchronous data link when no other data was ready for transmission. |
ETB | 23 0b0010111 0x17 CTRL-W |
End of Transmission Block | |
CAN | 24 0b0011000 0x18 CTRL-X |
Cancel | Early HP operating systems used CAN to signal that an entire line of unprocessed command text was to be discarded. |
EM | 25 0b0011001 0x19 CTRL-Y |
End of Medium | |
SUB | 26 0b0011010 0x1A CTRL-Z |
Substitute |
BSD-derived shells use this code to suspend program execution.
Some older DEC-derived systems use this character as an end-of-file indicator for text files. |
ESC | 27 0b0011011 0x1B CTRL-[ |
Escape |
For output to displays, ESC is commonly used to begin a
sequence of characters that are used to alter terminal behavior.
In these display control systems, the characters that immediately
follow the ESC character instruct the receiving device to reposition the
display cursor or printing position, erase the screen or reposition
paper, alter character sets or display colors to be used from this
point forward, even start or stop peripherals attached to the terminal.
In the 1970s, numerous display control code systems were developed by the various manufacturers. However, the control code system developed by DEC for the VT50/VT52 display terminals were extremely popular and emulated by other manufacturer equipment. The VT50/VT52 display control system was expanded for the DEC VT100, and that command set was largely adopted as an ANSI standard which is widely used today. For input from terminals, DEC TOPS-10/20 and UNIX csh shell use this code to attempt a command line completion or guide word display. (Guide words only in TOPS COMND JSYS calls.) |
FS | 28 0b0011100 0x1C CTRL-\ |
File Separator | By default, the command shells in UNIX systems treat this as a QUIT signal, and will pass a signal to the current foreground process that it should abort, and if allowed and possible, make a core dump. |
GS | 29 0b0011101 0x1D CTRL-] |
Group Separator | |
RS | 30 0b0011110 0x1E CTRL-^ |
Record Separator | |
US | 31 0b0011111 0x1F CTRL-_ |
Unit Separator | |
DEL | 127 0b1111111 0x7F No Standard |
Delete, Also known as RUB OUT |
Used on paper tape systems to "erase" a bad punch by over-punching an
incorrect byte with all holes. Some systems also used this code to create
a leader and trailer sequence for paper and digital tape recordings.
Some operating systems from the paper tape and punch card eras
ignore this code when received.
Some operating systems use this as an alternate to the Back space (BS) code, erasing the most recently received and unprocessed input character. |
There have been several versions of the ASCII coding system. There were formal versions in 1963, 1965, 1967 and the ANSI version in 1968. The following list details many of the changes made to the coding system during this period.
There are numerous symbols that do not exist in ASCII but might seem logical to have. Some do exist in other character sets, but these are not part of ASCII.
ASCII has only 94 code combinations that can be used to produce printable characters. Since 52 codes are consumed by the alphabet, and another 10 are consumed by numeric digits, this only leaves 32 codes for punctuation and other symbols. And in those 32 codes, 5 had to contain similar looking characters to characters found in the other set of 5. For example, eg '[' and ']' is in one set of 5 with '{' and '}' in the other set of 5. and so on. This design of ASCII was intentionally organized to allow simpler display devices to be produced that only had to print 62 of the 94 ASCII printable codes and could substitute something "close" when asked to display an ASCII character that the device was incapable of producing, such as using the uppercase letter when the lowercase letter could not be printed.
Subsequently, many symbols that might be desired are just not present in ASCII. For example, to provide all accent marks commonly used in European languages on vowels, as many as twelve codes per vowel would be needed, requiring perhaps sixty codes. The ASCII character coding system just doesn't have the space to include these symbols.
Here is a list of codes that people frequently inquire about that do not exist in ASCII.
Some display systems also offer special "fonts" that include symbols specific to certain occupations or world regions, but these typically re-use the same numerical code values that ASCII uses for its printable and extended printable characters. Because of the overlap, in order to mix special character codes and normal ASCII characters together, the special font must be activated, the special character selected, and then the special font deactivated. This must be repeated each time a character from the set not currently selected is desired, and in some equipment, only one set can be displayed at a time.
The typical World Wide Web browser is able to display both ASCII and ISO-8859 characters simultaneously because their numerical codes do not overlap. However, since most keyboards can only produce ASCII characters, the display of ISO-8859 characters is achieved by using HTML escape codes that are entered using ASCII codes. For example in the HTML language, the ASCII character sequence '¢' in an HTML document will display the ISO-8859 character '¢' on most systems.
It should be understood that no ASCII code specifies the font type, font size or color of the ASCII printable characters. These and any other additional attributes of printable text are optionally specified at a higher coding level, usually by preceding the target characters with an escape sequence, followed by instructions specifying how printable characters from this point forward should be displayed.
For example, in the HTML language, the HTML tag sequence <FONT COLOR="#FF6666"> specifies that subsequent characters should be displayed in the same font type and size previously used, but that when displayed on a device capable of displaying colors, the color of the subsequent characters should be displayed in the specified shade of red. You can see some color tables and the HTML values needed to produce them in the appendix of this document: The Use and Misuse of Color in Web Pages
When the BREAK key is pressed on a real communications terminal or similar device, the asynchronous serial transmission line begins to send continuous Spacing, the opposite of the "rest" state of continuous Marking. While the BREAK signal condition is present, there are no start, data, stop or parity bits being sent.
If the duration of a continuous spacing condition exceeds 1.6 seconds, it is usually considered to be a Modem BREAK indication. Traditionally, a Modem BREAK indication directed the local and distant modems to drop carrier and end the call. Some teleprinters also turned their motors off in response to this signal, while some "party line" networks used a BREAK signal to attract the attention of the network controller.
In "current-loop" transmission systems, the Spacing condition is equivalent to a lack of current present on the loop, and if the condition persisted it was treated as a "break" in the circuit.
On time-sharing dial-up systems, a BREAK signal is usually interpreted by the receiving communications controller or computer as a directive to stop the current operation, usually by behaving as though it had received some ASCII control character that normally performed the interrupt function, like CTRL-C.
On keyboards with computers attached or integrated (such as all modern PCs), the BREAK key (if present) likely is translated from a keyboard scan code into some ASCII code (such as CTRL-C), rather than producing an actual BREAK line condition. If an actual BREAK signal was to be sent out a serial port, this would have to be done by a special instruction to the tty/serial driver of that operating system. (POSIX environments use an ioctl() call to do this.)
One of the earliest 7-bit ASCII devices was an improved line of electro-mechanical printers made by the Teletype corporation. With an operational speed of up to 10 characters per second, these devices were used worldwide for message transmission by Western Union, various news wire services and the military. Later, these devices found new uses as input/output devices connected to computer systems that also communicated using the ASCII character set.
The most widely-manufactured Teletype model was number 33, which was sold under a variety of model names such as the KSR-33 and ASR-33. These devices could only print the basic printable character portion of the ASCII character set (64 characters). This limited these devices to uppercase letters, numbers and most punctuation characters as shown in the table above. Some early video terminals and computers (such as the Digital Equipment Corporation VT50 and the Radio Shack TRS-80 Model I) supported only the basic printable set of characters, despite being designed and manufactured years after the ASCII extended character set was adopted. Some manufacturers did offer upgrades that allowed for the display of all ASCII printable characters.
Prior to the introduction of the ASCII-based teletype printers, the Teletype corporation produced teleprinters that used Baudot or "5-Level" character codes, operating at speeds between 40 and 75 baud. These were widely used for over thirty years, but were largely removed from service by the mid 1960s.
IBMs earlier mainframe computers (notably the IBM 360 and 370 families) did not use ASCII. Instead, they used an alternate character coding system called EBCDIC which was devised by IBM as a way to ensure that any peripherals to be connected to IBM computers were also made by IBM. IBM eventually lost this battle and by the late 1970s, it was common to see IBM systems that used EBCDIC internally, but had external communication processors that translated transmissions between IBMs EBCDIC and what other equipment makers were using, which was ASCII.
Baudot (5-Level) Character Code Reference (HTML)
SIXBIT Character Code Reference (HTML)
RADIX50 Character Code Reference (HTML)
Baudot (5-Level) Character Code Reference (HTML)
Return to the Telecommunications Reference Index (HTML)
[Copyright 1975,1979,1983,2002,2003,2004,2005,2007,2008,2011 Frank Durda IV, All Rights Reserved. Mirroring of any material on this site in any form is expressly prohibited. The official web site for this material is: http://nemesis.lonestar.org Contact this address for use clearances: clearance at nemesis.lonestar.org Comments and queries to this address: web_reference at nemesis.lonestar.org]
Visit the nemesis.lonestar.org home page and index