Base 62
Base62 is an encoding system that encodes large numbers using ASCII characters. The digits 0-9 (value 0-9), uppercase letters A-Z (value 10-35) and lowercase letters a-z (value 36-61) are used. The high number base results in shorter character strings than with the decimal or hexadecimal system, which offers two main advantages:
- They can be entered by a human faster and with a lower risk of error. In this case, a font should be selected in which characters that could be confusing, such as a lowercase
land an uppercaseIor a zero (0) and an uppercaseO, are distinguishable. - Length restrictions, for example, when a number is to be used as part of an identifier or file name, can be bypassed.[clarification needed] However, it should be noted that the processing system is case-sensitive.
Base62 is also sortable, to the same order as the underlying data, provided that the sort algorithm used is case sensitive.
Base62 is not widely used. Although the large number base allows a high density encoding, it is not human-friendly. The use of both upper and lower case letters makes the strings awkward to read out. There is also the risk of 0/O and l/1 confusion. Other encodings, such as Base58, restrict the symbol alphabet further to avoid these problems. Compared to the more widely used Base64, Base62 does have the advantage that its values may be used in URLs without requiring further percent-encoding. Accordingly Base62 has mostly been used for entirely automated contexts, such as URL shortening services[1], passing tokens to external ad servers,[2][3], and producing embeddable, URL-safe ID strings for applications such as message IDs in SMTP mail servers.[4]
Base62's cleanness within restricted character sets, such as simple alphanumeric ASCII, has also led to it being suggested for use as a means of character encoding ISO 10646 / Unicode, without requiring the 8-bit clean transmission needed for UTF-8. This was put forward as a means of permitting multilingual identifiers within programming languages that would otherwise be restricted to ASCII.[5]
Implementation
Here some implementations in various languages.
- JavaScript: https://www.npmjs.com/package/base62
- Python: https://pypi.python.org/pypi/pybase62
- Rust: https://crates.io/crates/base62
- PHP: https://github.com/tuupola/base62
See also
References
- ↑ Matthias Kerstner (28 July 2012). "Shortening Strings (URLs) using Base 62 Encoding".
- ↑ "Base 62 Converter". Accuweather.
- ↑ US 20080147848, "Method and system for tracking a cumulative number of identifiable visitors to different objects"
- ↑ Hazel, Philip (2003). "How Exim identifies messages". The Exim SMTP Mail Server. UIT Cambridge. p. 48. ISBN 9780954452902. Search this book on
- ↑ Wu, Pei-Chi (2001). "A base62 transformation format of ISO 10646 for multilingual identifiers". Software: Practice and Experience. John Wiley & Sons. doi:10.1002/spe.408.
This article "Base 62" is from Wikipedia. The list of its authors can be seen in its historical. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
