RE-flex
| Developer(s) | Robert van Engelen |
|---|---|
| Initial release | December 5, 2016 |
| Stable release | 6.0
/ June 22, 2025 |
| Written in | C++ |
| Engine | |
| Operating system | Cross-platform |
| Type | Lexical analysis |
| License | BSD license |
| Website | https://github.com/Genivia/RE-flex |
Search RE-flex on Amazon.
RE/flex[1][2] (or RE-flex) is a computer program that generates lexical analyzers also known as "scanners" or "lexers"[3][4]. Lexical analysis is the process of converting an input character stream into a sequence of tokens, a task known as lexical tokenization.
Overview
Most notable lexer generators used in practice, including Flex, Ragel, and RE/flex, are based on deterministic finite automata (DFA) for efficient pattern matching, despite the theoretical possibility of an exponential increase in DFA size[5]. In practice, lexer specifications typically use deterministic regular expressions, which makes substantial DFA blowup uncommon.
RE/flex translates a POSIX-compliant lexer specification directly into a DFA using standard construction techniques described in the compiler literature[6], extending the techniques to handle lazy matching and indentation detection applicable to specific programming language tokenization tasks. Like Flex, RE/flex generates efficient DFA-based scanners, but it shares no code with Flex and is implemented as a complete rewrite in C++.
In addition to its native DFA-based engine, RE/flex can also be combined with external regular expression libraries that are not DFA-based, such as the C++ standard library regex engine, PCRE, and boost.regex. This is achieved by systematically rewriting the set of lexer patterns into a form suitable for tokenization with the selected external library. RE/flex performs this rewriting automatically using translation rules that are specific to each supported regular expression library. A lexer specification defines a set of regular expression patterns corresponding to different token classes, such as identifiers, keywords, literals, and operators. These patterns can be combined into a single regular expression . When applied to an input string, a regular expression engine repeatedly matches , returning the index i of the matched subpattern , thereby decomposing the input into a sequence of tokens.
Example use cases include:
- Compiler construction, such as the use of RE/flex in the Tiger Compiler project within the EPITA compiler construction curriculum
- Compiler-compiler systems, including its use in Ox, an attribute-grammar–based compiling system[7]
- Pattern matching and search tools, such as grep-like utilities, including the use of RE/flex in ugrep
See also
References
- ↑ van Engelen, Robert (April 15, 2017). "Constructing Fast Lexical Analyzers with RE/flex - Why Another Scanner Generator?". CodeProject.
- ↑ Heng, Christopher (December 27, 2018). "Free Compiler Construction Tools".
- ↑ Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2nd ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7. Search this book on
- ↑ Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1. Search this book on
- ↑ Li, Angela; Mamouras, Konstantinos (2025). "Efficient Algorithms for the Uniform Tokenization Problem" (PDF). Proceedings of the ACM on Programming Languages. ACM. doi:10.1145/3720498. Retrieved 12 January 2026.
- ↑ Aho, Alfred; Sethi, Ravi; Ullman, Jeffrey (1986). Compilers: Principles, Techniques and Tools. Addison-Wesley. ISBN 9780201100884. Search this book on
- ↑ Bischoff, Kurt. Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C (PDF). Search this book on
Category:Compiling tools Category:Finite automata Category:Regular expressions Category:Free software programmed in C++ Category:Software using the BSD license Category:Lexical analysis
This article "RE-flex" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:RE-flex. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
