RE-flex

RE/flex
Developer(s)	Robert van Engelen
Initial release	December 5, 2016; 9 years ago
Stable release	6.0 / June 22, 2025
Written in	C++
Engine
Operating system	Cross-platform
Type	Lexical analysis
License	BSD license
Website	https://github.com/Genivia/RE-flex

Search RE-flex on Amazon.

RE/flex^[1]^[2] (or RE-flex) is a computer program that generates lexical analyzers also known as "scanners" or "lexers"^[3]^[4]. Lexical analysis is the process of converting an input character stream into a sequence of tokens, a task known as lexical tokenization.

Overview

Most notable lexer generators used in practice, including Flex, Ragel, and RE/flex, are based on deterministic finite automata (DFA) for efficient pattern matching, despite the theoretical possibility of an exponential increase in DFA size^[5]. In practice, lexer specifications typically use deterministic regular expressions, which makes substantial DFA blowup uncommon.

RE/flex translates a POSIX-compliant lexer specification directly into a DFA using standard construction techniques described in the compiler literature^[6], extending the techniques to handle lazy matching and indentation detection applicable to specific programming language tokenization tasks. Like Flex, RE/flex generates efficient DFA-based scanners, but it shares no code with Flex and is implemented as a complete rewrite in C++.

In addition to its native DFA-based engine, RE/flex can also be combined with external regular expression libraries that are not DFA-based, such as the C++ standard library regex engine, PCRE, and boost.regex. This is achieved by systematically rewriting the set of lexer patterns into a form suitable for tokenization with the selected external library. RE/flex performs this rewriting automatically using translation rules that are specific to each supported regular expression library. A lexer specification defines a set of regular expression patterns ${p_{i} : i = 1, \dots, n}$ corresponding to different token classes, such as identifiers, keywords, literals, and operators. These patterns can be combined into a single regular expression $R = (p_{1}) ∣ (p_{2}) ∣ \dots ∣ (p_{n})$ . When applied to an input string, a regular expression engine repeatedly matches $R$ , returning the index $i$ of the matched subpattern $(p_{i})$ , thereby decomposing the input into a sequence of tokens.

Example use cases include:

Compiler construction, such as the use of RE/flex in the Tiger Compiler project within the EPITA compiler construction curriculum
Compiler-compiler systems, including its use in Ox, an attribute-grammar–based compiling system^[7]
Pattern matching and search tools, such as grep-like utilities, including the use of RE/flex in ugrep

References

↑ van Engelen, Robert (April 15, 2017). "Constructing Fast Lexical Analyzers with RE/flex - Why Another Scanner Generator?". CodeProject.
↑ Heng, Christopher (December 27, 2018). "Free Compiler Construction Tools".
↑ Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2nd ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7. Search this book on
↑ Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1. Search this book on
↑ Li, Angela; Mamouras, Konstantinos (2025). "Efficient Algorithms for the Uniform Tokenization Problem" (PDF). Proceedings of the ACM on Programming Languages. ACM. doi:10.1145/3720498. Retrieved 12 January 2026.
↑ Aho, Alfred; Sethi, Ravi; Ullman, Jeffrey (1986). Compilers: Principles, Techniques and Tools. Addison-Wesley. ISBN 9780201100884. Search this book on
↑ Bischoff, Kurt. Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C (PDF). Search this book on

Category:Compiling tools Category:Finite automata Category:Regular expressions Category:Free software programmed in C++ Category:Software using the BSD license Category:Lexical analysis

This article "RE-flex" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:RE-flex. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[1] van Engelen, Robert (April 15, 2017). "Constructing Fast Lexical Analyzers with RE/flex - Why Another Scanner Generator?". CodeProject.

[2] Heng, Christopher (December 27, 2018). "Free Compiler Construction Tools".

[3] Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2nd ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7. Search this book on

[4] Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1. Search this book on

[5] Li, Angela; Mamouras, Konstantinos (2025). "Efficient Algorithms for the Uniform Tokenization Problem" (PDF). Proceedings of the ACM on Programming Languages. ACM. doi:10.1145/3720498. Retrieved 12 January 2026.

[6] Aho, Alfred; Sethi, Ravi; Ullman, Jeffrey (1986). Compilers: Principles, Techniques and Tools. Addison-Wesley. ISBN 9780201100884. Search this book on

[7] Bischoff, Kurt. Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C (PDF). Search this book on

[1]

[2]

[3]

[4]

[5]

[6]

[7]

RE-flex

Overview

See also

References

📰 Article(s) of the same category(ies)[edit]