You can edit almost every page by Creating an account and confirming your email.

RE-flex

From EverybodyWiki Bios & Wiki





RE/flex
Developer(s)Robert van Engelen
Initial releaseDecember 5, 2016; 9 years ago (2016-12-05)
Stable release
6.0 / June 22, 2025 (2025-06-22)
Written inC++
Engine
    Operating systemCross-platform
    TypeLexical analysis
    LicenseBSD license
    Websitehttps://github.com/Genivia/RE-flex

    Search RE-flex on Amazon.

    RE/flex[1][2] (or RE-flex) is a computer program that generates lexical analyzers also known as "scanners" or "lexers"[3][4]. Lexical analysis is the process of converting an input character stream into a sequence of tokens, a task known as lexical tokenization.

    Overview

    Most notable lexer generators used in practice, including Flex, Ragel, and RE/flex, are based on deterministic finite automata (DFA) for efficient pattern matching, despite the theoretical possibility of an exponential increase in DFA size[5]. In practice, lexer specifications typically use deterministic regular expressions, which makes substantial DFA blowup uncommon.

    RE/flex translates a POSIX-compliant lexer specification directly into a DFA using standard construction techniques described in the compiler literature[6], extending the techniques to handle lazy matching and indentation detection applicable to specific programming language tokenization tasks. Like Flex, RE/flex generates efficient DFA-based scanners, but it shares no code with Flex and is implemented as a complete rewrite in C++.

    In addition to its native DFA-based engine, RE/flex can also be combined with external regular expression libraries that are not DFA-based, such as the C++ standard library regex engine, PCRE, and boost.regex. This is achieved by systematically rewriting the set of lexer patterns into a form suitable for tokenization with the selected external library. RE/flex performs this rewriting automatically using translation rules that are specific to each supported regular expression library. A lexer specification defines a set of regular expression patterns {pi:i=1,,n} corresponding to different token classes, such as identifiers, keywords, literals, and operators. These patterns can be combined into a single regular expression R=(p1)(p2)(pn). When applied to an input string, a regular expression engine repeatedly matches R, returning the index i of the matched subpattern (pi), thereby decomposing the input into a sequence of tokens.

    Example use cases include:

    See also

    References

    1. van Engelen, Robert (April 15, 2017). "Constructing Fast Lexical Analyzers with RE/flex - Why Another Scanner Generator?". CodeProject.
    2. Heng, Christopher (December 27, 2018). "Free Compiler Construction Tools".
    3. Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2nd ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7. Search this book on
    4. Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1. Search this book on
    5. Li, Angela; Mamouras, Konstantinos (2025). "Efficient Algorithms for the Uniform Tokenization Problem" (PDF). Proceedings of the ACM on Programming Languages. ACM. doi:10.1145/3720498. Retrieved 12 January 2026.
    6. Aho, Alfred; Sethi, Ravi; Ullman, Jeffrey (1986). Compilers: Principles, Techniques and Tools. Addison-Wesley. ISBN 9780201100884. Search this book on
    7. Bischoff, Kurt. Ox: An Attribute-Grammar Compiling System based on Yacc, Lex, and C (PDF). Search this book on

    Category:Compiling tools Category:Finite automata Category:Regular expressions Category:Free software programmed in C++ Category:Software using the BSD license Category:Lexical analysis



    This article "RE-flex" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:RE-flex. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.