Generative Flow Network

Script error: No such module "Draft topics". Script error: No such module "AfC topic".

Generative Flow Network (GFlowNet) is a probabilistic modeling framework based on neural networks. It adjusts weights to generate diverse explanatory hypotheses through sampling from bayesian posterior over the possible explanations for data acquired from the real world ^[1]^[2]. Using both the predictions of the world model and the uncertainty around those predictions, it is possible to come up with a reward function $R$ which is a proxy for the actual values obtained from the real world. The evaluation process consists of several stages ranging from numerical simulations to expensive experiments, filtering candidates with progressively higher fidelity oracles^[3] that measure different aspects of the usefulness of an hypothesis. In its most basic form, GFlowNet training objective is about matching a reward function $R$ rather than fitting a finite dataset (like typical generative models) which increases uncertainty in the model’s predictions. In the context of active learning, this uncertainty can be a strong signal to guide exploration in novel parts of the search space.

The design of GFlowNet lives at the intersection of reinforcement learning, deep generative models and energy-based probabilistic modeling. The framework enables neural networks to model distributions over data structures to sample objects with probability proportional to a given positive reward for that object i.e. approximation of the number of bits of useful information to get from an experiment.

GFlowNet was introduced at NeurIPS in 2021^[4] , and opens new doors for generative active learning to extract explanatory causal factors and search in an exponentially large space of hypotheses.

Background[edit]

The motivation behind GFlowNet is the continuous black box optimization^[5] where the learner has access to an oracle which returns a reward for a batch of hypotheses. Diversity becomes a key consideration in the design of these hypotheses particularly in cases where the oracle itself is uncertain.

A standard way to apply machines learning in exploration settings^[6] where calling the real world is expensive is to take the data $(x,y)$ already collected and train a supervised leaning proxy $F$ which infers $y$ from $x$ . A variant of function $F$ which incorporates uncertainty about its value ^[7]^[8] is then leveraged as a reward function $R$ to train a generative model that will sample hypotheses for the next round of experiments. Searching for a single hypothesis $x$ that maximizes $R(x)$ is not sufficient because the final goal is to find many novel hypotheses that are different from the ones that are already known.

The advantage of GFlowNet is that the computational cost is linear in the size of the batch, while obtaining high diversity of samples especially if the environment from where samples are generated has an exponentially large search space.

Diverse set of hypotheses:[edit]

While standard return maximization in reinforcement learning is designed to search for a single candidate that maximizes the reward function ^[9] which results in poor diversity, GFlowNet is trained to sample diverse set of hypotheses whose probability is proportional to a given positive return or reward function $R(x)$ . This is useful in tasks where exploration is important and cheap oracles may not reflect well the future outcomes. To maximize the chances that at least one of the hypotheses would work, it is important for these hypotheses to capture the modes of the imperfect reward function to improve the likelihood of discovering a candidate that can satisfy many evaluation criteria.

Highly-separated modes:[edit]

GFlowNet generalizes from seen modes of the reward function $R(x)$ to explore new modes, and directly jump to these modes by sampling from the distribution. Instead, Markov Chain Monte Carlo (MCMC) methods have to search stochastically for new modes of the energy function^[10] by making lots of small random-walk-like perturbations. In high dimensional space, when the modes are highly separated, it becomes exponentially unlikely to discover new isolated modes, because most random paths do not lead near that mode. The challenge of finding these highly separated modes, i.e., of mixing between modes, is the main drawback of MCMC.

Lower computational cost:[edit]

While MCMC methods can achieve diversity ^[11]^[12], they are expensive and in most cases only perform local exploration. Instead, GFlowNet amortizes the cost of search (which may require a lengthy chain with MCMC methods) into the cost of training the generative model. The cost of sampling in GFlowNet is amortized compared to MCMC.

Architecture[edit]

The training objective in GFlowNet is about matching a reward function $R(x)$ rather than fitting a finite dataset, which makes it possible to design larger neural network architecture without worrying about overfitting.

GFlowNet focuses on the specific machine learning problem of turning a given reward function into a generative policy which samples with a probability proportional to the reward. It views the probability assigned to an action given a state as the flow associated with a network whose nodes are states, and outgoing edges from that node are deterministic transitions driven by an action. A trained GFlowNet has an action space for deciding what to do at each state, and a state space for the partially constructed objects. Each sequence of constructive actions $(a_{0},a_{1},a_{2},\ldots )$ forms a trajectory $\tau$ that has a sequence of states $(s_{0},s_{1},s_{2},\ldots )$ , and there are many trajectories that lead to the same state $s_{t}$ . The terminal state $s_{n}$ of a trajectory is an object $x\in {\cal {X}}$ that GFlowNet is able to sample with probability proportional to the reward function $R(x)$ .

Neural networks:[edit]

The most basic GFlowNet architecture is one where a neural network outputs a stochastic policy $\pi (a_{t}|s_{t})=P_{F}(s_{t+1}|s_{t})$ , where $s_{t}$ is the partially constructed object, $a_{t}$ is one of the possible actions from $s_{t}$ leading to new state $s_{t+1}$ and $P_{F}$ Is the forward-sampling policy. The same neural net is used at every step, but it produces a stochastic output $a_{t}$ , from which a next state $s_{t+1}=\mathrm {T} (s_{t},a_{t})$ is obtained (where $\mathrm {T}$ is application-specific). Since $x$ is a variable-size object, the neural network should have an appropriate architecture for taking such objects in input e.g., recurrent neural network^[13], graph neural network^[14]or a transformer^[15]. Thus the application of GFlowNet is considered a particular form of recurrent stochastic network where the hidden recurrent state $s_{t}$ is stochastic.

Flows:[edit]

The total flow into the network is the sum of the rewards in the terminal states and can be demonstrated to be the flow at the start state. GFlowNet neural Network converges when the incoming and outgoing flow into and out of each state match. The reason it is called Flows is because of an analogy with flow networks where water enters in some places into the network and flows out of the network through terminal nodes.

Directed Acyclic Graph:[edit]

GFlowNet requires the graph to be directed and acyclic, meaning that actions are constructive and cannot be reverted. To undo an action, it is possible to sample the action removal from the backward-sampling policy $P_{B}$ which picks a parent of a node from the DAG, while forward-sampling policy $P_{F}$ picks a child of a node. The directed acyclic graph (DAG) contains all the trajectories to construct all the possible objects $x\in {\cal {X}}$ . Each node in the DAG corresponds to a state $s$ and each transition corresponds to a constructive action executed from a state $s$ and leading to a next state $s'=\mathrm {T} (s,a)$ .

Flow-matching constraint:[edit]

Let’s consider an initial state $s_{0}$ from which starts all trajectories, and a special terminating action $a_{t}$ taken in state $s$ and forcing the transition to a corresponding terminal state $x$ to take place. The total amount of flow entering the DAG is the flow $F(s_{0})$ going through the initial state $s_{0}$ and is then distributed across all the terminal states, through which a trained GFlowNet is going to achieve the constraint that the amount of flow should be $F(s_{0})=\sum R(x)$ .

Training GFlowNet is done by estimating the flow functions $F(x)$ and $F(s\rightarrow s')$ specifying the amount of flow going through states and transitions, given that it is required to achieve some desired flow in the terminal states $F(x)=R(x)$ . The stochastic policy associated with GFlowNet corresponds to the flows on the outgoing transitions, normalized:

$\pi (a|s)=\pi (s{\rightarrow }s'=T(s,a)|s)=P_{F}(s'|s)={\frac {F(s{\rightarrow }s')}{\sum _{s''}F(s{\rightarrow }s'')}}$ ^[16]

The equation says that the transition probability to go from state $s$ to state $s'$ is just the flow for that edge normalized over all the flows of the outgoing edges from the node. Thus, the forward policy can be used to sample trajectories, starting at state $s_{0}$ . Similarly, the backward policy is

$P_{B}(s|s')={\frac {F(s{\rightarrow }s')}{\sum _{s''}F(s''{\rightarrow }s')}}$ ^[17]

There is a function $F(x)$ that defines both $P_{F}$ and $P_{B}$ where $F$ meets the flow-matching constraint, which is not necessarily true when $F$ is estimated by a neural network that have not fully completed training and brought the training loss to 0 everywhere.^[18]

Applications[edit]

GFlowNet has the potential to find real-world applications in the field of iterative scientific discovery^[19], where exact sampling is intractable and local exploration (MCMC) methods perform poorly, but diversity of samples is desired (so local optimization is also not a good option). Applications include:

Drug Discovery^[20]

Energy Development

Language Modeling

Longevity Medicine^[21]

Material Science

Telco (6G Networks)^[22]

Space Exploration

Implementations[edit]

GFlowNet is implemented using PyTorch, code is available on GitHub under MIT license.

See also[edit]

Bayesian inference

Gaussian process

Markov Chain Tree Theorem

Thompson Sampling

This article "Generative Flow Network" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Generative Flow Network. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

↑ Bengio, Emmanuel; Jain, Moksh; Korablyov, Maksym; Precup, Doina; Bengio, Yoshua (2021-11-19). "Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation". arXiv:2106.04399 [cs.LG].
↑ Deleu, Tristan; Góis, António; Emezue, Chris; Rankawat, Mansi; Lacoste-Julien, Simon; Bauer, Stefan; Bengio, Yoshua (2022-06-28). "Bayesian Structure Learning with Generative Flow Networks". arXiv:2202.13903 [cs.LG].
↑ Seca, Diogo (2021-05-04). "A Review on Oracle Issues in Machine Learning". arXiv:2105.01407 [cs.LG].
↑ Emmanuel, Bengio; Moksh, Jain; Maksym, Korablyov; Doina, Precup; Yoshua, Bengio (2021-12-06). "Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation". Advances in Neural Information Processing Systems. 34. arXiv:2106.04399.
↑ Terayama, Kei; Sumita, Masato; Tamura, Ryo; Tsuda, Koji (2021-03-16). "Black-Box Optimization for Automated Discovery". Accounts of Chemical Research. 54 (6): 1334–1346. doi:10.1021/acs.accounts.0c00713. ISSN 0001-4842. PMID 33635621 Check |pmid= value (help). Unknown parameter |s2cid= ignored (help)
↑ Angermueller, Christof; Belanger, David; Gane, Andreea; Mariet, Zelda; Dohan, David; Murphy, Kevin; Colwell, Lucy; Sculley, D. (2020-11-21). "Population-Based Black-Box Optimization for Biological Sequence Design". International Conference on Machine Learning. PMLR: 324–334. arXiv:2006.03227. doi:10.1111/j.1365-246X.2006.03227.x. Unknown parameter |s2cid= ignored (help)
↑ de Freitas, Nando; Smola, Alex; Zoghi, Masrour (2012-06-27). "Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations". arXiv:1206.6457 [cs.LG].
↑ Negoescu, Diana M.; Frazier, Peter I.; Powell, Warren B. (2011-08-01). "The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery". INFORMS Journal on Computing. 23 (3): 346–363. doi:10.1287/ijoc.1100.0417. ISSN 1091-9856.
↑ Bachman, Philip; Precup, Doina (2015). "Data Generation as Sequential Decision Making". Advances in Neural Information Processing Systems. Curran Associates, Inc. 28. arXiv:1506.03504.
↑ Nijkamp, Erik; Hill, Mitch; Han, Tian; Zhu, Song-Chun; Wu, Ying Nian (2019-11-27). "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models". arXiv:1903.12370 [stat.ML].
↑ Grathwohl, Will; Swersky, Kevin; Hashemi, Milad; Duvenaud, David; Maddison, Chris (2021-07-01). "Oops I Took A Gradient: Scalable Sampling for Discrete Distributions". International Conference on Machine Learning. PMLR: 3831–3841. arXiv:2102.04509.
↑ Hanjun, Dai; Rishabh, Singh; Bo, Dai; Charles, Sutton; Dale, Schuurmans (2020). "Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration". Advances in Neural Information Processing Systems. 33. arXiv:2011.05363.
↑ Graves, Alex; Schmidhuber, Jürgen (2008). "Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 21.
↑ Scarselli, Franco; Gori, Marco; Tsoi, Ah Chung; Hagenbuchner, Markus; Monfardini, Gabriele (2009). "The Graph Neural Network Model". IEEE Transactions on Neural Networks. 20 (1): 61–80. doi:10.1109/TNN.2008.2005605. PMID 19068426. Unknown parameter |s2cid= ignored (help)
↑ Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; Polosukhin, Illia (2017-12-05). "Attention Is All You Need". arXiv:1706.03762 [cs.CL].
↑ Bengio, Yoshua; Deleu, Tristan; Hu, Edward J.; Lahlou, Salem; Tiwari, Mo; Bengio, Emmanuel (2022-04-07). "GFlowNet Foundations". arXiv:2111.09266 [cs.LG].
↑ Bengio, Yoshua; Deleu, Tristan; Hu, Edward J.; Lahlou, Salem; Tiwari, Mo; Bengio, Emmanuel (2022-04-07). "GFlowNet Foundations". arXiv:2111.09266 [cs.LG].
↑ Malkin, Nikolay; Jain, Moksh; Bengio, Emmanuel; Sun, Chen; Bengio, Yoshua (2022-01-31). "Trajectory Balance: Improved Credit Assignment in GFlowNets". arXiv:2201.13259 [cs.LG].
↑ Yoshua Bengio: Generative Flow Networks | IACS Distinguished Lecturer, retrieved 2022-08-02
↑ Jain, Moksh; Bengio, Emmanuel; Garcia, Alex-Hernandez; Rector-Brooks, Jarrid; Dossou, Bonaventure F. P.; Ekbote, Chanakya; Fu, Jie; Zhang, Tianyu; Kilgour, Micheal; Zhang, Dinghuai; Simine, Lena (2022-03-02). "Biological Sequence Design with GFlowNets". arXiv:2203.04115 [q-bio.BM].
↑ Jain, Moksh; Bengio, Emmanuel; Garcia, Alex-Hernandez; Rector-Brooks, Jarrid; Dossou, Bonaventure F. P.; Ekbote, Chanakya; Fu, Jie; Zhang, Tianyu; Kilgour, Micheal; Zhang, Dinghuai; Simine, Lena (2022-03-02). "Biological Sequence Design with GFlowNets". arXiv:2203.04115 [q-bio.BM].
↑ Thomas, Christo Kurisummoottil; Saad, Walid (2022-05-22). "Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic Communication". arXiv:2205.10768 [cs.LG].

[1] Bengio, Emmanuel; Jain, Moksh; Korablyov, Maksym; Precup, Doina; Bengio, Yoshua (2021-11-19). "Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation". arXiv:2106.04399 [cs.LG].

[2] Deleu, Tristan; Góis, António; Emezue, Chris; Rankawat, Mansi; Lacoste-Julien, Simon; Bauer, Stefan; Bengio, Yoshua (2022-06-28). "Bayesian Structure Learning with Generative Flow Networks". arXiv:2202.13903 [cs.LG].

[3] Seca, Diogo (2021-05-04). "A Review on Oracle Issues in Machine Learning". arXiv:2105.01407 [cs.LG].

[4] Emmanuel, Bengio; Moksh, Jain; Maksym, Korablyov; Doina, Precup; Yoshua, Bengio (2021-12-06). "Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation". Advances in Neural Information Processing Systems. 34. arXiv:2106.04399.

[5] Terayama, Kei; Sumita, Masato; Tamura, Ryo; Tsuda, Koji (2021-03-16). "Black-Box Optimization for Automated Discovery". Accounts of Chemical Research. 54 (6): 1334–1346. doi:10.1021/acs.accounts.0c00713. ISSN 0001-4842. PMID 33635621 Check |pmid= value (help). Unknown parameter |s2cid= ignored (help)

[6] Angermueller, Christof; Belanger, David; Gane, Andreea; Mariet, Zelda; Dohan, David; Murphy, Kevin; Colwell, Lucy; Sculley, D. (2020-11-21). "Population-Based Black-Box Optimization for Biological Sequence Design". International Conference on Machine Learning. PMLR: 324–334. arXiv:2006.03227. doi:10.1111/j.1365-246X.2006.03227.x. Unknown parameter |s2cid= ignored (help)

[7] Freitas, Nando; Smola, Alex; Zoghi, Masrour (2012-06-27). "Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations". arXiv:1206.6457 [cs.LG].

[8] Negoescu, Diana M.; Frazier, Peter I.; Powell, Warren B. (2011-08-01). "The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery". INFORMS Journal on Computing. 23 (3): 346–363. doi:10.1287/ijoc.1100.0417. ISSN 1091-9856.

[9] Bachman, Philip; Precup, Doina (2015). "Data Generation as Sequential Decision Making". Advances in Neural Information Processing Systems. Curran Associates, Inc. 28. arXiv:1506.03504.

[10] Nijkamp, Erik; Hill, Mitch; Han, Tian; Zhu, Song-Chun; Wu, Ying Nian (2019-11-27). "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models". arXiv:1903.12370 [stat.ML].

[11] Grathwohl, Will; Swersky, Kevin; Hashemi, Milad; Duvenaud, David; Maddison, Chris (2021-07-01). "Oops I Took A Gradient: Scalable Sampling for Discrete Distributions". International Conference on Machine Learning. PMLR: 3831–3841. arXiv:2102.04509.

[12] Hanjun, Dai; Rishabh, Singh; Bo, Dai; Charles, Sutton; Dale, Schuurmans (2020). "Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration". Advances in Neural Information Processing Systems. 33. arXiv:2011.05363.

[13] Graves, Alex; Schmidhuber, Jürgen (2008). "Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks". Advances in Neural Information Processing Systems. Curran Associates, Inc. 21.

[14] Scarselli, Franco; Gori, Marco; Tsoi, Ah Chung; Hagenbuchner, Markus; Monfardini, Gabriele (2009). "The Graph Neural Network Model". IEEE Transactions on Neural Networks. 20 (1): 61–80. doi:10.1109/TNN.2008.2005605. PMID 19068426. Unknown parameter |s2cid= ignored (help)

[15] Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; Polosukhin, Illia (2017-12-05). "Attention Is All You Need". arXiv:1706.03762 [cs.CL].

[16] Bengio, Yoshua; Deleu, Tristan; Hu, Edward J.; Lahlou, Salem; Tiwari, Mo; Bengio, Emmanuel (2022-04-07). "GFlowNet Foundations". arXiv:2111.09266 [cs.LG].

[17] Bengio, Yoshua; Deleu, Tristan; Hu, Edward J.; Lahlou, Salem; Tiwari, Mo; Bengio, Emmanuel (2022-04-07). "GFlowNet Foundations". arXiv:2111.09266 [cs.LG].

[18] Malkin, Nikolay; Jain, Moksh; Bengio, Emmanuel; Sun, Chen; Bengio, Yoshua (2022-01-31). "Trajectory Balance: Improved Credit Assignment in GFlowNets". arXiv:2201.13259 [cs.LG].

[19] Yoshua Bengio: Generative Flow Networks | IACS Distinguished Lecturer, retrieved 2022-08-02

[20] Jain, Moksh; Bengio, Emmanuel; Garcia, Alex-Hernandez; Rector-Brooks, Jarrid; Dossou, Bonaventure F. P.; Ekbote, Chanakya; Fu, Jie; Zhang, Tianyu; Kilgour, Micheal; Zhang, Dinghuai; Simine, Lena (2022-03-02). "Biological Sequence Design with GFlowNets". arXiv:2203.04115 [q-bio.BM].

[21] Jain, Moksh; Bengio, Emmanuel; Garcia, Alex-Hernandez; Rector-Brooks, Jarrid; Dossou, Bonaventure F. P.; Ekbote, Chanakya; Fu, Jie; Zhang, Tianyu; Kilgour, Micheal; Zhang, Dinghuai; Simine, Lena (2022-03-02). "Biological Sequence Design with GFlowNets". arXiv:2203.04115 [q-bio.BM].

[22] Thomas, Christo Kurisummoottil; Saad, Walid (2022-05-22). "Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic Communication". arXiv:2205.10768 [cs.LG].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]