Introduction
Schlandals is a state-of-the-art Projected Weighted Model Counter specialized for probabilistic inference over discrete probability distributions. Currently, there are known modelization for the following problems
- Computing the marginal probabilities of a variable in a Bayesian Network
- Computing the probability that two nodes are connected in a probabilistic graph
- Computing the probability of ProbLog programs
For more information on how to use Schlandals and its mechanics, check the documentation (still in construction). You can cite Schlandals using the following bibtex entry
@InProceedings{dubray_et_al:LIPIcs.CP.2023.15,
author = {Dubray, Alexandre and Schaus, Pierre and Nijssen, Siegfried},
title = {{Probabilistic Inference by Projected Weighted Model Counting on Horn Clauses}},
booktitle = {29th International Conference on Principles and Practice of Constraint Programming (CP 2023)},
pages = {15:1--15:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-300-3},
ISSN = {1868-8969},
year = {2023},
volume = {280},
editor = {Yap, Roland H. C.},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/19052},
URN = {urn:nbn:de:0030-drops-190520},
doi = {10.4230/LIPIcs.CP.2023.15},
annote = {Keywords: Model Counting, Bayesian Networks, Probabilistic Networks}
}
Installation
Before installing Schlandals, be sure to have the rust toolchain installed.
From source (recommended)
We assume that the commands are ran from CURRENT_DIR
git clone git@github.com:aia-uclouvain/schlandals.git && cd schlandals && cargo build --release
ln -s $CURRENT_DIR/schlandals/target/release/schlandals $HOME/.local/bin
This compiles the code in release
mode (with optimizations included) and add a symbolic links to the local binaries directory.
Using Cargo
Note that the code is still actively developed and it should not be expected that the version of the code on crates.io is the most up-to-date. The main branch of Schlandals should be stable enough for most use cases.
cargo install schlandals
This will put the schlandals
executable in ~/.cargo/bin/
Building Schlandals with Torch support
Schlandals supports learning parameters using Torch tensors.
We use the tch-rs crate for the bindings with libtorch tensors and this feature can be installed by
adding --features tensor
as a flag to the install (cargo install schlandals --features tensor
) or build (cargo build --release --features tensor
) commands.
Note that this requires to have libtorch install on your system and some environment variables set. For more information, see tch install page.
Modelization
This section explains how the problems must be encoded for the Schlandals solver.
The solver supports multiple file formats, but they rely on the same principle; the probabilistic model (i.e., its structure and parameters) is encoded in a file, and the query is encoded as evidence (i.e., assignment on some of the model's variable) is encoded in a separate file.
For each problem, the evidence can be given as a string in the command line arguments. For example, to compute a marginal probability in a Bayesian network specified in UAI format, one can use schlandals -i bn.uai --evidence "2 1 0"
where the evidence is UAI-style formatted.
You can find in the following pages a description of the supported file formats for each problem that can be solved using Schlandals. If you would like to use a file format that is not yet supported, consider opening an issue. For a more formal description of the Schlandals language, see [1]. Notice that Schlandals works on CNF formulas; hence all the file formats are, in the end, transformed into CNF (the Dimacs file format directly represent the used formulas).
- Bayesian network
- Reachability in probabilistic graphs
[1] Alexandre Dubray, Pierre Schaus, and Siegfried Nijssen. Probabilistic Inference by Projected Weighted Model Counting on Horn Clauses. In 29th International Conference on Principles and Practice of Constraint Programming (CP 2023), 2023
Modelizing Bayesian Networks
A Bayesian Network (BN) can be seen as a structural way to encode conditional independences of a joint probability distribution. Let \( \mathcal{V} = {V_1, \ldots, V_n} \) be random variables, then a Bayesian network is a directed acyclic graph structure over these variables, a show below.
This BN represent a joint probability distribution over binary variables A, B, C, and D. The directed edges encode the dependences; for example, B and C depends on A. In particular, we have that B and C are independent given C. Each variable has an associated conditional probability table (CPT) that gives its probability distributions given the values of its parents. The probabilities are called the parameters of the BN and are denoted by \( \theta \).
Currently, Schlandals is able to compute marginal probabilities of the BN (e.g., queries like \( P(C = \top) \), \(P C = \top, B = \bot \)).
Dimacs file format
Notice that our encoding is very similar to WMC encodings. For more informations about the WMC encodings used in our experiments, see [1,2] One particularity of Schlandals is that distributions are first order citizens, they are not transformed into clauses. Hence, we extended the classical DIMACS format to be able to defined distributions; a set of mutually exclusive, weighted, variables that sum up to 1. The variables inside a distributions are always the lowest indexes (1, 2, ...).
As an example, let us consider the following network
Header
The header is a single line in the format
p cnf <number variables> <number clauses>
The distributions
The distributions defined in the BN (\( P(A), P(B \mid A = a_0), P(C \mid A = a_1), \ldots \) are directly encoded in the file
c p distribution 0.2 0.8
c p distribution 0.6 0.4
c p distribution 0.3 0.7
...
Notice that the lines starts with c
but are not comments.
Moreover, the variables associated with the distributions are not explicitly declared; they are assigned in order.
That is, the first distribution (\( P(A)\)) is composed of variable 1 and 2 (variables in DIMACS format starts at 1), distribution \( P(B \mid A = a_0) \) is composed of variables 3 and 4, etc.
Deterministic variables
The remaining variables (i.e., number variables
minus the number of variables used for the distributions) are deterministic variables (without a weight).
There is one variable for each value of each network's variable.
For example, 18 variables are needed to encode the distributions, hence variable 19 will be \( a_0\) and variable 20 \( a_1\).
On total there are 18 + 8 = 26 variables. As of now, our file looks like
p cnf 26 <number clauses>
c p distribution 0.2 0.8
c p distribution 0.6 0.4
c p distribution 0.3 0.7
c p distribution 0.25 0.75
c p distribution 0.75 0.25
c p distribution 1.0 0.0
c p distribution 0.35 0.65
c p distribution 0.8 0.2
c p distribution 0.0 1.0
The Clauses
The clauses encode the implicit constraints of the CPTs. As an example, let us consider the first line of the CPT for variable B. This line is encoded with the two following constraints, written in English: - If \( A \) takes value \( a_0 \) and entry \( 0.6 \) is selected, then \( B \) must take value \( b_0 \) - If \( A \) takes value \( a_0 \) and entry \( 0.4 \) is selected, then \( B \) must take value \( b_1 \) Using the variables defined above, this can be encoded as - \( 19 \land 3 \Rightarrow 21 \Leftrightarrow \lnot 19 \lor \lnot 3 \lor 21 \) - \( 19 \land 4 \Rightarrow 22 \Leftrightarrow \lnot 19 \lor \lnot 4 \lor 22 \)
The file now looks like
p cnf 26 <number clauses>
c p distribution 0.2 0.8
c p distribution 0.6 0.4
c p distribution 0.3 0.7
c p distribution 0.25 0.75
c p distribution 0.75 0.25
c p distribution 1.0 0.0
c p distribution 0.35 0.65
c p distribution 0.8 0.2
c p distribution 0.0 1.0
c CPT for A.
-1 19 0
-2 20 0
c CPT for B.
-3 -19 21 0
-4 -19 22 0
-5 -20 21 0
-6 -20 22 0
c CPT for C.
-7 -19 23 0
-8 -19 24 0
-9 -20 23 0
-10 -20 24 0
c CPT for D.
-11 -21 -23 25 0
-12 -21 -23 26 0
-13 -21 -24 25 0
-14 -21 -24 26 0
-15 -22 -23 25 0
-16 -22 -23 26 0
-17 -22 -24 25 0
-18 -22 -24 26 0
The evidences
Evidences are knowledge about some parts of the probabilistic models. For Bayesian network, this means that it is known that some variables takes some values (e.g., \( D = d_0 \). Such evidence are encoded, in Schlandals, using the deterministic variables: we add the constraints \( \lnot d_1 \). For such evidence, the final file looks like
p cnf 26 19
c p distribution 0.2 0.8
c p distribution 0.6 0.4
c p distribution 0.3 0.7
c p distribution 0.25 0.75
c p distribution 0.75 0.25
c p distribution 1.0 0.0
c p distribution 0.35 0.65
c p distribution 0.8 0.2
c p distribution 0.0 1.0
c CPT for A.
-1 19 0
-2 20 0
c CPT for B.
-3 -19 21 0
-4 -19 22 0
-5 -20 21 0
-6 -20 22 0
c CPT for C.
-7 -19 23 0
-8 -19 24 0
-9 -20 23 0
-10 -20 24 0
c CPT for D.
-11 -21 -23 25 0
-12 -21 -23 26 0
-13 -21 -24 25 0
-14 -21 -24 26 0
-15 -22 -23 25 0
-16 -22 -23 26 0
-17 -22 -24 25 0
-18 -22 -24 26 0
-26 0
References
[1] Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artificial Intelligence, 172(6-7), 2008.
[2] Anicet Bart, Frédéric Koriche, Jean-Marie Lagniez, and Pierre Marquis. An improved CNF encoding scheme for probabilistic inference. In Proceedings of the Twenty-second European Conference on Artificial Intelligence, 2016.
UAI file format
For a complete description of the format, see here As an example, let us consider the following network
Preamble
The preamble starts with a single line indicating which type of network is encoded. In our case, this is a Bayesian network so the file starts with BAYES
.
Then, the number of variable in the network is given, with their domain size. In the above network there are four binary variables, so the preamble looks like
BAYES
4
2 2 2 2
Finally, the number of function (i.e., CPT for Bayesian network) is given, followed by the functions' scope. For each function, the number of variable in their scope is given first, followed by the variable indexes. It is important, in this step, that the variables for which the CPT is defined (i.e., the child) is the last variable in the scope. The preamble for our Bayesian network is
BAYES
4
2 2 2 2
4
1 0
2 0 1
2 0 2
3 1 2 3
Function tables
Finally, the weights of each CPT is given, in the order in which they are defined in the preamble.
For each CPT, the number of values inside it is given, followed by the values. The values are ordered following a "last increment first".
For example, the CPT of B is defined over two binary variables, A and B (indexes 0 and 1).
If their domain are represented by a 2-tuples, then the weights must be given in the following domains order(0, 0), (0, 1), (1, 0), (1, 1)
.
The final file looks like
BAYES
4
2 2 2 2
4
1 0
2 0 1
2 0 2
3 1 2 3
2
0.2 0.8
4
0.6 0.4 0.3 0.7
4
0.25 0.75 0.75 0.25
8
1.0 0.0 0.35 0.65 0.8 0.2 0.0 1.0
The evidences
The evidences must be encoded separately (either in a file, or passed as a string in the command line arguments). The evidences simply assign a value to a variable; hence, for encoding \( D = d_0 \), we use the following file
1 3 0
The file starts by the number of evidences, and then give for each variable in the evidences its value.
Modelizing Probabilistic Graph
Let \( G = (V, E) \) be a graph (undirected or directed) such that each edge \( e \in E \) is present in the graph with a probability \( p(e) \). Given a source \( s \) and a target \( t \) the goal is to compute the probability that \( s \) and \( t \) are not connected (Note that this is the complementary probability that they are connected). Below is an example of such graph, with five nodes and a query would be to compute the probability that \( A \) and \( E \) are connected.
Dimacs file format
Note that our encoding is the same as the one presented in [1], but with the explicit use of the distributions. As an example, let us consider the following graph
Header
The header is a single line in the format
p cnf <number variables> <number clauses>
The distributions
One distribution is defined for each edge, representing the fact that it is present or not.
c p distribution 0.4 0.6
c p distribution 0.8 0.2
c p distribution 0.5 0.5
...
Notice that the lines starts with c
but are not comments.
Moreover, the variables associated with the distributions are not explicitly declared; they are assigned in order.
That is, the first distribution (edge from A to B) is composed of variable 1 and 2 (variables in DIMACS format starts at 1), distribution for the edge from A to C is composed of variables 3 and 4, etc.
Deterministic variables
The remaining variables (i.e., number variables
minus the number of variables used for the distributions) are deterministic variables (without a weight).
There is one variable for each variable in the graph.
For example, 12 variables are needed to encode the distributions, hence variable 13 will be \( A \), variable 14 \( B \), etc.
On total there are 12 + 5 = 17 variables. As of now, our file looks like
p cnf 17 <number clauses>
c p distribution 0.4 0.6
c p distribution 0.8 0.2
c p distribution 0.5 0.5
c p distribution 0.6 0.4
c p distribution 0.7 0.3
c p distribution 0.3 0.7
The Clauses
The clauses encode the connectivity in the graph, using its transitivity property. Each deterministic variable can be seen as an indicator that tells if the node is reachable from the source node or not. Intuitively, a node is reachable from the source if it is connected to a node that is reachable from the source. For example, for node B this translates as "if A is reachable from the source, and the edge from A to B is present, then B is reachable from the source". Using the variables defined above, we obtain the following clauss \( 13 \land 1 \Rightarrow 14 \Leftrightarrow \lnot 13 \lor \lnot 1 \lor 14 \). The file now looks like
p cnf 17 <number clauses>
c p distribution 0.4 0.6
c p distribution 0.8 0.2
c p distribution 0.5 0.5
c p distribution 0.6 0.4
c p distribution 0.7 0.3
c p distribution 0.3 0.7
-13 -1 14 0
-13 -3 15 0
-14 -5 16 0
-15 -7 16 0
-15 -9 17 0
-16 -11 17 0
The evidences
The evidences for such problem encode the source (\( s \)) and target (\( t \)) node. Since the goal is to compute the probability that they are not connected, we add the two clauses \( s \) and \( \lnot t \). In the example, if the goal is to compute the probability that A and D are connected, then the file is
p cnf 17 <number clauses>
c p distribution 0.4 0.6
c p distribution 0.8 0.2
c p distribution 0.5 0.5
c p distribution 0.6 0.4
c p distribution 0.7 0.3
c p distribution 0.3 0.7
-13 -1 14 0
-13 -3 15 0
-14 -5 16 0
-15 -7 16 0
-15 -9 17 0
-16 -11 17 0
13 0
-17 0
References
[1] Leonardo Duenas-Osorio, Kuldeep Meel, Roger Paredes, and Moshe Vardi. Counting-based reliability estimation for power-transmission grids. In Proceedings of the AAAI Conference on Artificial Intelligence, 2017.
PG file format
The PG file format is a simple file format for encoding the structure of a probabilistic graph. When using such format, the evidence (source and target nodes) must be given either in a separate file or as a string in the command line arguments.
File structure
The first line of the file is either DIRECTED
or UNDIRECTED
and indicates the type of graph.
Then, the edges and their probability of being present are listed as 3-tuples (the two nodes and the probability).
For example, the following graph
is encoded as follows
DIRECTED
A B 0.4
A C 0.8
B D 0.5
C D 0.6
C E 0.7
D E 0.3
Then, the evidences can be given as follows
A D
which ask to compute the probability that A and D are disconnected.
Probabilistic Inference
Schlandals provides exact and \( \epsilon \)-bounded approximate inference methods. It can be used either as a DPLL-style search-based solver or as a compiler that outputs an arithmetic circuit. Beware that, currently; the compiler is just a search-solver saving its trace. Hence, you should use the search solver if you do not need to keep the circuits for further computation.
Search-based Inference
When solving an inference problem, Schlandals always performs a DPLL-style search over the solution space (even when compiling into an arithmetic circuit). However, due to the modelization choices made in Schlandals, there are few differences with classical DPLL search:
- The branching is done on the distributions, and not the variables. Hence, the "variable selection" heuristic is replaced with a "distribution selection" heuristic
- When there are no more distribution to branch on, if the Boolean Unit Propagation (BUP) did not return a failure, the remaining problem is SAT. This is due to the fact that there are only Horn clauses in the problem and the SAT problem for Horn formula can be solved with BUP.
- On top of that, there is an additional propagation, specific to Schlandals, relying on the Horn structure of the problem. For a detailed description of this propagation, see [1].
Exact inference
An exact inference strategy can be launched with the following command
[schlandals@schlandalspc]$ schlandals -i model.cnf
Estimated probability 1.1029004e-1 with bounds [1.1029004e-1 1.1029004e-1] found in 0 seconds
The solver output bounds on the probability and the time needed to solve the problem. If the model is specified as a UAI file format, then an example of command is
[schlandals@schlandalspc]$ schlandals -i model.uai --evidence "1 6 0"
Estimated probability 1.1029004e-1 with bounds [1.1029004e-1 1.1029004e-1] found in 0 seconds
For larger problem, it is possible to add a timeout and the bounds might not be tight. For a complete description on how the bounds are computed, see [2].
[schlandals@schlandalspc]$ schlandals -i large_model.cnf --timeout 30
Estimated probability 0.353553 with bounds [0.25 0.5] found in 30 seconds
Approximate inference
Schlandals can also perform anytime approximate inference with \( \varepsilon \)-guarantees [2]. If \( p \) is the true probability, it returns a probability \( \tilde p \) such that \[ \frac{p}{1 + \varepsilon} \leq \tilde p \leq p (1 + \varepsilon) \].
To use such approximate algorithm, add the --epsilon
command line argument
[schlandals@schlandalspc]$ schlandals search -i model.cnf --epsilon 0.01 --approx lds
Estimated probability 0 with bounds [0 5.0653832e-1] found in 0 seconds
Estimated probability 0 with bounds [0 2.7010540e-1] found in 0 seconds
Estimated probability 4.7705749e-3 with bounds [1.7279254e-4 1.3170931e-1] found in 0 seconds
Estimated probability 1.5684742e-2 with bounds [2.9445549e-3 8.3547817e-2] found in 0 seconds
Estimated probability 1.8642883e-2 with bounds [7.2711825e-3 4.7799252e-2] found in 0 seconds
Estimated probability 1.6748061e-2 with bounds [1.0781966e-2 2.6015435e-2] found in 0 seconds
Estimated probability 1.5110826e-2 with bounds [1.2312104e-2 1.8545738e-2] found in 0 seconds
Estimated probability 1.4143238e-2 with bounds [1.3100930e-2 1.5268473e-2] found in 1 seconds
Estimated probability 1.3766330e-2 with bounds [1.3381450e-2 1.4162279e-2] found in 2 seconds
Estimated probability 1.3616029e-2 with bounds [1.3488540e-2 1.3744723e-2] found in 4 seconds
Estimated probability 1.3616029e-2 with bounds [1.3488540e-2 1.3744723e-2] found in 4 seconds
Notice that you can also do exact anytime inference with
[schlandals@schlandalspc]$ schlandals search -i model.cnf --approx lds
Estimated probability 0 with bounds [0 5.0653832e-1] found in 0 seconds
Estimated probability 0 with bounds [0 2.7010540e-1] found in 0 seconds
Estimated probability 4.7705749e-3 with bounds [1.7279254e-4 1.3170931e-1] found in 0 seconds
Estimated probability 1.5684742e-2 with bounds [2.9445549e-3 8.3547817e-2] found in 0 seconds
Estimated probability 1.8642883e-2 with bounds [7.2711825e-3 4.7799252e-2] found in 0 seconds
Estimated probability 1.6748061e-2 with bounds [1.0781966e-2 2.6015435e-2] found in 0 seconds
Estimated probability 1.5110826e-2 with bounds [1.2312104e-2 1.8545738e-2] found in 1 seconds
Estimated probability 1.4143238e-2 with bounds [1.3100930e-2 1.5268473e-2] found in 1 seconds
Estimated probability 1.3766330e-2 with bounds [1.3381450e-2 1.4162279e-2] found in 2 seconds
Estimated probability 1.3616029e-2 with bounds [1.3488540e-2 1.3744723e-2] found in 4 seconds
Estimated probability 1.3565927e-2 with bounds [1.3524913e-2 1.3607065e-2] found in 6 seconds
Estimated probability 1.3548943e-2 with bounds [1.3537128e-2 1.3560769e-2] found in 8 seconds
Estimated probability 1.3543849e-2 with bounds [1.3540853e-2 1.3546845e-2] found in 10 seconds
Estimated probability 1.3542610e-2 with bounds [1.3541749e-2 1.3543471e-2] found in 11 seconds
Estimated probability 1.3542321e-2 with bounds [1.3541960e-2 1.3542683e-2] found in 13 seconds
Estimated probability 1.3542321e-2 with bounds [1.3541960e-2 1.3542683e-2] found in 13 seconds
References
[1] Alexandre Dubray, Pierre Schaus, and Siegfried Nijssen. Probabilistic Inference by Projected Weighted Model Counting on Horn Clauses. In 29th International Conference on Principles and Practice of Constraint Programming (CP 2023), 2023
[2] Alexandre Dubray, Pierre Schaus, and Siegfried Nijssen. Anytime Weighted Model Counting with Approximation Guarantees For Probabilistic Inference. In 30th International Conference on Principles and Practice of Constraint Programming (CP 2024), 2024
Schlandals as Compiler
Schlandals can be used as a compiler that output an arithmetic circuit from a CNF formula. In such mode, it first executes the search and then parse the cache to obtain the arithmetic circuit [1]. Hence, the compilation mode should only be used if you actually need the compiled circuit. If you only want to evaluate the probability of a problem, use the search.
Running the compilation
The compilation can be run using the following command
schlandals compile -i model.cnf compile
Estimated probability 1.3542085e-2 with bounds [1.3542085e-2 1.3542085e-2] found in 4 seconds
This command executes the search, compile the arithmetic circuit from the cache, and evaluates it. Notice that the compilation accepts the same arguments as the search, since it executes one first. Hence, you can produce approximate arithmetic circuits by doing an approximate search.
Command Line Arguments
schlandals compile --help
Usage: schlandals --input <INPUT> compile [OPTIONS]
Options:
--fdac <FDAC> If the problem is compiled, store it in this file
--dotfile <DOTFILE> If the problem is compiled, store a DOT graphical representation in this file
-h, --help Print help
Note: if you want to use general schlandals option, they must come before the compile
sub-command (e.g. schlandals -i model.cnf --epsilon 0.2 --approx lds compile --fdac ac.out
)
Learning Distribution Parameters on a Set of Queries
Schlandals provides a module that allows to learn distribution parameters for a set of queries on a probabilistic problem. It can be used with a float or a tensor semiring and it is compatible with exact and partial compilation.
Learning Settings
Basic Setting
To initiate training, a minimum requirement is an input file in CSV format. This file should include a list of CNF files, describing the queries to learn from, along with the expected output for each query. The CSV structure should follow the one of following example (saved as train.csv).
File, Probability
path_query1.cnf, 0.6
path_query2.cnf, 0.2
path_query3.cnf, 0.8
Queries within the same CSV file should pertain to the same probabilistic problem and thus share a common set of distributions.
The initial distribution values for training are derived from the first CNF file in the query list. Users can choose these initial values e.g., randomly or based on prior knowledge of the probabilistic problem.
The learning can be initiated with the default parameters using the following command.
schlandals learn --trainfile train.csv
In this basic setup, arithmetic circuits based on queries are compiled exactly.
Partial Compilation
Schlandals supports learning distribution parameters for partially compiled arithmetic circuits, enabling the handling of queries that may be too large for full compilation.
To use such partial compilation with the learning, the CNF files containing the query description should include an additional line specifying the distributions to be part of the compiled circuit and subject to learning.
The line is of the form c p learn
, followed by the index of the distributions to be learned.
For instance, if a query involves 5 distributions in its inputs and one wish to use a partially compiled circuit to learn distributions 1, 4, and 5 only, the corresponding line to include in the CNF file would be:
c p learn 1 4 5
In this partial compilation scenario, the values of the non-learned distributions (e.g., distributions 2 and 3 in our example) will be the values provided in the CNF file. During the learning process, these values of non-learned distributions will be considered as their true values while learning the parameters of the specified distributions.
The command to perform partial compilation is the following. (Assuming the CSV file referring to CNF files with partial compilation indications is saved as train_partial.csv
).
schlandals learn --trainfile train_partial.csv
By default, this command performs learning on partially compiled queries using an epsilon value of 0. To set epsilon to a non-null value (e.g., 0.3), use the following command:
schlandals learn --trainfile train_partial.csv --epsilon 0.3
Train-Test Setting
The learning module supports the use of a test set, allowing the assessment of learned distribution parameters on unseen queries related to the same probabilistic problem and sharing the same distributions.
A test CSV file containing the list of the test queries can be provided to the learning. The structure of the test file is the same to the one used for the train CSV file.
The test queries will be compiled partially or not, depending on the CNF file. Then, the queries will be evaluated a first time with the initial values of the distributions, and a second time with the learned values of the distributions.
The command to perform the learning with a test set is the following. (Assuming the CSV file we want to learn from is saved as train.csv
and the CSV file with the test queries is saved as test.csv
).
schlandals learn --trainfile train.csv --testfile test.csv
Logging Training Epochs (and Test Results)
To log losses and predicted outputs of each query at each epoch of learning, use the --do-log
option along with the --outputdir
option, indicating the folder for log storage. If a test set is provided, a second CSV file is created with losses and predicted outputs of each test query before and after learning.
The command to perform the learning with logging is the following. (Assuming the folder in which we want the output files to be saved is logs/
).
schlandals learn --trainfile train.csv --testfile test.csv --do-log --outputdir logs/
Command Line Options
[schlandals@schlandalspc]$ schlandals learn --help
Learn distribution parameters from a set of queries
Usage: schlandals learn [OPTIONS] --trainfile <TRAINFILE>
Options:
--trainfile <TRAINFILE>
The csv file containing the cnf filenames and the associated expected output
--testfile <TESTFILE>
The csv file containing the test cnf filenames and the associated expected output
-b, --branching <BRANCHING>
How to branch
[default: min-in-degree]
Possible values:
- min-in-degree: Minimum In-degree of a clause in the implication-graph
- min-out-degree: Minimum Out-degree of a clause in the implication-graph
- max-degree: Maximum degree of a clause in the implication-graph
- vsids: Variable State Independent Decaying Sum
--outfolder <OUTFOLDER>
If present, folder in which to store the output files
-l, --lr <LR>
Learning rate
[default: 0.3]
--nepochs <NEPOCHS>
Number of epochs
[default: 6000]
-d, --do-log
If present, save a detailled csv of the training and use a codified output filename
--timeout <TIMEOUT>
If present, define the learning timeout
[default: 18446744073709551615]
-e, --epsilon <EPSILON>
If present, the epsilon used for the approximation. Value set by default to 0, thus performing exact search
[default: 0]
--loss <LOSS>
Loss to use for the training, default is the MAE Possible values: MAE, MSE
[default: mae]
[possible values: mae, mse]
-j, --jobs <JOBS>
Number of threads to use for the evaluation of the DACs
[default: 1]
-s, --semiring <SEMIRING>
The semiring on which to evaluate the circuits. If `tensor`, use torch to compute the gradients. If `probability`, use custom efficient backpropagations
[default: probability]
[possible values: probability, tensor]
-o, --optimizer <OPTIMIZER>
The optimizer to use if `tensor` is selected as semiring
[default: sgd]
[possible values: adam, sgd]
--lr-drop <LR_DROP>
The drop in the learning rate to apply at each step
[default: 0.75]
--epoch-drop <EPOCH_DROP>
The number of epochs after which to drop the learning rate (i.e. the learning rate is multiplied by `lr_drop`)
[default: 100]
--stopping-criterion <STOPPING_CRITERION>
The stopping criterion for the training (i.e. if the loss is below this value, stop the training)
[default: 0.0001]
--delta-early-stop <DELTA_EARLY_STOP>
The minimum of improvement in the loss to consider that the training is still improving (i.e. if the loss is below this value for a number of epochs, stop the training)
[default: 0.00001]
--patience <PATIENCE>
The number of epochs to wait before stopping the training if the loss is not improving (i.e. if the loss is below this value for a number of epochs, stop the training)
[default: 5]
-h, --help
Print help (see a summary with '-h')
A Python interface, pyschlandals
, is available from PyPi here.
The interface is still rudimentary; open a pull request if you need any functionality.
To install the Python interface, run pip install pyschlandals
.
Running a simple problem
A problem in pyschlandals
is a set of distributions and clauses.
The following code block shows how to create a simple problem instance for a Bayesian Network and solve it using the DPLL-based search.
Notice that the indexes in the clauses start at 1, and the distributions use the first indexes.
from pyschlandals.pwmc import PyProblem
problem = PyProblem()
problem.add_distribution([0.2, 0.8])
problem.add_distribution([0.3, 0.7])
problem.add_distribution([0.4, 0.6])
problem.add_distribution([0.1, 0.9])
problem.add_distribution([0.5, 0.5])
problem.add_clause([11, -1])
problem.add_clause([12, -2])
problem.add_clause([13, -11, -3])
problem.add_clause([13, -12, -5])
problem.add_clause([14, -11, -4])
problem.add_clause([14, -12, -6])
problem.add_clause([15, -13, -7])
problem.add_clause([15, -14, -9])
problem.add_clause([16, -13, -8])
problem.add_clause([16, -14, -10])
problem.add_clause([-15])
print(problem.solve())
The problem generation can be seen as lazy. The PyProblem
is sent to the rust code only when problem.solve()
is called.
At this point, the distributions and clauses are sent to Schlandals as an alternative to the .cnf
files.