Learning¶
pyAgrum encloses all the learning processes for Bayesian network in a simple class BNLearner. This class gives access directly to the complete learning algorithm and theirs parameters (such as prior, scores, constraints, etc.) but also proposes low-level functions that eases the work on developping new learning algorithms (for instance, compute chi2 or conditioanl likelihood on the database, etc.).
- class pyAgrum.BNLearner(filename, inducedTypes=True) BNLearner ¶
- Parameters:
source (str or *pandas.DataFrame) – the data to learn from
inducedTypes (Bool) – whether BNLearner should try to automatically find the type of each variable
- BNLearner(filename,src) -> BNLearner
- Parameters:
source (str or *pandas.DataFrame) – the data to learn from
src (pyAgrum.BayesNet) – the Bayesian network used to find those modalities
- BNLearner(learner) -> BNLearner
- Parameters:
learner (pyAgrum.BNLearner) – the BNLearner to copy
- G2(*args)¶
G2 computes the G2 statistic and pvalue for two columns, given a list of other columns.
- Parameters
name1 (str) – the name of the first column
name2 (str) – the name of the second column
knowing ([str]) – the list of names of conditioning columns
- Returns
the G2 statistic and the associated p-value as a Tuple
- Return type
statistic,pvalue
- chi2(*args)¶
chi2 computes the chi2 statistic and pvalue for two columns, given a list of other columns.
- Parameters
name1 (str) – the name of the first column
name2 (str) – the name of the second column
knowing ([str]) – the list of names of conditioning columns
- Returns
the chi2 statistic and the associated p-value as a Tuple
- Return type
statistic,pvalue
- currentTime()¶
- Returns
get the current running time in second (float)
- Return type
float
- databaseWeight()¶
- Return type
float
- domainSize(*args)¶
- Return type
int
- epsilon()¶
- Returns
the value of epsilon
- Return type
float
- fitParameters(bn)¶
Easy shortcut to LearnParameters method. fitParameters uses self to direcuptly populate the CPTs of bn.0
- Parameters
bn (pyAgrum.BayesNet) – a BN which will directly have its parameters learned.
- getNumberOfThreads()¶
- Return type
int
- hasMissingValues()¶
- Return type
bool
- history()¶
- Returns
the scheme history
- Return type
tuple
- Raises
pyAgrum.OperationNotAllowed – If the scheme did not performed or if verbosity is set to false
- idFromName(var_name)¶
- Parameters
var_name (
str
) –- Return type
int
- isGumNumberOfThreadsOverriden()¶
- Return type
bool
- latentVariables()¶
Warning
learner must be using 3off2 or MIIC algorithm
- Returns
the list of latent variables
- Return type
list
- learnBN()¶
learn a BayesNet from a file (must have read the db before)
- Returns
the learned BayesNet
- Return type
- learnEssentialGraph()¶
- learnMixedStructure()¶
- Return type
- learnParameters(*args)¶
learns a BN (its parameters) when its structure is known.
- Parameters
dag (pyAgrum.DAG) –
bn (pyAgrum.BayesNet) –
take_into_account_score (bool) – The dag passed in argument may have been learnt from a structure learning. In this case, if the score used to learn the structure has an implicit apriori (like K2 which has a 1-smoothing apriori), it is important to also take into account this implicit apriori for parameter learning. By default, if a score exists, we will learn parameters by taking into account the apriori specified by methods useAprioriXXX () + the implicit apriori of the score, else we just take into account the apriori specified by useAprioriXXX ()
- Returns
the learned BayesNet
- Return type
- Raises
pyAgrum.MissingVariableInDatabase – If a variable of the BN is not found in the database
pyAgrum.UnknownLabelInDatabase – If a label is found in the database that do not correspond to the variable
- logLikelihood(*args)¶
- Return type
float
- maxIter()¶
- Returns
the criterion on number of iterations
- Return type
int
- maxTime()¶
- Returns
the timeout(in seconds)
- Return type
float
- messageApproximationScheme()¶
- Returns
the approximation scheme message
- Return type
str
- minEpsilonRate()¶
- Returns
the value of the minimal epsilon rate
- Return type
float
- nameFromId(id)¶
- Parameters
id (
int
) –- Return type
str
- names()¶
- Return type
List
[str
]
- nbCols()¶
- Return type
int
- nbRows()¶
- Return type
int
- nbrIterations()¶
- Returns
the number of iterations
- Return type
int
- periodSize()¶
- Returns
the number of samples between 2 stopping
- Return type
int
- Raises
pyAgrum.OutOfBounds – If p<1
- pseudoCount(vars)¶
access to pseudo-count (priors taken into account)
- Parameters
vars (list[str]) – a list of name of vars to add in the pseudo_count
- Return type
a Potential containing this pseudo-counts
- rawPseudoCount(*args)¶
- Return type
List
[float
]
- recordWeight(i)¶
- Parameters
i (
int
) –- Return type
float
- setAprioriWeight(weight)¶
Deprecated methods in BNLearner for pyAgrum>0.14.0
- setDatabaseWeight(new_weight)¶
- Parameters
new_weight (
float
) –- Return type
None
- setEpsilon(eps)¶
- Parameters
eps (float) – the epsilon we want to use
- Raises
pyAgrum.OutOfBounds – If eps<0
- Return type
None
- setInitialDAG(dag)¶
- Parameters
dag (pyAgrum.DAG) – an initial DAG structure
- Return type
- setMaxIndegree(max_indegree)¶
- Parameters
max_indegree (int) – the limit number of parents
- Return type
- setMaxIter(max)¶
- Parameters
max (int) – the maximum number of iteration
- Raises
pyAgrum.OutOfBounds – If max <= 1
- Return type
None
- setMaxTime(timeout)¶
- Parameters
tiemout (float) – stopping criterion on timeout (in seconds)
timeout (
float
) –
- Raises
pyAgrum.OutOfBounds – If timeout<=0.0
- Return type
None
- setMinEpsilonRate(rate)¶
- Parameters
rate (float) – the minimal epsilon rate
- Return type
None
- setNumberOfThreads(nb)¶
- Parameters
nb (
int
) –- Return type
None
- setPeriodSize(p)¶
- Parameters
p (int) – number of samples between 2 stopping
- Raises
pyAgrum.OutOfBounds – If p<1
- Return type
None
- setRecordWeight(i, weight)¶
- Parameters
i (
int
) –weight (
float
) –
- Return type
None
- setVerbosity(v)¶
- Parameters
v (bool) – verbosity
- Return type
None
- state()¶
- Return type
object
- useAprioriDirichlet(filename, weight=1)¶
Use the Dirichlet apriori.
- Parameters
filename (str) – the Dirichlet related database
weight (
float
) –
- Return type
- useAprioriSmoothing(weight=1)¶
Use the apriori smoothing.
- Parameters
weight (float) – pass in argument a weight if you wish to assign a weight to the smoothing, else the current weight of the learner will be used.
- Return type
- useEM(epsilon)¶
Indicates if we use EM for parameter learning.
- Parameters
epsilon (float) – if epsilon=0.0 then EM is not used if epsilon>0 then EM is used and stops when the sum of the cumulative squared error on parameters is les than epsilon.
- Return type
- useGreedyHillClimbing()¶
Indicate that we wish to use a greedy hill climbing algorithm.
- Return type
- useLocalSearchWithTabuList(tabu_size=100, nb_decrease=2)¶
- Parameters
tabu_size (
int
) –nb_decrease (
int
) –
- Return type
- useMDLCorrection()¶
Indicate that we wish to use the MDL correction for 3off2 or MIIC
- Return type
- useNMLCorrection()¶
Indicate that we wish to use the NML correction for 3off2 or MIIC
- Return type
- useNoCorrection()¶
Indicate that we wish to use the NoCorr correction for 3off2 or MIIC
- Return type
- useScoreLog2Likelihood()¶
Indicate that we wish to use a Log2Likelihood score.
- Return type
- verbosity()¶
- Returns
True if the verbosity is enabled
- Return type
bool