Learning essential graphs¶
In [1]:
%matplotlib inline
from pylab import *
import matplotlib.pyplot as plt
import os
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
Compare learning algorithms¶
Essentially MIIC and 3off2 computes the essential graph (CPDAG) from data. Essential graphs are mixed graphs.
In [2]:
learner=gum.BNLearner("out/sample_asia.csv")
learner.use3off2()
learner.useNMLCorrection()
print(learner)
ge3off2=learner.learnEssentialGraph()
Filename : out/sample_asia.csv
Size : (500000,8)
Variables : tuberculosis[2], smoking[2], tuberculos_or_cancer[2], visit_to_Asia[2], positive_XraY[2], bronchitis[2], lung_cancer[2], dyspnoea[2]
Induced types : True
Missing values : False
Algorithm : 3off2
Correction : NML
Prior : -
In [3]:
gnb.showDot(ge3off2.toDot());
In [4]:
learner=gum.BNLearner("out/sample_asia.csv")
learner.useMIIC()
learner.useNMLCorrection()
print(learner)
gemiic=learner.learnEssentialGraph()
gemiic
Filename : out/sample_asia.csv
Size : (500000,8)
Variables : tuberculosis[2], smoking[2], tuberculos_or_cancer[2], visit_to_Asia[2], positive_XraY[2], bronchitis[2], lung_cancer[2], dyspnoea[2]
Induced types : True
Missing values : False
Algorithm : MIIC
Correction : NML
Prior : -
Out[4]:
For the others methods, it is possible to obtain the essential graph from the learned BN.
In [5]:
learner=gum.BNLearner("out/sample_asia.csv")
learner.useGreedyHillClimbing()
bnHC=learner.learnBN()
print(learner)
geHC=gum.EssentialGraph(bnHC)
geHC
gnb.sideBySide(bnHC,geHC)
Filename : out/sample_asia.csv
Size : (500000,8)
Variables : tuberculosis[2], smoking[2], tuberculos_or_cancer[2], visit_to_Asia[2], positive_XraY[2], bronchitis[2], lung_cancer[2], dyspnoea[2]
Induced types : True
Missing values : False
Algorithm : Greedy Hill Climbing
Score : BDeu
Prior : -
In [6]:
learner=gum.BNLearner("out/sample_asia.csv")
learner.useLocalSearchWithTabuList()
print(learner)
bnTL=learner.learnBN()
geTL=gum.EssentialGraph(bnTL)
geTL
gnb.sideBySide(bnTL,geTL)
Filename : out/sample_asia.csv
Size : (500000,8)
Variables : tuberculosis[2], smoking[2], tuberculos_or_cancer[2], visit_to_Asia[2], positive_XraY[2], bronchitis[2], lung_cancer[2], dyspnoea[2]
Induced types : True
Missing values : False
Algorithm : Local Search with Tabu List
Tabu list size : 2
Score : BDeu
Prior : -
Hence we can compare the 4 algorithms.
In [7]:
(
gnb.flow.clear()
.add(ge3off2,"Essential graph from 3off2")
.add(gemiic,"Essential graph from miic")
.add(bnHC,"BayesNet from GHC")
.add(geHC,"Essential graph from GHC")
.add(bnTL,"BayesNet from TabuList")
.add(geTL,"Essential graph from TabuList")
.display()
)
In [ ]: