Using pyAgrum¶
In [1]:
%matplotlib inline
from pylab import *
import matplotlib.pyplot as plt
import os
Initialisation¶
importing pyAgrum
importing pyAgrum.lib tools
loading a BN
In [2]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
gnb.configuration()
Library | Version |
---|---|
OS | nt [win32] |
Python | 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)] |
IPython | 8.4.0 |
Matplotlib | 3.5.2 |
Numpy | 1.22.4 |
pyDot | 1.4.2 |
pyAgrum | 1.1.0.9 |
Sat Jun 04 00:49:40 2022 Paris, Madrid (heure d’été)
In [3]:
bn=gum.loadBN("res/alarm.dsl")
gnb.showBN(bn,size='9')
Visualisation and inspection¶
In [4]:
print(bn.variableFromName('SHUNT'))
SHUNT:Labelized({NORMAL|HIGH})
In [5]:
print(bn.cpt(bn.idFromName('SHUNT')))
|| SHUNT |
PULMEM|INTUBA||NORMAL |HIGH |
------|------||---------|---------|
TRUE |NORMAL|| 0.1000 | 0.9000 |
FALSE |NORMAL|| 0.9500 | 0.0500 |
TRUE |ESOPHA|| 0.1000 | 0.9000 |
FALSE |ESOPHA|| 0.9500 | 0.0500 |
TRUE |ONESID|| 0.0100 | 0.9900 |
FALSE |ONESID|| 0.0500 | 0.9500 |
In [6]:
gnb.showPotential(bn.cpt(bn.idFromName('SHUNT')),digits=3)
|
| ||
---|---|---|---|
| 0.100 | 0.900 | |
0.950 | 0.050 | ||
| 0.100 | 0.900 | |
0.950 | 0.050 | ||
| 0.010 | 0.990 | |
0.050 | 0.950 |
Results of inference¶
It is easy to look at result of inference
In [7]:
gnb.showPosterior(bn,{'SHUNT':'HIGH'},'PRESS')
In [8]:
gnb.showPosterior(bn,{'MINVOLSET':'NORMAL'},'VENTALV')
Overall results
In [9]:
gnb.showInference(bn,size="10")
What is the impact of observed variables (SHUNT and VENTALV for instance) on another on (PRESS) ?
In [10]:
ie=gum.LazyPropagation(bn)
ie.evidenceImpact('PRESS',['SHUNT','VENTALV'])
Out[10]:
|
|
|
| ||
---|---|---|---|---|---|
| 0.0569 | 0.2669 | 0.2005 | 0.4757 | |
0.0208 | 0.2515 | 0.0553 | 0.6724 | ||
0.0769 | 0.3267 | 0.1772 | 0.4192 | ||
0.0501 | 0.1633 | 0.2796 | 0.5071 | ||
| 0.0589 | 0.2726 | 0.1997 | 0.4688 | |
0.0318 | 0.2237 | 0.0521 | 0.6924 | ||
0.1735 | 0.5839 | 0.1402 | 0.1024 | ||
0.0711 | 0.2347 | 0.2533 | 0.4410 |
Using inference as a function¶
It is also easy to use inference as a routine in more complex procedures.
In [11]:
import time
r=range(0,100,2)
xs=[x/100.0 for x in r]
tf=time.time()
ys=[gum.getPosterior(bn,{'MINVOLSET':[0,x/100.0,0.5]},'VENTALV').tolist()
for x in r]
delta=time.time()-tf
p=plot(xs,ys)
legend(p,[bn.variableFromName('VENTALV').label(i)
for i in range(bn.variableFromName('VENTALV').domainSize())],loc=7);
title('VENTALV (100 inferences in %d ms)'%delta);
ylabel('posterior Probability');
xlabel('Evidence on MINVOLSET : [0,x,0.5]')
plt.show()
Another example : python gives access to a large set of tools. Here the value for the equality of two probabilities of a posterior is easely computed.
In [12]:
x=[p/100.0 for p in range(0,100)]
tf=time.time()
y=[gum.getPosterior(bn,{'HRBP':[1.0-p/100.0,1.0-p/100.0,p/100.0]},'TPR').tolist()
for p in range(0,100)]
delta=time.time()-tf
p=plot(x,y)
title('HRBP (100 inferences in %d ms)'%delta);
v=bn.variableFromName('TPR');
legend([v.label(i) for i in range(v.domainSize())],loc='best');
np1=(transpose(y)[0]>transpose(y)[2]).argmin()
text(x[np1]-0.05,y[np1][0]+0.005,str(x[np1]),bbox=dict(facecolor='red', alpha=0.1))
plt.show()
BN as a classifier¶
Generation of databases¶
Using the CSV format for the database:
In [13]:
gum.generateSample(bn,1000,"out/test.csv",with_labels=True)
Out[13]:
-14917.511725169048
In [14]:
with open("out/test.csv","r") as src:
for _ in range(10):
print(src.readline(),end="")
HYPOVOLEMIA,STROKEVOLUME,EXPCO2,CATECHOL,PULMEMBOLUS,VENTTUBE,BP,HRSAT,CO,ANAPHYLAXIS,VENTALV,VENTMACH,HR,PAP,ERRLOWOUTPUT,VENTLUNG,ARTCO2,PCWP,SHUNT,LVEDVOLUME,SAO2,CVP,DISCONNECT,KINKEDTUBE,ERRCAUTER,HISTORY,INSUFFANESTH,MINVOLSET,TPR,HREKG,PRESS,PVSAT,FIO2,INTUBATION,HRBP,MINVOL,LVFAILURE
FALSE,NORMAL,LOW,HIGH,FALSE,LOW,HIGH,HIGH,HIGH,FALSE,NORMAL,NORMAL,HIGH,NORMAL,FALSE,ZERO,NORMAL,NORMAL,HIGH,NORMAL,LOW,NORMAL,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,NORMAL,HIGH,ZERO,LOW,NORMAL,ONESIDED,HIGH,LOW,FALSE
FALSE,NORMAL,ZERO,HIGH,FALSE,LOW,HIGH,HIGH,HIGH,FALSE,ZERO,NORMAL,HIGH,NORMAL,TRUE,ZERO,LOW,NORMAL,NORMAL,NORMAL,LOW,NORMAL,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,NORMAL,HIGH,HIGH,LOW,NORMAL,NORMAL,NORMAL,HIGH,FALSE
FALSE,LOW,NORMAL,HIGH,FALSE,NORMAL,LOW,HIGH,LOW,FALSE,HIGH,HIGH,HIGH,NORMAL,FALSE,NORMAL,LOW,NORMAL,NORMAL,NORMAL,NORMAL,NORMAL,TRUE,FALSE,FALSE,FALSE,FALSE,HIGH,HIGH,HIGH,NORMAL,HIGH,NORMAL,NORMAL,HIGH,LOW,FALSE
TRUE,LOW,NORMAL,HIGH,FALSE,HIGH,LOW,HIGH,LOW,FALSE,LOW,HIGH,HIGH,NORMAL,FALSE,LOW,HIGH,HIGH,NORMAL,HIGH,LOW,HIGH,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,HIGH,HIGH,HIGH,NORMAL,NORMAL,NORMAL,HIGH,NORMAL,FALSE
FALSE,NORMAL,LOW,HIGH,FALSE,ZERO,HIGH,HIGH,HIGH,FALSE,ZERO,NORMAL,HIGH,NORMAL,FALSE,ZERO,HIGH,LOW,NORMAL,LOW,LOW,LOW,TRUE,FALSE,FALSE,FALSE,FALSE,NORMAL,NORMAL,HIGH,LOW,LOW,NORMAL,NORMAL,HIGH,ZERO,FALSE
FALSE,NORMAL,LOW,HIGH,FALSE,LOW,HIGH,HIGH,HIGH,FALSE,ZERO,NORMAL,HIGH,NORMAL,FALSE,ZERO,HIGH,NORMAL,NORMAL,NORMAL,LOW,NORMAL,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,NORMAL,HIGH,HIGH,LOW,NORMAL,NORMAL,HIGH,ZERO,FALSE
TRUE,LOW,LOW,HIGH,FALSE,LOW,NORMAL,HIGH,NORMAL,FALSE,ZERO,NORMAL,HIGH,NORMAL,FALSE,ZERO,HIGH,HIGH,NORMAL,HIGH,LOW,HIGH,FALSE,FALSE,FALSE,FALSE,TRUE,NORMAL,NORMAL,HIGH,HIGH,LOW,NORMAL,NORMAL,HIGH,ZERO,FALSE
FALSE,NORMAL,LOW,HIGH,FALSE,LOW,LOW,LOW,LOW,FALSE,ZERO,NORMAL,NORMAL,NORMAL,FALSE,ZERO,HIGH,NORMAL,NORMAL,NORMAL,LOW,NORMAL,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,HIGH,LOW,HIGH,LOW,NORMAL,NORMAL,LOW,ZERO,FALSE
FALSE,NORMAL,LOW,HIGH,FALSE,LOW,HIGH,HIGH,HIGH,FALSE,ZERO,NORMAL,HIGH,NORMAL,FALSE,ZERO,HIGH,NORMAL,NORMAL,NORMAL,LOW,NORMAL,FALSE,FALSE,FALSE,FALSE,FALSE,NORMAL,NORMAL,HIGH,NORMAL,LOW,NORMAL,NORMAL,HIGH,ZERO,FALSE
probabilistic classifier using BN¶
(because of the use of from-bn-generated csv files, quite good ROC curves are expected)
In [15]:
from pyAgrum.lib.bn2roc import showROC_PR
showROC_PR(bn,"out/test.csv",
target='CATECHOL',label='HIGH', # class and label
show_progress=True,show_fig=True,with_labels=True)
out/test.csv: 100%|██████████████████████████████████████████████████████████████████|
Out[15]:
(0.9525796038953934, 0.9300899828, 0.9980437992339163, 0.34813415895)
Using another class variable
In [16]:
showROC_PR(bn,"out/test.csv",'SAO2','HIGH',show_progress=True)
out/test.csv: 100%|██████████████████████████████████████████████████████████████████|
Out[16]:
(0.9629093016516952, 0.048531175, 0.7758007513206207, 0.5385017134)
Fast prototyping for BNs¶
In [17]:
bn1=gum.fastBN("a->b;a->c;b->c;c->d",3)
gnb.sideBySide(*[gnb.getInference(bn1,evs={'c':val},targets={'a','c','d'}) for val in range(3)],
captions=[f"Inference given that $c={val}$" for val in range(3)])
In [18]:
print(gum.getPosterior(bn1,evs={'c':0},target='c'))
print(gum.getPosterior(bn1,evs={'c':0},target='d'))
# using pyagrum.lib.notebook's helpers
gnb.flow.row(gum.getPosterior(bn1,evs={'c':0},target='c'),gum.getPosterior(bn1,evs={'c':0},target='d'))
c |
0 |1 |2 |
---------|---------|---------|
1.0000 | 0.0000 | 0.0000 |
d |
0 |1 |2 |
---------|---------|---------|
0.6638 | 0.1259 | 0.2103 |
|
|
|
---|---|---|
1.0000 | 0.0000 | 0.0000 |
|
|
|
---|---|---|
0.6638 | 0.1259 | 0.2103 |
Joint posterior, impact of multiple evidence¶
In [19]:
bn=gum.fastBN("a->b->c->d;b->e->d->f;g->c")
gnb.sideBySide(bn,gnb.getInference(bn))
In [20]:
ie=gum.LazyPropagation(bn)
ie.addJointTarget({"e","f","g"})
ie.makeInference()
gnb.sideBySide(ie.jointPosterior({"e","f","g"}),ie.jointPosterior({"e","g"}),
captions=["Joint posterior $P(e,f,g)$","Joint posterior $P(e,f)$"])
In [21]:
gnb.sideBySide(ie.evidenceImpact("a",["e","f"]),ie.evidenceImpact("a",["d","e","f"]),
captions=["$\\forall e,f, P(a|e,f)$",
"$\\forall d,e,f, P(a|d,e,f)=P(a|d,e)$ using d-separation"]
)
In [22]:
gnb.sideBySide(ie.evidenceJointImpact(["a","b"],["e","f"]),ie.evidenceJointImpact(["a","b"],["d","e","f"]),
captions=["$\\forall e,f, P(a,b|e,f)$",
"$\\forall d,e,f, P(a,b|d,e,f)=P(a,b|d,e)$ using d-separation"]
)
In [ ]: