Package gPy :: Module IO
[hide private]
[frames] | no frames]

Module IO

source code

Functions for IO of models and data

Classes [hide private]
  GraphCanvas
Class for drawing graphs
Functions [hide private]
Tuple
read_bif(fobj)
Read in a BN in BIF format
source code
 
csv2db(csv, db_file)
Create a sqlite database from a CSV file
source code
Tuple
read_dimacs(fobj)
Read a graph in DIMACS format
source code
Tuple
read_csv(fobj, sep=',')
Construct a list of records from a CSV file
source code
 
read_norsyslib(net, get_positions=False) source code
Tuple
read_dnet(fobj, get_positions=False)
Return CPTs of a Bayesian net in Netica 'dnet' format
source code
Dictionary
read_dnet2(fobj, get_positions=False)
Return CPTs of a Bayesian net in Netica 'dnet' format
source code
Tuple
read_twlib(i)
Fetch and parse a twlib graph from Utrecht
source code
 
read_xdsl(fobj) source code
Variables [hide private]
String _version = '$Id: IO.py,v 1.5 2008/10/07 09:10:27 jc Exp $'
Version of this module

Imports: urllib, re, Tkinter, random


Function Details [hide private]

read_bif(fobj)

source code 

Read in a BN in BIF format

Returns 3 dictionaries, The first maps each variable to a tuple of its values. The second maps each variable to a tuple of its parents. The third maps each instantiation of its parents (order as for tuple in 2nd dictionary) to a list of conditional probs (order as for tuple in 1st dictionary)

Parameters:
  • fobj (File) - BIF file object
Returns: Tuple
3 dictionaries

read_dimacs(fobj)

source code 

Read a graph in DIMACS format

Only vertices appearing in an edge are considered. All vertices assumed to be integers

Parameters:
  • fobj (File) - DIMACS file object
Returns: Tuple
Vertices, Edges of the graph

read_csv(fobj, sep=',')

source code 

Construct a list of records from a CSV file

The CSV file must be have 3 sections.

Section 1 has lines of the form: variable:value1,value2,... There is one line for each variable

Section 2 is a single line of sep separated variables. If the jth one of these is varname then the jth field of each record is a value for varname.

Section 3 consists of records, one per line with sep separating the fields. Either each record has an extra 'count' field or none do.

No further lines are read after an empty line (so trailing empty lines do not cause an IOError).

Here's part of an acceptable CSV file (where the optional extra count field is present):

A:N,Y
S:0,1
T:0,1
L:0,1
B:0,1
E:0,1
D:0,1
X:0,1
A,S,T,L,B,E,D,X
N,1,0,1,1,1,1,1,12
N,1,0,0,0,0,0,0,66
Parameters:
  • fobj (File) - CSV file object (NOT the file name)
  • sep (String) - The field separator in fobj.
Returns: Tuple
(header, values, variables, records) where:
  1. header is a list of the variables in the data in the order they are

given in the original data file. [A,S,T,L,B,E,D,X] in the example above.

  1. values is a dictionary mapping variables to their values, where

these values are a list. {'A':['N','Y'],'S':['0','1']...} in the example above.

  1. variables is an ordered list of the variables in the data. [A,B,D,E,L,T,S,X] in the example above.
  2. records is a list of records, each record is a tuple of integers. For each tuple the jth element is the index of the value of the jth variable of variables found in that record. The final integer is a count of how often the record appeared.
Raises:
  • IOError - If any record has the wrong number of fields

To Do: It would be nice to allow this to be an iterator where possible.

read_dnet(fobj, get_positions=False)

source code 

Return CPTs of a Bayesian net in Netica 'dnet' format

Does not work if inheritance is being used in the dnet file!

Parameters:
  • fobj (File or String) - File or name of file containing the input
Returns: Tuple
(values,named_cpts) where:
  1. values is a dictionary mapping variables to their values, where

these values are a list. For example, {'Tuberculosis':['Present','Absent'], ...}

  1. named_cpts is a dictionary mapping variables to a tuple of 0) their parents

and 1) the data for that CPT, where data is correctly ordered for a factor.

If get_positions is True, then a dictionary mapping variables to grid positions is also obtained.

read_dnet2(fobj, get_positions=False)

source code 

Return CPTs of a Bayesian net in Netica 'dnet' format

Does not work if inheritance is being used in the dnet file!

Parameters:
  • fobj (File or String) - File or name of file containing the input
Returns: Dictionary
A dictionary. Each key is a node name each value is again a dictionary mapping fields to values. Fields typically include 'states', 'probs', etc

read_twlib(i)

source code 

Fetch and parse a twlib graph from Utrecht

Parameters:
  • i (Int) - Number of the graph
Returns: Tuple
Vertices, Edges of the graph