There is one row in the table for each non-zero value. These are meant
to be used for factors with many variables, but not too many non-zero
values: contingency tables for example. This object is a sensible choice
when most of the information required from the data can be gleaned in a
few passes over it.
At present all database are stored in RAM.
|
|
__init__(self,
data=None,
variables=(),
domain=None,
new_domain_variables=None,
must_be_new=False,
check=False,
convert=False)
Initialise a Data
object |
source code
|
|
|
|
|
|
|
|
|
Iterator
|
__iter__(self)
Iterates over those joint instantiations of the data which have
non-zero counts associated with them |
source code
|
|
|
|
|
|
Float
|
|
|
Data object
|
|
|
|
conditional_entropy(self,
x,
y)
Return the conditional entropy H(x|y) for variable sets
x and y using the empirical distribution
given by the data |
source code
|
|
|
|
entropy(self,
x)
Return the entropy of the marginal empirical distribution given by
x and the data |
source code
|
|
|
|
| _bic_search2(self,
child,
n,
child_df,
old_parents,
further_parents,
store,
pa_lim,
highest_llh) |
source code
|
|
|
|
_bic_search(self,
child,
n,
child_df,
lower,
bic_lower,
upper,
upper_bound,
store)
Compute the BIC score for every parent set for child
which is a proper superset of lower and a subset of the
union of lower and upper and add it to the
dictionary store. |
source code
|
|
|
|
bic_search(self,
child,
pa_lim)
Branch and bound search for all parent sets for child
which do not have a higher scoring subset |
source code
|
|
|
|
|
|
|
|
|
Float
|
mutual_information(self,
x,
y)
Return the mutual information between the variable sets
x and y in the empirical distribution
determined by the data |
source code
|
|
|
The number of datapoints in the data
|
|
|
|
populate(self,
records,
variables=None)
Simply inserts the records into the database |
source code
|
|
|
|
|
|
|
ub(self,
qpa,
alpha,
ri)
The upper bound self provides on a the score of a smaller parent set
where |
source code
|
|
|
|
qh(self,
h=0)
Return the number of instantiations (ie cells) having a value greater
than h |
source code
|
|
|
|
make_family_scores_naively(self,
pa_size_lim=4,
precision=10.0,
batch_size=65536)
Make scores for all parent sets for all variables where (1) the size
of the parent set is at most pa_size_lim. |
source code
|
|
|
List
|
makeFactorsn(self,
n,
block=1000000)
Yield counts for all marginals with n variables in
blocks of block |
source code
|
|
|
|
| _countsfromdata(self,
count,
mults,
marginals_including) |
source code
|
|
|
List
|
|
|
|
| family_score(self,
child,
parents,
precision=1.0) |
source code
|
|
|
|
score_adg(self,
adg,
precision=1.0)
Get Bdeu score for an adg |
source code
|
|
|
|
| h_scores(self,
precision=1.0,
textfun=<type 'str'>) |
source code
|
|
|
|
|
|
Factor
object
|
|
|
Factor
object
|
|
|
Inherited from Variables.SubDomain:
__add__,
__div__,
__iadd__,
__idiv__,
__imul__,
__isub__,
__mul__,
__rdiv__,
__repr__,
__rmul__,
__str__,
__sub__,
copy,
drop_variable,
drop_variables,
inst2index,
insts,
insts_indices,
marginalise_onto,
sumout,
table_size,
uses_default_domain,
variables,
varvalues
Inherited from Variables.Domain:
add_domain_variable,
add_domain_variables,
add_domain_variables_from_rawdata,
change_domain_variable,
change_domain_variables,
common_domain,
known_variable,
numvals,
values
Inherited from object:
__delattr__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__setattr__
|