vfbn1 File Reference
Detailed Description
Learns the structure of a BeliefNet from a very large data set using sampling.
Learns the structure and parameters of a Bayesian network, accelerated with sampling as described in this paper. All variables must be categorical. vfbn1, like other Bayesian network learning programs, searches for high scoring Bayesian network structures by considering adding, removing, and reversing every possible edge, making the one that has highest score on training data, and repeating until no change improves the score. Unlike other learners, vfbn1 uses statistical tests and only uses enough data to be sure that it knows which change is best with high confidence (see the -delta parameter below). This allows vfbn1 to be much faster than traditional methods when there is enough data to make good decisions. It also allows it to learn from data streams (see the -stdin flag below). VFML also includes the vfbn2 algorithm which changes the search procedure used so that it can be faster and more scalable.
vfbn1 takes input and does output in c4.5 format. It expects to find the files <stem>.names
and <stem>.data
.
- Wish List:
- An API to this learner like the one to learning BeliefNet structure in beliefnet-engine.h
Arguments
- -f 'filestem'
- Set the name of the dataset (default DF)
- -source 'dir'
- Set the source data directory (default '.')
- -startFrom 'filename'
- Use net in 'filename' as starting point, must be BIF file (default start from empty net)
- -outputTo 'filename'
- Output the learned net to 'filename' in BIF format
- -delta 'prob'
- Allowed chance of error in each decision (default 0.00001 that's .001 percent)
- -tau 'tie error'
- Call a tie when score might change less than tau percent. (default 0.001)
- -chunk 'count'
- Accumulate 'count' examples before testing for a winning search step(default 10000)
- -limitMegs 'count'
- Limit dynamic memory allocation to 'count' megabytes, don't consider networks that take too much space (default no limit)
- -limitMinutes 'count'
- Limit the run to 'count' minutes then return current model (default no limit)
- -normal
- Use normal bound (default Hoeffding)
- -stdin
- Reads training examples from stdin instead of from 'stem'.data causes a 2 second delay to help give input time to setup (default off)
- -noReverse
- Doesn't reverse links to make nets for next search step (default reverse links)
- -parametersOnly
- Only estimate parameters for current structure, no other learning
- -seed 's'
- Seed for random numbers (default random)
- -maxSearchSteps 'num'
- Limit to 'num' search steps (default no max).
- -maxParentsPerNode 'num'
- Limit each node to 'num' parents (default no max).
- -maxParameterGrowthMult 'mult'
- Limit net to 'mult' times starting # of parameters (default no max).
- -maxParameterCount 'count'
- Limit net to 'count' parameters (default no max).
- -kappa 'kappa'
- The structure prior penalty for batch (0 - 1), 1 is no penalty (default 0.5)
- -structureTie
- Use the structural prior penalty in ties (default don't)
- -batch
- Run in batch mode, repeatedly scan disk, don't do hoeffding bounds (default off).
- -v
- Can be used multiple times to increase the debugging output.
- -h
- Run vfbn1 -h for a list of the arguments and their meanings.
Generated for VFML by
hosted by