When you implement a learner you need to worry about more than the learning; you need to interface with your environment, deal with command line arguments, locate and load files, interface with other tools. All these things are important, but aren't particularly interesting. This page describes the main issues to consider when working in the VFML environment. The implement-learner example will present a framework that implements solutions for these issues.
Example For: The framework that every learner will need
This is a simple example that presents the simplest possible
learning algorithm - one that always predicts the most-common-class in
the training set. This code is very similar to the
implementation of VFML's mostcommonclass learner. It includes a
makefile and a source file which are located in the
<VFML-root>/examples/implement-learner/
directory. This document presents an overview of the code which
should be sufficient to get you started modifying it for your own
needs.
You might like to go to the
<VFML-root>/examples/implement-learner/
directory
and get your favorite code/text editor ready. You might also
like to copy the directory somewhere and begin modifying the example
for your own needs.
This makefile will be a good starting point for your VFML projects. Glance at the makefile; the top couple lines contain information you would need to update if you want to use the file with another project.
Make sure you've properly installed the VFML library (see the Getting Started section if you
haven't done this yet), then type 'make' to build the example
program. Run it by typing implement-learner -h
, and
look at the output.
We've provided a starter project for windows using VC++ 6.0. It is configured to work if you've installed the VFML library into c:/proj/uwml/, if not see the Getting Started section for more information on how to update the configuration.
The windows version also uses a different source file, implement-learner-windows.c. The only difference between this file and implement-learner.c is that it doesn't do any timing.
This will be a high-level overview of the code from the example; it should be enough to get you started. For a more detailed description of a VFML project see the loading data documentation.
_printUsage
and _processArgs
work
together to get a valid command line and set a collection of
globals from it. The example shows you how to accept flags,
strings, ints, and doubles from the command line. One note
is that you should be careful not to have any arguments that are
sub-strings of other arguments - if you don't get the ordering
correct the strcmp might accept the longer argument as an instance
of the shorter one.-souce <directory>
and
-f <file-stem>
arguments and use them to find a
dataset. It then reads the names file into a global,
gEs
, and iterates over the examples from the .data
file.-u
argument to the program, it will
test its 'model' on the examples in the .test file and output the
results in a format appropriate for interfacing with xvalidate and
batchtest.-v
argument, the program will output more information
about its progress. In your learner, you might want to
implement higher message levels (in response to multiple -v flags
on the command line) to print out more detailed information about
your learner's progress. See VFML's Debugging API for some code that may
help with this.