educe.pdtb.util package¶
Submodules¶
educe.pdtb.util.args module¶
Command line options
-
educe.pdtb.util.args.add_usual_input_args(parser)¶ Augment a subcommand argparser with typical input arguments. Sometimes your subcommand may require slightly different output arguments, in which case, just don’t call this function.
-
educe.pdtb.util.args.add_usual_output_args(parser)¶ Augment a subcommand argparser with typical output arguments, Sometimes your subcommand may require slightly different output arguments, in which case, just don’t call this function.
-
educe.pdtb.util.args.announce_output_dir(output_dir)¶ Tell the user where we saved the output
-
educe.pdtb.util.args.get_output_dir(args)¶ Return the output directory specified on (or inferred from) the command line arguments, creating it if necessary.
We try the following in order:
- If –output is given explicitly, we’ll just use/create that
- OK just make a temporary directory. Later on, you’ll probably want to call announce_output_dir.
-
educe.pdtb.util.args.mk_output_path(odir, k)¶ Path stub (needs extension) given an output directory and a PDTB corpus key
-
educe.pdtb.util.args.read_corpus(args, verbose=True)¶ Read the section of the corpus specified in the command line arguments.
educe.pdtb.util.features module¶
Feature extraction library functions for PDTB corpus
-
class
educe.pdtb.util.features.DocumentPlus(key, doc)¶ Bases:
tuple-
doc¶ Alias for field number 1
-
key¶ Alias for field number 0
-
-
class
educe.pdtb.util.features.FeatureInput(corpus, debug)¶ Bases:
tuple-
corpus¶ Alias for field number 0
-
debug¶ Alias for field number 1
-
-
class
educe.pdtb.util.features.RelKeys(inputs)¶ Bases:
educe.learning.keys.MergedKeyGroupFeatures for relations
-
fill(current, rel, target=None)¶ See RelSubgroup
-
-
class
educe.pdtb.util.features.RelSubGroup_Core¶ Bases:
educe.pdtb.util.features.RelSubgroupcore features
-
fill(current, rel, target=None)¶
-
-
class
educe.pdtb.util.features.RelSubgroup(description, keys)¶ Bases:
educe.learning.keys.KeyGroupAbstract keygroup for subgroups of the merged RelKeys. We use these subgroup classes to help provide modularity, to capture the idea that the bits of code that define a set of related feature vector keys should go with the bits of code that also fill them out
-
fill(current, rel, target=None)¶ Fill out a vector’s features (if the vector is None, then we just fill out this group; but in the case of a merged key group, you may find it desirable to fill out the merged group instead)
-
-
class
educe.pdtb.util.features.SingleArgKeys(inputs)¶ Bases:
educe.learning.keys.MergedKeyGroupFeatures for a single EDU
-
fill(current, arg, target=None)¶ See SingleArgSubgroup.fill
-
-
class
educe.pdtb.util.features.SingleArgSubgroup(description, keys)¶ Bases:
educe.learning.keys.KeyGroupAbstract keygroup for subgroups of the merged SingleArgKeys. We use these subgroup classes to help provide modularity, to capture the idea that the bits of code that define a set of related feature vector keys should go with the bits of code that also fill them out
-
fill(current, arg, target=None)¶ Fill out a vector’s features (if the vector is None, then we just fill out this group; but in the case of a merged key group, you may find it desirable to fill out the merged group instead)
-
-
educe.pdtb.util.features.extract_rel_features(inputs)¶ Return a pair of dictionaries, one for attachments and one for relations
-
educe.pdtb.util.features.mk_current(inputs, k)¶ Pre-process and bundle up a representation of the current document
-
educe.pdtb.util.features.spans_to_str(spans)¶ string representation of a list of spans, meant to work as an id