org.apache.uima.examples
Class XmlDetagger

java.lang.Object
  extended by org.apache.uima.analysis_component.AnalysisComponent_ImplBase
      extended by org.apache.uima.analysis_component.Annotator_ImplBase
          extended by org.apache.uima.analysis_component.CasAnnotator_ImplBase
              extended by org.apache.uima.examples.XmlDetagger
All Implemented Interfaces:
AnalysisComponent

public class XmlDetagger
extends CasAnnotator_ImplBase

A multi-sofa annotator that does XML detagging. Reads XML data from the input Sofa (named "xmlDocument"); this data can be stored in the CAS as a string or array, or it can be a URI to a remote file. The XML is parsed using the JVM's default parser, and the plain-text content is written to a new sofa called "plainTextDocument".


Field Summary
static java.lang.String PARAM_XMLTAG
          Name of optional configuration parameter that contains the name of an XML tag that appears in the input file.
 
Constructor Summary
XmlDetagger()
           
 
Method Summary
 void initialize(UimaContext aContext)
          Performs any startup tasks required by this component.
 void process(CAS aCAS)
          Inputs a CAS to the AnalysisComponent.
 void typeSystemInit(TypeSystem aTypeSystem)
          Informs this annotator that the CAS TypeSystem has changed.
 
Methods inherited from class org.apache.uima.analysis_component.CasAnnotator_ImplBase
getRequiredCasInterface, process
 
Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase
getCasInstancesRequired, hasNext, next
 
Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
batchProcessComplete, collectionProcessComplete, destroy, getContext, getResultSpecification, reconfigure, setResultSpecification
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARAM_XMLTAG

public static final java.lang.String PARAM_XMLTAG
Name of optional configuration parameter that contains the name of an XML tag that appears in the input file. Only text that falls within this XML tag will be considered part of the "document" that it is added to the CAS by this CAS Initializer. If not specified, the entire file will be considered the document.

See Also:
Constant Field Values
Constructor Detail

XmlDetagger

public XmlDetagger()
Method Detail

initialize

public void initialize(UimaContext aContext)
                throws ResourceInitializationException
Description copied from interface: AnalysisComponent
Performs any startup tasks required by this component. The framework calls this method only once, just after the AnalysisComponent has been instantiated.

The framework supplies this AnalysisComponent with a reference to the UimaContext that it will use, for example to access configuration settings or resources. This AnalysisComponent should store a reference to its the UimaContext for later use.

Specified by:
initialize in interface AnalysisComponent
Overrides:
initialize in class AnalysisComponent_ImplBase
Parameters:
aContext - Provides access to services and resources managed by the framework. This includes configuration parameters, logging, and access to external resources.
Throws:
ResourceInitializationException - if this AnalysisComponent cannot initialize successfully.

typeSystemInit

public void typeSystemInit(TypeSystem aTypeSystem)
                    throws AnalysisEngineProcessException
Description copied from class: CasAnnotator_ImplBase
Informs this annotator that the CAS TypeSystem has changed. The Analysis Engine calls this method immediately following the call to AnalysisComponent_ImplBase.initialize(org.apache.uima.UimaContext), and will call it again whenever the CAS TypeSystem changes.

In this method, the Annotator should use the TypeSystem to resolve the names of Type and Features to the actual Type and Feature objects, which can then be used during processing.

Overrides:
typeSystemInit in class CasAnnotator_ImplBase
Throws:
AnalysisEngineProcessException - if the provided type system is missing types or features required by this annotator

process

public void process(CAS aCAS)
             throws AnalysisEngineProcessException
Description copied from class: CasAnnotator_ImplBase
Inputs a CAS to the AnalysisComponent. This method should be overriden by subclasses to perform analysis of the CAS.

Specified by:
process in class CasAnnotator_ImplBase
Parameters:
aCAS - A CAS that this AnalysisComponent should process.
Throws:
AnalysisEngineProcessException - if a problem occurs during processing


Copyright © 2012. All Rights Reserved.