public class HeaderAnnotator
extends org.apache.uima.fit.component.JCasAnnotator_ImplBase
Modifier and Type | Field and Description |
---|---|
static double |
KEYWORDWEIGHT
This determines how the occurence of a HeaderKeyword is weighted relative
to the font size.
|
static double |
LENGTHWEIGHT
This determines how the block length is weighted relative to the font size.
|
static double |
SHARPNESS
A constant factor for the scalar product inside the logistic regression.
|
static int |
SWITCHLENGTH
We think that a "normal" header should have less characters than this
number.
|
static double |
THRESHOLD
We regard everything as a header that has a higher confidence than
THRESHOLD according to our logistic regression (see below).
|
Constructor and Description |
---|
HeaderAnnotator() |
Modifier and Type | Method and Description |
---|---|
void |
process(org.apache.uima.jcas.JCas jcas) |
getLogger, initialize
getRequiredCasInterface, process
getCasInstancesRequired, hasNext, next
public static final double LENGTHWEIGHT
public static final double KEYWORDWEIGHT
public static final int SWITCHLENGTH
public static final double THRESHOLD
public static final double SHARPNESS
public void process(org.apache.uima.jcas.JCas jcas) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
process
in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
org.apache.uima.analysis_engine.AnalysisEngineProcessException
Copyright (C) 2013, 2014 Raphael Dickfelder, Jan Göpfert, Benjamin Paassen, Andreas Stöckel, licensed under the AGPL v. 3: http://openresearch.cit-ec.de/projects/scie