|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.nutch.parse.ms.MSExtractor
public abstract class MSExtractor
Defines a Microsoft document content extractor.
| Field Summary | |
|---|---|
protected static org.apache.commons.logging.Log |
LOG
|
| Constructor Summary | |
|---|---|
protected |
MSExtractor()
Constructs a new Microsoft document extractor. |
| Method Summary | |
|---|---|
protected void |
extract(InputStream input)
Extracts properties and text from an MS Document input stream |
protected abstract String |
extractText(InputStream input)
Extracts the text content from a Microsoft document input stream. |
protected Properties |
getProperties()
Get the Properties of the Microsoft document. |
protected String |
getText()
Get the content text of the Microsoft document. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected static final org.apache.commons.logging.Log LOG
| Constructor Detail |
|---|
protected MSExtractor()
| Method Detail |
|---|
protected void extract(InputStream input)
throws Exception
Exception
protected abstract String extractText(InputStream input)
throws Exception
Exceptionprotected String getText()
protected Properties getProperties()
Properties of the Microsoft document.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||