public final class FictionBookTextExtractor extends TextExtractor implements ISearchable, IRegexSearchable, IHighlightExtractor, IStructuredExtractor
Provides the text extractor for FB2 (FictionBook) documents.
Extracts a line of characters from a document:
// Create a text extractor for FB2 (FictionBook) documents
TextExtractor extractor = new FictionBookTextExtractor(stream);
// Extract a line of the text
String line = extractor.extractLine();
// If the line is null, then the end of the file is reached
while (line != null) {
// Print a line to the console
System.out.println(line);
// Extract another line
line = extractor.extractLine();
}
Extracts all characters from a document:
// Create a text extractor for FB2 (FictionBook) documents
TextExtractor extractor = new FictionBookTextExtractor(stream);
// Extract a text
System.out.println(extractor.extractAll());
Constructor and Description |
---|
FictionBookTextExtractor(InputStream stream)
Initializes a new instance of the
FictionBookTextExtractor class. |
FictionBookTextExtractor(String fileName)
Initializes a new instance of the
FictionBookTextExtractor class. |
Modifier and Type | Method and Description |
---|---|
protected void |
dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
|
List<String> |
extractHighlights(HighlightOptions... highlightOptions)
Extracts highlights.
|
void |
extractStructured(StructuredHandler handler)
Extracts a structured text.
|
protected String |
prepareLine()
Returns a line of the text.
|
void |
reset()
Resets the current document.
|
void |
search(SearchOptions options,
ISearchHandler handler,
ISearchEngine searchEngine,
List<String> keywords)
Searches the keywords.
|
void |
search(SearchOptions options,
ISearchHandler handler,
List<String> keywords)
Searches the keywords.
|
void |
searchWithRegex(String expression,
ISearchHandler handler,
RegexSearchOptions searchOptions)
Searches the expression.
|
checkDisposed, close, dispose, extractAll, extractLine, extractText, extractTextLine, getEncoding, getMediaType, getPassword, isDisposed, setEncoding, setMediaType
public FictionBookTextExtractor(String fileName)
Initializes a new instance of the FictionBookTextExtractor
class.
fileName
- The path to the file.UnsupportedDocumentFormatException
- File format isn't supported.public FictionBookTextExtractor(InputStream stream)
Initializes a new instance of the FictionBookTextExtractor
class.
stream
- The stream of the document.UnsupportedDocumentFormatException
- File format isn't supported.public void extractStructured(StructuredHandler handler)
Extracts a structured text.
extractStructured
in interface IStructuredExtractor
handler
- Structured text extraction handler.public List<String> extractHighlights(HighlightOptions... highlightOptions)
Extracts highlights.
extractHighlights
in interface IHighlightExtractor
highlightOptions
- A collection of HighlightOptions.public void search(SearchOptions options, ISearchHandler handler, List<String> keywords)
Searches the keywords.
search
in interface ISearchable
options
- Options for searching.handler
- An instance of the search handler.keywords
- A collection of words to search.public void search(SearchOptions options, ISearchHandler handler, ISearchEngine searchEngine, List<String> keywords)
Searches the keywords.
search
in interface ISearchable
options
- Options for searching.handler
- An instance of the search handler.searchEngine
- An instance of the search engine.keywords
- A collection of words to search.public void searchWithRegex(String expression, ISearchHandler handler, RegexSearchOptions searchOptions)
Searches the expression.
searchWithRegex
in interface IRegexSearchable
expression
- A regular expression.handler
- An instance of the search handler.searchOptions
- Options for searching.public void reset()
Resets the current document.
ExtractLine
method will return the first line of the document.
reset
in class TextExtractor
protected void dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
dispose
in class TextExtractor
disposing
- A boolean true if invoked from Dispose; otherwise, false.protected String prepareLine()
Returns a line of the text.
prepareLine
in class TextExtractor
Copyright © 2018. All rights reserved.