public abstract class CsvFormatDetector extends Object implements InputAnalysisProcess
InputAnalysisProcess
to detect column delimiters, quotes and quote escapes in a CSV input.Constructor and Description |
---|
CsvFormatDetector(int maxRowSamples,
CsvParserSettings settings,
int whitespaceRangeStart)
Builds a new
CsvFormatDetector |
Modifier and Type | Method and Description |
---|---|
protected abstract void |
apply(char delimiter,
char quote,
char quoteEscape)
Applies the discovered CSV format elements to the
CsvParser |
protected Map<Character,Integer> |
calculateTotals(List<Map<Character,Integer>> symbolsPerRow) |
void |
execute(char[] characters,
int length)
A sequence of characters of the input buffer to be analyzed.
|
protected char |
getChar(Map<Character,Integer> map,
Map<Character,Integer> totals,
char defaultChar,
boolean min)
Returns the character with the highest or lowest associated number.
|
protected void |
increment(Map<Character,Integer> map,
char symbol)
Increments the number associated with a character in a map by 1
|
protected void |
increment(Map<Character,Integer> map,
char symbol,
int incrementSize)
Increments the number associated with a character in a map
|
protected boolean |
isAllowedDelimiter(char ch) |
protected boolean |
isSymbol(char ch) |
protected char |
max(Map<Character,Integer> map,
Map<Character,Integer> totals,
char defaultChar)
Returns the character with the highest associated number.
|
protected char |
min(Map<Character,Integer> map,
Map<Character,Integer> totals,
char defaultChar)
Returns the character with the lowest associated number.
|
protected char |
pickDelimiter(Map<Character,Integer> sums,
Map<Character,Integer> totals) |
public CsvFormatDetector(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)
CsvFormatDetector
maxRowSamples
- the number of row samples to collect before analyzing the statisticssettings
- the configuration provided by the user with potential defaults in case the detection is unable to discover the proper column
delimiter or quote character.whitespaceRangeStart
- starting range of characters considered to be whitespace.protected Map<Character,Integer> calculateTotals(List<Map<Character,Integer>> symbolsPerRow)
public void execute(char[] characters, int length)
InputAnalysisProcess
execute
in interface InputAnalysisProcess
characters
- the input bufferlength
- the last character position loaded into the buffer.protected char pickDelimiter(Map<Character,Integer> sums, Map<Character,Integer> totals)
protected void increment(Map<Character,Integer> map, char symbol)
map
- the map of characters and their numberssymbol
- the character whose number should be incrementprotected void increment(Map<Character,Integer> map, char symbol, int incrementSize)
map
- the map of characters and their numberssymbol
- the character whose number should be incrementincrementSize
- the size of the incrementprotected char min(Map<Character,Integer> map, Map<Character,Integer> totals, char defaultChar)
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is emptyprotected char max(Map<Character,Integer> map, Map<Character,Integer> totals, char defaultChar)
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is emptyprotected char getChar(Map<Character,Integer> map, Map<Character,Integer> totals, char defaultChar, boolean min)
map
- the map of characters and their numbersdefaultChar
- the default character to return in case the map is emptymin
- a flag indicating whether to return the character associated with the lowest number in the map.
If false
then the character associated with the highest number found will be returned.protected boolean isSymbol(char ch)
protected boolean isAllowedDelimiter(char ch)
protected abstract void apply(char delimiter, char quote, char quoteEscape)
CsvParser
delimiter
- the discovered delimiter characterquote
- the discovered quote characterquoteEscape
- the discovered quote escape character.Copyright © 2024 Univocity Software Pty Ltd. All rights reserved.