Character processing settings control how Agent Ransack handles information found in files.
End of Line (EOL) Identifiers
Defines which other EOL identifiers Agent Ransack should use. Normally a Windows text file will use a CRLF (carriage return 0x0d, line feed 0x0a) combination to indicate the end of a line. However, other operating systems use different standards, usually either a stand alone CR or a stand alone LF character.
Maximum characters per line - sets the limit to the line length if an EOL character is not found. Lines that exceed the maximum line length are broken into separate lines, although the line number for the line stays the same.
Containing Text
The 'Include file name in content search' option is used to include the file name in the content search. For example, if a file named LondonHistory.doc was searched for Tower AND London the file would be matched if the word Tower appeared in document text because the word London is already in the file name. However, with the option switched off the file would only be matched if both words appeared in the document text, ie the documents file name would not be included as part of the text.
Special
Convert to 7-bit chars - when checked Agent Ransack only uses the first 7-bits of each data character. Some early word processors reserved the 8th bit of each character for formatting purposes, which if not removed causes problems when searching the data. This setting causes the 8-th bit of every character to be ignored.
|