A tool for searching & extracting information from multiple text files.
Project description
This package contains two tools: Raptor & Reptar
Raptor
Raptor for extracting and displaying information from a set of files of the same type; and creating a single file with all the selected information.
The information in the files may be in multiple rows:
PC01.txt: User=ms123 Name=Mayra Sanz OS=GNU/Linux IP=10.226.140.1
But, also, the information may be in several columns. It is possible to read data from multiple fields in a single line:
PC01.log: User: ms123 Name: Mayra Sanz OS: GNU/Linux IP: 10.226.140.1
Example: data from the following files:
PC01.txt: User=ms123 Name=Mayra Sanz OS=GNU/Linux IP=10.226.140.1 PC02.txt: User=lt001 Name=Luis Toribio OS=GNU/Linux IP=10.226.140.2 PC03.txt: User=co205 Name=Clara Osto OS=Win IP=10.226.140.3
… You can create a CSV file with the following information:
users.csv: User,Name,OS,IP MS123,Mayra Sanz,GNU/linux,10.226.140.1 LT001,Luis Toribio,GNU/linux,10.226.140.2 CO205,Clara Osto,Win,10.226.140.3
To achieve this you need to create a template (.rap) with Raptor, which is similar to an INI file with the following information:
users.rap:
[General]
description = Get list of users
extension = txt
prefix = PC
output_folder = txt
input_folder = txt
output_file = users.csv
delimiter = ,
quotechar = "
include_header = 1
include_file = 0
include_record_num = 0
include_empty_record = 0
search_multiple = 0
alternate_header =
search_multiple = 0
[Fields]
user = User=
name = Name=
os = OS=
ip = IP=
[Rules]
rule1 = ('user', 'UPPER')
To create .rap template (If the .rap template exists it is not saved). (Caution: field names must be lowercase):
from pysaurio import Raptor
rap1 = Raptor()
rap1.description = 'Get list of users'
rap1.extension = 'txt'
rap1.prefix = 'PC'
rap1.input_folder = 'txt'
rap1.output_folder = 'txt'
rap1.output_file = 'users.csv'
rap1.delimiter = ','
rap1.quotechar = '"'
rap1.include_header = '1'
rap1.include_file = '1'
rap1.include_record_num = '1'
rap1.include_empty_record = '0'
rap1.search_multiple = '0'
rap1.alternate_header = ''
rap1.fields['user'] = 'User='
rap1.fields['name'] = 'Name='
rap1.fields['os'] = 'OS='
rap1.fields['ip'] = 'IP='
rap1.rules.append(('user', 'UPPER'))
rap1.rules.append(('name', 'REMOVEFROM', ' '))
rap1.Save("users.rap")
del rap1
Attribute List:
description: short descripton of .rap template
extension: extension of the files to read
prefix: files must begin with this string
input_folder: folder of files to read
output_folder: output folder to save file with result
output_file: output filename
delimiter: delimiter character
quotechar: quote character
include_header: ‘0’ or ‘1’
include_file: ‘0’ or ‘1’
inclide_file_datetime: ‘0’ or ‘1’ (file creation & modification date/time)
include_record_num: ‘0’ or ‘1’
include_empty_record: ‘0’ or ‘1’
search_multiple: ‘0’ or ‘1’
alternate_header: alternative text of the report header
fields: dictionary with fieldnames and search string (read template)
record: dictionary with fieldnames and values (read template)
rules: list of rules (read template)
list_files: list of filenames to read (auto)
record_counter: number of records (auto)
errors: list of errors (auto)
number_errors: number of errors after you open or save a template
Functions available for rules:
rule1 = (fieldname, ‘SUBSTR’, postion_initial, lenght)
rule1 = (fieldname, ‘REPLACE’, ‘search_string’, ‘replace_string’)
rule1 = (fieldname, ‘REPLACEALL’, ‘search_string’, ‘replace_string’)
rule1 = (fieldname, ‘UPPER’)
rule1 = (fieldname, ‘LOWER’)
rule1 = (fieldname, ‘REVERSE’)
rule1 = (fieldname, ‘REMOVE’)
rule1 = (fieldname, ‘FIELDISDATA’)
rule1 = (fieldname, ‘REMOVEFROM’, ‘string’)
rule1 = (fieldname, ‘REMOVETO’, ‘string’)
Opens template (.rap) and creates (.csv) file from the data read from multiple text files:
from pysaurio import Raptor
import csv
rap2 = Raptor()
rap2.Open('users.rap')
if rap2.number_errors == 0:
file_csv = open(rap2.output_file, 'w', newline='')
csv_output = csv.writer(file_csv,
delimiter=rap2.delimiter,
quotechar=rap2.quotechar,
quoting=csv.QUOTE_MINIMAL)
if rap2.include_header == '1':
fields_list = rap2.BuildHeader()
print(fields_list)
csv_output.writerow(fields_list)
for row in rap2.list_files:
valid_record, new_record = rap2.BuildRow(row)
new_record = rap2.ApplyRules(new_record)
if valid_record:
new_record = list(new_record.values())
print(new_record)
csv_output.writerow(new_record)
file_csv.close()
else:
print(rap2.ShowError())
del rap2
Reptar
Reptar allows merge files, including only the necessary lines.
Example: data from the following files:
PCS01.txt: User,Name,OS,IP ms123,Mayra Sanz,GNU/Linux,10.226.140.1 lt001,Luis Toribio,GNU/Linux,10.226.140.2 co205,Clara Osto,Win,10.226.140.3 PCS02.txt: User,Name,OS,IP nn345,Nadia Pacheco,Win,10.226.140.4 jm401,Juan Madrid,GNU/Linux,10.226.140.5
… You can create a file with the following information:
Linux.csv: User,Name,OS,IP MS124,MAYRA SANZ,GNU/LINUX,10.226.140.1 LT001,LUIS TORIBIO,GNU/LINUX,10.226.140.2 CO205,CLARA OSTO,WIN,10.226.140.3 JM401,JUAN MADRID,GNU/LINUX,10.226.140.5
In this example, lines that contain the text “Linux” or beginning with the text “co205” are included:
from pysaurio import Reptar
rep1 = Reptar()
rep1.description = 'Get list of Linux users'
rep1.extension = 'txt'
rep1.prefix = 'PCS'
rep1.input_folder = 'txt'
rep1.output_folder = 'txt'
rep1.output_file = 'Linux.csv'
rep1.include_header = '1'
rep1.include_file = '0'
rep1.include_record_num = '0'
rep1.alternate_header = ''
rep1.lines.append(('INCLUDE', 'Linux'))
rep1.lines.append(('INCLUDRE', '^co205'))
rep1.rules.append(('line', 'UPPER'))
rep1.Save("linux.rep")
del rep1
# Opens .rep template and create file with output information
rep2 = Reptar()
rep2.Open('linux.rep')
if rep2.number_errors == 0:
file_csv = open(rep2.output_file, 'w')
if rep2.include_header == '1':
header = rep2.BuildHeader(rep2.list_files[0])
print(header)
file_csv.write(header + '\n')
for row in rep2.list_files:
current_file = open(rep2.input_folder + row, 'rb')
while True:
new_record = current_file.readline()
new_record = new_record.decode("utf-8", "ignore")
if not new_record: break
valid_record, new_record = rep2.BuildRow(new_record, row)
if valid_record:
new_record = rep2.ApplyRules(new_record)
print(new_record)
file_csv.write(new_record + '\n')
current_file.close()
file_csv.close()
else:
print(rep2.ShowError())
del rep2
Functions available for including and excluding lines:
line1 = (‘EXCLUDE’, ‘string’)
line1 = (‘INCLUDE’, ‘string’)
line1 = (‘EXCLUDEND’, ‘string’)
line1 = (‘INCLUDEND’, ‘string’)
line1 = (‘EXCLUDRE’, ‘regex’, ‘0’|’1’) # ‘1’ not case sensitive
line1 = (‘INCLUDRE’, ‘regex’, ‘0’|’1’) # (See module re)
The package contains more examples and data files to test.
Changelog
Pysaurio 0.2.6 - 2018-09-14 - Corrected error in function REMOVECOL
Pysaurio 0.2.5 - 2016-09-09 - new attribute: include_file_datetime = ‘0’ or ‘1’
Pysaurio 0.2.4 - Reptar include new rule: ‘REMOVECOL’, remove column
Pysaurio 0.2.3 - New functions: ‘INCLUDEND’ and ‘EXCLUDEND’
Pysaurio 0.2.2 - New argument in the ‘INCLUDRE’ and ‘EXCLUDRE’ functions
Pysaurio 0.2.1 - Reptar includes rules and the section ‘Lines’ you can use regular expressions.
Pysaurio 0.2.0 - Initial release (continued “Pyraptor”).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysaurio-0.2.6.tar.gz.
File metadata
- Download URL: pysaurio-0.2.6.tar.gz
- Upload date:
- Size: 31.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecfe8aa1893c2fc7b825f3c42086c6be4171894ad4a3be4ea7b90ad65a3f6b5c
|
|
| MD5 |
2087b22f2d566e6dc11a8a610a6a6d8f
|
|
| BLAKE2b-256 |
dfbe981b14cc4d2d6d60f714d2516218714a70c460bcb9a7ae383612bd1f4849
|
File details
Details for the file pysaurio-0.2.6-py3-none-any.whl.
File metadata
- Download URL: pysaurio-0.2.6-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5219f5a7a92ed345bd3d7bd479cc54326ec06f4e0bbba5bf3f0cefe32d9e855
|
|
| MD5 |
1be7fb257936a7f898fa2d91f8450b3b
|
|
| BLAKE2b-256 |
a4d419d7ac8d04065d7ecaac48b772fbad67aab530fff01579a64790ba837518
|