Split & Merge utilities for large csv files.
Project description
Split-Merge Package
Splitting a large CSV file into multiple small csv files for better processing using Split features at your local disk & Merge will merge back to small files into one large file. This is a first sample version.
Limitations
As of now, this will create splitted file with the extension known as "splitted". Make sure that your original file should not contain the same naming pattern.
Your source file name for example - customer_addr_20180112.csv Your split file name will will be given below: 1__customer_addr_20180112__splitted_.csv 2__customer_addr_20180112__splitted_.csv N__customer_addr_20180112__splitted_.csv Where N would be any number based on the size of the file. Bye default, each chunk will contain at least 30000 or less number of records.
This requires pandas & regular expression package installed in your python environment.
Sample Code to use this library. You can name it as ->
callSplitMergeFiles.py
from SplitMerge.clsSplitFiles import clsSplitFiles
from SplitMerge.clsMergeFiles import clsMergeFiles
import re
import platform as pl
import os
def main():
print("Calling the custom Package for large file splitting..")
os_det = pl.system()
print("Running on :", os_det)
###############################################################
###### User Input based on Windows OS ########
###############################################################
srcF = str(input("Please enter the file name with extension:"))
base_name = re.sub(r'[0-9]','', srcF)
srcFileInit = base_name[:-5]
if os_det == "Windows":
subdir = "\\temp\\"
path = os.path.dirname(os.path.realpath(__file__)) + "\\"
else:
subdir = "/temp/"
path = os.path.dirname(os.path.realpath(__file__)) + '/'
###############################################################
###### End Of User Input ######
###############################################################
t = clsSplitFiles(srcF, path, subdir)
ret_val = t.split_files()
if ret_val == 0:
print("Splitting Successful!")
else:
print("Splitting Failure!")
print("-"*30)
print("Finally, Merging small splitted files to make the same big file!")
y = clsMergeFiles(srcFileInit, path)
ret_val1 = y.merge_file()
if ret_val1 == 0:
print("Merge Successful!")
else:
print("Merge Failure!")
print("-"*30)
if __name__ == "__main__":
main()
End Of Sample Code - callSplitMergeFiles.py
Bug Fix: 1. Module loading issue fixed. 2. Source & Target directory as per developer's choice. Dependancy Package: You need to install followig packages in order to run this package -
pip install pandas
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file SplitMerge-0.0.2.post0.tar.gz
.
File metadata
- Download URL: SplitMerge-0.0.2.post0.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e58f02ab3c030514ec923fa519299e1aac3385cb22e3e006a22a639f50c22f84 |
|
MD5 | c603daf3c04b55c7a00023186593b2e6 |
|
BLAKE2b-256 | b66f531da65fd48809cbf50521e52a06cf73d417ac4170f0f5aef3d6688bc89d |
File details
Details for the file SplitMerge-0.0.2.post0-py3-none-any.whl
.
File metadata
- Download URL: SplitMerge-0.0.2.post0-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8ea4c46c1a3f733c597dfd75299a3a2e416feb329c9a851c341d46a1995b285 |
|
MD5 | cbf91cce2a9a195f0e98f1a085470405 |
|
BLAKE2b-256 | 37c46939c3e06d35fff159c6c21d4f8cb8a1840e995587c7eac46271b53491cb |