<< go back to Tools page

Charsep     Download     Releases     Help     Disclaimer

Background info

"Character-Separated-Values" files are broadly used on computers. It's an easy way to provide 2-dimensions structures of data - and 2-dimensions structures of data are so easily mentally or physically represented that they are the first solution we think of, to capture structured information beyond "lists".

Usually, simple Text editors are used to quickly edit Csv files, recreating the "columns" of information mentally. If the process becomes heavier, users then turn to spreadsheet software - Excel or OpenOffice "Calc", that are excellent to process such grid structures. However using these spreadsheet tools has some drawbacks :

  • They "translate" Csv files, interpreting values in datatypes (e.g. dates, numerics...).
  • They may have limitations - 255 columns, 65536 rows for excel versions prior to excel 2007. Even with current releases, due to their powerful functionalities they are not dealing perfectly with huge files.
  • Designed for calculation, they have limited Csv specific processing functionalities. They don't deal that well with Data profiling, Structure management (Columns-based processing), search/replace features.
  • Their "native" format is not csv - you will need import/export activities each time you process such a file.

This simple tool is implemented to bridge the gap between the two types of software (Text editors and Spreadsheet processors) for processing Csv files. It is not replacing neither of them but you may find it useful for many tasks. In addition, a command-line processor, very simple to use, allows to automate some tasks you may require on such files.

You can find on my YouTube channel, some videos that provide an overview of features of Charsep. The two first ones show fundamental concepts, the other ones go deeper into specific topics. I will include additional tutorials over time.

Some more information on the "CSV" format... : CSV is in fact commonly defined as an acronym for 'Comma-Separated-Values' however the separator is not always a Comma - since it's not the most convenient separator and is quite often a source of issues when used as part of values. Hence a better definition is 'Character-Separated-Values'. This format is not a very well-defined standard (at least many people "think" they are building csv files although not following a real standard), however you can look at RFC 4180 to get more info. Or of course your preferred search engine or wikipedia can help you to get tons of documents on the topic...

Some features of Charsep

Charsep is a java-developped program, using a "Swing" user-interface - and therefore can run on MacOS, Linux, Windows...

Below is a non-exhaustive list of features provided by Charsep
  • Direct in-grid edition of data
  • Structure management (Change of columns order, removal/addition of columns, that can be automated - based on 'template structures')
  • Support of huge files (in number of columns or number of rows)
  • Simple change of column separator (not a simple char-replacement, if separator is used in quoted strings) and Unix/Ms-Dos row separators (CrLf or Lf)
  • Support of different charsets including multibyte / unicode and quick conversion of charsets
  • Profiling : Complete statistics on structure (Distinct values in cols, Patterns, Min Max Avg size of values per col), extreme values... Control of files alignment with a defined profile
  • Merge of files, Comparison ("Diff") of files with a variety of options (content based or position based, case-sensitive checks or not, similarity of values, etc..)
  • Quick append of rows or columns / support of clipboard as a source or target
  • Support of headers / headerless files
  • Partial load of files (First n rows, Last n rows, from row n to m, from col n to m, only rows where a column contains/does not contain a specific value...)
  • Corrupted files detection / correction of files (Missing columns)
  • 1-click removal of all empty columns
  • Easy and Rich search functionalities : RegEx based or 'Starts with, contains, does not contain, is unique/has duplicates, matches another col values, matches a set of values...
  • Searches composition (And / Or) + Append to current set of search (Boolean operations for selection of search results)
  • Search on 1 column or all columns
  • Search by similarity - fuzzy-logic algorithms : Jaro-winkler, Levenshtein edit distance, Soundex
  • Search/Replace and transform - to uppercase, lowercase, filter alpha/numeric chars..., or RegEx based
  • Quick sort on any column - ascending / descending
  • Search in Header labels
  • Transposition of files (Switch columns to rows) taking into account headers/first column or not
  • Switch columns labels to columns nums or alphabetical columns ('Excel-format')
  • Command-line processing of files transformation
  • Generation of random files with many features for columns definition

Please click here to get Help on Charsep


 

Please read carefully - Disclaimer

Upon installing, copying or otherwise using Charsep you accept to be bound to the terms and conditions of this license. If you do not accept the terms and conditions found in this license you have to uninstall and to not use Charsep.
This software "Charsep" is provided "as-is" without any warranty, either implied or expressed.

In no event shall Jacques Detroyat, the author of this software be held liable for any data loss, misconfiguration or any other damage occuring from the use of this software.

You are granted the right to install and use this software free of charge on any or your computer systems. You are granted the right to make an unlimited number of copies of this software

You are not allowed to decompile or patch the "charsep.jar" file shipped with this distribution without prior written approval by the author of this software.

Jacques Detroyat reserves the right to license the same software to other individuals or entities under a different license agreement.

 << go to dbj and other tools page