web analytics

Drowning in Text Files

Everyone has them laying around their computer somewhere: folders full of loosely formatted files.  It could be text, xml or csv data that was created for that one time use that is now becoming multiple use.  The original person and program that created them no longer exists, but you still need to modify the data in these hundreds of files for a new analysis.   In addition, these files might be scattered into different folder structures and with different names.  The thought of copy and pasting hundreds of small files together is a no-go but similarly is putting in a ticket for someone else to help you with a task.  Simple tasks like these are one example where Composable Analytics shine, offering the power to manipulate many files in an easy to use interface that is can be saved and stored for later use and shared with next unlucky soul that might need to use these files.

Every person can relate to the situation of having folders filled with loosely formatted files that were created for a specific purpose but are now needed for multiple uses. These files could contain text, XML, or CSV data, and their original creators or programs may no longer be available. However, the data within these hundreds of files needs to be modified for a new analysis. Complicating matters further, these files might be scattered across different folder structures and have different names.

Manually copying and pasting hundreds of small files together is a time-consuming and error-prone approach. Similarly, relying on someone else to assist with the task by submitting a ticket can introduce delays and dependencies. This is where the process automation capabilities of Composable DataOps Platform truly shine.

Composable offers a powerful solution for efficiently manipulating large numbers of files through its user-friendly interface. With Composable, users can easily navigate through their file system, locate the relevant files, and perform desired operations on them. Whether it’s merging, transforming, filtering, or any other data manipulation task, Composable provides an intuitive and streamlined environment for carrying out these operations.

What sets Composable apart is its ability to save and store these file manipulation workflows for future use. Once a workflow is created to modify the files, it can be saved as a reusable template or stored as part of a larger data processing pipeline. This allows users to automate repetitive tasks, ensuring consistent and efficient processing of similar file sets in the future.

Moreover, the collaborative nature of Composable enables easy sharing of these file manipulation workflows. The next person who encounters the same set of files can benefit from the saved workflow, avoiding the need to reinvent the wheel. This seamless sharing of workflows fosters knowledge transfer and accelerates productivity, ensuring that the next user won’t have to go through the same challenges.

The DataFlow above was made in just a few minutes when we were asked to modify a large number of XML files with new header data.  The person who requested it wanted us to write a bash script to do the modification for them when we pointed out this can be done quickly and efficiently in a simple Composable DataFlow. The DataFlow begins with a Directory Lister Module, that is set to a specified file path, and lists all file with a *.xml extension. The Table ForEach Module iterates, or loops, through each file, reads the data, appends the header information, and saves it as a new file in another directory. DataFlows are effectively visual algorithms that make their function clear to even non-technical users, and they are easily enhanced or modified with more complex logic.

Composable addresses the common pain points associated with manipulating numerous loosely formatted files. By providing a user-friendly interface, the platform enables users to navigate, modify, and process these files efficiently. The ability to save and store workflows (DataFlows) ensures repeatability and automation, while the sharing capabilities foster collaboration and knowledge sharing among users. With Composable , the once-daunting task of working with scattered and diverse files becomes a streamlined and manageable process.

Leave a Reply

Your email address will not be published. Required fields are marked *