The SaaS allows putting various data sources (including SQL and NoSQL DBs, CSV, XML, XLS, JSON, etc.) into the system and builds reports based on this data. The user could filter, aggregate, sort and download the processed data.
Find the best approach to upload and index large files (50GB+) through the web Solution: We've developed a chunk uploader for efficiently work in spite of a slow Internet. The user was able to close a browser/tab and continue uploading whenever he wants
Find a solution to handle large and small files efficiently Solution: Developed a hybrid solution to data ingestion. Large files were put into BigQuery, while medium and small files were put into ElasticSearch. This solution allowed us to process very large files and run analytics queries for less than 2 seconds on large datasets
Determine file format and the approximate progress of the file ingestion Solution: The solution was to implement a special logic, based on the heuristic algorithm to successfully overcome the challenge and show the right progress to the end user
Development of a rather reach and fast interface Solution: Found the canvas table, forked and customized it for our needs. It's a similar way as Google Sheets works, with lazy loading and rendering only those data that the user sees