NLPurify

Documentation Status GitHub Issues GitHub Forks GitHub Stars LICENSE File PyPI - Downloads PyPI Latest Release

A text cleaning and extraction engine was developed using a combination of traditional techniques like Unicode translations, cleaning using regular expressions, and modern tools like “natural language processing” and “large language models” to detect and clean long texts and create word vectors.