Use jieba to do Chinese Word Segmentation, then transform and persist as the type of Bunch which is defined in Scikit-Learn .Use TF-IDF method to vectorize it.Finally ,use Naive Bayes to train and test it.
CNTexas/Chinese-Text-Classification
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|