Jieba词性标注/NLTK词性标注

  1. Jieba词性标注

    def jieba_posseg(sent):
        from jieba import posseg
        seg = posseg.cut(sent)
        for item in seg:
            print(item.word, item.flag)

  2. NLTK词性标注

    def nltk_posseg(sent):
        import nltk
        sent = nltk.word_tokenize(sent)
        seg = nltk.pos_tag(sent)
        for word, tag in seg:
            print(word, tag)

    另外推荐北大pkuseg清华thulac的nlp工具包,下面给出分词效果对比:

0