Language processing tasks and corresponding NLTK modules

Accessing corpora:

nltk.corpus : this is a Standardized interfaces to corpora and lexicons

String processing

nltk.tokenize, nltk.stem : Tokenizers, sentence tokenizers, stemmers  

Collocation discovery 

nltk.collocations:    t-test, chi-squared, point-wise mutual information 

Part-of-speech tagging

nltk.tag : n-gram, backoff, Brill, HMM, TnT

Classification

nltk.classify, nltk.cluster: Decision tree, maximum entropy, naive Bayes, EM, k-means

Chunking

nltk.chunk : Regular expression, n-gram, named entity

Parsing

nltk.parse : Chart, feature-based, unification, probabilistic, dependency       .

Semantic interpretation

nltk.sem, nltk.inference : Lambda calculus, first-order logic, model checking  

Evaluation metrics

nltk.metrics : Precision, recall, agreement coefficients  

Probability and estimation

nltk.probability : Frequency distributions, smoothed probability distributions  

Applications

nltk.app, nltk.chat : Graphical concordancer, parsers, WordNet browser, chatbots  

Linguistic fieldwork

nltk.toolbox : Manipulate data in SIL Toolbox format  



Comments

Popular posts from this blog

Apache Spark Features