Data Science Notebook

Menu

Menu

Data Science Notebook

Menu

Menu

Category Archives: text mining

A simple “click” that create LDA topic models for text mining A python library I wrote --available with "pip install easyLDA"

If you have Python and a collection of texts in a file, simply as “pip install easyLDA”, then in shell run $ easyLDA, won’t be long before your topic model ready.

fun, LDA, package, python, text mining, topic model
deep learning, jupyter notebook, machine learning, project, Python, text mining, topic modeling, vitualization
Posted on February 13, 2018

Simple LDA Topic Modeling in Python: implementation and visualization, without delve into the Math

The very simple approach to train a topic model in LDA

within 10 minutes!

 

Plot words importance

 

topic modeling, topic modeling python lda visualization gensim pyldavis nltk
data cleasing, Python, text mining, topic modeling, unsupervised learning
Posted on April 25, 2017

Sentiment Analysis model deployed!

I’ve trained a sentiment analysis on simple data set:

Amazon Reviews: Unlocked Mobile Phones

based on the amazon phone purchase reviews. Simple linear SVM classifier using scikit-learn. The code is down below, please scroll down

Yet I’ve successful deployed the model on an AWS server!  original deployment page

Model building

Continue Reading →
Posted on April 17, 2017

What do people say about iphone? LDA topic models built by amazon phone purchased views

Built topic models from the data Amazon Reviews: Unlocked Mobile Phones,

Positive :  Here’s 5 star ratings say, 20 topics version

Continue Reading →
Posted on April 15, 2017

unsupervised learning-4 discovering interpretable features

Non-negative matrix factorization (NMF)

word-frequency array

apply dimensional reduction techniques on image

  • PCA

  • NMF

  • SVD

NMF components, PCA components

Cosine similarity

 

learning, longly, midnight, super
data cleasing, jupyter notebook, project, Python, text mining, unsupervised learning
Posted on February 20, 2017

unsupervised learning-3 Dimension reduction: PCA, tf-idf, sparse matrix, twitter posts clustering Intrinsic dimension, text mining, Word frequency arrays, csr_matrix, TruncatedSVD

Dimension reduction: PCA,  Intrinsic dimension

tf-idf,  Word frequency arrays

sparse matrix,  csr_matrix, TruncatedSVD

fun, pca, text mining, tf-idf
data cleasing, jupyter notebook, Python, statistic, text mining, unsupervised learning
Posted on February 18, 2017

case example 4 – N-gram and complex pipline

 Add complex features, e.g. scaling: HashingVectorizer, N-gram, SelectKBest, SparseInteractions, MaxAbsScaler, OneVsRestClassifier

case, project, study
data cleasing, jupyter notebook, machine learning, Pandas, text mining
Posted on January 17, 2017

case example 3 – building Pipeline

 building a Pipline for Text mining, preprocessing and model training & prediction

case, fun, project
data cleasing, jupyter notebook, machine learning, text mining
Posted on January 17, 2017

case example 2 – model exploration and NLP

CountVectorizer() in sklearn, Tokenizes all the strings, Bag-of-words

case, project
data cleasing, jupyter notebook, machine learning, text mining
Posted on January 17, 2017

Post navigation

Older posts

Log in

  • Register
  • Log in
  • Entries RSS

contact me

Richard Ji

Richard Ji

Categories

  • bokeh (5)
  • Chinese (3)
  • data cleasing (25)
  • data retrieve (13)
  • database (5)
  • deep learning (18)
  • Deploy on Linux (16)
  • excel (2)
  • fun (1)
  • git (1)
  • jupyter notebook (57)
  • keras (12)
  • machine learning (11)
  • matplotlib (6)
  • Pandas (9)
  • practice (9)
  • project (7)
  • Python (69)
  • R (1)
  • source (2)
  • statistic (14)
  • tensorflow (5)
  • text mining (11)
  • tips (1)
  • topic modeling (3)
  • Uncategorized (14)
  • unsupervised learning (6)
  • vitualization (27)
  • wordpress (2)

Recent Posts

  • My Data Science & Data Engineer Project
  • A simple “click” that create LDA topic models for text mining
  • Face Similarity searching ~ landmark detecting
  • Simple LDA Topic Modeling in Python: implementation and visualization, without delve into the Math
  • Sentiment Analysis model deployed!

Click cat

Archives

  • May 2018 (1)
  • February 2018 (1)
  • September 2017 (1)
  • April 2017 (5)
  • March 2017 (12)
  • February 2017 (14)
  • January 2017 (36)
  • December 2016 (3)
  • November 2016 (6)
  • October 2016 (10)
  • August 2016 (2)
  • July 2016 (3)
  • March 2016 (2)
  • February 2016 (10)
  • January 2016 (6)
  • December 2015 (2)
  • November 2015 (1)
  • October 2015 (1)
  • September 2015 (1)
  • April 2015 (1)
  • January 2010 (1)

iridescent data science 

Theme by SiteOrigin.
  • Home
  • Visualization

Categories

  • bokeh
  • Chinese
  • data cleasing
  • data retrieve
  • database
  • deep learning
  • Deploy on Linux
  • excel
  • fun
  • git
  • jupyter notebook
  • keras
  • machine learning
  • matplotlib
  • Pandas
  • practice
  • project
  • Python
  • R
  • source
  • statistic
  • tensorflow
  • text mining
  • tips
  • topic modeling
  • Uncategorized
  • unsupervised learning
  • vitualization
  • wordpress
HTML Snippets Powered By : XYZScripts.com