NLP

Natural Language Processing,Machine Learning,Development,Algorithm

View My GitHub Profile

NLP


[attention] NLP with attention

[lm] IRST Language Model Toolkit and KenLM

[brat] brat rapid annotation tool

[parsing] visualizer for the Sejong Tree Bank

[parsing] constituent tree(phrase structure tree) to dependency tree

[transliteration] m2m aligner

[parsing] SyntaxNet

[parsing] BIST Parser

[Rouzeta] FST based Korean morphological analyzer

Algorithm


[algorithm] sort & partial sort

[algorithm] generate all permutations of string

[algorithm] edit distance and longest common substring

[algorithm] binary tree

[algorithm] binary search tree

[algorithm] segment tree, rmq and autocomplete

[algorithm] KMP, BM string matching algorithm demo

[algorithm] Aho Corasick multi pattern matching

[algorithm] cascaded multi word multi pattern matching

[algorithm] structural pattern matching

[algorithm] linear time regular expression matcher via NFA

[algorithm] efficiently sorting linked lists

[algorithm] find most similar vectors


Common


[common] optimization vs maintainability,readability


Machine Learning


Basic Statistics

[statistics] review of basic probability theory

[statistics] why sample mean is unbiased estimator

[statistics] Central Limit Theorem

[statistics] p value

[statistics] posterior, likelihood, prior, evidence, MLE, MAP

[statistics] Entropy, Relative Entropy and Mutual Information

[statistics] Maximum Likelihood and Maximum Entropy

[statistics] Naive Bayesian, HMM, Maximum Entropy Model, CRF

[statistics] HMM, MEMM, CRF

[statistics] naive bayesian with aspect model

[statistics] Bayesian inference

[statistics] sentencepiece and unigram language model

HMM

[HMM] Hidden Markov Model 구현 관점에서

[HMM] Viterbi algorithm

[HMM] forward and backward variable

SVM

[svm] LIBSVM

ME

[me] maxent

CRF

[crf] Wapiti

[crf] CRFSuite

Boost

[boost] xgboost

Deep Learning

[neural network] neural network and deep learning

[neural network] notation and mathematics

[word2vec] Neural Language Model and Word2Vec

[word2vec] Word Embedding Visual Inspector

[CNN] tutorials

[RNN] tutorials

[layer norm] layer normalization


Development


Python

[python] find all occurrences in string

[python] how to implement dict ( hash )

[python] unlimited integer range and find duplicated number in array

[python] mod_python

[python] mod_fastcgi, mod_wsgi

[python] multi threading or multi processing for fetching url

[python] performance tuning and profiling

[python] remove control characters and all punctuations

[python] string compare disregarding white space

[python] unicode string, check digit and alphabet

[python] Berkeley DB

[python] LevelDB

[python] LMDB

[python] calling C functions from Python in OS X

[python] update python in os x

[python] GIL(Global Interpreter Lock) and Releasing it in C extensions

[python] yield, json dump failure

[python] difflib, show differences between two strings

[python] memory mapped dictionary shared by multi process

[python] setup.py for egg

[python] mangage path in OS X

[python] install pygraphviz

[python] filtering malformed utf8 character sequence(string)

[python] python tricks

C

[c] How to C(as of 2016)

[c] mmap() and munmap()

[c] 가변길이 구조체

[c] Deep C

[JNA] Java Native Access

[c] re entrancy and thread safty

[c] program in memory

[c] using re2 from c, c wrapper

[c] pcre2 sample code

[c] using cpp library from c

[c] rpath for non standard shared library

[c] struct forward declaration in header

[c] ngram function

Go

[go] golang tutorial

Hadoop, Pig, Spark

[hadoop] Hadoop disk and memory spec

[pig] cache multiple files

[pig] GROUP operator and MAX,SUM,AVG,COUNT

[pig] map reduce for unbalanced key distribution

[pig] GROUP BY, reduce phase, STREAMING, nested FOREACH

[pig] MERGE JOIN

[pig] hug number of part files

[pig] set pig.splitCombination false

[spark] installation and test on Ubuntu, docker, OS X

NoSQL

[nosql] NoSQL

MySQL, Oracle

[mysql] esacpe back slash in load infile

[mysql] forward engineering error on ‘max key length is 767 bytes’

[mysql] workbench 스키마 복사시 주의점

[oracle] using Oracle SQL Developer in OS X

Web Server

[tornado] multi process

[tornado] tcp request must end with ‘\r\n\r\n’

[tornado] Newline in header: ‘HTTP 1.0\n 404 Not Found’

[nginx unit] instruction

Graph DB

[neo4j] Install and Usage

[orientdb] Install and Usage

Etc

[automake] conditional config in configure.ac

[git] easy manual

[http] trace method check

[repo] epel.repo for redhat 6

[http,tcp ip] 유용한 학습 자료

[gdb] using in OS X

[rpmbuild] make rpm package

[socket] blocking, non blocking, synchronous, asynchronous

[docker] using ubuntu from osx and windows

[bazel] install Syntaxnet

[interview] questions

Old posts

[daum blog]