Bitbucket – Github – CLTL BitbucketCLTL Github

Reproducible Experiments

I believe that every scientific experiment should be reproducible. Therefore, I’ve done my best to ensure that all of my papers are fully reproducible (with all results, not only the best ones). Besides, whenever I take time to reproduce others’ experiments, I share them with the community.

Stanford Neural Dependency Parsing Experiment

A fast reimplementation of Stanford neural parser in Lua/Torch7. Using a GPU, training takes only 1.5 hour instead of 8 hours of Stanford’s Java implementation. Nitty-gritty details of Chen and Manning (2014) are readily implemented. See this post for details.

Reinforcement Learning and Error Propagation (EACL 2017)

This codebase extends Stanford neural parser (written in Java) by adding reinforcement learning and measurement of error propagation. We found that reinforcement learning reduces error propagation and improves performance. The repo is hosted on Bitbucket.

BabelFy Reimplementation

BabelFy is state-of-the-art software in word sense disambiguation and entity linking (as of 2016) but the source code is proprietary. One can only use it via an API, whether for a very limited amount of documents or with a fee. We attempted to reimplement BabelFy but didn’t have enough resources (time, RAM, CPU hours). We open the source code anyway with the hope that somebody will continue the work. CLIN 26 presentation – Github repo

Vietnamese Text Processing


A collection of various command line tools and GATE plugins for a Vietnamese processing pipeline. VNLP was developed in ePi technologies to include:

  • Tokenizer based on vnTokenizer
  • POS tagger
  • Rule-based named entity recognizer (NER) implemented in GATE
  • Orthomatcher and Co-referencer for Vietnamese names
  • Clause recognizer (incubating)
  • Tool for annotating and linking in GATE

Link Grammar

Lienkate is a Java library for parsing Vietnamese text using link grammar formalism but it can also parse an arbitrary language given a link grammar dictionary. In addition to basic features of a link grammar parser, it provides some utilities:

  • Link grammar expression to automaton converter
  • Deterministic parser for link grammar (incubating)

One thought on “Software

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s