Learn the coding skills for your next job

Go-CoNLLU – Some Much Needed Machine Learning Support in Go

Python is commonly seen as the AI/ML language, but is often a dull blade due to unsafe typing and being slow, like really slow. Many popular natural language processing toolkits only have Python APIs, and we want to see that change. At Nuvi, a social media marketing tool, we use Go for the majority of our data processing tasks because we can write simple and fast code. Today we are open-sourcing a tool that has helped make our ML lives easier in Go. Say hello to go-conllu.

Learn the Go the right way

Go is the language of cloud-native technologies. If you’re interested in modern web systems then our Go Mastery track of courses and projects will give you all the skills you need to have a successful switch.

What is CoNLL-U?

The Conference on Natural Language Learning (CoNNL) has created multiple file-formats for storing natural language annotations. CoNLL-U is one such format and is used by the Universal Dependency Project, which hosts many annotations of textual data. In order to use these corpora, we need a parser that makes it simple for developers to utilize the data.

Universal Dependencies Machine Learning Logo
Universal Dependencies

How Does Go-Conllu Help?

Go-conllu parses conllu data. It is a simple and reliable way to import conllu data into your application as Go structs.

The GoDoc can be found here with the specifics

Let’s take a look at the example quick-start code from the Readme. First, download the package.

go get github.com/nuvi/go-conllu
Code language: Bash (bash)

Then in a new project:

package main import ( "fmt" "log" conllu "github.com/nuvi/go-conllu" ) func main() { sentences, err := conllu.ParseFile("path/to/model.conllu") if err != nil { log.Fatal(err) } for _, sentence := range sentences { for _, token := range sentence.Tokens { fmt.Println(token) } fmt.Println() } }
Code language: Go (go)

All the sentences and tokens in the corpus will be printed to the console.

If you need a .conllu corpus file you can download the Universal Dependencies English training model here: en_ewt-udtrain.conllu

Related Articles

Trying to find your next programming job?

If you are a self-taught developer having trouble finding your first programming job, we've got your back! We have the learning resources and tight-knit dev community that you need to land the coding job you've been looking for. To get started, create a free account and join our Discord community.

Have questions or feedback?

If we've made a mistake in the article, please let us know so we can get it corrected!

Leave a Comment